C++ Exit-Time Destructors

martijntje · 2024-03-23T08:35:16

It's interesting to see that the complexity of c++ is so great that even the code in standards proposals contains undefined behaviour. The no_destroy class triggers UB when .get() is called, because it fails to launder the returned pointer. This is required since c++17.

pjmlp · 2024-03-23T10:16:31

The way C++ is going is no longer the language I felt in love as my next programming tool after Object Pascal (Turbo/Delphi).

C++26 is going to introduce erroneous behavior.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p27...

maccard · 2024-03-23T10:40:08

This is absolute madness, and IMO goes against everything c++ should stand for. It's a non-zero runtime overhead breaking language change, for a case that should be handled by disallowing uninitialized reads, with an opt in for undefined behaviour.

pjmlp · 2024-03-23T11:29:45

The madness is the increasing complexity, adding features into the standard without having them being tested on field (like in any other sane language), last meeting had 210 people voting in for their special features, trying to fix safety without cleaning out the features that make it impossible in first place, and so on.

As for non-zero runtime overhead, that is long gone, since stuff like std::map, std::regexp and refusal from compiler vendors to break an ABI that isn't even defined on the ISO level, yet it blocks any proposal that has the side effect of breaking it.

maccard · 2024-03-23T11:37:47

> adding features into the standard without having them being tested on field

Could not agree more. Modules are a perfect example of this.

> As for non-zero runtime overhead, that is long gone, since stuff like std::map, std::regexp

I have a little sympathy for this - they're libraries that are replaceable (yet you'd think we'd have learned from map, regex, unique_ptr that getting it right first time is impossible, and _maybe_ we should start putting these things in the language), and the overhead there is opt-in. In this case, if I have the following code:

    char buf[SOME_VERY_LARGE_CONSTANT];
    bool result = fill_buf_from_c_library(buf, sizeof(buf) / sizeof(buf[0]));
    assert(result); // we've guaranteed at runtime that this memory is initialized

now comes with a significant performance hit with a compiler upgrade, and requires code changes to fix

> refusal from compiler vendors to break an ABI that isn't even defined on the ISO level, yet it blocks any proposal that has the side effect of breaking it.

And yet this is true at the same time. Madness.

tialaramex · 2024-03-23T14:27:08

> I have a little sympathy for this - they're libraries that are replaceable

In effect unlike the C stdlib, where sure - there's almost nothing in the freestanding library, the C++ stdlib is full of higher level features even in freestanding, more like Rust's core. It's not as rich as Rust's core, but it's definitely in that direction.

So that means there's a responsibility for Quality of Implementation on those features, they're not really "replaceable" because they're in the fundamental standard library even on bare metal which means a "replacement" is just a parallel implementation. That's probably fine to some extent for std::unordered_map, but it's very silly for std::function, and I'd argue it's even silly for std::unique_ptr and std::string.

> now comes with a significant performance hit with a compiler upgrade, and requires code changes to fix

It's a problem that it took C++ until now to fix this, but that's still on them. There is a proposal (which I'd guess you'll hate) to have a Rust-style uninitialized wrapper type so that you can write what you meant here and it's clear to the compiler and to future human maintainers that those ain't chars, those are a kiss of death until after that C function successfully does what it says it does.

The C definition of assert, which C++ inherits, is problematic here, because it's a no-op in release builds.

maccard · 2024-03-23T14:55:56

> So that means there's a responsibility for Quality of Implementation on those features, they're not really "replaceable" because they're in the fundamental standard library even on bare metal which means a "replacement" is just a parallel implementation. That's probably fine to some extent for std::unordered_map, but it's very silly for std::function, and I'd argue it's even silly for std::unique_ptr and std::string.

I think we agree here - you can replace std::unordered_map with absl::map etc, so the impact is lower. I do wish we'd stop doing it, though - see fmt.

I agree that it's incredibly stupid for std::function and std::unique_ptr to not be language level features - unique_ptr is the poster child for something that could faster and safer with less compile time overhead if implemented in the language rather than in the library.

> There is a proposal (which I'd guess you'll hate)

No, I think that's great actually, but I don't want that _and_ noinit/indeterminite (which is proposed in the link above) _and_ a new category of behaviour. I want one option, and I want failure of adherence to it to either be compiler enforced, implelentation defined, or Undefined Behaviour, _not_ "erroneus behaviour".

> The C definition of assert, which C++ inherits, is problematic here, because it's a no-op in release builds.

Sorry, yeah you're totally right. my day to day C++ work has a custom set of assertions that in release builds are basically the following (simplified for posting here) -

#define our_assert(conditional) \ if (!conditional) { \ log_stuff_and_flush() \ std::abort() \ }

> because it's a no-op in release builds.

This falls into the same group of back compat as the unwillingness to break ABI IMO - you can have an optimised build without NDEBUG, or an unoptimised build with NDEBUG, and the side effects of disabling bounds checking is turning assert into a noop.

Asooka · 2024-03-23T13:23:04

Oh thank God! I have been asking for something like "erroneous behaviour" for years. Clang in particular is very fond of deleting safety checks if they check for conditions which it thinks are only possible under undefined behaviour. Every year I have to fix one or two bugs that stem from code from completely separate modules interacting in a way that manages to produce a state which leads to undefined behaviour. The option to opt-in to UB is also a very good idea - we can turn it on in the hot parts of code and restore performance where it really matters, while constraining the impact, i.e. even if the code has UB, that cannot poison the rest of the program.

I hope this renewed focus on safety will lead to the default for what is currently undefined behaviour being more in line with what was originally intended - sort-of arbitrary behaviour dependent on the platform's state at the time of code execution. As in, less constrained than "implementation defined", since there is no guarantee it will produce the same result every time, but more constrained than UB, since it does not invalidate the program.

lelanthran · 2024-03-25T14:40:08

I fell out of love with C++ over a decade ago.

The discipline required to not shoot your foot off in a non trivial project was far greater than than the extra code needed when writing plain C, or, for bigger projects, C# and Java.

I still reach for C first, and if I need higher level abstractions, I'm reaching for something other than C++.

geertj · 2024-03-23T11:03:51

Are you sure that with https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p05... this is still undefined behavior?

martijntje · 2024-03-23T12:30:37

If I read that paper correctly, that's only valid if the object can be constructed without code being run - i.e. a trivially constructible type.

Since there is no limitation on the no_destroy type, you can create any type with it, including those which have a non-trivial constructor. Of course, you could use a concept to restrain it to only trivially constructible types, or you could omit the launder in an if constexpr branch and still be compliant.

protomolecule · 2024-03-23T15:39:14

I don't think that get() needs to launder that pointer. The object of type T is constructed only once in the storage provided by an array of bytes.

LegionMammal978 · 2024-03-23T20:08:25

It doesn't really matter how many times the T is constructed: the pointer resulting from "new" points to the T object, but the pointer from accessing the member just points to the byte array, and the latter cannot be turned into the former without std::launder().

protomolecule · 2024-03-24T17:22:24

I don't think you are right here.

std::launder addresses a specific situation: when an object is replaced by another object of the same type but compiler doesn't know that the first object's lifetime has ended and the same address points to a different object. In this case compiler may still use cached values of constant or reference fields of the dead object or its vptr.

In this case the byte array merely provides storage for an object of type T created by the placement new operator and that object is never replaced by anything. The lifetime of the byte array itself doesn't end when a nested object is created.

LegionMammal978 · 2024-03-24T19:42:18

That's one of the situations for std::launder(). (However, in most cases where the new object has exactly the same type, C++20 has made laundering unnecessary with its 'transparently replaceable' criterion.)

Yet the other situation is precisely this one [0] [1]. The byte array does provide storage for the T, but a pointer to the byte array is not a pointer to the T. You need to launder the pointer to get from the former to the latter. Such are the rules when you're bound as strictly to type-based aliasing as standard C++ is.

(Not that any real compiler is nearly as strict about byte arrays in practice: if they were, then many Unix networking APIs would be practically unusable. Indeed, people care about it so little that Linux is happily adding new syscalls that write data into variable-length buffers and expect the user to arbitrarily cast them back into structs.)

[0] https://en.cppreference.com/w/cpp/utility/launder#Notes

[1] https://stackoverflow.com/a/39382728

protomolecule · 2024-03-24T23:02:50

You are right, thank you for taking time to explain.

worstspotgain · 2024-03-23T13:24:34

On Mac and iOS, this thorny issue is made even worse by some ancient Darwin design choices. When the process is terminated, the main thread executes static destructors while the other threads are still running. This causes unpredictable exit-time crashes in (typically cross-platform) code that uses static variables with nontrivial destructors.

The solution is to always terminate all threads before exiting. This requires a bit of internal infrastructure that's missing from many projects. If a cross-platform library is being used, it may start its own untracked threads.

On iOS, app processes only terminate in special circumstances. When this issue yields a crash it's (currently) imperceptible to the user. However, it's still tracked in the crash statistics.

gpderetta · 2024-03-23T16:29:55

std::quick_exit

worstspotgain · 2024-03-23T18:32:41

Sure, and on many systems there are platform-specific ways of terminating the process. However:

- Executing static destructors without crashing is the ideal result.

- On Mac, there are factors that make the thread-termination approach preferable, such as using cross-platform libraries or a framework like Qt.

- On iOS, none of these apply, as process termination is out of your hands.

pizlonator · 2024-03-23T02:29:56

Great write-up! I was just spelunking in this part of the language recently.

Worth noting that global constructors and destructors are generally sucky for a bunch of reasons. Don’t use them unless you really have to. Avoid them in libraries.

Reason to avoid global constructors: there’s no laziness to them; they definitely run on program start even if the global is never used. For small projects, no big deal. For large projects it means a large startup time penalty. Successful projects get big eventually, so if you’re playing the long game you want to assume you’re a big project. If you’re a library writer, it’s probable that you’ll be linked into something big. If you don’t have global ctors then your clients will thank you.

Atexit anything is just pointless, IMO. The best way for a process to exit is to just exit. Trying to “destruct” your state at exit risks race conditions and weird bugs but buys you very little. Even things like flushing or syncing a file probably don’t belong in atexit since you should assume your program will run a long time so you want important stuff flushed and synced way before the exit happens. Also, global dtors are implemented by having a global ctor that does __cxa_atexit, so see previous paragraph.

And as with all programming advice, your mileage may vary and some advice is best ignored. If global ctors/dtors achieve what you want and/or startup time isn’t an issue, then do what you feel, bro

akoboldfrying · 2024-03-23T08:39:21

>Even things like flushing or syncing a file probably don’t belong in atexit since you should assume your program will run a long time so you want important stuff flushed and synced way before the exit happens.

I definitely want buffers to be flushed to disk if the user Ctrl-C's the program. The OS only ensures that the resources are released at the OS level, but does nothing to prevent the persistent application state becoming inconsistent at the application level.

The really hardcore way to do this is to structure your whole program so that all state is recorded as deltas appended to a WAL, with a background thread/process periodically merging this into "current state" -- in that case, even pulling the power cord can't leave your application in an inconsistent state. But that's a whole lot of work, and just best-effort flushing buffers on exit gets you 80% of the way there.

gpderetta · 2024-03-23T16:37:07

Or you do the easy mode and use sqlite for your file format.

KerrAvon · 2024-03-23T05:01:47

Yes. If you have an atexit or global destructor on a program expected to run on anything Unixlike or Windows >= NT, you’re just making things slower for the user. Unless you’re on AmigaOS, let the kernel do its job; it can release your resources much faster than you can.

dataflow · 2024-03-23T05:35:08

> If you have an atexit or global destructor on a program expected to run on anything Unixlike or Windows >= NT, you’re just making things slower for the user. Unless you’re on AmigaOS, let the kernel do its job; it can release your resources much faster than you can.

This is a pretty myopic view of things? There are lots of other things you might wish to do when the program terminates besides releasing resources...

pizlonator · 2024-03-23T06:10:45

Like what?

gsliepen · 2024-03-23T07:19:44

At a previous job I was using destructors to ensure motors and other actuators in a telescope would shut down properly, and in the right order. This was very important, because incorrect shutdown could cause damage to equipment and potentially people.

This was actually my main reason for using C++ to program the control software: the fact that constructors and destructors run deterministically, especially when something bad happens and an exception is thrown. And while the language doesn't specify in which order they run for global variables in different compilation units, you still know how they are ordered with respect to `main()`, and if you pay attention to how the linker works, you can actually know how they will all be ordered.

gpderetta · 2024-03-23T16:41:19

If your program crashes or hangs it also doesn't run static destructors. And if this is safety critical you need the cleanup guarantee anyway, for example via a supervisor or watchdog process.

gsliepen · 2024-03-24T21:18:54

Of course. It is run from a supervisor, which will power off all actuators if the control software crashes. This is less ideal than a more clean shutdown though. And there is a hardware watchdog which will reboot the computer it is running on if the whole thing crashes. And hardware endswitches will physically cut power if the actuators run into a position they shouldn't be.

gpderetta · 2024-03-25T10:42:26

Can't the watchdog also perform the clean shutdown? I strongly subscribe to the Crash-only Software design principle, but it is useful to get feedback from actual experience in the field.

edit: to clarify, I'm not dogmatic, I have nothing against implmenting clean shutdown procedures for example to avoid alerts and allow quick restarts that don't need to roll-forward replay logs.

I just think that clean shutdown should be best-effort and never required for consistency/durability/safety.

We are probably in agreement.

gsliepen · 2024-03-25T22:56:44

It might be able to, but it doesn't know the state of the system. It could try to determine that somehow, but that would complicate the watchdog a lot, and that is code that is almost never exercised, so the chance that there would be bugs in that code is much higher. If shutdowns were more frequent, a crash-only design might indeed have made more sense.

dataflow · 2024-03-23T18:37:23

Can't such a destructor be in the supervisor?

  Guard g;
  // ...call subprocess...
  // Guard::~Guard will run at exit

pizlonator · 2024-03-23T13:08:00

Ok but couldn’t you have done that prior to calling exit?

dataflow · 2024-03-23T18:33:18

What problem would that have solved?

Also... can't you make that argument for anything? "Avoid unique_ptr -- can't you just free the memory before exiting?"

microtherion · 2024-03-23T13:16:04

For command line tools that use curses or mess with the terminal in other ways, you can leave things in a pretty ugly state if you don't clean up properly.

dataflow · 2024-03-23T06:41:34

Like https://stackoverflow.com/a/9334778

masklinn · 2024-03-23T07:26:13

Or signaling external systems, or removing lockfiles, …

Anything the kernel does not do on program exit, which is most things.

nmcveity · 2024-03-23T08:05:33

But sometimes not fast enough. IIRC, you can wait for a HPROCESS in Win32 with the WaitForSingleObject call but that is signaled when the application code finishes _not_ when the OS has finished it's clean up. So if the process you were waiting on was an application that would write, say, "a.txt" and you wanted to wait until that process was done so that you can read "a.txt", it is possible that you fail to open that file because the OS had not released the file resource.

Of course, I wouldn't put that clean up in an atexit.

gpderetta · 2024-03-23T16:39:04

Not much to add. Strongly agree. Destructors are to preserve internal program invariant. External invariants need transactional guaratees.

wyldfire · 2024-03-23T05:29:33

It'd be great if we could come together and specify some of these dark corners like what should happen to static destructors and at_thread_exit handlers at dlclose time.

Since it's unspecified, musl's behavior is just as correct as glibc's or BSD's or ...

ryandrake · 2024-03-23T09:44:54

Previous discussion: https://news.ycombinator.com/item?id=39737622

o11c · 2024-03-23T05:21:58

Side note: dynamic library unloading (and, for that manner, multiple loading of the same library; see `dlmopen`) is finicky because the language has no concept of ownership of things like function pointers and pointers to static data. It's easy to imagine a better language, though this would of course require a lot of low-level implementation work!

pjmlp · 2024-03-23T10:21:23

Which is why that kind of stuff only works properly with languages that combine dynamic loading with rich runtimes, alongside automatic memory management, having all puzzle pieces to work together to keep the application running when the library is removed again.

I came to accept that in languages like C and C++, the best path for dynamic loading code is really via OS IPC and external processes, the decrease in hardware resources and simpler programming model of using shared libraries as plugins, isn't worth the endless amount of debugging hours regarding program crashes.

dataflow · 2024-03-23T05:30:20

> the language has no concept of ownership of things like function pointers and pointers to static data. It's easy to imagine a better language

I actually have both a hard time imagining a better language (not saying it's impossible, just that it's not trivial for me to see) and also how ownership would resolve this issue. Could you elaborate?

lldb · 2024-03-23T06:42:43

I tested this recently and it definitely works as expected in c++ - destructors of static lifetime variables are called when the library is unloaded.

rwmj · 2024-03-23T07:39:53

Only in trivial cases. The problem is when another part of the program holds a function pointer into the library which has been unloaded. This can happen in particular when you combine libraries and threads:

libvirt has long been linked with -Wl,-z -Wl,nodelete to avoid this:

https://gitlab.com/libvirt/libvirt/-/commit/8e44e5593eb9b89f...

Another case in libnbd which really demonstrates how hard this is:

https://gitlab.com/nbdkit/libnbd/-/commit/368e3d0d5a8871aad5...

alexey-salmin · 2024-03-23T09:11:27

> The problem is when another part of the program holds a function pointer into the library which has been unloaded.

How is this different from holding a pointer to an object being deleted?

rwmj · 2024-03-23T09:40:37

That it's very hard to avoid (we think in fact impossible), because of the way that Linux _exit(2) works when you have threads, destructors and thread-local storage together. This is even if you are very careful. You should probably read the links I posted.

alexey-salmin · 2024-03-23T10:10:53

I did and this looks like a library design issue, which is now impossible to solve without breaking the ABI. I argue though that a design that doesn't suffer from this issue exists because there is a way to serialize all these operations in a safe way.

rwmj · 2024-03-23T11:53:16

Thread-local storage + destructors is, as far as we can tell, impossible to use safely by any significantly complicated program or library. It's largely because of how Linux shoots down programs on exit by killing the threads in any order without notice. It could be fixed in Linux, although I can also understand why Linux developers wouldn't want to do that as it would push significant complexity there.

jenadine · 2024-03-23T09:42:23

It's not much different. But programmers tend to assume that static data lives forever and don't always think of this problem.

saagarjha · 2024-03-23T02:57:56

> Potential race conditions: Destructors might execute during thread termination, while other threads still attempt to access the object. Examples: webkit

Haven’t looked into the bug closely but I will say that WebKit compiles with thread safety on statics turned off, which means they are in a non-standard language mode with far more races.

ooterness · 2024-03-23T16:38:46

As an embedded C++ developer, exit-time destructors have bit me many times.

In many cases, they pull in 100+ kB of useless code that only runs on exit. I've only got 128 kiB of RAM and my program doesn't exit, but good luck convincing the compiler this is true. It's so aggravating.

roeles · 2024-03-23T20:19:35

Recent tests with link time optimization give me the impression that code which is not actually called is not included (or even resolved). If you can convince the compiler that the destructor is never called, it might not include it.

ooterness · 2024-03-24T06:07:09

Yes, that's correct. Assuming "-ffunction-sections" is set during compilation, and "--gc-sections" is set during linking, it'll drop any unreachable code.

Sadly, small oversights (e.g., exception handling for undefined virtual methods) can cause disproportionate bloat.

MichaelMoser123 · 2024-03-23T14:12:57

the article didn't mention the “static initialization order ‘fiasco’"

https://isocpp.org/wiki/faq/ctors#static-init-order

If you have two static objects in two different files then good luck if one depends on the other, as the initialization order is not well defined.

If both files are in the same executable then you can get around that by linking the object files in the desired order - well mostly, kind of.

i have hit this problem at some point...

josefx · 2024-03-23T14:16:34

That sounds like hard mode, the easy solution is to just initialize them the first time they are used. GetInstance(){ if (!instance)instance = new Foo; return instance;}, add a lock if you need thread safety and done.

MichaelMoser123 · 2024-03-23T14:18:36

Yes, you fix the problem by calling the initialization stuff from a single point.

rcxdude · 2024-03-23T17:59:24

or just:

Foo &getInstance() { static Foo instance; return instance; }

Guaranteed to be threadsafe by the standard, even.