It's very platform specific. MacOS has had "containers" since switching to NeXTS...

skydhash · 2024-04-28T23:22:46

Is it? I'm using LXC containers, but that mostly because I don't want to run VMs on my devices (not enough cores). I've noted down the steps to configure them if I ever have to redo it so I can write a shell script. I don't see the coordination problem if you choose one distro as your base and then provision them with shell scripts or ansible. Shipping a container instead of a build is the same as building desktop apps instead of electrons, optimizing for developer time instead of user resources.

mike_hearn · 2024-04-29T07:58:15

> if you choose one distro as your base

Yes obviously if you control the whole stack then you don't really need containers. If you're distributing software that is intended to run on Linux and not RHEL/Ubuntu/whatever then you can't rely on the userspace or packaging formats, so that's when people go to containers.

And of course if part of your infrastructure is on containers, then there's value in consistency, so people go all the way. It introduces a lot of other problems but you can see why it happens.

Back in around 2005 I wasted a few years of my youth trying to get the Linux community on-board with multi-distro thinking and unified software installation formats. It was called autopackage and developers liked it. It wasn't the same as Docker, it did focus on trying to reuse dependencies from the base system because static linking was badly supported and the kernel didn't have the necessary features to do containers properly back then. Distro makers hated it though, and back then the Linux community was way more ideological than it is today. Most desktops ran Windows, MacOS was a weird upstart thing with a nice GUI that nobody used and nobody was going to use, most servers ran big iron UNIX still. The community was mostly made up of true believers who had convinced themselves (wrongly) that the way the Linux distro landscape had evolved was a competitive advantage and would lead to inevitable victory for GNU style freedom. I tried to convince them that nobody wanted to target Debian or Red Hat, they wanted to target Linux, but people just told me static linking was evil, Linux was just a kernel and I was an idiot.

Yeah, well, funny how that worked out. Now most software ships upstream, targets Linux-the-kernel and just ships a whole "statically linked" app-specific distro with itself. And nobody really cares anymore. The community became dominated by people who don't care about Linux, it's just a substrate and they just want their stuff to work, so they standardized on Docker. The fight went out of the true believers who pushed against such trends.

This is a common pattern when people complain about egregious waste in computing. Look closely and you'll find the waste often has a sort of ideological basis to it. Some powerful group of people became subsidized so they could remain committed to a set of technical ideas regardless of the needs of the user base. Eventually people find a way to hack around them, but in an uncoordinated, undesigned and mostly unfunded fashion. The result is a very MVP set of technologies.

titzer · 2024-04-28T22:23:13

> A lot of that is due to coordination problems.

The dumpster fire at the bottom of that is libc and the C ABI. Practically everything is built around the assumption that software will be distributed as source code and configured and recompiled on the target machine because ABI compatibility and laying out the filesystem so that .so's could even be found in the right spot was too hard.

fch42 · 2024-04-29T02:06:20

To quote Wolfgang Pauli, this is not just not right, it's not even wrong ...

The "C ABI" and libc are a rather stable part of Linux. Changing the behaviour of system calls ? Linus himself will be after you. And libc interfaces, to the largest part, "are" UNIX - it's what IEEE1003.1 defines. While Linux' glibc extends that, it doesn't break it. That's not the least what symbol revisions are for, and glibc is a huge user of those. So that ... things don't break.

Now "all else on top" ... how ELF works (to some definition of "works"), the fact stuff like Gnome/Gtk love to make each rev incompatible to the prev, that "higher" Linux standards (LSB) don't care that much about backwards compat, true.

That, though, isn't the fault of either the "C ABI" or libc.

mike_hearn · 2024-04-29T08:18:06

Things do break sadly, all the time, because the GNU symbol versioning scheme is badly designed, badly documented and has extremely poor usability. I've been doing this stuff for over 20 years now [1] [2], and over that time period have had to help people resolve mysterious errors caused by this stuff over and over and over again.

Good platforms allow you to build on newer versions whilst targeting older versions. Developers often run newer platform releases than their users, because they want to develop software that optionally uses newer features, because they're power users who like to upgrade, they need toolchain fixes or security patches or many other reasons. So devs need a "--release 12" type flag that lets them say, compile my software so it can run on platform release 12 and verify it will run.

On any platform designed by people who know what they're doing (literally all of the others) this is possible and easy. On Linux it is nearly impossible because the entire user land just does not care about supporting this feature. You can, technically, force the GNU ld to pick a symbol version that isn't the latest, but:

• How to do this is documented only in the middle of a dusty ld manual nobody has ever read.

• It has to be done on a per symbol basis. You can't just say "target glibc 2.25"

• What versions exist for each symbol isn't documented. You have to discover that using nm.

• What changes happened between each symbol isn't documented, not even in the glibc source code. The header, for example, may in theory no longer match older versions of the symbols (although in practice they usually do).

• What versions of glibc are used by each version of each distribution, isn't documented.

• Weak linking barely works on Linux, it can only be done at the level of whole libraries whereas what you need is symbol level weak linking. Note that Darwin gets this right.

And then it used to be that the problems would repeat at higher levels of the stack, e.g. compiling against the headers for newer versions of GTK2 would helpfully give your binary silent dependencies on new versions of the library, even if you thought you didn't use any features from it. Of course everyone gave up on desktop Linux long ago so that hardly matters now. The only parts of the Linux userland that still matter are the C library and a few other low level libs like OpenSSL (sometimes, depending on your language). Even those are going away. A lot of apps now are being statically linked against muslc. Go apps make syscalls directly. Increasingly the only API that matters is the Linux syscall API: it's stable in practice and not only in theory, and it's designed to let you fail gracefully if you try to use new features on an old kernel.

The result is this kind of disconnect: people say "the user land is unstable, I can't make it work" and then people who have presumably never tried to distribute software to Linux users themselves step in to say, well technically it does work. No, it has never worked, not well enough for people to trust it.

[1] Here's a guide to writing shared libraries for Linux that I wrote in 2004: https://plan99.net/~mike/writing-shared-libraries.html which apparently some people still use!

[2] Here's a script that used to help people compile binaries that worked on older GNU userspaces: https://github.com/DeaDBeeF-Player/apbuild

mattpallissard · 2024-04-30T21:46:40

> How to do this is documented only in the middle of a dusty ld manual nobody has ever read.

This got an audible laugh out of me.

> Good platforms allow you to build on newer versions whilst targeting older versions.

I haven't been doing this for 20 years (13), but I've written a fair amount of C. This, among other things, is what made me start dabbling with zig.

  ~  gcc -o foo foo.c
  ~  du -sh foo
  16K foo
  ~  readelf -sW foo | grep 'GLIBC' | sort -h
       1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.34 (2)
       3: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND puts@GLIBC_2.2.5 (3)
       6: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.34
       6: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND __cxa_finalize@GLIBC_2.2.5 (3)
       9: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND puts@GLIBC_2.2.5
      22: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND __cxa_finalize@GLIBC_2.2.5
  ~  ldd foo                                 
    linux-vdso.so.1 (0x00007ffc1cbac000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007f9c3a849000)
    /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f9c3aa72000)


  ~  zig cc -target x86_64-linux-gnu.2.5 foo.c -o foo
  ~  du -sh foo
  8.0K  foo
  ~  readelf -sW foo | grep 'GLIBC' | sort -h        
       1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.2.5 (2)
       3: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@GLIBC_2.2.5 (2)
  ~  ldd foo                                 
    linux-vdso.so.1 (0x00007ffde2a76000)
    libc.so.6 => /usr/lib/libc.so.6 (0x0000718e94965000)
    /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x0000718e94b89000)

edit: I haven't built anything complicated with zig as I have with the other c build systems, but so far it seems to have some legit quality of life improvements.

mike_hearn · 2024-05-01T08:13:25

Interesting that zig does this. I wonder what the binaries miss out on by defaulting to such an old symbol version. That's part of the problem of course: finding that out requires reverse engineering the glibc source code.

fch42 · 2024-05-01T09:31:12

Maybe just nitpicking but he _specified_ the target version for the zig compile.

(Haven't tested what it would link against where that not given)

mattpallissard · 2024-05-01T18:25:12

> Maybe just nitpicking but he _specified_ the target version for the zig compile.

Right, but I was able to do it as a whole. I didn't have to do it per symbol.

fch42 · 2024-04-30T17:36:43

Thanks for extensive examples of "the mess"...

I'd only like to add one thing here ... on static linking.

It's not a panacea. For non-local applications (network services), it may isolate you from compatibility issues, but only to a degree.

First, there are Linux syscalls with "version featuritis" - and by design. Meaning kernel 4.x may support a different feature set for the given syscall than 5.x or 6.x. Nothing wrong with feature flags at all ... but a complication nonetheless. Dynamic linking against libc may take advantage of newer features of the host platform whereas the statically linked binary may need recompilation.

Second, certain "features" of UNIX are not implemented by the kernel. The biggest one there is "everything names" - whether hostnames/DNS, users/groups, named services ... all that infra has "defined" UNIX interfaces (get...ent, get...name..., ...) yet the implementation is entirely userland. It's libc which ties this together - it makes sure that every app on a given host / in a given container gets the same name/ID mappings. This does not matter for networked applications which do not "have" (or "use") any host-local IDs, and whether the DNS lookup for that app and the rest of the system gives the same result is irrelevant if all-there-is is pid1 of the respective docker container / k8s pod. But it would affect applications that share host state. Heck, the kernel's NFS code _calls out to a userland helper_ for ID mapping because of this. Reimplement it from scratch ... and there is absolutely no way for your app and the system's view to be "identical". glibc's nss code is ... a true abyss.

Another such example is (another "historical" wart) timezones or localization. glibc abstracts this for you, but language runtime reimplementations exist (like the C++2x date libs) that may or may not use the same underlying state - and may or may not behave the same when statically compiled and the binary run on a different host.

Static linking "solves" compatibility issues also only to a degree.

titzer · 2024-04-29T05:10:19

glibc is not stable on Linux. Syscalls are.

saagarjha · 2024-04-29T07:42:16

glibc is ABI-compatible in the forward direction.

fch42 · 2024-04-29T07:17:04

https://cdn.kernel.org/pub/software/libs/glibc/hjl/compat/

It's providing backwards compatibility (by symbol versioning). And that way allows for behaviour to evolve while retaining it for those who need that.

I would agree it's possibly messy. Especially if you're not willing or able to change your code providing builds for newer distros. That said though... ship the old builds. If they need it only libc, they'll be fine.

(the "dumpster fire" is really higher up the chain)

vnuge · 2024-04-28T22:59:45

> Practically everything is built around the assumption that software will be distributed as source code

Yup, and I vendor a good number dependencies and distribute source for this reason. That and because distributing libs via package managers kinda stinks too, it's a lot of work. Id rather my users just download a tarball from my website and build everything local.

skydhash · 2024-04-28T23:36:20

I don't think that users expect developers to maintain packages for every distro. I had to compile ffmpeg lately for a debian installation and it went without an hitch. Yes, the average user is far away from compiling packages, but they're also far away from random distributions.

metalspoon · 2024-04-28T23:52:12

I think flatpak is closer to .app bundles. So, the argument is a little unfair.