Jump to page: 1 2 3
Thread overview
May 06

This post is meant to be a highly enlightening and entertaining explanation (or should I say it shouldn't cure anyones insomnia) of just how many things can go wrong with shared libraries if they are not worked with right regardless of platform.

Now I know this is an utter wall of text, but if you want to work with shared libraries you probably should read all of this. It'll get you up to speed on the theory of using them by preventing a repeat of my experiences, no war stories for you!

If you have inside knowledge of how shared libraries work, please expand upon this in the comments, perhaps we can get an article out of it for the site.

Some of the advice in this article may go against your previous experiences working with shared libraries. The recommendations here exist because the alternatives have seen to be problematic for a large portion of support requests over a two year period. If you understand what you are doing, you of course can disregard a particular piece and may want to expand upon or refine what the information that is being given here so that we can create a great overview of the subject for future programmers to learn from!

Latest copy can be found here.

Glossary

Before we begin to get into actual content we should probably cover some basic terms.

  • Binary (within a process, can be known as an image or module): An executable or shared library.
  • Static library: An archive containing one or more object files.
  • Shared library: A reusable and multi-loadable binary that typically does not contain an entry point function.
  • Out of binary: A symbol that does not exist in the current (compiling/linking) binary.
  • Visiblity override switch: A compiler switch that changes the default symbol mode of symbols, unless stated otherwise.
  • DllImport override switch: A compiler switch that changes the default symbol mode for symbols that are external, unless stated otherwise.
  • Silo'd: a library that is unaware of other instances of itself (my own definition for the usage of this article).
  • Isolated: a library that is sandboxed so that no resources can cross into other code (my own definition for the usage of this article).

Table of Contents

  • Common Mistakes

    >

    Not asking for help in understanding the theory behind shared libraries, linking and loading in general is going to lead to failure for your project. No matter how good you are with this stuff, help will be needed at some point.

  • Things That are Not Covered

    >

    Not everything has been described here that can impact shared libraries usage in D. It is not a tutorial, but a reference for before you start using them.

  • Is a Dynamic Link Library a Shared Library?

    >

    Yes, but they make it easy to think otherwise!

  • Import Libraries are Special Yes?

    >

    There is nothing special about import libraries, don't export global variables, oh and you should probably just link against a DLL dynamically!

  • Symbol Modes Make Ya Go Mad!

    >

    When dealing with shared libraries there are three modes a symbol can be in Internal, DllImport and DllExport. Setting these up right are the core problem that results in both linkage failures and runtime errors.

    • Not Everything Should Be Exported >

      Just because something can be exported, doesn't mean it should be, i.e. TLS.

  • Symbol, What Symbol?

    >

    Current language is not very helpful with any generated symbols and this can lead to program corruption.

  • Knowing When to DllImport

    >

    Current solutions are too broad, inconsistent and will out right result in linker errors without any compiler assistance. They outright prevent intermediary usage of static libraries and object files without issues arrising.

  • Why Not Intermediary Static Libraries?

    >

    A static library does not fully get included, eliding FTW! Use object files for intermediaries rather than static libraries for anything that gets exported.

  • It is Loaded, Works Yes?

    >

    Just because it linked, doesn't mean it'll load even with the right dependencies and the behavior of loaders are not consistent between platforms.

  • Unloading

    >

    To keep your sanity, don't unload a shared library unless your process is dieing.

  • Initializing Your Shared Library

    >

    A shared library that allows you to borrow resources it owns, and borrows from another is full of failure modes that may not be avoidable.

    • TLS Hooking

      >

      Only Windows offer hooking of threads, which supports zero or more DllMain and for druntime should be automatically injected.

    • Scenario: Your Own Memory Allocator

      >

      The order of deinitialization can matter between siblings shared libraries, if you can avoid letting a sibling shared library borrow resources from you, you should avoid it.

    • Scenario: Your Own Threads

      >

      If you're going to do your own threads, don't forget to register them with druntime and handle cyclic registration to and from.

  • Where Is Thy Runtime?

    >

    Did you follow my advice in Unloading, no? Well good luck with that. If you have a runtime loaded don't have duplicates of it, stick to a single shared library build of it.

  • Who Needs a Scope Anyway?

    >

    Go ahead be smart! Don't use shared libraries or static libraries, go import only! See how quickly you kill off that scope that depends on having state.

Common Mistakes

TLDR: Not asking for help in understanding the theory behind shared libraries, linking and loading in general is going to lead to failure for your project. No matter how good you are with this stuff, help will be needed at some point.

I would write a lot more here, but currently the language and the tooling simply does not assist you in getting what you need sent to the linker sent.

  • You cannot tell the compiler that a module is not in your binary. See: Knowing When to DllImport. My DIP fixes this.
  • You cannot tell the compiler that something is private, actually needs to be exported and have it work correctly (export is currently a visibility modifier). See Atila's DConf 2023 talk You're Writing D Wrong--Átila Neves as to why this is very worrying that we cannot do it currently. This is something my DIP resolves.
  • If you are able to tell the compiler that a type needs to be exported, it will not export things it generates leading to it not work anyway. See: Symbol, What Symbol?. Another thing my DIP fixes.
  • If it does work, its going to cause silent program corruption. See: Symbol, What Symbol?.

In general if you're going to work with shared libraries, you will likely run into situations where you need help. Buying, reading and learning from Linkers & Loaders is not going to be enough to get you to a successful outcome.

Things That are Not Covered

TLDR: Not everything has been described here that can impact shared libraries usage in D. It is not a tutorial, but a reference for before you start using them.

  • No D code with build file examples
  • Exceptions
  • Template instantiations that cross the shared library boundary

Is a Dynamic Link Library a Shared Library?

TLDR: Yes, but they make it easy to think otherwise!

So let's start with something simple, a Dynamic Link Library (DLL) is not a shared library. This is not an accurate statement, as a DLL facilitates the role that a shared library does on non-Windows systems. As an issue this come up in a few places such as Windows System Programming 3rd edition pg. 150, documentation for GetFullPathNameA, an answer on stack overflow.

The shared library model is notable because of the reusable nature of a binary that the OS loader can merge into your process. Either during initial load started by the kernel or during execution of your program at your request.

Of note is that each binary that makes up a process (executable vs shared library) are not isolated. They are merged. Once merged the only thing preventing exposure of one to another, is the symbol table that the kernel keeps for each binary which is used for patching.

In another section Where Is Thy Runtime I describe a library that is silo'd, this just means it does not know about other things in the process. Isolation on the other hand would refer to sandboxing which as far as I am aware no OS does.

Okay so how is that entertaining? Great question, due to the indirection introduced by DLL's it can appear that they are in fact isolated which can lead to quite some interesting moments!

Import Libraries are Special Yes?

TLDR: There is nothing special about import libraries, don't export global variables, oh and you should probably just link against a DLL dynamically!

Whenever you link a binary you may have noted a corresponding file has been created along with it. This is an import library, it was generated by the linker when it saw that you exported something. These are quite informational, they tell you what symbols were exported, but more importantly they tell a future linker invocation about them too!

Not all platforms use these files, others such as Linux rely on what is in the binary to provide this information solely. On Windows they utilize by import libraries and information in the shared library to map their symbols which works great for their commercially concerned OS!

So what are import libraries? Some custom format or other horrendous thing to never learn about?

No! In fact they are just regular static libraries! If you can emit a static library you can probably create your own without much work.

The two main things that they contain which are of interest is the extern symbols that have _imp prefixed to their name and wrappers to these symbols where a simple jump (or similar) to what is pointed at. jmp [_imp_symbol]; these are symbols are generated to have the original symbol name (without the _imp).

Those generated wrappers are why the druntime bindings to WinAPI currently work, without DllImport support being cleanly defined and in active use by the language!

This has another interesting tidbit, you should only have the ability to export functions, not global variables. You can see this in Microsoft's libc how they have it to be a function call in a macro.

What is great about this is in practice there is no difference between linking against a shared library statically (using linker) or loading dynamically (using loader yourself). Either way you're dealing with an indirection of using a global pointer!

So if you're ever asking yourself if you should statically or dynamically link against a shared library on Windows, you should probably link dynamically unless you're distributing the end binary as it makes no difference when using a symbol.

Symbol Modes Make Ya Go Mad!

TLDR: When dealing with shared libraries there are three modes a symbol can be in Internal, DllImport and DllExport. Setting these up right are the core problem that results in both linkage failures and runtime errors.

In the traditionally applied (POSIX) shared library model, the only symbol modes relevant to discussion are internal versus external. An external symbol is one not defined in a given binary, and internal is found within. However just because a symbol is internal does not mean it has its symbol name known or accessible to other binaries to link against.

Along came Windows DLL's and we no longer use internal versus external terminology with shared libraries although it is still relevant to object files and it is how linkers and loaders still operate at the lowest level even if we are no longer operating solely within it. Now we use Internal, DllImport and DllExport regardless of the platform.

  • An Internal symbol is a symbol that is found in a binary that is not directly accessible by name externally to that binary.
  • A DllImport symbol is a symbol that is not found in the current binary and is external to it. For Windows specifically this refers to the symbol having indirection via a global pointer to the internal symbol. See _imp prefixed symbols in import libraries heading above.
  • A DllExport symbol is a. internal symbol that has an exportation linker flag applied to it. Traditionally this will expose the symbol name for the symbol. For Windows it will hide the internal symbol and instead expose a new global variable which is a pointer, using the name with the prefix _imp, that points to the internal symbol.

Each platform has its own tunings to the shared library model, both OSX and Linux may both be POSIX, but they each have their own behaviors that are not necessarily POSIX compliant.

LLVM has some explanations for these modes, there are many others they support although they are not relevant to this document. For internal, and for DllImport/DllExport.

Symbol modes are the heart and sole of the majority of issues relating to shared library support in the language. Most specifically what should be exported automatically, and when do we apply DllImport instead of Internal.

Not Everything Should Be Exported

TLDR: Just because something can be exported, doesn't mean it should be, i.e. TLS.

The vast majority of symbols that are user written (not compiler generated) error due to the symbol modes DllImport and Internal being mixed up. But sometimes DllExport can cause issues for both generated and user written symbols.

According to Ulrich Drepper and at least one other Stack overflow user C constructors/destructors on linux do not need to be exported.
Since it is not required to be exported, exporting can only invite problems when it is done unnecessarily. See the bug ticket to track disallowing exportation of functions marked as such.

Alternatively another set of issues can be seen with generated symbols such as ModuleInfo or TypeInfo. By not exporting ModuleInfo and assuming it is available the compiler introduces a hidden dependency on a generated symbol that may not exist.

This is a bit of problem with shared libraries. Especially when a D file could actually be a binding to a C library (like Deimos). See these two tracking issues for ModuleInfo exportation problems Export ModuleInfo and Remove dependency.

Unfortunately the removal of the dependency can only work correctly if you know that the module is out of binary or you end up with fun situations where a dependency module does not initialize before you try to access it.

See Why Not Intermediary Static Libraries? for an explanation on why a static library should not contain exports.

Thread local variables (TLS), Fiber local variables (FLS) are examples of specialty global variables that should never be exported. The scheme used for each depends on the platform and can change over time (Android has recently changed its TLS scheme for instance).

The global itself could be a key into some sort of map that the operating system provides, or emulated by the toolchain into existing. The creation of the key into map may be done by user code, as done with pthread and Win32 which has explicit mention that the handle may not cross the DLL boundary.

Instead of exporting a TLS variable you can wrap the access to the storage pointer by a function that returns it. This should be done automatically by the compiler or disallowed.

Symbol, What Symbol?

TLDR: Current language is not very helpful with any generated symbols and this can lead to program corruption.

So you've got yourself a fancy pants type and you've done everything right. Exported all the symbols that don't get exported automatically (that the compiler is supposed to exporting for you since you can't in language), annotated with export on the type and methods itself but... you get a segfault when you used it. What would you do?

I have had to deal with this very situation before multiple times when the D code looks like this:

MyType var;
var = MyType(...);

It looks like it should be working fine! The segfault isn't even in this function!!! How is this code buggy? Well you won't believe this... but that variable initialization, didn't initialize.

See dmd is rather "helpful" even though it didn't know that the .init symbol is in DllImport mode rather than Internal, and because of the way the codegen works it still linked and didn't cause any memory corruption!

So when the copy from the .init symbol to the stack occurs it sees a zero length, and it thinks I'm done! Wahoo, I did the thing. Except it didn't do the thing. In fact it did zero of the things it was meant to do.

What you end up with is a variable with junk left over stack data which can be pretty much anything. This shows up very easily when you are dealing with library based reference counting, due to the atomic alignment check. Not a fun time to be had.

This shows us how important it is to export symbols generated from a type automatically when other symbols have been explicitly exported. D has a lot of house keeping symbols that get generated, including opCmp! All of these must be handled for you, or it hasn't got a chance to work and there will be a lot of distractions requiring a significant amount of debugging to resolve.

Knowing When to DllImport

TLDR: Current solutions are too broad, inconsistent and will out right result in linker errors without any compiler assistance. They outright prevent intermediary usage of static libraries and object files without issues arrising.

So we've so far covered how the compiler needs to assist with exportation automatically and that you must have a way to put a symbol into DllExport mode, but we still have to cover DllImport, and what the compiler can do to assist you.

Nothing. It cannot help you. It will get it wrong, things will not link.

So it is fully on you to put symbols into DllImport mode, and that right there is the giant problem, how do you do this?

Well you can start with the dllimport override switch that ldc has introduced. But you are limited to either system libraries like druntime and phobos, or every shared library. There is no finer grained solution as part of CLI switches currently.

If you do it in code, now suddenly you have to maintain both an interface file and the source file. Oh did I mention that the compiler can't help here either? Yeah... the D interface generator has no knowledge of if you want the resulting file to be used for a static library or shared library. Even if it was going to work, it isn't going to work for you today.

So you have got to annotate per symbol that it is in DllImport mode. In my DIP for exportation I changed this to have the consistent syntax of export with extern and this applies to all symbols.

Still this isn't a good enough situation, doesn't help build managers and certainly is a major pain, obviously nobody is going to do this manually if they have a choice.

While it is great to have a fine grained solution (including conditionally) for setting DllImport mode, this shouldn't be your primary way of setting up the symbol modes.

There is an alternative that works great as a story for both build managers and for people who don't know anything about why it exists!

The external import path switch -extI this is a switch I have proposed similar to -I. If you understand the import path switch you can understand that the external import switch is just for modules found in a shared library. Easy swap!

From a compiler perspective it knows that any module found from an external import switch is found in another binary, and if its from the import switch that it can be found from the currently compiling binary!

This enables it to switch any found DllExport symbols to DllImport without any action on each symbol by the programmer. How wonderful!

But what if we didn't annotate with export and instead used the visiblity override switch to set exportation, well use the dllimport override switch to apply to all symbols found from a external module. Great, more compiler assistance with minimal changes!

But why not use the override switches isn't this good enough? No, no it is not. It's too broad.

Without the ability to pick which modules are out of binary, versus being linked into the current binary you get linker warnings and they exist because you are out right doing the wrong thing by adding extra indirection (which may not have been enabled by the (lacking, or different setting) of visibility override switch).
This has the unfortunate casualty of no static library or object file intermediaries without causing problems.

Why Not Intermediary Static Libraries?

TDLR: A static library does not fully get included, eliding FTW! Use object files for intermediaries rather than static libraries for anything that gets exported.

So you've been a good programmer, split up your code base so that there are intemediary compilation steps to enable faster rebuild times and proper scoping of project work. Nothing could go wrong with that when it comes to shared libraries right? Right???

Oh how are you naive! There is so much wrong with this that you're going to rethink everything you have ever done.

So linkers don't just include a static library whole, it only includes an object file that it contains if something references it by default. Great for when you are building executables, not so great when you are constructing a shared library from static libraries containing exports that do not get pulled in by anything.

Unfortunately while there is a way to force it, you need to know the static libraries name and can be a bit buggy depending on the linker in question. Only resonable solution to this is to use object files, that do not get elided.

According to Adam Wilson, the recommendation from Microsoft internally is to not export from static libraries and this makes sense given the above issues. So while you can use a static library to contribute towards your shared library, it should not be providing any exported symbols.

This is problematic with dub, as it does not support object files currently. See this ticket for a potential redesign of how dub works with target types.

You should also be aware that with both of the override switches (visibility and dllimport) you will not have fined grained control over exports in a static library versus object files in dub today based upon the (sub)package. There are multiple things that will need to be done to enable people to prevent running afoul of these recommendations whilst still enabling full control.

To further complicate matters, if you want to fully isolate a static library neither dub nor the compiler can assist you (by using the .di generator). This will require further research to enable this advice of not exporting from static libraries to be automatically applied with minimal intervention by the programmer.

It is Loaded, Works Yes?

TLDR: Just because it linked, doesn't mean it'll load even with the right dependencies and the behavior of loaders are not consistent between platforms.

So you have succesfully compiled and linked. Symbols that were supposed to be exported were, and those that weren't weren't. So it will work now yes? YES?

NOPE. We are not done yet.

Now we gotta talk about loading of shared libraries and ensuring their state is valid.

But where does a loader look for a shared library to load? First place is system directories which of course depends upon your system configuration.

For POSIX systems it uses some environment variables to determine auxiliary locations. It also looks in a special string within a binary (executable and shared libraries) called RPATH, however keep in mind this will carry with the binary no matter where its called or by who.

On Windows and OSX it'll look in the current working directory by default too, not just system directories or the PATH variable.

Windows does support some customization for the usage of launchers, that will allow at runtime to setup some additional paths.

So much variety in behavior of the system loader, how do we ensure we have a consistent behavior that "just works" with our build managers? Outside of the build manager we really can't do a whole lot.

But what we can do is unify upon placing them into the same directory as the executable and then letting the build manager use the appropriete environment arguments to setup the lookup paths to point to it. If all you are doing is wanting to run your program that is great.

I have a PR to add this capability to dub, which has been a tad contentious for those who are not me or Martin.

Of course all of this assumes you have all the dependencies setup with no conflicts in place (such as versioning). If you don't you're going to need a tool like Dependencies to figure this one out the hard way.

Unloading

TLDR: To keep your sanity, don't unload a shared library unless your process is dieing.

Remember when I said shared libraries are not isolated (sandboxed)? Yeah that. That is a bit of a problem...

If you unload a shared library you are putting your process into an indeterminate state on if it could be corrupted. For this reason I would not recommend unloading a shared library except in one rather particular case.

If you can guarantee that a given shared library has not during its existance been sharing its resources and you have not been taking any pointers into it, you may unload it.

To work around this limitation of no sharing of resources, you can use handles as long as they are not the integral representation of a pointer and to convert them internally to a pointer use a data structure to map it. A much slower approach, but safer if you need to do unloading.

The simplest solution to all of this which is what I would recommend, is to simply keep a shared library loaded but detach them internally. So if you mess up you are not risking a program crash. Just don't subvert your API that controls attachment and it should work safely.

This approach takes care of both read only memory (functions, globals, constant literals) as well as heap allocated memory.

Initializing Your Shared Library

TLDR: A shared library that allows you to borrow resources it owns, and borrows from another is full of failure modes that may not be avoidable.

All platforms worth mentioning here support some method to run initializers and deinitializers in your shared library after load and before unload with priorities. In D this can be hooked using the pragma(crt_constructor) and pragma(crt_destructor). However we do not support priorities.

Windows has some additional support of initialization callbacks via the DllMain function, however this will be covered in the sub heading TLS Hooking.

When a shared library is designed to work in isolation and not take ownership of any resource it did not create for its own internal use, there should be minimal concerns surrounding its initialization and deinitialization, as long as they were never exposed to other code, nor other code exposed to it.
See my prior point in Unloading regarding handles.

On the other hand when you have a shared library similar to druntime that:

  • Does not define its own initialization/deinitialization functions that are automatically run (you must explicitly run them).
  • Owns threads that you can request, borrow and sets up its own internal state.
  • Can be informed of threads you own, but does not allow you to add its internal state onto it (not necessarily required but there is no function that you are supposed to call to make it happen).
  • Owns memory (GC) that you can borrow at your request.
  • Borrows memory that it scans for GC memory.
  • Runs other peoples code (module (de)constructors, unittests, destructors) at potentially indeterminate times.

Every single one of these things could be the cause of your programs corruption. Best case scenario is a segfault, but silent program corruption is just as possible.

TLS Hooking

TLDR: Only Windows offer hooking of threads, which supports zero or more DllMain's and for druntime should be automatically injected.

Having knowledge of when a thread is created or destroyed is quite useful to have if your goal is to register threads to a shared library, construction or destruction of your state.

Windows has this capacity in the form of a function called DllMain this maps into a section inside of a the PE-COFF binary for TLS callback functions and enables a compiler to provide as many hook functions as desired to load/unload of binaries as well as on creation and destruction of threads.

This leads to a concern about the existance of a mixin template in druntime called SimpleDllMain. When druntime is built as a shared library on Windows, it'll automatically be included. However if you build a shared library that has druntime as a static library this will not be handled for you and it could be without using the DllMain function up.

If we offered a pragma to set a function as a TLS callback function we could let druntime have its own, remove the need for SimpleDllMain entirely.

Although in the above I say only Windows supports it, in recent years C++ has introduced thread local variables and with that destructor support. This might be hookable, although this would not solve the on thread creation hook and for that reason it should be considered Windows only for the time being.

Scenario: Your Own Memory Allocator

TLDR: The order of deinitialization can matter between siblings shared libraries, if you can avoid letting a sibling shared library borrow resources from you, you should avoid it.

Scenario: you have a shared library sitting side by side as a sibling to druntime, that has been told that druntime exists via registration (see dub's injectSourceFiles as a way to do this automatically) and you have your own memory allocator.

You want to tell the GC about any memory you allocate, because of course somebody might want to put GC memory into it and you don't want to let it get free'd.

So you tell the GC all about it by adding it as a range, no problem right? You're being a good person! And you would be rather mistaken when it comes time to do unloading...

See it is totally possible that your shared library gets deinitialized after druntime does. And of course when you deinitialize, you gotta tell druntime to remove those ranges! This is one way to get a crash deep inside of the druntime's GC without a way of knowing why.

Please do not ask me how I know about this, it wasn't a fun time to debug this one.

A workaround to this is to add an additional initialization and deinitialization call to druntime. This will increase the counter internally and when you do your call to it will let it die proper. Making it so all your state has it gone, and all its state about you is also gone.

Note: this works with the C constructor/destructor, so this is running outside of the user start function.

Scenario: Your Own Threads

TLDR: If you're going to do your own threads, don't forget to register them with druntime and handle cyclic registration to and from.

So you have decided to create your own thread abstraction, you wrote it and it worked first time, well done! And now you have gotten a user to try it; the program crashed once run. The horror!

Out of pure curiosity did you register the thread and then ran the thread initialization code for module constructors and TLS? Yes? Why of course you didn't, you didn't even know that druntime was loaded in process. See Scenario: Your Own Memory Allocator section for more information on registering druntime.

Okay now that you have done it and it runs, great job!

So tell me, has druntime registered its threads with you also? No? Curious, that you wanted to build a thread abstraction library but you only cared enough to write the code regarding the threads that you wanted. Still at least no other threads are interacting with your code. What? That isn't the case? Oh no...

Okay so the needful has been done, you have a module constructor and destructor that informs you of thread creation and destruction by druntime. Super. But why are you getting stack overflows now?

See you did the most intelligent thing possible, you registered your thread with druntime, and druntime registered its thread with your abstraction. Isn't that how its meant to be? Why yes, yes it is meant to be like that. Except you created a bit of a loop there...

After all that work, now it starts to work without failures, assuming of course you didn't mess out an implementation detail some place like I did. It's always fun to have to debug code where an object gets deallocated and the same pointer gets allocated for the same thing and you wonder why the state keeps changing on you!

Where Is Thy Runtime?

TLDR: Did you follow my advice in Unloading, no? Well good luck with that. If you have a runtime loaded don't have duplicates of it, stick to a single shared library build of it.

I tried... I really did, I spent an entire day trying to write this section. Fact is what this section was meant to talk about is when multiple copies of a runtime are loaded into a process with no knowledge of each other.

If the owned resources of a shared library never crossed the boundary to other peoples code is followed as I recommended in Unloading then this section wouldn't matter. But of course nobody does that, see SDL, SQLite or should I say pretty much EVERY C LIBRARY IN ACTIVE USE. Oh and for anyone in doubt, how about that COM eh? Ya know the C++ based remote process communication, that uses heap allocated classes that underpins a pretty significant portion of the Windows shell and Microsoft products extension capabilities.

Okay rant over, hopefully everyone who has made it this far can see that there is a risk here that I am trying to educate about.

So you have a library, a runtime of sorts. Lets call it druntime. This runtime owns and loans out memory from it, and has callbacks registered into it (destructors, module destructors ext.) as well as memory registered into it (ModuleInfo, TypeInfo). Not only that but it also has system resources such as locks and threads that it owns and loans out to other code. Sometimes it even knows about system resources that other code has created such as threads!

So this "druntime", you build it as a shared library and you have multiple binaries depending upon it loaded into your process. You load and unload, register and unregister all correctly. No segfaults happen on start up and shutdown. Good job, I'm sure that you have followed all of my advice that I have detailed in the other sections of the article.

Alternatively you could have built this "druntime" into an executable or shared library and you end up having a mix leading you to have multiple copies loaded into your process. Only they know nothing of each other. This is unfortunately a very real possibility, after all where will you register your runtime into?

Which one do you think is going to cause problems at indeterminate points in time?

The second of course! Okay I lie it could be either but the second one is almost guaranteed to result in problems that are impossible to debug for the novice.

Problem is each "druntime" is silo'd, it has no knowledge of the other, or have the ability to communicate with it. But lets say you did have the ability to communicate which is a rather big if, have you really got all the state ready to be communicatable between them? What happens when it is time to unload? Different version size mismatch, behavior changes fields ext. This of course doesn't answer questions like whose memory allocator do you use from that point on, who ends up owning threads, and how do you detect ROM that no longer will exist (i.e. TypeInfo). You are just asking for trouble trying to merge them.

In Is a Dynamic Link Library a Shared Library? I explain the difference between a library that has been silo'd versus isolated. Where the latter is sandboxed and the former is merely ignorant of what else is in the process.

So should you accept that they are silo'd because anything else is a developmental nightmare even if you have been successful in aggregating state so that it can be passed back and forth. Now the question has become, have you crossed resources (even if it was done accidently) that are owned from one "druntime" to another "druntime" instance? Of course you did, because who wouldn't? Its not like there is any protection from doing it. Go ahead propose exploding the number of pointer types... See where that gets ya.

You put one bit of memory into another bit of memory with each being owned by a different GC, which of course doesn't know about the other. Naturally the memory that went into the other has no other references and its GC has gone ahead and collected it. Not long after that you accessed it, oh hey segfault! What did you expect? This is too easy to do by accident.

If you are going to have a runtime that has resources it owns exposed to other code (RAM, handles such as a thread or lock) don't duplicate that runtime. You are asking for trouble. Use a shared library for this, not a mix of static libraries with shared library builds of it.

Who Needs a Scope Anyway?

TLDR: Go ahead be smart! Don't use shared libraries or static libraries, go import only! See how quickly you kill off that scope that depends on having state.

So you wanna be smart, you think that your project having any binary is just a big ball of problems, so you're going import only! Well aren't you clever!

Just to clarify some things first:

  • Does it have any state? Threads, locks, globals, inter-thread communication?
  • Does it need any giant lookup tables, that should be in read only memory and shared throughout a process?
  • Will there be any symbols that cannot be templated? Or should I have said will be a right pain to use if it were templated?
  • Are you linking against a non-D library?

If you answered no to all of these questions, well congratulations you can go import only!

What? You didn't answer no to all of these questions? What are you trying to build, a whole new standard library or something?

Limiting yourself to import only requires you to limit your scope. Good bye event loops, windowing, anything asynchronous. While you can do these things, you will be limiting yourself severely enough that your code will not look familiar to others. So up to you, listen to my advice, use a shared library and have a state that can be shared or don't and put a copy into every binary, which might be fine if all you have is a single executable.

Either way, good luck with that PhobosV3 event loop whilst still being import only!

May 07

On Monday, 6 May 2024 at 03:28:47 UTC, Richard (Rikki) Andrew Cattermole wrote:

>

This post is meant to be a highly enlightening and entertaining explanation (or should I say it shouldn't cure anyones insomnia) of just how many things can go wrong with shared libraries if they are not worked with right regardless of platform.

[...]

Thanks for the write-up. It's going to take a while and probably several re-reads for me to get through this and write down some notes, but I think it's a valuable use of time.

May 08

On Tuesday, 7 May 2024 at 23:50:17 UTC, Atila Neves wrote:

>

On Monday, 6 May 2024 at 03:28:47 UTC, Richard (Rikki) Andrew Cattermole wrote:

>

This post is meant to be a highly enlightening and entertaining explanation (or should I say it shouldn't cure anyones insomnia) of just how many things can go wrong with shared libraries if they are not worked with right regardless of platform.

[...]

Thanks for the write-up. It's going to take a while and probably several re-reads for me to get through this and write down some notes, but I think it's a valuable use of time.

Will echo Atila's comments -- thanks for taking the time to write this up! It may be nice to have a version of this on the blog, glad it's archived here at the least!

May 07
Thanks for writing this.

Are you writing solely about DLLs on Windows? They don't have much in common with shared libraries on OSX and Posix.
May 08
On 08/05/2024 3:08 PM, Walter Bright wrote:
> Thanks for writing this.

I'm happy to do it.

I do sincerely hope it raises enough awareness of the situations that you can get into if your builds are even slightly "interesting" that we can have this be fully solved with some form of finality.

> Are you writing solely about DLLs on Windows?

No, although that is where I found the vast majority of problems however it isn't the source of them.

There is mention of ``RPATH`` and a difference in behavior of the loader between POSIX, OSX and Linux.

Porting my code base was fairly straight forward as the only things specific to Linux I had to deal with was featured in the ``TLS Hooking`` and ``It is Loaded, Works Yes?`` headings. Everything else was just system library differences basically and applying existing solutions to known problems found on Windows.

> They don't have much in common with shared libraries on OSX and Posix.

They do have plenty in common, this is a misconception I really want to get you off of. There is a dedicated heading for this ``Is a Dynamic Link Library a Shared Library?``.

The base level of how the linkers and loader on Windows work is still the traditional model that you are an expert in. External symbols to be found elsewhere and internal symbols found in a given binary.

If this was not the case, Optlink would not have the ability to produce DLL's that still work on Windows today.

Microsoft of course wasn't happy with that model and placed a bunch of extra behavior on top of it that I call tunings as part of their linker.

No other platform has such extreme tunings, but others do have tunings, which is why we no longer use the traditional model at the compiler level for any platform. See LLVM's IR documentation (its referenced), ``DllImport``, ``DllExport``, ``Internal`` (there are variations of it, but we'll just simplify it down to internal).
May 08
On 08/05/2024 11:50 AM, Atila Neves wrote:
> On Monday, 6 May 2024 at 03:28:47 UTC, Richard (Rikki) Andrew Cattermole wrote:
>> This post is meant to be a highly enlightening and entertaining explanation (or should I say it shouldn't cure anyones insomnia) of just how many things can go wrong with shared libraries if they are not worked with right regardless of platform.
>>
>> [...]
> 
> Thanks for the write-up. It's going to take a while and probably several re-reads for me to get through this and write down some notes, but I think it's a valuable use of time.

Thank you for saying that. I do appreciate that you are going to take the time to read it, it should be quite an interesting jumping off point for you with all the references!

There is mention of your last DConf talk wrt. private, and why someone (such as myself) would not appreciate export being a visibility modifier. See ``Common Mistakes`` heading.
May 08
On 08/05/2024 2:04 PM, Mike Shah wrote:
> On Tuesday, 7 May 2024 at 23:50:17 UTC, Atila Neves wrote:
>> On Monday, 6 May 2024 at 03:28:47 UTC, Richard (Rikki) Andrew Cattermole wrote:
>>> This post is meant to be a highly enlightening and entertaining explanation (or should I say it shouldn't cure anyones insomnia) of just how many things can go wrong with shared libraries if they are not worked with right regardless of platform.
>>>
>>> [...]
>>
>> Thanks for the write-up. It's going to take a while and probably several re-reads for me to get through this and write down some notes, but I think it's a valuable use of time.
> 
> Will echo Atila's comments -- thanks for taking the time to write this up! It may be nice to have a version of this on the blog, glad it's archived here at the least!

Thanks!

I wasn't sure it would fully fit into a single N.G. post.

6500 words, 30k characters, took four days to write.

I could almost write a masters thesis with this as a base!
May 08
On Wednesday, 8 May 2024 at 03:08:15 UTC, Walter Bright wrote:
> Thanks for writing this.
>
> Are you writing solely about DLLs on Windows? They don't have much in common with shared libraries on OSX and Posix.

That is confusing me as well. DLLs share concepts with shared libraries on other platforms, but they have subtle differences. The ones that come to my mind:

- Shared libraries export everything by default. DLLs export nothing by default. This relates to the non-standard declspec(dllexport) declaration supported by MSVC to mark exported symbols.

- Unix system linkers take shared libraries as input files directly. Windows linkers require import libraries. These import libraries contain thunks that jump to the real code in the DLL. Those thunks can be avoided if the compiler knows a symbol comes from a DLL. This is why declspec(dllimport) exists in MSVC (as a performance optimization).

- DllMain() is a Windows only construct. If it is present, it is invoked for a lot of different events (PROCESS_ATTACH, THREAD_ATTACH...). Some Unix/Posix OSes support callbacks for loading/unloading libraries at most. The mechanisms are not equivalent.

- And then there are all the funny ways in which static initialization in C++ can break in combination with Unix shared libraries. There are some fun, really opaque pitfalls like static constructors getting executed multiple times (and at times when you probably woudldn't expect). I don't think the same is true on Windows.

These differences result in a number of things that are different in one model and not the other. On Unix, it's legal to have name collisions between symbols exported from different libraries. Typically, the first encountered symbol wins. This allows mechanisms like LD_PRELOAD to work and and use a program with a replacement malloc() implementation, for example. There is no Windows equivalent for this. You'd have to provide a shim DLL in the search path that provides all symbols.
May 08
On 08/05/2024 5:13 PM, Gregor Mückl wrote:
> On Wednesday, 8 May 2024 at 03:08:15 UTC, Walter Bright wrote:
>> Thanks for writing this.
>>
>> Are you writing solely about DLLs on Windows? They don't have much in common with shared libraries on OSX and Posix.
> 
> That is confusing me as well. DLLs share concepts with shared libraries on other platforms, but they have subtle differences. The ones that come to my mind:
> 
> - Shared libraries export everything by default. DLLs export nothing by default. This relates to the non-standard declspec(dllexport) declaration supported by MSVC to mark exported symbols.

It is a convention on POSIX systems to export everything by default (negative annotation).

On Windows you have the 64k exported symbol limit so from a practical stand point you have to go positive instead.

About a year ago deadalnix told me that he thought that this was changing for some linux distros (unconfirmed) and it makes sense why the desire might be there.

Anytime you export a symbol you are pinning it into existence. You are preventing both compiler and linker from performing optimizations. It also makes your binaries larger and increases your load times.

Positive annotation might be a bit annoying and require you to understand how symbols are represented but using it regardless of platform is a much better default, this is something both me and Walter agree with although I am unsure what information he used to come to that conclusion so I cannot speak for him on that.

> - Unix system linkers take shared libraries as input files directly. Windows linkers require import libraries. These import libraries contain thunks that jump to the real code in the DLL. Those thunks can be avoided if the compiler knows a symbol comes from a DLL. This is why declspec(dllimport) exists in MSVC (as a performance optimization).

That is mostly correct, but your conclusion is wrong.

It's only a performance optimization for functions. For anything else you're stuck with going into ``DllImport`` mode explicitly. Such as an array that's in ROM like our ``.init`` symbol or ``TypeInfo`` instances; so being explicit about symbol modes is quite important to D, without the explicitness D simply won't load.

As for externs into a DLL, on Windows its pretty common for the exports to be missing from the DLL itself, hence you need the extra file for static linking. The tradeoffs that Microsoft picked here must have an interesting origin. I don't think it will be purely because Windows 95a was distributed on 30 floppy disks (or there abouts I'd have to count).

> - DllMain() is a Windows only construct. If it is present, it is invoked for a lot of different events (PROCESS_ATTACH, THREAD_ATTACH...). Some Unix/Posix OSes support callbacks for loading/unloading libraries at most. The mechanisms are not equivalent.

I covered this in ``TLS Hooking`` heading. But basically inside of PE-COFF there is a TLS section that allows providing as many of these functions are you like. The name might be special (as to indicate the purpose is meant for user-code not library code such as druntime), but the purpose is not special.

I have no idea why POSIX hasn't added this as a feature to pthread. As far as I'm aware there is no legitimate reason why it shouldn't exist. It seems like an "ewww Microsoft did it so we won't copy their good idea" kind of thing.

> - And then there are all the funny ways in which static initialization in C++ can break in combination with Unix shared libraries. There are some fun, really opaque pitfalls like static constructors getting executed multiple times (and at times when you probably woudldn't expect). I don't think the same is true on Windows.

See ``TLS Hooking``, but one thing I did find is as part of glibc it'll hook the thread death and run all the thread destructors.

Did I mention that those destructors can be run multiple times? Yeah it's a mess.

> These differences result in a number of things that are different in one model and not the other. On Unix, it's legal to have name collisions between symbols exported from different libraries. Typically, the first encountered symbol wins. This allows mechanisms like LD_PRELOAD to work and and use a program with a replacement malloc() implementation, for example. There is no Windows equivalent for this. You'd have to provide a shim DLL in the search path that provides all symbols.

I've done a quick look, it seems its allowed to have duplicate symbols on Windows as well, which makes sense otherwise things like plugins wouldn't exactly work right (and could lead to failures for stuff like REPL's).

https://learn.microsoft.com/en-us/windows/win32/api/dbghelp/nc-dbghelp-psymbol_registered_callback64

That function callback/struct is used as part of the Windows image introspection library for both loaded and not yet loaded binaries so it must be possible to get into that situation.

As for stuff like ``LD_PRELOAD`` I don't think there is anything to prevent it from existing, its just Microsoft decided not to support it. In some ways this is a security concern in it existing so I can understand that they didn't want to implement it.

Unfortunately the Windows loader is pretty badly documented, the only place I know of that documents it is the Windows Internal books and I'm a couple of versions behind (I don't remember 5 mentioning duplicate symbols).

After more reading there is a something akin to ``LD_PRELOAD`` which shock and horror is not recommended and is disabled with secure boot enabled.

https://devblogs.microsoft.com/oldnewthing/20071213-00/?p=24183
May 08
On 5/7/2024 8:45 PM, Richard (Rikki) Andrew Cattermole wrote:
>> They don't have much in common with shared libraries on OSX and Posix.
> 
> They do have plenty in common, this is a misconception I really want to get you off of. There is a dedicated heading for this ``Is a Dynamic Link Library a Shared Library?``.

Isn't it true that DLLs on Windows share their global data segment with all users of the DLL? While Linux shared libraries have a separate data segment for each process?

This is a very major difference.

« First   ‹ Prev
1 2 3