Windows vs. Unix: Linking dynamic load modules

by Chris Phoenix

Coming from a Unix background, the Windows way of dynamic loading was quite alien, and it took me a while even to figure out just how alien it was. So I've written up some descriptions of the differences, and how to create dynamically loadable files on Windows. Feel free to include these in any FAQ, documentation, or whatever, with or without attribution or editing.

Thanks to David M. Beazley (the SWIG documentation) and especially Gordon McMillan (several very useful emails) for much of this information. Any mistakes are my responsibility.

Unix vs. Windows Dynamic Loading

Unix and Windows use completely different paradigms for run-time loading of code. Before you try to build a module that can be dynamically loaded, be aware of how your system works.

In Unix, a shared object (.so) file contains code to be used by the program, and also the names of functions and data that it expects to find in the program. When the file is joined to the program, all references to those functions and data in the file's code are changed to point to the actual locations in the program where the functions and data are placed in memory. This is basically a link operation.

In Windows, a dynamic-link library (.dll) file has no dangling references. Instead, an access to functions or data goes through a lookup table. So the DLL code does not have to be fixed up at runtime to refer to the program's memory; instead, the code already uses the DLL's lookup table, and the lookup table is modified at runtime to point to the functions and data.

In Unix, there is only one type of library file (.a) which contains code from several object files (.o). During the link step to create a .so, the linker may find that it doesn't know where an identifier is defined. The linker will look for it in the object files in the libraries; if it finds it, it will include all the code from that object file.

In Windows, there are two types of library, a static library and an import library (both called .lib). A static library is like a Unix .a file; it contains code to be included as necessary. An import library is basically used only to reassure the linker that a certain identifier is legal, and will be present in the program when the .dll is loaded. So the linker uses the information from the import library to build the lookup table for using identifiers that aren't included in the .dll. When an application or a .dll is linked, an import library may be generated, which will need to be used for all future .dll's that depend on the symbols in the application or .dll.

Suppose you're building two dynamic-load modules, B and C, which should share another block of code A. In Unix, you would *not* pass A.a to the linker for and; that would cause it to be included twice, so that B and C would each have their own copy. In Windows, building A.dll will also build A.lib. You *do* pass A.lib to the linker for B and C. A.lib does not contain code; it just contains information which will be used at runtime to access A's code.

In Windows, using an import library is sort of like using "import spam"; it gives you access to spam's names, but does not create a separate copy. In Unix, linking with a library is more like "from spam import *"; it does create a separate copy.

Windows Dynamic Loading

Windows Python was built in Microsoft Visual C++; using other compilers may or may not work (though Borland seems to). The rest of this section is MSVC specific.

When creating DLLs in Windows, you must pass "python15.lib" to the linker. To build two DLLs, spam and ni (which uses C functions found in spam), you could use these commands:

cl /LD /I/python/include spam.c ../libs/python15.lib
cl /LD /I/python/include ni.c spam.lib ../libs/python15.lib

The first command created three files: spam.obj, spam.dll and spam.lib. Spam.dll does not contain any Python functions (such as PyArg_ParseTuple), but it does know how to find the Python code thanks to python15.lib.

The second command created ni.dll (and .obj and .lib), which knows how to find the necessary functions from spam, and also from the Python executable.

Not every identifier is exported to the lookup table. If you want any other modules (including Python) to be able to see your identifiers, you have to say "_declspec(dllexport)", as in "void _declspec(dllexport) initspam(void)" or "PyObject _declspec(dllexport) *NiGetSpamData(void)".

Developer Studio will throw in a lot of import libraries that you don't really need, adding about 100K to your executable. To get rid of them, according to Gordon McMillan:

>That's just DevStudio's transparent UI! The c runtime is specified 
>under compiler options (that "Debug Mulithreaded DLL" list box), and 
>is implicit in the link options. If you specify "ignore default 
>libs", you need to add the correct msvcrtxx.lib to the libs list.

Helpful comments from comp.lang.python readers:

You can also pass a linker definition file, e.g. "ni.def", which catalogs the
exported symbols. For cross-platform code, this extra piece of effort saves a
bunch of ugly preprocessor magic to work around the non-standard linkage

Tres Seaver    713-523-6582
Palladion Software

Even the Unix story is slightly hairier than this correct outline. Putting asid a few older variants which support dynamic loading in only a primitive way, specialists still need to be aware of 1. AIX, where Paul Duffin describes <URL:> the challenge of sensible dynamic loading, as well as 2. the complications of loading objects, which often construct themselves only with difficulty <URL:>, and 3. a few cases where loaders don't correctly re- solve backward references without explicit hints <URL:>. -- Cameron Laird +1 281 996 8546 FAX

Chris Phoenix's HOME