In C++, pre/post-main function call can be implemented using a global class instance. Its constructor and destructor are invoked automatically before and after the main function. But in C, no such mechanism. Actually, there’s a glib implementation that can help. You may want to read my previous post about CRT sections of MSVC. I just copy the code and do some renaming:

One limitation in glib code is the lack of support for VS2003 and early versions. #pragma code_seg() is used to implement the same function:

Output from msvc/gcc:

This post provides a detailed view of the MSDN article CRT Initialization. Just paste some content here:

The CRT obtains the list of function pointers from the Visual C++ compiler. When the compiler sees a global initializer, it generates a dynamic initializer in the .CRT$XCU section (where CRT is the section name and XCU is the group name). To obtain a list of those dynamic initializers run the command dumpbin /all main.obj, and then search the .CRT$XCU section (when main.cpp is compiled as a C++ file, not a C file).

The CRT defines two pointers:
– __xc_a in .CRT$XCA
– __xc_z in .CRT$XCZ

Both groups do not have any other symbols defined except __xc_a and __xc_z. Now, when the linker reads various .CRT groups, it combines them in one section and orders them alphabetically. This means that the user-defined global initializers (which the Visual C++ compiler puts in .CRT$XCU) will always come after .CRT$XCA and before .CRT$XCZ.

So, the CRT library uses both __xc_a and __xc_z to determine the start and end of the global initializers list because of the way in which they are laid out in memory after the image is loaded.

Let’s run our VS debugger to further investigate the CRT implementation. I’m using VS2010, and a global instance of class A is declared and initialized:

Now set the breakpoints in the constructor and destructor, and start debugging. I’ve tried exe/dll and dynamic/static CRT combinations to view the call stacks:

_initterm is defined as follow. It is used to walk through __xc_a and __xc_z mentioned above:

__xc_a, __xc_z and other section groups are defined as:

gcc uses similar technology to deal with pre/post-main stuff. The section names are .init and .fini .

Code snippet:

Output:

Exception safety is ensured, when using shared_ptr. Memory allocated by m_a is freed even when an exception is thrown. The trick is: the destructor of class shared_ptr is invoked after the destructor of class C.

The HelloWorld application is much simpler than the Windows one. Just put parameters into registers from %eax to %edx, and trigger a 0x80 interrupt.

There are hundreds of documents telling how Windows implements its system call, using int 2e or sysenter. But I can find no code to run to learn how exactly it works. And I managed to write it for my own.

The C code requires only SDK to compile, for I have copied all DDK definitions inline. It opens a C:\test.txt file and write Hello World! to it. Quite simple. I’ve tried a HelloWorld console application. But its call sequence is far more complex than I have expected, after I have made some reverse engineering and read some code from ReactOS project(Wine does not help, since it does not implement a Win32 compatible call sequence in the console case). The code is the basis of our further investigation. It invokes NtCreateFile(), NtWriteFile() and NtClose() in ntdll.dll with dynamic loading:

I found the handle value and all three function pointers are fixed, at least on my Windows XP(SP3). It may be caused by the preferred base address of ntdll.dll. The code should work on all Windows platforms, since it has no hardcoded values.

Now, translate the C code into assembly. Error handling is ommitted:

Compile the code with:

The assembly code of NtCreateFile(), NtWriteFile() and NtClose() are copied directly from ntdll.dll. For NtCreate(), 25h is the system service number that will be used to index into the KiServiceTable(SSDT, System Service Dispatch Table) to locate the kernel function that handles the call.

System service numbers vary between Windows versions. This is why they are not recommend to be used directly to invoke system calls. I only demonstrate the approach here. For Windows XP, the values of the three numbers are 25h, 112h and 19h. While for Windows 7, they are 42h, 18ch and 32h. Change them yourself if you’re running Windows 7. For a complete list of system service numbers, refer here or dissemble your ntdll.dll manually :). The output executable is a tiny one, only 3KB in size, since it eliminates the usage of CRT. Moreover, it has an empty list of import functions!

At 7ffe0300h is a pointer to the following code:

NOTE: The assembly code may work only when compiled to a 32-bit application. 64-bit mode is not tested and need modification to work.

One last point, it seems the STR_HELLO string is required to be aligned to 8 byte border. Otherwise, you will get 0x80000002 error code(STATUS_DATATYPE_MISALIGNMENT).