First, from Intel’s manuals 3A 9.1.4:

The first instruction that is fetched and executed following a hardware reset is located at physical address FFFFFFF0H. This address is 16 bytes below the processor’s uppermost physical address. The EPROM containing the software-initialization code must be located at this address.

The address FFFFFFF0H is beyond the 1-MByte addressable range of the processor while in real-address mode. The processor is initialized to this starting address as follows. The CS register has two parts: the visible segment selector part and the hidden base address part. In real-address mode, the base address is normally formed by shifting the 16-bit segment selector value 4 bits to the left to produce a 20-bit base address. However, during a hardware reset, the segment selector in the CS register is loaded with F000H and the base address is loaded with FFFF0000H. The starting address is thus formed by adding the base address to the value in the EIP register (that is, FFFF0000 + FFF0H = FFFFFFF0H).

The first time the CS register is loaded with a new value after a hardware reset, the processor will follow the normal rule for address translation in real-address mode(that is, [CS base address = CS segment selector * 16]). To insure that the base address in the CS register remains unchanged until the EPROM based software-initialization code is completed, the code must not contain a far jump or far call or allow an interrupt to occur (which would cause the CS selector value to be changed).

Two screenshots showing instructions in address FFFFFFF0H and FFFF0H(Shadow BIOS, see below) and their jumps. The first one is showing a AMI BIOS, while the second Phoenix BIOS. High BIOS of AMI directly jumps to the shadowed one, and both high and shadowed one jump to the same address. But High BIOS of Phoenix just keeps running in high addresses. The first instruction of both BIOS after all jumps is FAh, say cli(disable interrupts). I’m not going to do more reverse engineering. 🙂
biosboot_ami
biosboot_phoenix

NOTE: Main memory is not initialized yet at this time. From here:

The motherboard ensures that the instruction at the reset vector is a jump to the memory location mapped to the BIOS entry point. This jump implicitly clears the hidden base address present at power up. All of these memory locations have the right contents needed by the CPU thanks to the memory map kept by the chipset. They are all mapped to flash memory containing the BIOS since at this point the RAM modules have random crap in them.

The reset vector is simply FFFFFFF0h. Now, POST is started as described here:

POST stands for Power On Self Test. It’s a series of individual functions or routines that perform various initialization and tests of the computers hardware. BIOS starts with a series of tests of the motherboard hardware. The CPU, math coprocessor, timer IC’s, DMA controllers, and IRQ controllers. The order in which these tests are performed varies from motherboard to motherboard. Next, the BIOS will look for the presence of video ROM between memory locations C000:000h and C780:000h. If a video BIOS is found, It’s contents will be tested with a checksum test. If this test is successful, the BIOS will initialize the video adapter. It will pass controller to the video BIOS, which will inturn initialize itself and then assume controller once it’s complete. At this point, you should see things like a manufacturers logo from the video card manufacturer video card description or the video card BIOS information. Next, the BIOS will scan memory from C800:000h to DF800:000h in 2KB increments. It’s searching for any other ROM’s that might be installed in the computer, such as network adapter cards or SCSI adapter cards. If a adapter ROM is found, it’s contents are tested with a checksum test. If the tests pass, the card is initialized. Controller will be passed to each ROM for initialization then the system BIOS will resume controller after each BIOS found is done initializing. If these tests fail, you should see a error message displayed telling you “XXXX ROM Error”. The XXXX indicates the segment address where the faulty ROM was detected. Next, BIOS will begin checking memory at 0000:0472h. This address contains a flag which will tell the BIOS if the system is booting from a cold boot or warm boot. A value of 1234h at this address tells the BIOS that the system was started from a warm boot. This signature value appears in Intel little endian format, that is, the least significant byte comes first, they appear in memory as the sequence 3412. In the event of a warm boot, the BIOS will will skip the POST routines remaining. If a cold start is indicated, the remaining POST routines will be run.

NOTE: Main memory is initialized in POST. Main part of memory initialization code is complicated, and is directly provided by Intel which is known as MRC(Memory Reference Code).

There’s one step in POST called BIOS Shadowing:

Shadowing refers to the technique of copying BIOS code from slow ROM chips into faster RAM chips during boot-up so that any access to BIOS routines will be faster. DOS and other operating systems may access BIOS routines frequently. System performance is greatly improved if the BIOS is accessed from RAM rather than from a slower ROM chip.

A DRAM control register PAM0(Programmable Attribute Map) makes it possible to independently redirect reads and writes in the BIOS ROM area to main memory. The idea is to allow for RAM shadowing which allows read-access for ROMs to come from main memory whereas writes will continue to go to ROMs. Refer to Intel’s MCH datasheet for details:

This register controls the read, write, and shadowing attributes of the BIOS area from 0F0000h–0FFFFFh. The (G)MCH allows programmable memory attributes on 13 Legacy memory segments of various sizes in the 768 KB to 1 MB address range. Seven Programmable Attribute Map (PAM) Registers are used to support these features. Cacheability of these areas is controlled via the MTRR registers in the processor.

Big real mode(or unreal mode) is used to address more memory beyond 1M, as BIOS ROMs becomes larger and larger. In big real mode, one or more data segment registers have been loaded with 32-bit addresses and limits, but code segment stays unchanged:

Real Mode Big Real Mode Protected Mode
Code segment(cs) 1M 1M 4G
Data segments(ds, es, fs, gs) 1M 4G 4G

Protected mode can also refer 4G memory. But BIOS is mainly written for real mode, big real mode is a better choice for addressing.

Then, BIOS continues to  find a bootable device, see wikipedia:

The BIOS selects candidate boot devices using information collected by POST and configuration information from EEPROM, CMOS RAM or, in the earliest PCs, DIP switches. Option ROMs may also influence or supplant the boot process defined by the motherboard BIOS ROM. The BIOS checks each device in order to see if it is bootable. For a disk drive or a device that logically emulates a disk drive, such as a USB Flash drive or perhaps a tape drive, to perform this check the BIOS attempts to load the first sector (boot sector) from the disk to address 7C00 hexadecimal, and checks for the boot sector signature 0x55 0xAA in the last two bytes of the sector. If the sector cannot be read (due to a missing or blank disk, or due to a hardware failure), or if the sector does not end with the boot signature, the BIOS considers the disk unbootable and proceeds to check the next device. Another device such as a network adapter attempts booting by a procedure that is defined by its option ROM (or the equivalent integrated into the motherboard BIOS ROM). The BIOS proceeds to test each device sequentially until a bootable device is found, at which time the BIOS transfers control to the loaded sector with a jump instruction to its first byte at address 7C00 hexadecimal (1 KiB below the 32 KiB mark).

After all of above, BIOS initialization is finished. It’s your turn to take control of your system from address 0000:7c00!!

Why this address? It’s not defined by Intel nor Microsoft. It was decided by IBM PC 5150 BIOS developer team(David Bradley). See here:

BIOS developer team decided 0x7C00 because:

– They wanted to leave as much room as possible for the OS to load itself within the 32KB.
– 8086/8088 used 0x0 – 0x3FF for interrupts vector, and BIOS data area was after it.
– The boot sector was 512 bytes, and stack/data area for boot program needed more 512 bytes.
– So, 0x7C00, the last 1024B of 32KB was chosen.

Macintosh really has a fantastic UI. I once installed OSX 10.3 successfully using pearpc, but it was awfully slow, since it need to emulate PowerPC via software layer. And now, I just successfully installed OSX 10.6 Snow Leopard in VMware player 5.0.2. The equivalent workstation version is 9.0.2. I tried virtualbox, but it just did not work. Now, please follow my steps:

1. Create a new VM and select the OS type as “FreeBSD”.

2. Close the VMware player. Open the *.vmx file find the line:

Change to:

Start VMware player again. The OS type is now set to “Mac OSX 10.6 Server”:

osx106_1

3. Modify VM: set Memory to 1G, check “Accelerate 3D Graphics”. Now, here’s the most _important_ step: Remove your existing hard disk, and add a new one, but choose SCSI as the virtual disk type. Change the CD/DVD device to also use SCSI type via the “Advanced” button. Without these steps, you will encounter the famous “still waiting for root device” error. Seems OSX cannot handle IDE devices correctly. 🙁

osx106_2

4. I used EmpireEFI v1085 to boot and install OSX 10.6, please find both images for your own. When EmpireEFI boots finishes, mount the OSX 10.6 image and press F5 to refresh. VMware player 5.0.2 supports *.dmg file directly, please select all files to find the image:

osx106_3

5. Here we go, just press enter and you will be booted into OSX installer:

osx106_4

6. If the disk drive doesn’t appear under “Select the disk where you want to Install Mac OSX”, go to menu Utility –> Disk Utilities and erase the whole disk:

osx106_5

7. The disk should now appear. You may want to customize the installation by clicking the button in left-bottom corner. Then let’s move on:

osx106_6

8. When finished, the system will reboot automatically. And it will fail. We must still use EmpireEFI to boot. But we select to boot OSX this time:

osx106_7

9. After some simple configuration, you will finally have your OSX desktop. Cheers!

osx106_8

Updated Aug 23:
The EmpireEFI did not work after I upgraded to 10.6.8. Kernel panic appeared like:

I used iBoot 3.3 to replace EmpireEFI, and booted successfully.

osx106_9

You will have App Store(available in 10.6.6+) in your menu after upgrade. I also installed Xcode 3.2.6 which can still be downloaded from Apple. It requires 10.6.6 too.

I’ve googled a lot to find the answer. But none really solve the problem simply and gracefully, even on stackoverflow. So we’ll do ourselves here 🙂

Actually, std::string supports operation using multibytes characters. This is the base of our solution:

g_cs is a Chinese word(“你好” which means hello) encoded in UTF-8. The code works under both Windows(WinXP+VS2005) and Linux(Ubuntu12.04+gcc4.6). You may wanna open a.txt to check whether the string is correctly written.

NOTE: Under Linux, we print the string directly since the default console encoding is UTF-8, and we can view the string. While under Window, the console DOES NOT support UTF-8(codepage 65001) encoding. Printing to it simply causes typo. We just convert it to a std::wstring and use MessageBox() API to check the result. I will cover the encoding issue in windows console in my next post, maybe.

I began to investigate the problem, since I cannot find a solution to read/write a UTF-8 string to XML file using boost::property_tree. Actually, it’s a bug and is already fixed in boost 1.47 and later versions. Unfortunately, Ubuntu 12.04 came with boost 1.46.1. When reading non-ASCII characters, some bytes are incorrectly skipped. The failure function is boost::property_tree::detail::rapidxml::internal::get_index(). My test code looks like:

Almost the same structure with the previous function. And finally the utf8_to_ucs2() function:

Please add header files yourselves to make it compile 🙂

There’s a series of C++ FAQ: http://www.parashift.com/c++-faq/pointers-to-members.html. And one of it addresses some technical details:

Pointers to member functions and pointers to data are not necessarily represented in the same way. A pointer to a member function might be a data structure rather than a single pointer. Think about it: if it’s pointing at a virtual function, it might not actually be pointing at a statically resolvable pile of code, so it might not even be a normal address – it might be a different data structure of some sort.

Let’s write some demo code:

Output on WinXP/VS2005:

Output on Ubuntu12.04/gcc4.6:

Both are run on 32bit systems. Sizes of pointer-to-member-function are not confirmed to be equal to sizeof(void *). And you are not allowed to convert a pointer-to-member-function to void * or a plain pointer-to-function type. Only equality comparisons(=, !=) are supported. Thus, it can be used to implement the “Safe Bool Idiom” here: http://www.artima.com/cppsource/safebool.html.

Perhaps the best-known use of this technique comes from the C++ Standard the conversion that allows the state of iostreams to be queried uses it.

But at least in gcc/libstdc++, its implementation uses bool and void * conversion operations. In basic_ios class:

This allows std::cout to even be deleted. Hmmm…Just write down some details here.

Continue with last article, only paste code here:

IcmpSendEcho() is used to send ICMP messages which does not require administrator privilege. I summarize all cases in which raw socket, icmp api or system ping approach may fail:

raw socket icmp api system ping
Windows XP Administrator OK OK OK
User WSAEACCES in sendto() OK OK
Guest WSAEACCES in sendto() OK OK
Windows 7 Administrator WSAEACCES in socket() OK OK
User WSAEACCES in socket() OK OK
Guest WSAEACCES in socket() ERROR_ACCESS_DENIED in IcmpCreateFile() Unable to contact IP driver. General failure.
Run as Administrator OK OK OK

You may ask what’s the difference between “Administrator” and “Run as Administrator”, the answer comes from stackoverflow:

– When an user from the administrator group logs on, the user is allocated two tokens: a token with all privileges, and a token with reduced privileges. When that user creates a new process, the process is by default handed the reduced privilege token. So, although the user has administrator rights, she does not exercise them by default. This is a “Good Thing”.

– To exercise those rights the user must start the process with elevated rights. For example, by using the “Run as administrator” verb. When she does this, the full token is handed to the new process and the full range of rights can be exercised.