A good compiler with bad defaults

01 Mar 2017

Visual C++ 6 is a venerable compiler, but when I first saw it I was shocked how bad it was dealing with what should be fairly trivial cases. Take hello world, dynamically linked against the CRT, comparing Visual C++ 5 and Visual C++ 6:

C:\TEMP>type hw.c
#include <windows.h>
#include <stdio.h>

int main(int argc, char * argv[])
{
    printf("Hello world from C version %i\n", _MSC_VER);
    return 0;
}

C:\TEMP>cl /MD hw.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 11.00.7022 for 80x86
Copyright (C) Microsoft Corp 1984-1997. All rights reserved.

hw.c
Microsoft (R) 32-Bit Incremental Linker Version 5.00.7022
Copyright (C) Microsoft Corp 1992-1997. All rights reserved.

/out:hw.exe
hw.obj

C:\TEMP>hw.exe
Hello world from C version 1100

C:\TEMP>sdir -cw40 hw*|more
------------+------------+-------------
hw.c    158b|hw.exe 3072b|hw.obj  473b
------------+------------+-------------
 3 files, 0 dirs, 3703b used, 4094m vol size, 2044m vol free

Visual C++ produces a 3Kb hello world. Now for Visual C++ 6:

C:\TEMP>cl /MD hw.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 12.00.8168 for 80x86
Copyright (C) Microsoft Corp 1984-1998. All rights reserved.

hw.c
Microsoft (R) Incremental Linker Version 6.00.8168
Copyright (C) Microsoft Corp 1992-1998. All rights reserved.

/out:hw.exe
hw.obj

C:\TEMP>hw.exe
Hello world from C version 1200

C:\TEMP>sdir -cw40 hw*|more
------------+------------+-------------
hw.c    158b|hw.exe 16.0k|hw.obj  533b
------------+------------+-------------
 3 files, 0 dirs, 16.6k used, 4094m vol size, 2044m vol free

Visual C++ 6 is 16Kb - more than 5 times worse. For what is otherwise a minor upgrade, that seems pretty serious. How did it go so badly wrong?

The answer lies in the layout of the executable file itself. Below is the output of "link /dump /headers" on the two executables. For ease of comparison I'm using the tools from Visual C++ 5 for this, with the program generated by Visual C++ 5 on the left and Visual C++ 6 on the right:

Microsoft (R) COFF Binary File Dumper Version 5.00.7022
Copyright (C) Microsoft Corp 1992-1997. All rights reserved.


Dump of file hw5.exe

PE signature found

File Type: EXECUTABLE IMAGE

FILE HEADER VALUES
     14C machine (i386)
       4 number of sections
58A01292 time date stamp Sat Feb 11 23:45:22 2017
       0 file pointer to symbol table
       0 number of symbols
      E0 size of optional header
     10F characteristics
            Relocations stripped
            Executable
            Line numbers stripped
            Symbols stripped
            32 bit word machine

OPTIONAL HEADER VALUES
     10B magic #
    5.00 linker version
     200 size of code
     600 size of initialized data
       0 size of uninitialized data
    1020 address of entry point
    1000 base of code
    2000 base of data
         ----- new -----
  400000 image base
    1000 section alignment
     200 file alignment
       3 subsystem (Windows CUI)
    4.00 operating system version
    0.00 image version
    4.00 subsystem version
    5000 size of image
     400 size of headers
       0 checksum
  100000 size of stack reserve
    1000 size of stack commit
  100000 size of heap reserve
    1000 size of heap commit
       0 [       0] address [size] of Export Directory
    4000 [      28] address [size] of Import Directory
       0 [       0] address [size] of Resource Directory
       0 [       0] address [size] of Exception Directory
       0 [       0] address [size] of Security Directory
       0 [       0] address [size] of Base Relocation Directory
       0 [       0] address [size] of Debug Directory
       0 [       0] address [size] of Description Directory
       0 [       0] address [size] of Special Directory
       0 [       0] address [size] of Thread Storage Directory
       0 [       0] address [size] of Load Configuration Directory
       0 [       0] address [size] of Bound Import Directory
    4064 [      3C] address [size] of Import Address Table Directory
       0 [       0] address [size] of Reserved Directory
       0 [       0] address [size] of Reserved Directory
       0 [       0] address [size] of Reserved Directory


SECTION HEADER #1
   .text name
     1CC virtual size
    1000 virtual address
     200 size of raw data
     400 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         (no align specified)
         Execute Read

SECTION HEADER #2
  .rdata name
       C virtual size
    2000 virtual address
     200 size of raw data
     600 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40000040 flags
         Initialized Data
         (no align specified)
         Read Only

SECTION HEADER #3
   .data name
      5C virtual size
    3000 virtual address
     200 size of raw data
     800 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0000040 flags
         Initialized Data
         (no align specified)
         Read Write

SECTION HEADER #4
  .idata name
     176 virtual size
    4000 virtual address
     200 size of raw data
     A00 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0000040 flags
         Initialized Data
         (no align specified)
         Read Write

     Summary

        1000 .data
        1000 .idata
        1000 .rdata
        1000 .text
Microsoft (R) COFF Binary File Dumper Version 5.00.7022
Copyright (C) Microsoft Corp 1992-1997. All rights reserved.


Dump of file hw6.exe

PE signature found

File Type: EXECUTABLE IMAGE

FILE HEADER VALUES
     14C machine (i386)
       3 number of sections
58A01351 time date stamp Sat Feb 11 23:48:33 2017
       0 file pointer to symbol table
       0 number of symbols
      E0 size of optional header
     10F characteristics
            Relocations stripped
            Executable
            Line numbers stripped
            Symbols stripped
            32 bit word machine

OPTIONAL HEADER VALUES
     10B magic #
    6.00 linker version
    1000 size of code
    2000 size of initialized data
       0 size of uninitialized data
    101A address of entry point
    1000 base of code
    2000 base of data
         ----- new -----
  400000 image base
    1000 section alignment
    1000 file alignment
       3 subsystem (Windows CUI)
    4.00 operating system version
    0.00 image version
    4.00 subsystem version
    4000 size of image
    1000 size of headers
       0 checksum
  100000 size of stack reserve
    1000 size of stack commit
  100000 size of heap reserve
    1000 size of heap commit
       0 [       0] address [size] of Export Directory
    204C [      28] address [size] of Import Directory
       0 [       0] address [size] of Resource Directory
       0 [       0] address [size] of Exception Directory
       0 [       0] address [size] of Security Directory
       0 [       0] address [size] of Base Relocation Directory
       0 [       0] address [size] of Debug Directory
       0 [       0] address [size] of Description Directory
       0 [       0] address [size] of Special Directory
       0 [       0] address [size] of Thread Storage Directory
       0 [       0] address [size] of Load Configuration Directory
       0 [       0] address [size] of Bound Import Directory
    2000 [      3C] address [size] of Import Address Table Directory
       0 [       0] address [size] of Reserved Directory
       0 [       0] address [size] of Reserved Directory
       0 [       0] address [size] of Reserved Directory


SECTION HEADER #1
   .text name
     15C virtual size
    1000 virtual address
    1000 size of raw data
    1000 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         (no align specified)
         Execute Read

SECTION HEADER #2
  .rdata name
     186 virtual size
    2000 virtual address
    1000 size of raw data
    2000 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40000040 flags
         Initialized Data
         (no align specified)
         Read Only

SECTION HEADER #3
   .data name
      5C virtual size
    3000 virtual address
    1000 size of raw data
    3000 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0000040 flags
         Initialized Data
         (no align specified)
         Read Write

     Summary

        1000 .data
        1000 .rdata
        1000 .text

What this output shows is that Visual C++ 5 generated four sections of 0x200 bytes (512 bytes) each, so 2Kb of sections, plus 0x400 bytes (1Kb) of headers, for a 3Kb executable. Visual C++ 6 generated three sections of 0x1000 bytes (4Kb) plus an extra 0x1000 bytes for headers, resulting in a 16Kb executable. The virtual size values are small and similar between the two - the difference is that by default Visual C++ 6 aligns all sections within the file on a 4Kb boundary.

This behavior is optional and can be turned off with the "/OPT:NOWIN98" linker switch. With that switch specified, the result is three 0x200 byte sections plus 0x400 bytes of headers, resulting in a 2.5Kb executable - 512 bytes smaller than the Visual C++ 5 version, and less than 1/6th the size produced by default. The user just needs to make the logical leap that the solution to large executable file sizes is related to Windows 98 optimization in order to discover this switch.

The reason file alignment matters in Windows executables is that each section needs to be laid out on its own page in memory, so when the program is run the result will be 3 or 4 4Kb pages, regardless of how compact the file is on disk. This happens because each section has slightly different page permissions - in the Visual C++ 6 case, one page is executable, one is readable, one is readable and writable. The only way to enforce these permissions is at the page level. So even when the executable is only 2.5Kb, there may be more than one 4Kb IO needed to read it, and may be more work to lay it out correctly if the disk representation doesn't match the memory representation.

What I don't know (and don't think I ever will know) is why Windows 98 was special. The costs referred to above exist for any platform that can properly execute Windows executables. Why would Windows 98 have costs that Windows 95 did not, or that Windows NT did not? If those costs are substantial, the choice from the Visual C++ team makes sense - the two products shipped at a similar time. But optimizing for best Windows 98 performance was arguably not the correct thing to do within a few years, and Visual C++ 6 lasted much longer than that.

The postscript to this is that Visual C++ 2005 reverted to the same behavior as Visual C++ 5 by default, with 512 byte file alignment.