                                                        Joseph Hlavaty
                     Copyright (C) J. Hlavaty, 1995
                          All rights reserved.

                 MapMan:  Rolling Your Own Symbols for 16-bit
                              Windows Environments

        Almost any Windows programmer has been in a situation where
he or she has wished for more symbols than those, for example, that
are shipped with the Windows SDK.  Some programmers might have even
found themselves in a situation where they need to debug another
application due to a conflict with their own app.  In this article,
I'll show you a useful tool that will permit you to build .SYM files for
any 16-bit Windows executable, including the DLLs that make up Windows
itself!   This tool is called MapMan, the Windows map file manager.

Environment

        This application runs on any 16-bit application for the Windows
operating environment.  I have used many Windows 3.1 binaries in my
test suite, including Write, ProgMan, ClipBrd, Notepad, Krnl386 and
User.  Of course the application also runs under Win-OS/2 2.x and 3.0.
(It can even be used on 16-bit OS/2 executables such as EPM.EXE in OS/2 Warp).

        No undocumented features of Windows are needed, simply an
understanding of the Windows New Executable (NE) format that is the
basis of all 16-bit Windows executables (32-bit Windows apps use a
different format that will not be discussed in this article).  This
article will explain a number of the data structures built into an
NE header, including the Resident and Non-Resident Names tables, the
Segment Table and the Entry Table.

        The MapMan application is a real-mode DOS program.  I fully
intend to port the application to Windows (as we'll discuss below), but
decided that this version would only be a DOS app due to space and time
constraints.

Background

        As most Windows programmers are aware, there is a need to export
any procedure that will be called external to the application.  One example
of such an exported procedure is an application window procedure (or wndproc).
A wndproc is not called directly by the application, but rather by Windows.
As we'll see, exporting a function simply adds it to a few internal tables of
the NE header and so makes the function accessible to any module, either by
name or ordinal.  This process
of exporting is much like adding a chapter title to the table of contents of
a book; without the chapter title in the table of contents it might be
impossible to find the chapter by simply skimming the book.  Even if you were
successful, you would probably find that your search would be a very
time-consuming process.  Likewise, if Windows were not able to look up your
application's wndproc in your module's list of exported functions, it would
be difficult or impossible for Windows to call the procedure (to send it a
message, for example).

        A programmer can export functions by placing the
function names to be exported in the Exports section of an executable's
module definition (.DEF) file.  Oten there is also a compiler-dependent keyword
that can be used in a function definition (e.g., _export) to export
a function without requiring a .DEF file entry.  My source code uses
the compiler-independent Exports section method to define exported
entry points.  Let's take a look at the sample .DEF in Table One for a moment,
as this file contains exported functions.

        You may have noticed a Name field, a Description field and an Exports
section, but you probably didn't realize how they interrelate.  Each of these
pieces is actually part of one of two tables in the NE header (the resident
or the non-resident names tables).  The CODE and DATA keywords, of course,
define the application's segments, found in the segment table in the NE
header.  And, yes, flags such as the EXETYPE keyword as well as other
keywords such as HEAPSIZE and STACKSIZE all resolve to sections of
the NE header.

Terminology

        Now let's examine some of the terms we'll need to understand for this
article.  Please refer to the MMNEHEAD.H file in Table Two for more information
on new executable data structures.

The DOS Executable Header

        A DOS executable (or MZ) header is the data structure found at the
beginning of all 16-bit Windows executables.  It is actually the same header used
for DOS .EXE files.  (We should note that .COM files have no header, they
are a straight binary image; DOS device drivers (.SYS files) have a different
header than either COM or EXE files).  This DOS MZ header permits DOS to recognize a
Windows app or library as an executable.  If you attempt to run a Windows
executable from a DOS command line, the DOS loader will launch the DOS stub
found just after the MZ header.  The default stub writes a message that the
application requires Microsoft Windows and returns to DOS.  Windows programs
use a slightly-modified version of this MZ header to locate their new
executable header in the executable image.  Windows will read this
New Executable header to launch any given Windows application or library.
While they won't be discussed in this article, many other header types
also use the same slightly-modified MZ header to locate their headers in
an executable.  See the section OTHER HEADER FORMATS below.

The New Executable Header

        A new executable (or NE) header is a revised header structure used
by Windows applications (and also used by OS/2 1.x applications such as EPM).
It is much larger than its DOS counterpart, due to the greater complexity of
these 'newer' apps.  The function of an NE header is to permit the Windows
loader to load individual pieces of the executable (for example, a code
segment) without having to physically load the entire application into
memory.

        The NE header functions as a giant index into the application binary.
For example, if Windows later needs to load another segment found in this app
(due to a SegmentNotPresent fault, for example), then the Windows loader can
extract the individual segment's offset out of the NE and load that portion
of the file containing just that segment into memory.

        As the Windows kernel refers constantly to the NE data structure, it
is loaded into memory along with the application, albeit in a slightly
modified form.  This in-memory representation of an NE is known as a module
database (or MDB).  See pp. 319-339 in Undocumented Windows for more information on
module databases.  For purposes of this article, we'll only concern ourselves
with the file-based representation of the NE header.

Other Header Formats

        You may also see three other common types of 32-bit file headers.  The
first header, with an LE signature, is used for Windows VxDs.  Try dumping
a file with a .386 file extension from your Windows system directory.  The term
LE stands for Linear Executable where linear means non-segmented (also called
"flat").

        The second type of header, with an LX signature, is used for modern
OS/2 executables.  Most modules in OS/2 Warp use this file format.  Some that
still contain NE headers are EPM.EXE, and PMTKT.DLL.  The term LX also stands
for Linear Executable.

        The third type of header, with a PE signature, is used for Win32 apps.
The Win32s test application FREECELL.EXE has such a header.  The term PE stands
for portable executable.

The Names Tables

There are two names tables in an NE header, the resident names table and the
non-resident names table.  Each entry in these tables consists of a
textual string (the API name) and an ordinal value.  Windows executables
can access an exported entry in these names tables either by name (such as
CreateWindow) or by ordinal (entry 41).  The ordinal value is an offset
into the entry table discussed below.  For this reason, your application's
ordinal values should always be distinct within a single executable module.
The first entry in each of these tables is reserved as documented below,
and contain ordinal values of 0.

The Resident Names Table

The resident names table contains those exported
entry points that should remain resident, such as __GP
in GDI, and the WEP routine found in libraries.  The resident names table also
contains the module name for the current executable as its first entry.
You would use the RESIDENTNAME keyword in your .DEF file to make your exported
function a member of the resident names table.  By default, exported entry
points are placed in the non-resident names table.

The Module Name

        The module name is often (but not necessarily) the executable file
name of an application.  In reality, it is the value following the NAME or
LIBRARY line in the .DEF file.  It is the module name that is used by Windows
to identify an application once it has been loaded into memory.  For example,
all display drivers regardless of their file name have the module name DISPLAY.
Similarly both KRNL386.EXE and KRNL286.EXE share the module name KERNEL.  In
any case, a module name must be distinct in a single running session of
Windows.  As you won't be running two KERNEL modules or more than one
DISPLAY driver, this module name overloading is not usually a problem.

The Non-Resident Names Table

        The non-resident names table is structurally very similar to the
resident names table.  The first entry of the non-resident names table is
the module description from the .DEF file.  Additionally, as we'll see
below when we cover this table in more detail, the offset of this table in
the NE is NOT based on the start of the NE itself, unlike most NE pointers,
but is rather based off of the start of the MZ header.

The Segment Table

        The segment table contains an entry for each segment in the
application.  It contains such information such as the segment size, the
segment type (code or data, for example) and so on.  Windows can use this
information to calculate what kind of selector needs to be allocated for
the segment and also how big a block of memory will be needed to fit
the information found in a particular segment of the executable.

The Entry Table

        The entry table contains a list of segment:offsets (for example
8:f1b in the case of the Windows 3.1 function CreateWindow) and some
additional information such as whether or not the entry point is exported,
and the like.

        The entry table contains externalized routines that can be
called from another application.  Do not confuse this with the relocation
information used to patch (or "fixup") an app code segment with a call to another
application (such as a call to the USER module's CreateWindow API made
in your application) or with a call to a function in another segment
within a particular app.  The entry table contains functions in a
single module (such as CreateWindow in the USER module) that may
be called from other modules.

The Map File

        A map file is an ascii text file containing information that
maps (or identifies) pieces of a module by symbolic value to addresses in
the module's segments.  Consider the map file found in Table Three, which was
generated by the Microsoft 5.1 linker using object modules built with
debugging information using the -Zi option.  This map file is linker
specific, other linkers may generate different .MAP files.  See Table 5
for an overview of the sections of a Microsoft linker .MAP file.

        At the top of the .MAP file, you'll notice the module name,
in this case TRAPMAN.  The second section of the .MAP file contains
a description of the segments in the application.  It has been edited
to make it more compact, as the original had divided this small application's
two segments into over thirty pieces!  The number to the left of the ':' is
the segment number (in hex).  You can see that this application has two
segments, one of type CODE, the other of type DATA.  The first segment has
2bf2h bytes (decimal 11250) which is 25d0h -- the start of the last
section -- + 622h -- the length of the last section.  The second segment
has 0b3ah bytes (decimal 2874), which is 930h + 20ah as outlined above.

        The third section of the .MAP file gives the DGROUP of the module.
Normally all code segments of an application will be shared across multiple
instances (Windows will only load one copy of the code and read-only data
for any and all copies of the application in memory at any one time).
In an application using the DATA MULTIPLE keyword in its .DEF file, however,
each instance of the application will have its own private DGROUP
distinct from that of all other instances (as the DGROUP is both readable and
writeable).  This field denotes which segment number (in hex)
is the DGROUP.  Note that by Microsoft convention, the last segment in a
module is the DGROUP.

        The fourth section of the .MAP file is the list of exported functions
found in this module.  This application has two, one for the About box and
one for the main window.  Both are WndProcs and must be exported so that
Windows can call them.  Again, all offsets are in hex, so the About box
routine is actually 1564 bytes into the first segment of the application.

        The fifth section of the .MAP file contains the public symbols
sorted by name.  Public symbols can be thought of as those symbols that
are known to the linker.  In other words, a public symbol can be used within
any executable module, and not just in the source file that it is defined in.
In C, for example, functions normally have such external linkage.

        As mentioned previously, this application was built with debug
information so the linker had much more information than it normally
would to put in the .MAP file.  The About box routine is Pascal, as
Windows requires an exported entry to be (the lack of a preceding
underscore hints to this), and is in the first segment at offset 61ch.

        Now would be a good time to mention that there is nothing about
exporting a function that requires the Pascal calling convention.  A CDecl
function can be exported.  If you would export a CDecl wndproc, however,
Windows will be unable to call the function successfully. Windows normally
assumes that exported functions have the Pascal calling convention.  The
Pascal calling convention holds that the function called will clean its
parameters off the stack (usually with a RET N instruction).  If the function
is CDecl, then it assumes that the caller will clear parameters from the stack).
The stack will be left in an unstable state if the called function had other
than a void parameter list.  It is worth noting that there are
even a few Windows exported functions that are CDecl by necessity
(e.g., wsprintf()), as the Pascal calling convention doesn't support
variable argument functions.

        The next line contains 0:0 for an address.  The null value denotes
this as a far pointer requiring "fixup" as discussed above.  At link
time, the linker has no idea at what segment:offset value the MESSAGEBOX routine
will be found, just that it is in the USER module with ordinal 1.  It is
the responsibility of the Windows loader to replace occurrences of MESSAGEBOX
in the application with the appropriate selector offset to the function
in memory.

        The MYFARPROC and MYODS functions are actually assembly language
functions (note that they are all upper-case with no leading underscore).
They were marked as public symbols with the PUBLIC keyword in the assembly
source file.

        The function _DPMIAllocateLDTDescriptors is C code (note the leading
underscore and the fact that case is preserved in the symbol name).

        Lastly we have the symbol __astart.  If you look at the last line of
the .MAP file, you'll notice that it is the __astart routine that is the
actual entry point (i.e., the first piece of code executed by Windows when
it launches a new instance of the application).  As the double leading
underscore indicates, this is part of the C runtime library for my compiler.
Double underscores are used for public C library functions to avoid the
risk of name collisions with non-library source code.  One exception to
this rule are the "standard library" functions in C (such as strlen()), which
would only have a single underscore in this table.  For more information, see
your specific compiler documentation or The C Programming Language, 2nd edition.

        Notice that the About box routine also appeared above in the list of
exported functions.  All exported functions can be thought of as public
symbols and so we find the exports in this list also.

        The sixth section of the .MAP file contains the identical symbols
in section five, but sorted by address (although the section is called
Publics by Value in the .MAP file).  It's nice of the linker to give it
to us both ways.  If you've broken into a debugger because your app has just
trapped, and you're staring at CS:IP=103f:0886, it's nice to know that part
of your .MAP file is sorted by address.  If you're trying to find the segment
and offset of one of your symbols to set a breakpoint in that same debugger,
you'll appreciate the fact a section of the .MAP file is also sorted by name.
Who said that you can't have your cake and eat it, too?

        My version of MapSym, however, only considers the 'Publics by Value'
section to be essential.  For reasons I haven't looked into, removing the
'Publics by Name' section makes the resulting .SYM file slightly smaller, and
MapSym doesn't complain.  Remove the 'Publics by Value' section and MapSym
will refuse to build a .SYM file, giving a message to relink the executable.
Since the two tables should be identical, and differ only in order, there's
probably no reason why MapSym should need to look at both, in all fairness.

        The seventh section of the .MAP file contains the line:  "Program
entry point at 0001:25E1".  As explained above, this segment:offset is that
of the __astart() library function for this application.  Our WinMain() is not
called directly by Windows.  It will be called by the C library code that
in this case is the Windows entry point for the application.

Introduction

        MapMan was built with the Microsoft C compiler version 6, and
generates .MAP files compatible with Microsoft linkers and the Microsoft
MapSym utility used to create symbol files.  You may need to modify
MapMan's output to match your particular linker's output in order to
create symbol files for a different linker.  This should not be a
time-consuming task as the largest piece of work in MapMan
is the functions specific to the NE header, which are compiler and
linker independent (at least for our purposes).

Portability of Source Code

        I intend to create a Windows version of this application which
will reuse all source code except for the platform specific code (in
mapdutil.c).  For this reason, you'll see that I have my own TYPEDEFs for
BOOL, WORD and other standard Windows types (this is a DOS app, of course!).
I also have wrapper functions around all calls to the standard C library, such
as printf().  This is because the Windows version will not call printf(), but
some other function (probably a file-system related function, but it may simply
append to a buffer in memory).  It's a given that it won't be writing to STDOUT
with printf()!  I intend to use this DOS version of MAPMAN.EXE as the stub
for the Windows version, making an app that truly will run in either DOS
or Windows!

The Problem At Hand

        Our task is to create a .MAP file similar in form and function to that
generated by a Microsoft linker and acceptable to the Microsoft MapSym symbol
file generator.  The process for doing this is shown in Table 6.  As
executables are binary files, we'll need to open the executable for binary
read (just say NO! to character translation), and then parse the MZ and NE
headers (if any) found.

Base Functions for MapMan

       The overall flow of the MapMan executable is simple, as is evidenced
by the main() routine.  First any arguments give by the user are processed.
Then, if a name was given, we allocate a buffer and attempt to load
a file by that name into our buffer for processing.  Finally, we free
our buffer, and return to DOS.

        The LoadExe() routine, which actually loads the file and processes it,
is not any more complicated.  It opens the file (as binary, for reasons already
given), and loads it into the buffer we allocated above.  At this step,
pBuffer (a pointer to the start of the buffer that we allocated) points to
the beginning of the file.  If the file is a valid Windows executable, then
we'll find at the start of the buffer an MZ (old-style) executable header.

        We call the SetMZ() function to set our pointer to the MZ header
(pMZ), and validate the new pointer by checking its signature.  If SetMZ()
returns FALSE, then the file loaded has no valid MZ header.  It might be
a .COM file, or simply a data file.  In any case, we can do no further
processing and exit, after warning the user.

        If we have a valid MZ header, then we must also check that we have
additionally loaded a valid Windows executable.  We do that by verifying that an NE
header exists after the MZ header.  If the MZ header relocation table address
is less than 0x40 (64 decimal), then no Windows header exists.  Once again,
we exit after warning the user.

        Otherwise we call SetNE() to set and validate our pointer to the NE
header (pNE) that we'll use for further processing.  If this routine returns
FALSE, then no valid NE header exists and we exit, again after warning the
user.  If we do in fact have a valid NE header, then we can begin processing
the header to create the internal structures needed to build our map file.

How to Generate a Map File

        Map file generation is a two-stage process:  first, we create internal
representations of the structures that we'll need from the NE; then we build
a .MAP file from them.  The first stage of the above process is found in the
calls to BuildResidentNamesTable(), BuildNonResidentNamesTable(),
BuildEntryTable(), and BuildSegmentTable().  The first three routines create
and modify the list of entry points pointed to by pEntryHead.  The last
routine creates a list of application segments pointed to by pSegmentHead.
With just these two pointers, we will have most of the information that
we will need to build a .MAP file.

BuildResidentNamesTable()

        The resident names table is pointed to by the pResident pointer in
the NE header structure.  This pointer is based off the start of the NE
header.  We can say that the ResidentNames table begins pResident bytes
after pNE.  Remember, however, that the rules of C pointer addition
require us to assign pNE to a pointer to char in order to add pResident to
it.  See section 6.4 in Kernighan and Ritchie for more information on
pointers to structures.

        The first entry added to our entry point list is the first entry
from the resident names table.  This is the module name we talked about
above.  We parse the resident names table first so that the module name
(which will be needed later to generate the .MAP file) is the first element
of our entry point list.  This is a convenience, but not a requirement.

        Often Windows applications have no resident APIs.  In this case,
the resident names table still exists, the first and only entry contains
the module name of the executable.  But remember that an API will only
show up in one table, it is either in the resident names table or else
it is in the non-resident names table.

        Unfortunately, any structure that we'd write to map a
resident names entry is inherently unusable.  This is because there is
a variable length structure right in the middle of it!  The structure
begins with a length byte 'n' followed by 'n' characters, this string
(although it is not a valid C string as the string is not terminated
by a null character) names the function that is exported.  A word follows
the name which gives the ordinal value (the number of the entry table entry
that corresponds to this exported function).

        For each entry in the table, we create a new entry point node
by calling MakeEntryPointElement().  An entry point node contains the
following fields:  a pointer to a null-terminated function name, an API
ordinal value, a segment number and an offset.  Any value that is not
yet known is set to INVALID_VALUE so that we do not use it.  Currently
segment number and offset are invalid as these values will not be known
until we parse the entry table later.  The resident names table ends when a
length byte of 0 is found, and in this case the name and ordinal fields do
not exist.  The same holds for the other tables that we'll discuss in this
article -- a length byte of 0 represents the end of the current table, with
no usual fields following.

        There is only one non-standard part to this function.  We create
a null-terminated API name by overwriting the first byte of the ordinal
number in our buffer after first saving the value of the ordinal in our entry point
element list.  Of course, this requires that the pointer saved also point
to the first character of the string (the length byte must be skipped).
If we do this, we won't have to reallocate a second piece of memory to
hold a new null-terminated character string (also known as an ASCIIZ string).
We need an ASCIIZ string so that standard library functions in DOS can
be called -- those that take strings require their strings to be
terminated with a null character.

BuildNonResidentNamesTable()

        The non-resident names table is pointed to by the pNRes pointer
in the NE header structure.  The  BuildNonResidentNamesTable()
function is nearly identical to the BuildResidentNamesTable() function
above.  The only difference is that the non-resident names table offset
is based on the start of the MZ header (i.e., the start of the executable).
This must have been done to facility easy access.  Once the operating
system (or OS) had saved the pNRes offset to the table, the OS can
reload the table by opening the file again and seeking for pNRes bytes.
In this case there is no need to find the NE as our pointer takes us
directly to the table we want.

        As the function before it, BuildNonResidentNamesTable() adds
entries to our entry point list.  Once this function completes (as described
above), all exported entry points in the Windows module have been placed on
our entry point list with their ordinals.  The first entry in the non-resident
names table is the module description.  Non-entry points, like module name
and module description, have a 0 ordinal and will not have an entry
in the entry table.

BuildEntryTable()

        The entry table is pointed to by the pEntry pointer in the NE header
structure.  The pEntry pointer is based off the start of the NE header.   It is
this table that assocates the name and number by which an application can
export a function to other modules to an actual segment:offset in the
exporting module.  The NE header also has the cbEntry field, which gives the
size in bytes of the entry table.

        An entry table entry can consist of various combinations of four
structures.  An entry table entry begins with a single byte, which is the
count of records in this entry (or 0 at the end of the table).  This is
followed by a byte that describes the records in this entry.

        Note that this second byte can have a number of meanings.  If the
second byte is zero, then the first byte is a count of entries to skip.
This means that this count is to be added to the current entry number
to calculate a new entry number.  This is an optimization to reduce the
size of the entry table by replacing anywhere from 1-255 entries with
only two bytes.  The VGA.DRV that comes with my copy of Windows 3.1 has
a number of these skip counts in its entry table entries.

        The second reserved value is 0xFE (254 decimal).  This value
denotes that this entry is a data value.  See Undocumented Windows, Chapter
5, for a description of such constants in the Windows kernel (.e.g,
__WINFLAGS).  Note that these data values must be extracted via GetProcAddr() and
will not be available via symbolic name.  (Rather, they will be, but they
won't be right!)  See the section of Windows Internels that discusses the
data values initialized during the Windows boot process for more information
as to why these symbolic values won't work by themselves.

        The last reserved value is 0xFF (decimal 255).  This value marks this
group of entry records as belonging to a moveable segment.  A moveable
entry table entry contains a byte of flags, an int 3fh instruction, the
actual segment number, and the offset of the particular entry.

        One of the interesting things about this structure is the Int 3fh.
This software interrupt was used by the Microsoft overlay manager.
Perhaps the real-mode version of Windows also used this interrupt as a loader
for discarded segments.  If so, things must be much easier now in Protected
Mode with not present selectors and segment not present faults on their
access!  For our purposes, we only care about the segment number and the
offset.  The software interrupt is just so much real mode baggage...

        Any other values in the second byte give the segment number of
the segment containing these elements.  This segment is a fixed
(i.e., not moveable) segment, and the entry structure contains only
flags and an offset.

        Once we've got all this information, we call GetEntryPointElement()
to find the name associated with the current entry table index (remember
that this is the same as the API ordinal), and update our entry element list.
We continue processing entry table records until the count is zero or we've
read more bytes than the cbEntry field in the NE header.  You should note
that the count of records is often greater than 1 (there can be multiple
records in a single grouping depending on how the ordinals are laid out).

BuildSegmentTable()

        We'll also need to build a segment table for the .MAP file.  To do
this, we'll need to walk the segment table in the NE.  This table is pointed
to by pSegment (based on the start of the NE header), and has a size of
cbSegment * 8.  Each segment record has four fields:  offset, length, flags and
minimum size, and there are cbSegment segment records in the segment table.

        We allocate a block of memory large enough for the segment table,
and the pSegmentHead variable points to the start of that block.  We'll
then walk the block later by considering the pointer to point to an array
of cbSegment segment records.  Now we have all the information that we
need to generate a .MAP file, so let's take a look at how we do it...

Building a .MAP File

  The process of constructing a .MAP file is found in BuildMapFile().
This process is actually fairly complicated; the complications arise
from the demands of making a .MAP file that Microsoft's MapSym symbol file
generator will accept, and not from any conceptual complexity.  As an example,
MapSym will be unable to find the entry point of a program if the .MAP file
used does not contain "Program entry point at " at the beginning of a line.
Likewise a section with the heading "Publics by Value" must be in the .MAP
file or MapSym will not accept it as valid for symbol file generation.  Note
that MapMan does not sort the Publics by Name or Publics by Value sections
of the .MAP file in the interest of simplicity.

        As already noted in Table Five, the map file requires the following
sections in a specific format:  module name, segment table, DGROUP,
exported entry points, public symbols by name, public symbols by value and
application entry point.

        See Table Four for an example .MAP file generated by MapMan for the
same application whose .MAP file is given in Table Three.

Generating Map Files For Another Compiler

        You'll need to update five functions.  Four are in mapllist.c
(DumpEntryPointList(), DumpSegmentList(), DumpPublicsByName() and
DumpPublicsByValue()), as they refer to private data structures found
in that module.  The main function is BuildMapFile() in mapmkmap.c.
Have fun and let me know how it goes!

Extending A Generated Map File

        You can add to these .MAP files and have the new symbols for use
in your .SYM file.  Just be sure to put your new symbols and addresses
in the Publics by Value section if you want MapSym to see them!

        For example, I took a few items from the THHook section of
Undocumented Windows, namely hExeHead and CurTDB, and added them to
a .MAP file I generated for KRNL386.EXE with MapMan.  We know from
the exports table that THHook is at 4:0218.  From Undocumented Windows,
we find that hExeHead = THHook + 0x04 or 0x21c and CurTDB is at THHook +
0x10 or 0x228.  So I added the following lines (again, to the Publics by
Value section) to my map file, recompiled with MapSym, and ran my debugger.

                      0004:021C       myhExeHead
                      0004:0228       myCurTDB

I could now dump hExeHead and CurTDB with complete symbols, using the names
that I had given in my generated .MAP file.

Discussion of References for the NE Header

        The Duncan book has an excellent description of the New Executable
format.  My copy has been missing since the last time I moved, but I hope
that I'll find it in a box somewhere!

        Microsoft's Programmer's Reference, Volume 4:  Resources is also a
good source.  It covers other Windows types also in addition to the NE header.

        Tom Swan's Inside Windows File Formats is the best file format book
that I've seen yet for the 16-bit Windows platform.  This is one of those
books that I keep close to my computer.  It was invaluable both as a check
for the Microsoft documentation as well as in its own right.

Closing

        I hope that you've found this article to be interesting reading.
I also hope that you'll find MAPMAN to be a useful tool in your collection
of Windows debugging tools.  But the fact that such function as that
found in MapMan does not exist in most debuggers or C compilers brings
up a few questions.

        Why don't OS/2's kernel debugger and WDeb386 give you exported
entry points for free?  Why did I need to write a tool such as MapMan?
It's obviously not a lack of skill on the the part of the developers of
Windows and OS/2.  Their debugger and loader programmers should certainly
understand the NE format well enough to add this sort of function.
Soft-ICE for Windows (by Nu-Mega Technologies) has had it for years.
It can only be that they haven't had the time to do so.  If so, I have no
objection if they use my source code...  If your Windows debugger doesn't
give you exported entry points, then ask them to add the feature!  (And
then get a copy of Soft-ICE for Windows).  I'm not associated in any way
with Nu-Mega, of course, but I find their products extremely useful for
Windows software development.

        Why have C compilers stopped shipping startup source code?  It's
a great way to learn how your compiler works!  If your compiler does ship
it (see your documentation), then consider yourself lucky.  In any case,
learn what your startup code is doing for you.  Often you're given enough
source to modify and rebuild the startup code so that you can create
your own specialized version.  If your compiler doesn't ship startup code,
then ask them to do so!

             Table One -- A Sample .DEF File

NAME             TRAPMAN
DESCRIPTION      'J. Hlavaty:  Windows GP handler for debugging'
EXETYPE          WINDOWS

PROTMODE

CODE             LOADONCALL NONDISCARDABLE
DATA             PRELOAD    MULTIPLE


HEAPSIZE         1024
STACKSIZE        8096

EXPORTS
                 MainWndProc   @1
                 About         @2


             Table Two -- The New Executable header

( include mmnehead.h)

             Table Three -- A Sample Map File

 TRAPMAN

 Start     Length     Name                   Class
 0001:0000 013AEH     TRAPMAN_TEXT           CODE
 0001:13AE 001A4H     DPMI_TEXT              CODE
 0001:1560 01069H     HANDLER                CODE
 0001:25CA 00000H     TRAPDATA_TEXT          CODE
 0001:25D0 00622H     _TEXT                  CODE
 0002:0000 00000H     DATA                   DATA
(portions removed to conserve space)
 0002:0930 0020AH     c_common               BSS

 Origin   Group
 0002:0   DGROUP

 Address   Export                  Alias

 0001:061C About                   About
 0001:0322 MainWndProc             MainWndProc

  Address         Publics by Name

 0001:061C       About
 0000:0000  Imp  MESSAGEBOX           (USER.1)
 0001:16C2       MYFARPROC
 0001:2595       MYODS
 0001:1422       _DPMIAllocateLDTDescriptors
 0001:25E1       __astart

  Address         Publics by Value

(removed to conserve space)

Program entry point at 0001:25E1

             Table Four -- MapMan Generated .MAP file For the Same Application

 TRAPMAN

 Start     Length     Name                   Class
 0001:0000 02BF2H     Seg1_TEXT              CODE
 0002:0000 00B3AH     Seg2_DATA              DATA

 Origin   Group
 0002:0   DGROUP

 Address   Export                  Alias

 0001:0322 MAINWNDPROC             MAINWNDPROC
 0001:061C ABOUT                   ABOUT

  Address         Publics by Name

 0001:0322       MAINWNDPROC
 0001:061C       ABOUT

  Address         Publics by Value

 0001:0322       MAINWNDPROC
 0001:061C       ABOUT

Program entry point at 0001:25E1

                    Table Five -- Microsoft Linker Map File Sections

        Section         Contents
        =======         ========
           1            Module name of executable
           2            Segment table
           3            DGROUP
           4            Exported entry points
           5            Public symbols by name
           6            Public symbols by value
           7            Entry point

Note that spacing, case and terminology in many of the section headers
     are mandatory!  Mapsym will reject the .MAP file if it cannot find
     the string "Publics by Value", for example with an obscure "No public
     symbols" error message.

              Table Six -- The Map File Creation Process

        1. Load header information of executable into memory or exit (on failure)
        2. Verify valid MZ header or exit
        3. Verify valid NE header or exit
        4. Build internal representation of Resident Names entries
        5. Build internal representation of Non-Resident Names entries
        6. Add Entry Table information to internal representation from 4 and 5
        7. Build internal representation of Segment Table
        8. Write .MAP file format to STDOUT (can be redirected)

========================================================

References

Duncan, Ray.  Advanced OS/2 Programming.  (Redmond, WA:  Microsoft Press,
   1989).

Kernighan, B. and Ritchie, D.  The C Programming Language, 2nd edition.
   (Englewood Cliffs, NJ:  Prentice Hall, 1988).

Microsoft Corporation.  Microsoft Windows SDK Programmer's Reference,
   Volume 4:  Resources.  (Redmond, WA:  Microsoft Corporation, 1992).

Pietrek, Matt.  Windows Internals.  (Reading, MA:  Addison-Wesley, 1993).

Schulman, A., Maxey, D., and Pietrek, M.  Undocumented Windows.
   (Reading, MA:  Addison-Wesley, 1992).

Swan, Tom.  Inside Windows File Formats.  (Indianapolis, IN:  Sams
   Publishing, 1993).

