             
                           WORDFUN 2.0

           Freeware DOS word utilities.  These programs may be 
     freely distributed.  David M. Dibble,  July, 2001

          SPELL, ANAGRAM, FIND - all using 
              WORDS.DAT dictionary
          WC - word count
          VIEW, VIEW.DOC
          WORDFUN.TXT, README.TXT

                      [Advanced options]
          MAKE  - compile user or replacement dictionary
          CHECK - remove duplicate words from sorted list,
                  preliminary to running MAKE
          EXTRACT.BAT - automate word list generation

          WORDFUN 2.0 contains a 118,000 word list which is more 
     than 96% compatible with The American Heritage Dictionary of 
     the English Language.  The fast, noninteractive SPELL 
     CHECKER will proof an entire book in less than a minute, 
     writing to file unmatched words for each chapter.  Spell 
     check short text files without loading a word processor.  
     Includes ANAGRAM and FIND for use with Scrabble, Word 
     Jumble, or crosswords--or simply for quick word look ups.  
     Does WORD FREQUENCY counts, the output lists ranked by 
     number or alphabetically.  WORD COUNT accepts wild cards, 
     will give total word counts for an entire book in a matter 
     of seconds.  Includes an enhanced MORE replacement.

           What's New:  Added spell checking capability.  Word 
     frequency will now output alphabetic lists, with or without 
     the frequency numbers.  Increased speed.  Files now have an 
     embedded path to the WORDS.DAT dictionary, so they can be 
     used in any directory--that is, put the tiny SPELL file in 
     your DOS path, and spell check anywhere on your hard disk.  
     WORDS.DAT data path may be modified.  Included are the 
     advanced utilities MAKE and CHECK.

          These utilities are free for use, and may be freely 
     distributed, including sale on CD-ROM collections.  They are 
     to remain free.  This compilation represents years of my 
     life.  Use of this word list in commercial ventures is 
     expressly forbidden without a contractual license from me-- 
     i.e. I have the right to sue you under the existing copyright 
     laws, both for royalties and for damages.  If you want to use 
     the list in a spelling checker you have to give it away.



          THE DATA PATH -- C:\SPELLING

          The tiny programs ANAGRAM, FIND, and SPELL all require 
     access to the WORDS.DAT dictionary in order to function.  To 
     find the dictionary, they will first look in the current 
     directory.  Failing to find it there, they will look for 
     WORDS.DAT in the C:\SPELLING directory.  If you want access 
     to these programs from anywhere on your hard disk, then 
     create the directory C:\SPELLING and place WORDS.DAT there.

          The data path is easy to modify, so that you can put 
     WORDS.DAT and related files in a logical subdirectory within 
     the existing structure of your hard disk.  See the later 
     section on patching the files.  But the easiest way to try 
     out everything is to create C:\SPELLING, unzip all the files 
     into it, and proceed.



          SPELL    =============================================

          SPELL does both word frequency and spelling checks.  
     Type SPELL /? to see:

          Usage> SPELL {/P}{/F}{/A}{/L} Filespec {out_filespec}

          Spelling:  compare file words to WORDS.DAT and output 
                     only words not found in dictionary.

          /P = Prompt (y/n) before processing each file
          /F = word Frequency - compile word list, rank by use
          /A = word frequency, Alphabetic sort by word
          /L = alphabetic word List, no numbers

          Filespec may contain wild cards.  If no output is 
     specified, extents .SPL or .FRQ are used.  Input files must be 
     plain ASCII text, and should not contain hyphens except for 
     compound adjectives.  With professional writing, no editor 
     wants to see text with hyphenated words at the end of lines; 
     all it does is disrupt the revision process.

          The /P prompt option may be used in combination with 
     the other options.

                    SPELL GOOD.TXT

          No options specified, so default is spelling.  SPELL 
     parses GOOD.TXT, and compares the file words to WORDS.DAT 
     dictionary.  Only those words for which no match was found 
     in the dictionary will be output to the file GOOD.SPL.  The 
     spelling check ignores case.

          Spelling turned out to be quite effective, especially 
     considering my initial negative expectations.  The WORDS.DAT 
     dictionary was modified, with thousands of proper nouns and 
     names added to make it more suited for spell checking (the 
     World Almanac is handy for this).  WORDS.DAT was able to 
     uncover errors in the spell checker that came with my word 
     processor.  I confirmed that this commercial spell checker 
     was letting pass misspellings that did not fool the WORDS.DAT 
     dictionary.

          The spell check is reasonably fast.  Using a 486DX-66 
     computer and files TEST1, TEST2, TEST3, each file of about 
     5,000 words, each yielding a list of about 1,400 words to 
     check, I entered the command:

                    SPELL T*

          Total elapsed time to read in, parse, and spell check 
     all three files was 5 seconds.
          I further tested the program with a number of manuscripts.  
     I converted a book to ASCII (takes about 1 second), then:

                    SPELL CHAP*

     The program took forty seconds to finish, then I combined the 
     .SPL files with:  COPY *.SPL ALL
          SPELL ALL removed duplicates and put everything in 
     alphabetical order.  Note that all of these files had been 
     double checked with the commercial spelling checker that came 
     with my word processor.  They should not have contained any 
     misspelled words.  Yet I found mistakes like "psychoanalyse."

          The point is that SPELL made it easy to scan hundreds 
     of thousands of words to find these problems.  The fact that 
     it is not interactive, which originally seemed such a minus, 
     turned out to have its advantages.  When used alone for 
     small text files, or in combination with a regular spelling 
     checker, SPELL can definitely be useful.

                    SPELL /F GOOD.TXT 

     will cause SPELL to read in GOOD.TXT and output a list of 
     words, each preceded by the number of times it occurs in the 
     document, to a file GOOD.FRQ.  The words in GOOD.FRQ will be 
     ranked with most used words first, and words appearing only 
     once last.

                    SPELL /A GOOD.TXT

     as above, but with /A the frequency word list is sorted 
     alphabetically, rather than ranked by word occurrence.

                    SPELL /L GOOD.TXT

     Identical to /A, except no frequency numbers are given.   /L 
     produces a straight, alphabetized list of all the words in 
     GOOD.TXT, written to GOOD.FRQ.

          In this list possessives such as everybody's and one's 
     are listed as separate words.  But in a spelling check the 
     apostrophe-S is ignored, with the root word being compared 
     to the WORDS.DAT dictionary.  This means that the display 
     stats of unique and duplicate words are slightly different 
     between spelling checks (possessives truncated) and word 
     frequency (words remain whole).

          In all these lists, and in spelling, capitalization 
     follows the all-or-none rule.  Every occurrence of the word 
     in a file must be capitalized; a single exception and the 
     word becomes lowercase.

          A final technical note:  the speed of the spell tests 
     really depends very little on the file word count; what 
     matters is the size of SPELL's internal word list.  In the 
     above test, I could have combined all the chapters into a 
     single file and run SPELL on that.  The total file word count 
     would be the same.  But the internal word list generated from 
     a single megabyte file would soar.  If the list is 9,000 
     words, then each new word must be compared to that list and, 
     if unique, added to its end.  One can easily double the time 
     required for a spelling check by working with a single 
     megabyte-sized file instead of many chapter-sized files.  On 
     the other hand, if your 25,000-word novelette reduces to a 
     list of only 3,000 words, then spell checking it as one large 
     file will hardly slow things down.  Those with gigahertz 
     machines may not care about this and may merrily spell check 
     two- or three-megabyte files.  In that case you should 
     realize that SPELL's internal word list can't exceed 18,000 
     words, and possibly less, depending on your configuration.  
     This buffer size may be a lot bigger than you realize; it is 
     probably large enough to spell check a hefty Stephen King 
     novel as a single file--and while waiting for SPELL to finish 
     you could always read the book.



          ANAGRAM   ============================================

          For help with Scrabble or a Word Jumble.  To use, type 
     ANAGRAM at the DOS prompt, followed by the letters you wish 
     to unscramble:  ANAGRAM letters 

          Remember that WORDS.DAT must be present in the current 
     directory, or in C:\SPELLING.

          For the Scrabble blank tile, or wild card, use a 
     question mark.  Some examples: 

                    ANAGRAM  costnoy 
                    ANAGRAM  satire? 
                    ANAGRAM  ?retina 
                    ANAGRAM  aa??? 

          The last example would output all five-letter words 
     that have two a's.  This will quickly scroll off the screen, 
     so you may wish to redirect output to a file, or through 
     MORE--or better yet, through VIEW /R, which is a MORE 
     replacement.  

                    ANAGRAM aa??? | view /r 

          Now you will be able to page back and forth, use Home and 
     End to examine the output.  Any time you use wild cards the 
     programs ANAGRAM and FIND will go through the entire 118,000 
     word dictionary.  If you dislike typing ANAGRAM, then rename 
     the program to something like AN.COM.



          FIND     =============================================

          For help with Scrabble or with crosswords.  If there is a 
     name conflict with the DOS FIND program, then rename my 
     utility to LOOKUP or something.  To use, type FIND at the DOS 
     prompt, followed by a letter pattern you wish to resolve:  
     FIND letters 

          WORDS.DAT must be present in the current directory, or 
     in C:\SPELLING.

          Unlike ANAGRAM, the letter positions are locked as 
     entered.  Once again, for the Scrabble blank tile use a 
     question mark.  This will be replaced with a single letter.  
     With FIND, however, you may use an asterisk (*) to denote 
     any combination or number of letters, or even no letters.  
     For example:
          
                    FIND *A*E*I*O*U* 

     will display all words in WORDS.DAT which contain the vowels 
     in order (abstemious, facetious, and so forth).  Similarly, 
     FIND *a*a*a*a*a*  would locate all words that have five or 
     more a's.  

                    FIND *EIGHT* 

     will display all words which contain the word "eight," such as 
     eightvo, heighten, pennyweight etc.  The asterisk may be 
     replaced with a number of letters, or by nothing, but the 
     question mark will have a one to one replacement.  How to find 
     all words that have two consecutive i's? 

                    FIND *ii* 
 
          As a special case, mostly for Scrabble play, FIND will 
     accept single input (FIND X), which outputs all lowercase 
     words, of 8 letters or less, which contain the designated 
     letter.  This is a subset of FIND *X* which would output all 
     words containing the letter X, including those that were 
     capitalized, and long words of more than 8 letters.  Note 
     that single input can be redirected to a file, or piped, as 
     in two of the examples below: 

          FIND  f??h*           fashion, fishmonger, foxhound ...  
          FIND  *a              words ending in an A 
          FIND  tr*ia           get spelling:  triskaidekaphobia 
          FIND  q >Q.TXT        redirect Q words to file 
          FIND  z | VIEW /R     piping, page text for Z words 
          FIND  *               entire dictionary 

     The output from FIND * can be redirected to a file, to browse 
     the dictionary.  Enter:  FIND *  >COMPLETE.DIC

          Better, and taking less disk space, is EXTRACT.BAT.  Use 
     this batch file to extract the original word lists from 
     WORDS.DAT in order to modify them and recompile them into a 
     personal version of WORDS.DAT.  Please read the later section 
     on creating replacement dictionaries.



          WC -- Word Count  ====================================

          There are many ways to get a word count, including 
     running a spelling checker.  I found myself writing the 
     figures down, then using a hand calculator to add everything 
     up.  I needed the word count not just for a chapter, but for 
     an entire book.  Curiously, I could find nothing online.  
     Hence this word count utility, which accepts wild cards.  
     Files must be in plain ASCII for accurate totals.  To use: 

                    WC Filespec 

     where Filespec may contain wild cards.  Thus: 

                    WC *.DIC 

     when run in a directory that contains A.DIC, B.DIC, C.DIC 
     etc., will not only show the word count for each file, but 
     will display the cumulative word count total for the complete 
     dictionary.  

          For a book:   WC CHAP* 

          This gives a word count for every chapter, as well as 
     a cumulative word count for the complete book.  Remember the     
     chapters must be in plain ASCII.  No extent was specified, 
     so no .BAK files will be included in the totals.



          PATCHING THE PROGRAMS  (make a backup copy!)

          As initially stated, the tiny programs ANAGRAM, FIND, 
     and SPELL all require access to the WORDS.DAT dictionary in 
     order to function.  To find the dictionary, they first look 
     in the current directory, then for C:\SPELLING\WORDS.DAT.  
     If you want access to these programs from anywhere on your 
     hard disk, then create the directory C:\SPELLING and place 
     WORDS.DAT there.

          The data path is easy to modify.  The patch procedure 
     for ANAGRAM, FIND, and SPELL is identical.  In each of them 
     the data path C:\SPELLING\WORDS.DAT is listed at 10 hex (in 
     DEBUG, where COM files start at 100 hex, the address is 
     110).  Using a hex editor, you will see that the data path 
     starts on the second line.  You have three full lines of 16 
     bytes, or 48 bytes total, to work with.  In the ASCII 
     display, type over C:\SPELLING\WORDS.DAT with the full 
     subdirectory path where you have the WORDFUN files.  The 
     entry must terminate in a null.  Since the field is filled 
     with nulls, you needn't bother if you enter a longer data 
     path.  If you enter a shorter path, then flip to the hex 
     portion of the display, and terminate WORDS.DAT with hex 00.

          If you must use DEBUG:
               DEBUG SPELL.COM
               N MYSPELL.COM
               D
               E 110 'C:\YOUR\DATA\PATH\WORDS.DAT',0
               D 100
               W
               Q

     The N MYSPELL.COM renames the modified file, since you forgot 
     to make a backup before doing any patching.  In the fourth 
     line, you need the quote marks to enter a string into DEBUG.



          PERSONAL DICTIONARIES  ================================

          PART I:  Using MAKE.COM.

          WORDS.DAT provides excellent basic spell checking, but 
     you will probably want to add personal names, street 
     addresses, company names, special interests, and so forth.

          You can create a personal dictionary to use as an 
     adjunct to WORDS.DAT (or, indeed, to use as a replacement), 
     but some strict rules must be adhered to.  Your word list 
     must be in alphabetical order, without duplicates, one word 
     to a line (NO spaces or hyphens, although apostrophes are 
     allowed), the lines ending in a carriage return.  The last 
     word of the file must also end with a carriage return.

          Conveniently enough, this is precisely the output 
     format of a *.SPL file:  dictionary sort, no duplicates, one 
     word to a line (no spaces or hyphens), all lines, including 
     the last, ending in a carriage return.  The unmatched words, 
     those not in the WORDS.DAT dictionary, are what you want for 
     your personal dictionary.  Use SPELL's wild cards on all 
     your manuscripts, letters, reports, and articles, then 
     combine the .SPL files with:
                    COPY *.SPL ALL
                    SPELL ALL PERSONAL.DCT
     The second command takes all the combined, unmatched words 
     and puts them in alphabetical order, removing duplicates.  
     The file PERSONAL.DCT now needs to be pruned of accented or 
     misspelled words.

          Go through and remove any words with accented letters, 
     such as attach, rsum, or caf (my commercial spelling 
     checker can't handle accented words either).  At best such 
     entries are useless, and they may cause problems.  Don't 
     introduce single letters to the word list--if you are using 
     the output from SPELL there will be none.  If you are using 
     the output from /L there will be single letters, but MAKE 
     will ignore them.  Remove misspelled words from PERSONAL.DCT, 
     and anything that you don't want to compile.  Be very careful 
     to confirm the spelling of words that you include; now is the 
     time to get things right.  To create a dictionary:

                    MAKE  PERSONAL.DCT  WORDS.TOO

     Any output name can be specified, but WORDS.TOO is convenient 
     (see option 3 below).  If no output name is specified, the 
     default WORDS.DAT is used, which can overwrite and destroy 
     the main dictionary, so take care.  Since MAKE inserts a 128- 
     byte header, the compression ratio for a small file will be 
     negligible.  Keep PERSONAL.DCT on hand, so that you can later 
     amend or add to it, and recompile it.

          There is no particular limit to the size of your 
     personal dictionary:  the more words, the more efficient the 
     spelling check.  If PERSONAL.DCT grows to 120K, split it into 
     two pieces such as PERSONAL.A-L and PERSONAL.M-Z.  This is 
     necessary to avoid stuffing MAKE's compression buffer.  These 
     can be compiled with:

                    MAKE  PERSONAL.*  WORDS.TOO

     The files will be selected in the right order, A-L first, 
     then M-Z.  

          There is an easy way to check everything once your 
     personal dictionary gets too large to scan quickly, even if 
     it contains 10,000 words.  It is best to work on a RAM disk 
     away from other files.  Enter MAKE PERSONAL.DCT, to create a 
     small version of WORDS.DAT.  Then expand this dictionary with 
     FIND /e * >MY.DCT.  Now use a compare utility on the files 
     MY.DCT and PERSONAL.DCT to confirm that they are identical 
     (for one thing, they both better have the same number of 
     bytes).  Any difference between input and output is likely 
     due to accented words, which can be located and removed or 
     edited.



          PART II:  Using Dictionaries.

          Programming for the use of personal dictionaries shows 
     an economy of effort on my part.  This means one needs 
     various work arounds to use them.  Remember that SPELL first 
     looks for WORDS.DAT in the current directory then, failing to 
     find it, in the C:\SPELLING\ directory.  Below are several 
     different options.

          1.)  Put WORDS.TOO in the directory where you do your 
     writing and editing.  In it, or in your path, have a SPL.BAT 
     file:

               SPELL  %1  *.$$$
               REN  WORDS.TOO  WORDS.DAT
               SPELL  *.$$$
               REN  WORDS.DAT  WORDS.TOO
               DEL  *.$$$

          Use the BAT file by typing:  SPL filename.ext
          SPELL will parse the file, and output to FILENAME.$$$.  
     Then your personal dictionary will be renamed WORDS.DAT.  
     Since this is in the current directory, that is what SPELL 
     will next use on FILENAME.$$$ (output to FILENAME.SPL.)  In 
     the last two lines restore WORDS.TOO and delete the temporary 
     .$$$ file.

          Such a BAT file could be useful for multiple users, each 
     with their own personal dictionary, by replacing WORDS.TOO 
     with %2.  Use by typing SPL followed by the file name to 
     spell check and then your dictionary:  SPL file dictionary

               SPELL  %1  *.$$$
               REN  %2  WORDS.DAT
               SPELL  *.$$$              <==output to FILE.SPL
               REN  WORDS.DAT  %2
               DEL  *.$$$

          2.)  As an alternative, name your personal dictionary 
     WORDS.DAT, but keep it in a subdirectory or parent directory.  
     Send the output from SPELL there by specifying a path for the 
     output.  Switch to that directory to SPELL using your 
     personal dictionary.  Or send the output from SPELL to a RAM 
     disk, then copy your personal dictionary there (renaming it 
     to WORDS.DAT in the process), and switch to the RAM disk.

          3.)  Make a copy of SPELL.COM called SPELL2.COM.  Patch 
     SPELL2 as described above by changing .DAT to .TOO.  That 
     is, instead of C:\SPELLING\WORDS.DAT0 at 10 hex, the entry 
     reads C:\SPELLING\WORDS.TOO0

          Put WORDS.TOO in the C:\SPELLING\ directory along with 
     WORDS.DAT.  Now SPELL2 can be run anywhere on the hard disk, 
     and it uses your personal dictionary.  Use SPELL normally.  
     Whenever the output is overlong, run SPELL2 on it.  Or 
     create a SPL.BAT file (which could optionally specify a path 
     to the SPELL and SPELL2 files):

               SPELL  %1  *.SP$
               SPELL2  *.SP$  *.SP2
               DEL  *.SP$

          Use the BAT file by typing:  SPL filename

          The ideas from these various options should let you set 
     up something that suits your work habits.



          THE WHOLE SHEBANG  ====================================

          It takes me two months to make my way through the 
     dictionary, and this is a vast improvement over the time it 
     took back when many words had to be added.  Do you really 
     want to mess with the original dictionary?

          The main dictionary, WORDS.DAT, was compiled using MAKE 
     on a list of sorted files.  If you are wondering why I even 
     bothered--there are both anagram and spell programs that use 
     plain ASCII text files--the answer is that a compressed 
     dictionary easily doubles the speed of the programs ANAGRAM, 
     FIND, and SPELL.  To be of manageable length, the letter 
     divisions of the word files were as follows:

             A                G                 PO-Q
             B                H                 R
             C                I                 S
             CO               J-KL              SO
             D                M                 T
             E                N-O               U
             F                P                 V-Z

          Some letters, such as words starting with J, K, L, were 
     grouped together.  Some letters, such as C, P, S, were split 
     in two, the division always at the letter O.  (Nothing 
     special about the letter O; that was where these divisions 
     fell naturally.)  You can automatically recreate these files 
     by using EXTRACT.BAT.  It is convenient to keep all of them 
     in their own subdirectory.

          Note that S plus SO combined equals 130K, the largest 
     letter file.  This compresses to 41K in WORDS.DAT, well 
     within MAKE's 60K compression buffer.  So the letter S 
     doesn't need to be split to compile correctly.  But doubling 
     the size of a file means that the time required for a sort 
     goes up by four.  The sort delay becomes noticeable with a 
     130K file.  Though the real reason for splitting letters was 
     because large files are exhausting when one is trying to work 
     through the dictionary.

          Putting words into alphabetical order is best left to 
     computers.  I seldom manually insert words in their proper 
     order, but instead tack words on to the end of a file, and 
     then do a sort using Vernon D. Buerg's SORTF237 utility 
     [available at www.simtel.net] and a one-line SORT.BAT:

                    SORTF %1 %1.DIC /C /+1,23

     SORTF doesn't accept wild cards, but one can use the DOS FOR 
     command and, since this is a batch file, a CALL, to sort all 
     21 files:   for %f in (*) do call sort.bat %f

     Or else:    for %f in (*) do sortf %f %f.dic /c /+1,23

          Clean up the directory, delete unsorted files, and 
     rename:  REN *.DIC  *

          Duplicate words can cause problems.  The included CHECK 
     utility can be run on sorted files such as those above by 
     entering at the DOS prompt:

                         CHECK *

     Duplicates will be removed.  CHECK considers "Joseph" and 
     "joseph" to be duplicates.  Because I was interested in 
     Scrabble play, CHECK will retain the lowercase word.  There 
     should not be any duplicates, but it is a good idea to be 
     certain, and CHECK takes only a second or two to run.

          After having sorted and checked the word lists, the 21 
     files can be compiled with:

                         MAKE *

     MAKE will automatically take the files in alphabetical order 
     and compile them into WORDS.DAT--don't destroy your old copy 
     of WORDS.DAT!  On a large RAM disk, away from my current copy 
     of WORDS.DAT, compiling the 118,000 word dictionary takes 
     only a second.

          You may want to keep your old copy of WORDS.DAT.  
     Running a spell check on its expanded words will create a 
     list of all the words which have been removed between 
     versions.  Similarly, by making the old WORDS.DAT active 
     (putting it into the current directory), and running a spell 
     check on the new dictionary, a list can be compiled of all 
     the latest additions.



          AFTERWORD   ==========================================

          Anagram and crossword utility programs number in the 
     dozens, if not hundreds.  Yet years ago I was dissatisfied 
     with every one that I looked at.  Some had a word list or 
     dictionary that was riddled with errors (often traceable to a 
     single source).  Others were so all inclusive that they 
     overwhelmed with words far beyond any normal usage.  

          Going through the dictionary to compile a word list is 
     grueling work.  I have done it--three times--going page by 
     page through the American Heritage Dictionary.  While doing 
     so I eliminated racial epithets that the dictionary termed 
     offensive.  If I missed any I apologize.  Most obscene words 
     were discarded (they are also missing from spelling 
     checkers).  The resultant word list is not prudish, as any 
     inquisitive searcher will soon find, but such paring does 
     make WORDS.DAT more suitable for a game of family Scrabble.  

          I was surprised at the degree to which compiling a word 
     list is subjective.  WORDS.DAT is aimed at American usage, 
     so to what degree does one exclude words that are chiefly 
     British?  Anyone who reads a lot probably finds British 
     usage second nature.  And what about common variants?  The 
     dictionary can indicate whether a variant has nearly equal 
     status to the prime listing, or whether it is a clear second 
     best, but a word list is mute on preference.  You may find 
     that I list the root and only the root of a variant, while 
     the preferred word is taken through the dozens of forms of 
     its full grammatical range.  The aim is to improve ones 
     vocabulary with useful, legitimate words--NOT to win at 
     Scrabble by dredging up obscure or archaic spellings that 
     have no place in the real world.  

          Having said that, I should add that I am not above 
     using a Scrabble book, or even browsing through a twenty- 
     volume OED, and adding words that catch my fancy to 
     WORDS.DAT.  Probably 96% of WORDS.DAT is from the American 
     Heritage Dictionary (standard edition; the college edition 
     is too limited).  Exceptions are not necessarily bizarre or 
     uncommon words.  You will find words like "eyedrops," which 
     I located by going online to www.dictionary.com, then 
     clicking on the link to the Merriam-Webster Dictionary.  At 
     the same time I typed in "mudflap," but was told the closest 
     match was "mudflat."  I later found "mudflap" in the New 
     Oxford Dictionary of English (1998).  I am also partial to 
     the Oxford English Reference Dictionary.  WORDS.DAT may 
     have omissions and inconsistencies, but then so do most 
     dictionaries.  Also, words like "qi" are included.  Both 
     the _New York Times_ and _Discover_ magazine used the word 
     qi in discussing the life force which is brought into 
     harmony by Chinese acupuncture.  If it's good enough for 
     them . . .  You may also note that many seemingly proper 
     nouns are lowercase in WORDS.DAT.  Lowercase words are 
     permitted in Scrabble, proper nouns are not.  Thus "joseph," 
     as in the coat, and "venetian," as in blinds, are correctly 
     listed in lowercase.  On the other hand, if you are used 
     to playing "gouda," you are in for a surprise.  Gouda, Edam 
     and Muenster are all capitalized, and therefore forbidden 
     Scrabble plays.  Most cheeses, with several notable 
     exceptions (which I leave for you to find) are proper nouns      
     according to the American Heritage Dictionary, and that is 
     how they appear in WORDS.DAT.

          Lastly, but importantly, I have followed a convention of 
     my spelling checker.  Words like "label" may be parsed as 
     "labeled, labeling, labels" or "labelled, labelling, labels."  
     The single final letter is the preferred modern usage, and 
     that is the form which appears in WORDS.DAT.  This, and 
     related choices, means that the word list is ten thousand 
     words shorter than it might have been.  On the other hand, a 
     word such as "labelled" cannot actually be classed as an 
     error.  If you want to use WORDS.DAT as an arbiter for 
     Scrabble play, this is a point you may have to resolve before 
     a game begins.

          As a spelling dictionary, WORDS.DAT actually has too 
     many words.  Spell checkers want fewer words, not more.  A 
     word like "stelar"--related to stele--which is included, will 
     prevent one from finding a typo for "stellar," which would be 
     the desired word in almost every case.  Still, WORDS.DAT does 
     contain a core list of words in common with spelling checkers.



          APPENDIX:  Hyphens and Hobgoblins  

          For this second version of the dictionary about 2,800 
     words were removed, and 5,800 words were added, bringing the 
     115,000-word dictionary up to 118,000 words.  Many scrabble- 
     type words, or archaic terms, were deleted, while entries 
     like hypercomputer and nanotechnology were added.  Thousands 
     of proper nouns were added to make WORDS.DAT suitable as a 
     spelling checker.  Many of these decisions were judgment 
     calls.

          To my surprise, I found hundreds of questionable words 
     during the revision process.  Except for several typos, all 
     of these could be traced to my commercial spelling checker.  
     I had relied on this while proofing a 70,000-word spell 
     checker for personal use, then used that 70,000-word list as 
     the basis for a later expanded dictionary.  Once questionable 
     words entered the list they tended to remain.  I discovered 
     that this commercial spell checker contains words like 
     "suspendible" [the word which means this is "suspensible"] or 
     "coiffed" [no dictionary has a double-F; the word is 
     "coifed"] or "cogniscible" [cognizable?  There is a 
     "cognoscible"].

          This may seem alarming, but the fact is one could use a 
     109,000-word list which contains 10,000 misspelled words and 
     never notice (it helps if one is a good speller).  Most 
     business users only need a 50,000-word spell checker.  For 
     creative writing I found 70,000 words to be more than 
     adequate.  Shakespeare wrote all of his plays, including all 
     of the names of people and places, using only 29,000 words.  
     (Shakespeare and James Joyce are considered to have unusually 
     large vocabularies.)  Nonetheless, I would prefer a master 
     list that is totally free of errors.  Since this effort is 
     the work of one person that is probably impossible, but it 
     seems a suitable goal.

          A natural evolution of the English language is for two 
     word descriptions to first become hyphenated, then to later 
     merge into a single word:  wild life, wild-life, wildlife.  
     Sixty years ago Eric Ambler could write "to-morrow."  Forty 
     years ago Ian Fleming could use "non-committally."  It is 
     possible that many of today's hyphens will look as quaint as 
     these in the future.  Even ten years ago John Le Carr, in 
     The Night Manager, was using "techno-babble," a word that now 
     always appears joined without the hyphen.

          How do we know whether a word takes a hyphen?  The 
     common answer would be to look in the dictionary, but this 
     is by no means foolproof.  Dictionaries, like critics, do 
     not lead, but record a phenomenon after it has occurred.  
     Language is determined by the people, both by thoughtful 
     writers and by everyday use.  Thus "absent-mindedly" appears 
     in the fourth edition of The American Heritage Dictionary, 
     and you could not be faulted for using hyphenation.  But my 
     reading tells me that "absentmindedly" has long been 
     acceptable as a single word (Le Carr uses it as such in the 
     above quoted book).  More and more I have relied on the web 
     site www.dictionary.com, which performs a search through the 
     fourth edition of The American Heritage Dictionary (my 
     preferred source) and through an unabridged Webster's 
     Dictionary (less attractive, since once one gets into 
     unabridged dictionaries the word base is overwhelming), and 
     through WordNet from Princeton.  WordNet is rather lax, so I 
     am often uneasy relying on it, but it does support my 
     decision to include "absentmindedly" in WORDS.DAT.  
     Similarly, WordNet, lists "closeup" as acceptable, but the 
     words "face-up, write-up, sit-up, tune-up" require 
     hyphenation [yes, facedown is one word, but face-up is 
     hyphenated].  TV Guide, which surely represents the common 
     language, uses "Editor's Close-up," so perhaps here the 
     hyphen is still the safer bet.  Many words absolutely 
     require hyphenation at this time:  "after-shave, cold-
     bloodedly, point-blank, anti-Semitic, mind-boggling."  So if 
     you use "aftershave" or "antisemitism," and wonder why 
     WORDS.DAT doesn't contain such entries, you might consider 
     that the words may be hyphenated.  But always bear in mind 
     my opening comments about current hyphens looking terribly 
     quaint in the future, and that that future may be a lot 
     nearer than we think.

          Only eight years separated the third edition of the 
     American Heritage Dictionary (my original source) with the 
     fourth edition.  The third edition listed "bingo, bingoes."  
     After some research I saw no way out but to include "bingos" 
     in WORDS.DAT.  Now the fourth edition bears me out, listing 
     the plural as "bingos."  I included "catsuit" as it entered 
     the language [Jeri Ryan, Seven of Nine, could talk about 
     wearing her catsuit].  Now, of course, the fourth edition 
     lists this entry, omitted from the third.  Any word list 
     which is not constantly monitored becomes obsolete.  I am 
     always open to suggestions concerning words that should be 
     deleted or added to WORDS.DAT.
