\  PARATAG.S
\  This table tags the end of paragraphs in uncoded ASCII text.
\  It would be a good idea to run COMPRESS.S on your file before running
\  this table on it. That will make it easier for this table to identify
\  paragraphs.

\  Throughout these equations, flag 0 is turned ON if the type of line ending
\  has not been resolved, and turned OFF if it has. This allows other
\  equations to do further checking as to what kind of line ending should be
\  attached.

\ *************************************************************************

\  The first two equations handle the case where a line is only up to 40
\  characters long. We want to tag such lines with an end of paragraph.

\v\^*(39)\v\0d\0a*00=\p0\^1<EP>\0d\0a*00
\v\^*(39)\v\0d\0a*01=<EP>\0d\0a\p0\^1<EP>\0d\0a*00

\  In the above two equations, the first character is a printable wild card
\  code because search equations can't begin with a variable-length string.
\  Technically, then, this equation will match only lines that have at
\  least 1 printable character in them (a space, a letter, a number, etc.).

\ *************************************************************************

\  The next two equations handle lines up to 80 characters long. Such lines
\  will be left untagged and flag 0 will be turned ON to indicate that the
\  line ending has not been resolved. If an unresolved line is followed by
\  another line of up to 80 characters, the preceding line is resolved with
\  an end of paragraph.

*(40)\v\^*(40)\v\0d\0a*00=*(40)\p0\^1*01
*(40)\v\^*(40)\v\0d\0a*01=<EP>\0d\0a*(40)\p0\^1*01

\  Notice in the above two equations that we are checking for exactly 40
\  printable characters before we check for up to 40 more. This is because an
\  equation that reads \^*(80)\v will match the same lines as \^*(40)\v for
\  lines under 40 characters! Why? Because the var.len. code \^ means "UP TO
\  (xx) characters", and a 30-character line would be certainly less than 80
\  characters just as it would be less than 40 characters, hence there's no
\  distinction between the two for short lines.

\ *************************************************************************

\  The next equations handle the case where a long line is followed by a lower
\  case letter. Such lines are continued as a paragraph, unless a period or
\  paren follows the letter, in which case it is assumed to be the beginning
\  of an outline-style point such as a) or a.

\y*01= \p0*00     \ an unresolved line followed by a lower case letter
\0d\0a\y= \p2     \ a carriage return followed by a lower case letter

\y. *01=<EP>\0d\0a\p0\p1\p2*00    \ unless it looks like an outline
\0d\0a\y. =<EP>\0d\0a\p2\p3\p4
\y) *01=<EP>\0d\0a\p0\p1\p2*00
\0d\0a\y) =<EP>\0d\0a\p2\p3\p4

\ *************************************************************************

\  The following equations handle lines that begin with an upper case or
\  numbered outline point. Such lines are new paragraphs, so the preceding
\  line must be tagged.

\n. *01=<EP>\0d\0a\p0\p1\p2*00
\u. *01=<EP>\0d\0a\p0\p1\p2*00
\0d\0a\n. =<EP>\0d\0a\p2\p3\p4
\0d\0a\u. =<EP>\0d\0a\p2\p3\p4

\n) *01=<EP>\0d\0a\p0\p1\p2*00
\u) *01=<EP>\0d\0a\p0\p1\p2*00
\0d\0a\n) =<EP>\0d\0a\p2\p3\p4
\0d\0a\u) =<EP>\0d\0a\p2\p3\p4

\  a run of cap characters followed by a colon is a new paragraph.

\u\^*(20)\x: *01=<EP>\0d\0a\p0\^1: *00

\ *************************************************************************

\  These equations tag an unresolved line followed by any printable character
\  (except a lower case letter, which is handled separately above) with an end
\  of paragraph. You can change the meaning of the equations removing the <EP>
\  codes and replacing them with a word space. This would make all unresolved
\  lines default to a continuation.

\v*01=<EP>\0d\0a\p0*00
\u*01=<EP>\0d\0a\p0*00     \ *** CHANGE THESE TWO IF YOU ARE GETTING
\n*01=<EP>\0d\0a\p0*00     \     TOO MANY MISMARKED PARAGRAPH ENDINGS ***

\ *************************************************************************

\  And finally, a CRLF by itself is always a paragraph end.

\0d\0a=<EP>\0d\0a*00

