Guy Macon's Documentation Standards
Revision 1.02
Guy Macon
02 April 2003 <-- A bunch has changed since 2003; I really need to rewrite this from scratch.


The purpose of this document is to define the data formats to be used in
all engineering projects managed by Guy Macon.


In most cases this document is advisory only.  This document is mandatory
only in cases where an applicable Requirements document makes conformance
to this document a requirement.


Engineering documentation should be usable for a minimum of twenty years
so that we can support older equipment.  In the case of patent litigation,
sufficient engineering documentation to prove prior art may be required
for as long as the patent holder exists.  This requirement cannot be met
if we store our documents in data formats that are proprietary to a
specific vendor, have secret file structures, or are nonstandard in any
way.  Our documents must survive even if our favorite software vendors go
out of business.  (Care to try to edit a document stored in Wang,
Electric Pencil, WordStar, WordPerfect for DOS or Lotus 123 formats?
All of them were market dominators in their day...)  This document was
inspired by my experience with a company that has over 30,000 carefully
worded and formatted documents in WordPerfect 5.1/DOS format.

Bad Formats:

The following formats are known to be proprietary, nonstandard, rare, too
new, or otherwise undesirable:

Graphics Interchange format (.gif):

GIF files are covered by a Unisys patent.  After years of allowing free
use, Unisys/CompuServe started charging GIF users.  This should be a
lesson for anyone who chooses a proprietary format as a "standard".

Microsoft Word (.doc), Excel (.xls).  Write (.wri) Rich Text (.rtf),
Bitmap (.bmp), and other Microsoft proprietary formats.

Microsoft has a history of making new versions of their software that are
not backward and forward compatible with documents saved with older
versions of the same software.  Converters are available for many (but not
all) older formats, but the conversions are one way.  If John Doe using
Microsoft Write (.wri) under Windows 3.11 sends a document to Jane Doe who
uses Windows NT, Jane cannot edit the document.  Instead, a conversion
takes place and the document is converted into WordPad format.  WordPad
cannot save in Write format.  This means that Jane cannot make a
correction to John's document without making the document unreadable on
John's computer.  These situations can be triggered by an upgrade to an
application or even by applying the latest Service Pack.  Word, Excel,
Access, etc, all lack forward/backward compatibility with older versions
of the same format.  In many cases there was no apparent reason for
changing the file format except as a method of forcing users to upgrade.
Conversion programs are purposely made to be imperfect and formatting
is often changed during conversion.

Non-standard HTML, also known as "Tag Soup"

Certain browser vendors are motivated to fill the Internet with documents that
are best viewed with their software.  long term archivability is not a
priority - in fact they have a business model that profits from forcing document
authors to follow shifting standards.  See the section on XHTML below.

Extensible Markup Language (.xml) WWW Consortium Recommendation.

XML is a fine format, but for the purposes of engineering documentation the
XHTML subset will usually suffice.  If XML is to be used, a document that
defines the structure and which contains an archiving plan is required.

Standard Generalized Markup Language (.sgml).

SGML is another fine format, but it can be used to create a wide variety
of incompatible file formats.  For our purposes the HTML subset of XML will

Adobe Acrobat (.pdf) format

PDF lacks forward and backward compatibility - you need the latest acrobat
reader to read the latest .pdf files.  In many cases the latest version of
acrobat reader will not run on older hardware.  A .pdf document created with
the Chinese version of Acrobat cannot be opened with the English version of
Acrobat.  Alas, there is no current standard that will work as a direct
replacement for the Adobe proprietary PDF format.  Suggestions welcome.

Joint Photographic Experts Group, all implementations except JFIF

Strictly speaking, JPEG refers only to a family of compression algorithms;
it does *not* refer to a specific image file format.  The JPEG committee
was prevented from defining a file format by turf wars within the
international standards organizations.  In the absence of official
standards, a number of JPEG program writers have written incompatible
implementations and as a result their programs aren't 100% compatible
with anyone else's.  The closest thing we have to a standard JPEG format
is JFIF (JPEG File Interchange Format).

TIFF (.tff)

The TIFF 6.0 spec for incorporating JPEG is not widely implemented, partly
because it has some serious design flaws.  NextStep systems are the only
ones making any significant use of TIFF 6.0 style TIFF/JPEG.  TIFF is far
more complex than JFIF, and is generally less transportable because
different vendors often implement slightly different, non overlapping
subsets of TIFF.

Any document that contains macros, scripts, Java objects, or any other
executable objects.

Macros, scripts, etc.  mean that you must not only read the file, but also
execute some code.  This decreases the chances that the document will be
usable years from now.  In addition, Macros, scripts, etc.  are a security
problem, and may contain worms, viruses, or data bombs.  Any document that
has executable content must display all content when displayed on a system
that lacks the ability to execute the executable content.

Any document that loses information when displayed, printed, or copied on
a monochrome device.

Meeting this requirement can be as simple as putting a small print "RED"
in the corner of a red colored box.  Failure to make documents monochrome
friendly causes problems for users who lack color copy machines, and is a
violation of the Americans with Disabilities (ADA) act as it applies to
the colorblind and visually impaired.

Good Formats:

The following formats are known to be standard, and are acceptable for
use in engineering documentation.

ASCII (American Standard Code for Information Interchange) as defined in
ANSI_X3.4-1968, ANSI_X3.110-1983, ISO-IR-99, CSA_T500-1983 and ISO 8859-1.
This is the preferred format for purely textual information, and is the
most universal of all standards.  Use it whenever possible.

5.2.0 ASCII Rules:

The subset of ASCII to be used uses only the following characters.  NULL
and TAB characters are interpreted differently on different systems and
shall not be used.

Table 1: Allowable ASCII characters with ANSI names

Char  Hex Dec  Name                  Char  Hex Dec  Name
      0A  010  LINE FEED (LF)               0D  013  CARRIAGE RETURN (CR)
[ ]   20  032  SPACE                  [!]   21  033  EXCLAMATION MARK
["]   22  034  QUOTATION MARK         [#]   23  035  OCTOTHORPE
[$]   24  036  DOLLAR SIGN            [%]   25  037  PERCENT SIGN
[&]   26  038  AMPERSAND              [']   27  039  APOSTROPHE
[(]   28  040  LEFT PARENTHESIS       [)]   29  041  RIGHT PARENTHESIS
[*]   2A  042  ASTERISK               [+]   2B  043  PLUS SIGN
[,]   2C  044  COMMA                  [-]   2D  045  HYPHEN-MINUS
[.]   2E  046  PERIOD                 [/]   2F  047  FORWARD SLASH
[0-9] 30-39 048-057 NUMBERS           [:]   3A  058  COLON
[;]   3B  059 SEMICOLON               [<]   3C  060  LESS-THAN SIGN
[=]   3D  061 EQUALS SIGN             [>]   3E  062  GREATER-THAN SIGN
[?]   3F  063 QUESTION MARK           [@]   40  064  COMMERCIAL AT
[A-Z] 41-5A 065-090 CAPITAL LETTERS   [[]   5B  091  LEFT SQUARE BRACKET
[\]   5C  092 REVERSE SLASH           []]   5D  093  RIGHT SQUARE BRACKET
[^]   5E  094 CIRCUMFLEX ACCENT       [_]   5F  095  LOW LINE
[`]   60  096 GRAVE ACCENT            [a-z] 61-7A 097-122 LOWER CASE LETTERS
[{]   7B  123 LEFT CURLY BRACKET      [|]   7C  124  VERTICAL LINE
[}]   7D  125 RIGHT CURLY BRACKET     [~]   7E  126  TILDE

Please note that ASCII does not have characters such as "Underscore"
(underscore is an attribute like Bold or Italic.  It resembles Low
Line when applied to a Space character) or "Dash" (en-dash is as wide
as the letter "n", Em-dash is as wide as the letter "M" - concepts that
only apply to proportional fonts.  ASCII is a character set, not a Font).

In released documents each line shall be less than 78 printable characters
long (75 is preferred), left justified, and without hyphenation.  Please
note that modern text editors such as UltraEdit will convert this format
to and from long line format (no line feed or carriage return until end of
paragraph) for more convenient editing, and that the long line format is
preferred for unreleased drafts.

Each line shall be terminated with one of the following sequences,
in descending order of preference:

Carriage Return followed by Line Feed (DOS/Windows, 0D 0A)
Line Feed alone (UNIX/Linux, 0A)
Carriage Return alone (Macintosh, 0D)

The preferences are based on the fact that UNIX/Linux and Macintosh
are better at working with other formats.

Software developers are encouraged to write code that outputs lines with
Carriage Return followed by Line Feed, and that accepts input with any of
the above line termination sequences.

If possible, ASCII documents should be printed with a non-proportional
font (10, 11, or 12 point Courier, Courier New. Andale/momotype.com,
or OCR-B preferred) without bold, italic, or underline attributes.

Extensible HyperText Markup Language, also known as XHTML (.htm .html).
(See [ http://www.w3.org ] for details.) XHTML is a cleaned up version
of HTML.  For engineering documents published on the Internet or on
local networks that are meant to be accessed with web browsers, good
XHTML is ideal.  For an example of a web site written in XHTML, see
[ http://www.GuyMacon.com ].

Hypertexty Markup Language, or HTML:

The approved version of HTML is the version defined by the W3C
Network Working Group.  HTML is an application of ISO Standard
8879:1986 Information Processing Text and Office Systems; Standard
Generalized Markup Language (SGML).

Nonstandard extensions and implementations such as the ones used in
Netscape Navigator and Microsoft Internet Explorer are not allowed
in engineering documents.

Independent JPEG Group JFIF (commonly misnamed "JPEG") as defined in
Annex B of ISO DIS 10918-1.

JPEG is not a file format.  JPEG is a compression scheme.  JPEG File
Interchange Format (JFIF) is a file format which allows JPEG images
to be exchanged between a wide variety of platforms and applications.
Nearly all files on the Internet that are called "JPEG" are really JFIF

Computer Graphics Metafile (.cgm) as defined in ANSI X3.122-1986 and
8632:1992 Parts 1-4, Version 1 CGM only.

Computer Graphics Metafile can handle vector graphics and images.  It
stores pictures in a way which is independent of any particular software,
computer or graphics device.  CGM offers the first standard method of
storing a vector-based image type.  Vector images allow for greater detail
and clarity at multiple zoom levels.  Also, they are usually much more
compact than the equivalent bitmap.  CGM is used by a number of other
standards including the Office Document  Architecture (ODA) standard, ATA
within the commercial aviation industry, J2008 within the automotive
industry and the CALS (Computer-aided Acquisition and Logistics Support)
US Dept of Defense specification.

PNG (Portable Network Graphics) Specification

PNG is defined in W3C Recommendation RFC-2083 and is on a standards track
under the purview of ISO/IEC JTC 1 SC 24 and is expected to be released
eventually as ISO/IEC International Standard 15948.  It is intent of the
standards bodies to maintain backward compatibility with the current PNG
specification.  PNG seems to be better designed than most other formats
- if and when it becomes a standard, I will likely  make it  my preferred
graphics format.  For now, it is advised to archive the image file in
multiple formats

Revision Control:

All engineering documents shall be archived on a network file server
in such a way that no single or double failure will make the document
permanently unavailable.  The electronic file on the file server shall
be considered to be the current version.  All paper copies are to be
considered to be out of date and obsolete at the moment of printing.
The best that you say about the paper copy is that it was the latest
version several seconds ago.

All printed copies shall contain automatically computer generated time
stamps.  Any changes to a document will automatically result in a new
time stamp.  Updating of revision numbers is at the discretion of the
author - readers shall consider the time stamp authoritative and the
revision level to be advisory only.

Working with nonconforming file formats:

It is inevitable that some engineers will use applications that have
nonstandard file formats.  There are too many useful tools that are
proprietary.  When nonstandard file formats are used, the engineer shall
do the following before releasing the document:

Save the file in as many formats as possible. Use other software to make
conversions that are not on the exportable format list.

Give serious consideration to creating a standard document that shows the
same information.

Store the document on as many media as possible.  An ASCII file on
an 8 inch disk formatted for Kaypro CP/M is hard to read on a PC
with CD-ROM and 3.5" floppy.  Be sure to store copies on as many
servers as possible, as server data tends to be backed up and to
get transferred as new computers and media formats are added to the

Make nine hard copies on acid free paper (vellum or Mylar preferred) with
archive quality ink and store three copies in each of three different
locations.  Make microfilm copies if you can.

8.0 Copyright:

Copyright (c) 2000 by Guy Macon.  All rights reserved.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1
or any later version published by the Free Software Foundation.

It is the authors intent to grant permission to use this document for
any purpose, as long as proper credit is given on any copy that is posted
to the Internet, and to give permission for corporate users to use this
or modified versions of this without credit as if it was their own work.
I am no lawyer, so this is just an expression of my intent - the
actual legal copyright is defined GNU Free Documentation License, not
this paragraph.  If you make changes,  I would appreciate it if you
emailed me a copy - I might want to add your changes to my copy.
My email address may change, but you can always find it at
[ http://www.GuyMacon.com ].

9.0 Things I haven't decided on yet - discussion welcome!

Postscript and Encapsulated postscript.

Tex and LaTex

A method of enciphering data that is likely to be available 20
years from now, plus a way of not losing or revealing the
pass phrase over the 20 year period.