Document file format

From Wikipedia, the free encyclopedia

Jump to: navigation, search

A document file format is a text or binary file format for storing documents on a storage media, especially for use by computers. There currently exist a multitude of incompatible document file formats. The file format currently used by Microsoft Word (.doc) is arguably the most widespread de facto-standard.

A rough consensus has been established that XML is to be the basis for future document file formats. Open XML-based standards include DocBook and, more recently, OpenDocument and Office Open XML. OpenDocument became an ISO standard on May 3, 2006. However Microsoft will be using the Office Open XML document format standardized through Ecma International which is currently under standardization in ISO.

In 1993 the ITU-T tried to establish a standard for document file formats, known as the Open Document Architecture (ODA) which was supposed to replace all competing document file formats. It is described in ITU-T documents T.411 through T.421, which are equivalent to ISO 8613. It did not succeed.

Page description languages such as PostScript and PDF have become the de facto-standard for documents that a typical user should only be able to be read, not edit.

[edit] Common document file formats

  • Amigaguide
  • DNL DNAML's DNL page turning format
  • CHM (Microsoft's help format)
  • .doc for Microsoft Word (Format revised and altered in new software versions; structural binary format identical since Word 97; specifications available from Microsoft upon request)
  • DocBook (an XML format for technical documenation)
  • HLP
  • HTML (.html, .htm), in combination with possible image files referred to; IE can also combine these, having just one MHT-file to represent a webpage.
  • Office Open XML (Ecma's open, XML-based standard for office documents)
  • OpenDocument (open, XML-based standard for office documents)
  • PalmDoc Handheld de facto document standard.
  • Plucker Handheld navigable wide used document standard.
  • PDF - many people can read them (since the viewer is free), fewer can make and edit them
  • RTF (a textual encoding of the data in a Word DOC; many programs' Word export filters actually write RTF as RTF is much easier to generate reliably)
  • SYmbolic LinK (SYLK)
  • TEI (an XML format for digital publication)
  • TeX
  • Troff
  • TXT (plain text)
  • WordPerfect (.doc) (Note: possible confusion with Word format extension)
  • XML

[edit] See also

Personal tools
Languages