poi/src/contrib/src/org/apache/poi/contrib/poibrowser/package.html

64 lines
2.5 KiB
HTML

<!doctype html public "-//W3C//DTD HTML 4.0//EN//">
<!-- Copyright (C) 2004 The Apache Software Foundation. All rights reserved. -->
<html>
<head>
<title></title>
</head>
<body>
<div>
<p>The <strong>POI Browser</strong> is a very simple Swing GUI tool that
displays the internal structure of a Microsoft Office file. It concentrates
on streams in the <em>Horrible Property Set Format (HPSF)</em>. In order to
access these streams the POI Browser uses the package
<tt>org.apache.poi.hpsf</tt>.</p>
<p>A file in Microsoft's Office format can be seen as a filesystem within a
file. For example, a Word document like <var>sample.doc</var> is just a
simple file from the operation system's point of view. However, internally
it is organized into various directories and files. For example,
<var>sample.doc</var> might consist of the three internal files (or
"streams", as Microsoft calls them) <tt>\001CompObj</tt>,
<tt>\005SummaryInformation</tt>, and <tt>WordDocument</tt>. (In these names
\001 and \005 denote the unprintable characters with the character codes 1
and 5, respectively.) A more complicated Word file typically contains a
directory named <tt>ObjectPool</tt> with more directories and files nested
within it.</p>
<p>The POI Browser makes these internal structures visible. It takes one or
more Microsoft files as input on the command line and shows directories and
files in a tree-like structure. On the top-level POI Browser displays the
(operating system) filenames. An internal file (i.e. a "stream" or a
"document") is shown with its name, its size and a hexadecimal dump of its
first bytes.</p>
<p>The POI Browser pays special attention to property set streams. For
example, the <tt>\005SummaryInformation</tt> stream contains information
like title and author of the document. The POI Browser opens every stream
in a POI filesystem. If it encounters a property set stream, it displays
not just its first bytes but analyses the whole stream and displays its
contents in a more or less readable manner.</p>
</div>
</body>
</html>
<!-- Keep this comment at the end of the file
Local variables:
sgml-default-dtd-file:"HTML_4.0_Strict.ced"
mode: html
sgml-omittag:t
sgml-shorttag:nil
sgml-namecase-general:t
sgml-general-insert-case:lower
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
-->