Make the build work (someone forgot to run ./build.sh clean docs before they committed invalid xml)
PR: Obtained from: Submitted by: Reviewed by: git-svn-id: https://svn.apache.org/repos/asf/jakarta/poi/trunk@352866 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
7ee81420f4
commit
3775c3e5bf
@ -1,5 +1,5 @@
|
|||||||
<?xml version="1.0" encoding="UTF-8"?>
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.0//EN" "../dtd/document-v10.dtd">
|
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd">
|
||||||
|
|
||||||
<document>
|
<document>
|
||||||
<header>
|
<header>
|
||||||
@ -11,7 +11,7 @@
|
|||||||
</header>
|
</header>
|
||||||
|
|
||||||
<body>
|
<body>
|
||||||
<s1 title="The Word 97 File Format in semi-plain English">
|
<section title="The Word 97 File Format in semi-plain English">
|
||||||
|
|
||||||
<p>The purpose of this document is to give a brief high level overview of the
|
<p>The purpose of this document is to give a brief high level overview of the
|
||||||
HDF document format. This document does not go into in-depth technical
|
HDF document format. This document does not go into in-depth technical
|
||||||
@ -20,19 +20,19 @@
|
|||||||
<p>The OLE file format is not discussed in this document. It is assumed that
|
<p>The OLE file format is not discussed in this document. It is assumed that
|
||||||
the reader has a working knowledge of the POIFS API. </p>
|
the reader has a working knowledge of the POIFS API. </p>
|
||||||
|
|
||||||
<s2 title="Word file structure">
|
<section title="Word file structure">
|
||||||
<p>A Word file is made up of the document text and data structures
|
<p>A Word file is made up of the document text and data structures
|
||||||
containing formatting information about the text. Of course, this is a
|
containing formatting information about the text. Of course, this is a
|
||||||
very simplified illustration. There are fields and macros and other
|
very simplified illustration. There are fields and macros and other
|
||||||
things that have not been considered. At this stage, HDF is mainly
|
things that have not been considered. At this stage, HDF is mainly
|
||||||
concerned with formatted text.</p>
|
concerned with formatted text.</p>
|
||||||
</s2>
|
</section>
|
||||||
<s2 title="Reading Word files">
|
<section title="Reading Word files">
|
||||||
<p>The entry point for HDF's reading of a Word file is the File Information
|
<p>The entry point for HDF's reading of a Word file is the File Information
|
||||||
Block (FIB). This structure is the entry point for the locations and size
|
Block (FIB). This structure is the entry point for the locations and size
|
||||||
of a document's text and data structures. The FIB is located at the
|
of a document's text and data structures. The FIB is located at the
|
||||||
beginning of the main stream.</p>
|
beginning of the main stream.</p>
|
||||||
<s3 title="Text">
|
<section title="Text">
|
||||||
<p>The document's text is also located in the main stream. Its starting
|
<p>The document's text is also located in the main stream. Its starting
|
||||||
location is given as FIB.fcMin and its length is given in bytes by
|
location is given as FIB.fcMin and its length is given in bytes by
|
||||||
FIB.ccpText. These two values are not very useful in getting the text
|
FIB.ccpText. These two values are not very useful in getting the text
|
||||||
@ -46,9 +46,9 @@
|
|||||||
If the piece uses unicode, the file offset is masked with a certain bit.
|
If the piece uses unicode, the file offset is masked with a certain bit.
|
||||||
Then you have to unmask the bit and divide by 2 to get the real file
|
Then you have to unmask the bit and divide by 2 to get the real file
|
||||||
offset. </p>
|
offset. </p>
|
||||||
</s3>
|
</section>
|
||||||
<s3 title="Text Formatting">
|
<section title="Text Formatting">
|
||||||
<s4 title="Stylesheet">
|
<section title="Stylesheet">
|
||||||
<p>All text formatting is based on styles contained in the StyleSheet.
|
<p>All text formatting is based on styles contained in the StyleSheet.
|
||||||
The StyleSheet is a data structure containing among other things, style
|
The StyleSheet is a data structure containing among other things, style
|
||||||
descriptions. Each style description can contain a paragraph style and
|
descriptions. Each style description can contain a paragraph style and
|
||||||
@ -57,8 +57,8 @@
|
|||||||
from another style.</p>
|
from another style.</p>
|
||||||
<p>Eventually, you have to chain back to the nil style which is an
|
<p>Eventually, you have to chain back to the nil style which is an
|
||||||
imaginary style with certain implied values.</p>
|
imaginary style with certain implied values.</p>
|
||||||
</s4>
|
</section>
|
||||||
<s4 title="Paragraph and Character styles">
|
<section title="Paragraph and Character styles">
|
||||||
<p>Paragraph and Character formatting properties for a document's text are
|
<p>Paragraph and Character formatting properties for a document's text are
|
||||||
stored on file as deltas from some base style in the Stylesheet. The
|
stored on file as deltas from some base style in the Stylesheet. The
|
||||||
deltas are used to create a complete uncompressed style in memory.</p>
|
deltas are used to create a complete uncompressed style in memory.</p>
|
||||||
@ -75,8 +75,8 @@
|
|||||||
compressed properties for that interval. The compessed PAPX is based on
|
compressed properties for that interval. The compessed PAPX is based on
|
||||||
its base style in the StyleSheet. The compressed CHPX is based on the
|
its base style in the StyleSheet. The compressed CHPX is based on the
|
||||||
enclosing paragraph's base style in the Stylesheet.</p>
|
enclosing paragraph's base style in the Stylesheet.</p>
|
||||||
</s4>
|
</section>
|
||||||
<s4 title="Uncompressing styles and other data structures">
|
<section title="Uncompressing styles and other data structures">
|
||||||
<p>All compressed properties(CHPX, PAPX, SEPX) contain a grpprl. A grpprl
|
<p>All compressed properties(CHPX, PAPX, SEPX) contain a grpprl. A grpprl
|
||||||
is an array of sprms. A sprm defines a delta from some base property.
|
is an array of sprms. A sprm defines a delta from some base property.
|
||||||
There is a table of possible sprms in the Word 97 spec. Each sprm is a
|
There is a table of possible sprms in the Word 97 spec. Each sprm is a
|
||||||
@ -85,10 +85,10 @@
|
|||||||
the base style. After every sprm in the grpprl is performed on the base
|
the base style. After every sprm in the grpprl is performed on the base
|
||||||
style you will have the style for the paragraph, character run,
|
style you will have the style for the paragraph, character run,
|
||||||
section, etc.</p>
|
section, etc.</p>
|
||||||
</s4>
|
</section>
|
||||||
</s3>
|
</section>
|
||||||
</s2>
|
</section>
|
||||||
</s1>
|
</section>
|
||||||
</body>
|
</body>
|
||||||
</document>
|
</document>
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user