HPBF docs update

git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@690739 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Nick Burch 2008-08-31 17:24:10 +00:00
parent 7ba82ba657
commit bbe5c8d867
2 changed files with 22 additions and 1 deletions

View File

@ -171,6 +171,27 @@ PL 62 1a 00 00 48 00 00 00 // PL from: 1a62 (6754), len: 48 (72)
think that the second 4 bytes of text describes the format think that the second 4 bytes of text describes the format
of data block at the offset. The format of the text block of data block at the offset. The format of the text block
is easy, but we're still trying to figure out the others.</p> is easy, but we're still trying to figure out the others.</p>
<section><title>Structure of TEXT bit</title>
<p>This is very simple. All the text for the document is
stored in a single bit of the Quill CONTENTS. The text
is stored as little endian 16 bit unicode strings.</p>
</section>
<section><title>Structure of PLC bit</title>
<p>The first four bytes seem to hold the count of the
entries in the bit, and the second four bytes seem to hold
the type. There is then some pre-data, and then data for
each of the entries, the exact format dependant on the type.</p>
<p>Type 0 has 4 2 byte unsigned ints, then a pair of 2 byte
unsigned ints for each entry.</p>
<p>Type 4 has 4 2 byte unsigned ints, then a pair of 4 byte
unsigned ints for each entry.</p>
<p>Type 8 has 7 2 byte unsigned ints, then a pair of 4 byte
unsigned ints for each entry.</p>
<p>Type 12 holds hyperlinks, and is very much more complex.
See <code>org.apache.poi.hpbf.model.qcbits.QCPLCBit</code>
for our best guess as to how the contents match up.</p>
</section>
</section> </section>
</body> </body>
</document> </document>

View File

@ -41,7 +41,7 @@
lots of offsets to other parts of the file.</p> lots of offsets to other parts of the file.</p>
<p>Our initial aim is to provude a text extractor for the format <p>Our initial aim is to provude a text extractor for the format
(now done), and be able to extract hyperlinks from within (now done), and be able to extract hyperlinks from within
the document (not yet supported). Additional low level the document (partly supported). Additional low level
code to process the file format may follow, if there code to process the file format may follow, if there
is demand and developer interest warrant it.</p> is demand and developer interest warrant it.</p>
<p>At this time, there is no <em>usermodel</em> api or similar. <p>At this time, there is no <em>usermodel</em> api or similar.