From bbe5c8d867807b05e7e4c78e03c8bc37e603fe50 Mon Sep 17 00:00:00 2001 From: Nick Burch Date: Sun, 31 Aug 2008 17:24:10 +0000 Subject: [PATCH] HPBF docs update git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@690739 13f79535-47bb-0310-9956-ffa450edef68 --- .../content/xdocs/hpbf/file-format.xml | 21 +++++++++++++++++++ .../content/xdocs/hpbf/index.xml | 2 +- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/src/documentation/content/xdocs/hpbf/file-format.xml b/src/documentation/content/xdocs/hpbf/file-format.xml index e08ebbac0..ed0185828 100644 --- a/src/documentation/content/xdocs/hpbf/file-format.xml +++ b/src/documentation/content/xdocs/hpbf/file-format.xml @@ -171,6 +171,27 @@ PL 62 1a 00 00 48 00 00 00 // PL from: 1a62 (6754), len: 48 (72) think that the second 4 bytes of text describes the format of data block at the offset. The format of the text block is easy, but we're still trying to figure out the others.

+ +
Structure of TEXT bit +

This is very simple. All the text for the document is + stored in a single bit of the Quill CONTENTS. The text + is stored as little endian 16 bit unicode strings.

+
+
Structure of PLC bit +

The first four bytes seem to hold the count of the + entries in the bit, and the second four bytes seem to hold + the type. There is then some pre-data, and then data for + each of the entries, the exact format dependant on the type.

+

Type 0 has 4 2 byte unsigned ints, then a pair of 2 byte + unsigned ints for each entry.

+

Type 4 has 4 2 byte unsigned ints, then a pair of 4 byte + unsigned ints for each entry.

+

Type 8 has 7 2 byte unsigned ints, then a pair of 4 byte + unsigned ints for each entry.

+

Type 12 holds hyperlinks, and is very much more complex. + See org.apache.poi.hpbf.model.qcbits.QCPLCBit + for our best guess as to how the contents match up.

+
diff --git a/src/documentation/content/xdocs/hpbf/index.xml b/src/documentation/content/xdocs/hpbf/index.xml index 01f49f061..84d6948fd 100755 --- a/src/documentation/content/xdocs/hpbf/index.xml +++ b/src/documentation/content/xdocs/hpbf/index.xml @@ -41,7 +41,7 @@ lots of offsets to other parts of the file.

Our initial aim is to provude a text extractor for the format (now done), and be able to extract hyperlinks from within - the document (not yet supported). Additional low level + the document (partly supported). Additional low level code to process the file format may follow, if there is demand and developer interest warrant it.

At this time, there is no usermodel api or similar.