diff --git a/src/documentation/content/xdocs/hpsf/how-to.xml b/src/documentation/content/xdocs/hpsf/how-to.xml index 0073126c9..aadf753a4 100644 --- a/src/documentation/content/xdocs/hpsf/how-to.xml +++ b/src/documentation/content/xdocs/hpsf/how-to.xml @@ -92,6 +92,12 @@ properties. Chances are that you will find here what you need and don't have to read the other sections. +
If all you are interested in is getting the textual content of
+ all the document properties, such as for full text indexing, then
+ take a look at
+ org.apache.poi.hpsf.extractor.HPFSPropertiesExtractor
. However,
+ if you want full access to the properties, please read on!
The first thing you should understand is that a Microsoft Office file is not one large bunch of bytes but has an internal filesystem structure with files and directories. You can access these files and directories using diff --git a/src/documentation/content/xdocs/hwpf/quick-guide.xml b/src/documentation/content/xdocs/hwpf/quick-guide.xml index bf046258e..d717b0ef0 100644 --- a/src/documentation/content/xdocs/hwpf/quick-guide.xml +++ b/src/documentation/content/xdocs/hwpf/quick-guide.xml @@ -55,13 +55,25 @@ can then get text and other properties.
+To get at the headers and footers of a word document, first create a
+org.apache.poi.hwpf.HWPFDocument
. Next, you need to create a
+org.apache.poi.hwpf.usermodel.HeaderStores
, passing it your
+HWPFDocument. Finally, the HeaderStores gives you access to the headers and
+footers, including first / even / odd page ones if defined in your
+document. Additionally, HeaderStores provides a method for removing
+any macros in the text, which is helpful as many headers and footers
+do end up with macros in them.
It is possible to change the text via
insertBefore()
and insertAfter()
on a Range
object (either a Range
,
Paragraph
or CharacterRun
).
- It is also possible to delete a Range
, but this
- code is know to have bugs in it.
+ It is also possible to delete a Range
.
+ This code will work in many, but not all cases, and patches to
+ improve it are gratefully received!