diff --git a/src/documentation/content/xdocs/book.xml b/src/documentation/content/xdocs/book.xml index c803b12d6..13538d3e9 100644 --- a/src/documentation/content/xdocs/book.xml +++ b/src/documentation/content/xdocs/book.xml @@ -21,7 +21,8 @@ - + + diff --git a/src/documentation/content/xdocs/hslf/book.xml b/src/documentation/content/xdocs/hslf/book.xml new file mode 100644 index 000000000..cc92cdb1c --- /dev/null +++ b/src/documentation/content/xdocs/hslf/book.xml @@ -0,0 +1,18 @@ + + + + + + + + + + + + + + + + diff --git a/src/documentation/content/xdocs/hslf/index.xml b/src/documentation/content/xdocs/hslf/index.xml new file mode 100755 index 000000000..5d176fa8c --- /dev/null +++ b/src/documentation/content/xdocs/hslf/index.xml @@ -0,0 +1,33 @@ + + + + + +
+ POI-HSLF - Java API To Access Microsoft Powerpoint Format Files + Overview + + + +
+ + +
+ Overview + +

HSLF is the POI Project's pure Java implementation of the Powerpoint file format.

+

HSSF provides a way to read powerpoint presentations, and extract text from it. + It also provides some (currently limited) edit capabilities. +

+ This code currently lives the scratchpad area of the POI CVS repository. + Ensure that you have the scratchpad jar or the scratchpad build area in your + classpath before experimenting with this code. + +

The quick guide documentation provides + information on using this API. Comments and fixes gratefully accepted on the POI + dev mailing lists.

+ + +
+ +
diff --git a/src/documentation/content/xdocs/hslf/quick-guide.xml b/src/documentation/content/xdocs/hslf/quick-guide.xml new file mode 100644 index 000000000..5f6525232 --- /dev/null +++ b/src/documentation/content/xdocs/hslf/quick-guide.xml @@ -0,0 +1,58 @@ + + + + + +
+ POI-HSLF - A Quick Guide + Overview + + + +
+ + +
Basic Text Extraction +

For basic text extraction, make use of +org.apache.poi.extractor.PowerPointExtractor. It accepts a file or an input +stream. The getText() method can be used to get the text from the slides, +from the notes, or from both. +

+
+ +
Specific Text Extraction +

To get specific bits of text, first create a org.apache.poi.usermodel.SlideShow +(from a org.apache.poi.HSLFSlideShow, which accepts a file or an input +stream). Use getSlides() and getNotes() to get the slides and notes. +These can be queried to get their page ID (though they should be returned +in the right order). You can also call getTextRuns() on these, to get their +blocks of text. From the TextRun, you can extract the text, and check +what type of text it is (eg Body, Title) +

+
+ +
Changing Text +

It is possible to change the text via TextRun.setText(String). However, if +the length of the text is changed, things will break because PowerPoint has +internal file references in byte offsets, which are not yet all updated when +the size changes. +

+
+ +
Guide to key classes +
    +
  • org.apache.poi.hslf.HSLFSlideShow + Handles reading in and writing out files. Generates a tree of the records + in the file +
  • +
  • org.apache.poi.hslf.usermode.SlideShow + Builds up model entries from the records, and presents a user facing + view of the file +
  • +
  • org.apache.poi.hslf.extractor.PowerPointExtractor + Uses the model code to allow extraction of text from files +
  • +
+
+ +
\ No newline at end of file