diff --git a/src/documentation/content/xdocs/poifs/how-to.xml b/src/documentation/content/xdocs/poifs/how-to.xml index e1bc8c2c6..5e107070d 100644 --- a/src/documentation/content/xdocs/poifs/how-to.xml +++ b/src/documentation/content/xdocs/poifs/how-to.xml @@ -27,12 +27,9 @@
How To Use the POIFS APIs -

This document describes how to use the POIFS APIs to read, write, and modify files that employ a POIFS-compatible data structure to organize their content.

-
Revision History - -
+

This document describes how to use the POIFS APIs to read, write, + and modify files that employ a POIFS-compatible data structure to + organize their content.

Target Audience

This document is intended for Java developers who need to use the POIFS APIs to read, write, or modify files that employ a POIFS-compatible data structure to organize their content. It is not necessary for developers to understand the POIFS data structures, and an explanation of those data structures is beyond the scope of this document. It is expected that the members of the target audience will understand the rudiments of a hierarchical file system, and familiarity with the event pattern employed by Java APIs such as AWT would be helpful.

@@ -87,7 +84,7 @@ Disadvantages - Conventional Reading + Conventional Reading (POIFSFileSystem) Simpler API similar to reading a conventional file system.
Can read documents in any order. @@ -96,6 +93,19 @@ All files are resident in memory, whether your application needs them or not. + + New NIO driven Reading (NPOIFSFileSystem) + + Simpler API similar to reading a conventional file system.
+ Can read documents in any order.
+ Lower memory than POIFSFileSystem + + + If created from an InputStream, all files are resident in memory. + (If created from a File, only certain key structures are)
+ Currently doesn't support writing + + Event-Driven Reading @@ -135,9 +145,8 @@ DirectoryEntry root = fs.getRoot();

Once the file system has been loaded into memory and the root directory has been obtained, the root directory can be read. The following code fragment shows how to read the entries in an org.apache.poi.poifs.filesystem.DirectoryEntry instance:

// dir is an instance of DirectoryEntry ... -for (Iterator iter = dir.getEntries(); iter.hasNext(); ) +for (Entry entry : dir) { - Entry entry = (Entry)iter.next(); System.out.println("found entry: " + entry.getName()); if (entry instanceof DirectoryEntry) { @@ -197,6 +206,56 @@ DocumentInputStream stream = new DocumentInputStream(document);
+
NIO Reading using NPOIFSFileSystem +

In this technique for reading, certain key structures are loaded + into memory, and the entire directory tree can be walked by the + application, reading specific documents at leisure.

+

If you create a NPOIFSFileSystem instance from a File, the memory + footprint is very small. However, if you createa a NPOIFSFileSystem + instance from an input stream, then the whole contents must be + buffered into memory to allow random access. As such, you should + budget on memory use of up to 20% of the file size when using a File, + or up to 120% of the file size when using an InputStream.

+
Preparation +

Before an application can read a file from the file system, the + file system needs to be opened and core parts processed. This is done + using the org.apache.poi.poifs.filesystem.NPOIFSFileSystem + class. Once the file system has been loaded into memory, the + application may need the root directory. The following code fragment + will accomplish this preparation stage:

+ +// This is the most memory efficient way to open the FileSystem +NPOIFSFileSystem fs; +try +{ + fs = new NPOIFSFileSystem(new File(filename)); +} +catch (IOException e) +{ + // an I/O error occurred, or the InputStream did not provide a compatible + // POIFS data structure +} +DirectoryEntry root = fs.getRoot(); + + +// Using an InputStream requires more memory than using a File +NPOIFSFileSystem fs; +try +{ + fs = new NPOIFSFileSystem(inputStream); +} +catch (IOException e) +{ + // an I/O error occurred, or the InputStream did not provide a compatible + // POIFS data structure +} +DirectoryEntry root = fs.getRoot(); + +

Assuming no exception was thrown, the file system can then be read.

+

One the NPOFSFileSytem is open, you can manipulate it just like + a POIFSFileSytem one.

+
+
Event-Driven Reading

The event-driven API for reading documents is a little more complicated and requires that your application know, in advance, which files it wants to read. The benefit of using this API is that each document is in memory just long enough for your application to read it, and documents that you never read at all are not in memory at all. When you're finished reading the documents you wanted, the file system has no data structures associated with it at all and can be discarded.

Preparation