2003-04-23 20:53:41 -04:00
|
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
2007-01-15 18:11:09 -05:00
|
|
|
<!--
|
|
|
|
====================================================================
|
|
|
|
Licensed to the Apache Software Foundation (ASF) under one or more
|
|
|
|
contributor license agreements. See the NOTICE file distributed with
|
|
|
|
this work for additional information regarding copyright ownership.
|
|
|
|
The ASF licenses this file to You under the Apache License, Version 2.0
|
|
|
|
(the "License"); you may not use this file except in compliance with
|
|
|
|
the License. You may obtain a copy of the License at
|
|
|
|
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
See the License for the specific language governing permissions and
|
|
|
|
limitations under the License.
|
|
|
|
====================================================================
|
|
|
|
-->
|
2008-03-21 08:59:42 -04:00
|
|
|
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.3//EN" "./dtd/document-v13.dtd">
|
2003-04-23 20:53:41 -04:00
|
|
|
|
|
|
|
<document>
|
|
|
|
<header>
|
2009-11-19 16:22:21 -05:00
|
|
|
<title>Apache POI - Component Overview</title>
|
2003-04-23 20:53:41 -04:00
|
|
|
<authors>
|
2003-07-19 21:42:17 -04:00
|
|
|
<person id="AO" name="Andrew C. Oliver" email="acoliver@apache.org"/>
|
2005-01-25 15:08:18 -05:00
|
|
|
<person id="RK" name="Rainer Klute" email="klute@apache.org"/>
|
2009-11-19 16:22:21 -05:00
|
|
|
<person id="DF" name="David Fisher" email="dfisher@jmlafferty.com"/>
|
2003-04-23 20:53:41 -04:00
|
|
|
</authors>
|
|
|
|
</header>
|
|
|
|
<body>
|
2009-11-19 16:22:21 -05:00
|
|
|
<section><title>Apache POI Project Components</title>
|
|
|
|
<section><title>POIFS for OLE 2 Documents</title>
|
|
|
|
<p>
|
|
|
|
POIFS is the oldest and most stable part of the project. It is our port of the OLE 2 Compound Document Format to
|
|
|
|
pure Java. It supports both read and write functionality. All of our components ultimately rely on it by
|
|
|
|
definition. Please see <link href="./poifs/index.html">the POIFS project page</link> for more information.
|
|
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section><title>HSSF and XSSF for Excel Documents</title>
|
|
|
|
<p>
|
|
|
|
HSSF is our port of the Microsoft Excel 97(-2007) file format (BIFF8) to pure
|
|
|
|
Java. XSSF is our port of the Microsoft Excel XML (2007+) file format (OOXML) to
|
|
|
|
pure Java. SS is a package that provides common support for both formats with a common API.
|
|
|
|
They both support read and write capability. Please see
|
|
|
|
<link href="./spreadsheet/index.html">the HSSF+XSSF project page</link> for more
|
|
|
|
information.
|
|
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section><title>HWPF and XWPF for Word Documents</title>
|
|
|
|
<p>
|
|
|
|
HWPF is our port of the Microsoft Word 97 (-2003) file format to pure
|
2010-06-30 13:40:33 -04:00
|
|
|
Java. It supports read, and limited write capabilities. It also provides
|
|
|
|
simple text extraction support for the older Word 6 and Word 95 formats.
|
|
|
|
Please see <link href="./hwpf/index.html">the HWPF project page for more
|
2009-11-19 16:22:21 -05:00
|
|
|
information</link>. This component remains in early stages of
|
|
|
|
development. It can already read and write simple files.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
We are also working on the XWPF for the WordprocessingML (2007+) format from the
|
2010-06-30 13:40:33 -04:00
|
|
|
OOXML specification. This provides read and write support for simpler
|
|
|
|
files, along with text extraction capabilities.
|
2009-11-19 16:22:21 -05:00
|
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section><title>HSLF and XSLF for PowerPoint Documents</title>
|
|
|
|
<p>
|
|
|
|
HSLF is our port of the Microsoft PowerPoint 97(-2003) file format to pure
|
|
|
|
Java. It supports read and write capabilities. Please see <link
|
|
|
|
href="./slideshow/index.html">the HSLF project page for more
|
|
|
|
information</link>.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
We are also working on the XSLF for the PresentationML (2007+) format from the
|
|
|
|
OOXML specification.
|
|
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section><title>HPSF for OLE 2 Document Properties</title>
|
|
|
|
<p>
|
|
|
|
HPSF is our port of the OLE 2 property set format to pure
|
|
|
|
Java. Property sets are mostly use to store a document's properties
|
|
|
|
(title, author, date of last modification etc.), but they can be used
|
|
|
|
for application-specific purposes as well.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
HPSF supports both reading and writing of properties.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
Please see <link href="./hpsf/index.html">the HPSF project
|
|
|
|
page</link> for more information.
|
|
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section><title>HDGF for Visio Documents</title>
|
|
|
|
<p>
|
|
|
|
HDGF is our port of the Microsoft Viso 97(-2003) file format to pure
|
|
|
|
Java. It currently only supports reading at a very low level, and
|
|
|
|
simple text extraction. Please see <link
|
|
|
|
href="./hdgf/index.html">the HDGF project page for more
|
|
|
|
information</link>.
|
|
|
|
</p>
|
|
|
|
</section>
|
|
|
|
<section><title>HPBF for Publisher Documents</title>
|
|
|
|
<p>
|
|
|
|
HPBF is our port of the Microsoft Publisher 98(-2007) file format to pure
|
|
|
|
Java. It currently only supports reading at a low level for around
|
|
|
|
half of the file parts, and simple text extraction. Please see <link
|
|
|
|
href="./hpbf/index.html">the HPBF project page for more
|
|
|
|
information</link>.
|
|
|
|
</p>
|
|
|
|
</section>
|
2010-12-22 00:04:19 -05:00
|
|
|
<section><title>HMEF for TNEF (winmail.dat) Outlook Attachments</title>
|
|
|
|
<p>
|
|
|
|
HMEF is our port of the Microsoft TNEF (Transport Neutral Encoding
|
|
|
|
Format) file format to pure Java. TNEF is sometimes used by Outlook
|
|
|
|
for encoding the message, and will typically come through as
|
|
|
|
winmail.dat. HMEF currently only supports reading at a low level, but
|
|
|
|
we hope to add text and attachment extraction shortly. Please see <link
|
|
|
|
href="./hmef/index.html">the HMEF project page for more
|
|
|
|
information</link>.
|
|
|
|
</p>
|
|
|
|
</section>
|
2009-11-19 16:22:21 -05:00
|
|
|
<section><title>HSMF for Outlook Messages</title>
|
|
|
|
<p>
|
|
|
|
HSMF is our port of the Microsoft Outlook message file format to pure
|
2010-06-30 13:40:33 -04:00
|
|
|
Java. It currently only some of the textual content of MSG files, and
|
|
|
|
some attachments. Further support and documentation is coming in slowly.
|
2009-11-19 16:22:21 -05:00
|
|
|
For now, users are advised to consult the unit tests for example use.
|
|
|
|
Please see <link href="./hsmf/index.html">the HPBF project page for more
|
|
|
|
information</link>.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
Microsoft has recently added the Outlook file format to its OSP. More information
|
|
|
|
is now available making implementation of this API an easier task.
|
|
|
|
</p>
|
|
|
|
</section>
|
|
|
|
</section>
|
2003-04-23 20:53:41 -04:00
|
|
|
<section><title>What is it?</title>
|
2009-11-19 16:22:21 -05:00
|
|
|
<p>The Apache POI project is the master project for developing pure
|
2003-04-23 20:53:41 -04:00
|
|
|
Java ports of file formats based on Microsoft's OLE 2 Compound
|
|
|
|
Document Format. OLE 2 Compound Document Format is used by
|
|
|
|
Microsoft Office Documents, as well as by programs using MFC
|
|
|
|
property sets to serialize their document objects.
|
|
|
|
</p>
|
2009-11-19 16:22:21 -05:00
|
|
|
<p>Apache POI is also the master project for developing pure
|
|
|
|
Java ports of file formats based on Office Open XML (ooxml.)
|
|
|
|
OOXML is part of an ECMA / ISO standardisation effort. This
|
|
|
|
documentation is quite large, but you can normally find the bit you
|
|
|
|
need without too much effort!
|
|
|
|
<link href="http://www.ecma-international.org/publications/standards/Ecma-376.htm">ECMA-376 standard is here</link>,
|
|
|
|
and is also under the <link href="http://www.microsoft.com/interop/osp">Microsoft OSP</link>.
|
|
|
|
</p>
|
2003-04-23 20:53:41 -04:00
|
|
|
</section>
|
2009-11-19 16:22:21 -05:00
|
|
|
<section id="components"><title>Component Map</title>
|
2003-04-23 20:53:41 -04:00
|
|
|
<p>
|
2009-11-19 16:22:21 -05:00
|
|
|
The Apache POI distribution consists of support for many document file formats. This support is provided
|
|
|
|
in several Jar files. Not all of the Jars are needed for every format. The following tables
|
|
|
|
show the relationships between POI components, Maven repository tags, and the project's Jar files.
|
|
|
|
</p>
|
|
|
|
<table>
|
|
|
|
<tr>
|
|
|
|
<th>Component</th>
|
|
|
|
<th>Application type</th>
|
|
|
|
<th>Maven artifactId</th>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./poifs/index.html">POIFS</link></td>
|
|
|
|
<td>OLE2 Filesystem</td>
|
|
|
|
<td>poi</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./hpsf/index.html">HPSF</link></td>
|
|
|
|
<td>OLE2 Property Sets</td>
|
|
|
|
<td>poi</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./spreadsheet/index.html">HSSF</link></td>
|
|
|
|
<td>Excel XLS</td>
|
|
|
|
<td>poi</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./slideshow/index.html">HSLF</link></td>
|
|
|
|
<td>PowerPoint PPT</td>
|
|
|
|
<td>poi-scratchpad</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./hwpf/index.html">HWPF</link></td>
|
|
|
|
<td>Word DOC</td>
|
|
|
|
<td>poi-scratchpad</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./hdgf/index.html">HDGF</link></td>
|
|
|
|
<td>Visio VSD</td>
|
|
|
|
<td>poi-scratchpad</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./hpbf/index.html">HPBF</link></td>
|
|
|
|
<td>Publisher PUB</td>
|
|
|
|
<td>poi-scratchpad</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./hsmf/index.html">HSMF</link></td>
|
|
|
|
<td>Outlook MSG</td>
|
|
|
|
<td>poi-scratchpad</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./spreadsheet/index.html">XSSF</link></td>
|
|
|
|
<td>Excel XLSX</td>
|
|
|
|
<td>poi-ooxml</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./slideshow/index.html">XSLF</link></td>
|
|
|
|
<td>PowerPoint PPTX</td>
|
|
|
|
<td>poi-ooxml</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./hwpf/index.html">XWPF</link></td>
|
|
|
|
<td>Word DOCX</td>
|
|
|
|
<td>poi-ooxml</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td><link href="./oxml4j/index.html">OpenXML4J</link></td>
|
|
|
|
<td>OOXML</td>
|
2009-12-13 13:04:02 -05:00
|
|
|
<td>poi-ooxml-schemas, ooxml-schemas</td>
|
2009-11-19 16:22:21 -05:00
|
|
|
</tr>
|
|
|
|
</table>
|
|
|
|
<p>
|
|
|
|
This table maps artifacts into the jar file name. "version-yyyymmdd" is the POI version stamp. For the latest release it is
|
2009-12-15 04:27:44 -05:00
|
|
|
3.6-20091214.
|
2009-11-19 16:22:21 -05:00
|
|
|
</p>
|
|
|
|
<table>
|
|
|
|
<tr>
|
|
|
|
<th>Maven artifactId</th>
|
2009-12-13 13:04:02 -05:00
|
|
|
<th>Prerequisites</th>
|
2009-11-19 16:22:21 -05:00
|
|
|
<th>JAR</th>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td>poi</td>
|
2009-12-13 13:04:02 -05:00
|
|
|
<td>commons-logging, log4j</td>
|
2009-11-19 16:22:21 -05:00
|
|
|
<td>poi-version-yyyymmdd.jar</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td>poi-scratchpad</td>
|
2009-12-13 13:04:02 -05:00
|
|
|
<td>poi</td>
|
2009-11-19 16:22:21 -05:00
|
|
|
<td>poi-scratchpad-version-yyyymmdd.jar</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td>poi-ooxml</td>
|
2009-12-13 13:04:02 -05:00
|
|
|
<td>poi, poi-ooxml-schemas, dom4j</td>
|
2009-11-19 16:22:21 -05:00
|
|
|
<td>poi-ooxml-version-yyyymmdd.jar</td>
|
|
|
|
</tr>
|
2009-12-13 13:04:02 -05:00
|
|
|
<tr>
|
|
|
|
<td>poi-ooxml-schemas (*)</td>
|
|
|
|
<td>xmlbeans, geronimo-stax-api_1.0_spec</td>
|
|
|
|
<td>poi-ooxml-schemas-version-yyyymmdd.jar</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td>poi-examples (*)</td>
|
|
|
|
<td>poi, poi-scratchpad, poi-ooxml</td>
|
|
|
|
<td>poi-examples-version-yyyymmdd.jar</td>
|
|
|
|
</tr>
|
2009-11-19 16:22:21 -05:00
|
|
|
<tr>
|
|
|
|
<td>ooxml-schemas</td>
|
2010-08-12 09:59:45 -04:00
|
|
|
<td>xmlbeans, geronimo-stax-api_1.0_spec</td>
|
2010-06-14 11:20:19 -04:00
|
|
|
<td>ooxml-schemas-1.1.jar</td>
|
2009-11-19 16:22:21 -05:00
|
|
|
</tr>
|
|
|
|
</table>
|
|
|
|
<p>
|
2009-12-13 13:04:02 -05:00
|
|
|
(*) starting with 3.6-beta1-20091124.
|
|
|
|
</p>
|
|
|
|
<p>
|
2010-06-14 11:20:19 -04:00
|
|
|
poi-ooxml requires poi-ooxml-schemas. This is a substantially smaller
|
|
|
|
version of the ooxml-schemas jar (ooxml-schemas-1.1.jar for POI 3.7 or later, ooxml-scheams-1.0.jar for POI 3.5 and 3.6).
|
|
|
|
The larger ooxml-schemas jar is <link href="faq.html">normally</link>
|
|
|
|
only required for development
|
2003-04-23 20:53:41 -04:00
|
|
|
</p>
|
|
|
|
</section>
|
2009-11-19 16:22:21 -05:00
|
|
|
<section><title>Examples</title>
|
|
|
|
<p>
|
|
|
|
Small sample programs using the POI API are available in the
|
2005-01-25 15:08:18 -05:00
|
|
|
<em>src/examples</em> directory of the source distribution. Before
|
|
|
|
studying the source code you might want to have a look at the
|
2009-11-19 16:22:21 -05:00
|
|
|
"Examples" section of the <link href="apidocs/overview-summary.html">POI API
|
|
|
|
documentation</link>.
|
|
|
|
</p>
|
2009-12-13 13:04:02 -05:00
|
|
|
<p>
|
|
|
|
Also note that we now include all of the examples in the distribution.
|
|
|
|
</p>
|
2005-01-25 15:08:18 -05:00
|
|
|
</section>
|
|
|
|
<section><title>Contributed Software</title>
|
2009-11-19 16:22:21 -05:00
|
|
|
<p>
|
|
|
|
Besides the "official" components outlined above there is some further
|
|
|
|
software distributed with POI. This is called "contributed" software. It
|
2005-01-25 15:08:18 -05:00
|
|
|
is not explicitly recommended or even maintained by the POI team, but
|
2009-11-19 16:22:21 -05:00
|
|
|
it might still be useful to you.
|
|
|
|
</p>
|
|
|
|
<section><title>POI Browser</title>
|
|
|
|
<p>
|
|
|
|
The POI Browser is a very simple Swing GUI tool that displays the
|
2005-01-25 15:08:18 -05:00
|
|
|
internal structure of a Microsoft Office file and especially the
|
|
|
|
property set streams. Further information and instructions how to
|
|
|
|
execute it can be found in the <link
|
2009-11-19 16:22:21 -05:00
|
|
|
href="apidocs/org/apache/poi/contrib/poibrowser/package-summary.html#package_description">POI
|
|
|
|
Browser package description</link>.
|
|
|
|
</p>
|
2005-01-25 15:08:18 -05:00
|
|
|
</section>
|
|
|
|
</section>
|
2003-04-23 20:53:41 -04:00
|
|
|
</body>
|
|
|
|
<footer>
|
|
|
|
<legal>
|
2009-11-19 16:22:21 -05:00
|
|
|
Copyright (c) @year@ The Apache Software Foundation. All rights reserved.
|
2003-04-23 20:53:41 -04:00
|
|
|
</legal>
|
|
|
|
</footer>
|
|
|
|
</document>
|