poi/src/documentation/xdocs/hssf/how-to.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd">

<document>
    <header>
        <title>The New Halloween Document</title>
        <authors>
            <person email="acoliver2@users.sourceforge.net" name="Andrew C. Oliver" id="AO"/>
            <person email="glens@apache.org" name="Glen Stampoultzis" id="GJS"/>
        </authors>
    </header>
  <body>
    <section title="How to use the HSSF prototype API">

        <section title="Capabilities">
            <p>This release of the how-to outlines functionality for the 1.5 release.
            Those looking for information on previous releases should
            look in the documentation distributed with that release.</p>
            <p>
              This release allows numeric and string cell values to be written to
              or read from an XLS file as well as reading and writing dates. Also
              in this release is row and column sizing, cell styling (bold,
              italics, borders,etc), and support for built-in data formats.  New
              to this release is an event-based API for reading XLS files.
              It differs greatly from the read/write API
              and is intended for intermediate developers who need a smaller
              memory footprint.  It will also serve as the basis for the HSSF
              Generator.</p>
        </section>
    <section title="General Use">
        <section title="User API">
                <section title="Writing a new one">

                    <p>The high level API (package: org.apache.poi.hssf.usermodel)
                    is what most people should use. Usage is very simple.
                    </p>
                    <p>Workbooks are created by creating an instance of
                    org.apache.poi.hssf.usermodel.HSSFWorkbook.
                    </p>
                    <p>Sheets are created by calling createSheet() from an existing
                    instance of HSSFWorkbook, the created sheet is automatically added in
                    sequence to the workbook. Sheets do not in themselves have a sheet
                    name (the tab at the bottom); you set
                    the name associated with a sheet by calling
                    HSSFWorkbook.setSheetName(sheetindex,&quot;SheetName&quot;).</p>
                    <p>Rows are created by calling createRow(rowNumber) from an existing
                    instance of HSSFSheet. Only rows that have cell values should be
                    added to the sheet. To set the row's height, you just call
                    setRowHeight(height) on the row object. The height must be given in
                    twips, or 1/20th of a point. If you prefer, there is also a
                    setRowHeightInPoints method.
                    </p>
                    <p>Cells are created by calling createCell(column, type) from an
                    existing HSSFRow. Only cells that have values should be added to the
                    row. Cells should have their cell type set to either
                    HSSFCell.CELL_TYPE_NUMERIC or HSSFCell.CELL_TYPE_STRING depending on
                    whether they contain a numeric or textual value. Cells must also have
                    a value set. Set the value by calling setCellValue with either a
                    String or double as a parameter. Individual cells do not have a
                    width; you must call setColumnWidth(colindex, width) (use units of
                    1/256th of a character) on the HSSFSheet object. (You can't do it on
                    an individual basis in the GUI either).</p>
                    <p>Cells are styled with HSSFCellStyle objects which in turn contain
                    a reference to an HSSFFont object. These are created via the
                    HSSFWorkbook object by calling createCellStyle() and createFont().
                    Once you create the object you must set its parameters (colors,
                    borders, etc). To set a font for an HSSFCellStyle call
                    setFont(fontobj).
                    </p>
                    <p>Once you have generated your workbook, you can write it out by
                    calling write(outputStream) from your instance of Workbook, passing
                    it an OutputStream (for instance, a FileOutputStream or
                    ServletOutputStream). You must close the OutputStream yourself. HSSF
                    does not close it for you.
                    </p>
                    <p>Here is some example code (excerpted and adapted from
                    org.apache.poi.hssf.dev.HSSF test class):</p>
<source><![CDATA[        short rownum;

        // create a new file
        FileOutputStream out = new FileOutputStream("workbook.xls");
        // create a new workbook
        HSSFWorkbook wb = new HSSFWorkbook();
        // create a new sheet
        HSSFSheet s = wb.createSheet();
        // declare a row object reference
        HSSFRow r = null;
        // declare a cell object reference
        HSSFCell c = null;
        // create 3 cell styles
        HSSFCellStyle cs = wb.createCellStyle();
        HSSFCellStyle cs2 = wb.createCellStyle();
        HSSFCellStyle cs3 = wb.createCellStyle();
        // create 2 fonts objects
        HSSFFont f = wb.createFont();
        HSSFFont f2 = wb.createFont();

        //set font 1 to 12 point type
        f.setFontHeightInPoints((short) 12);
        //make it red
        f.setColor((short) HSSFCellStyle.RED);
        // make it bold
        //arial is the default font
        f.setBoldweight(f.BOLDWEIGHT_BOLD);

        //set font 2 to 10 point type
        f2.setFontHeightInPoints((short) 10);
        //make it the color at palette index 0xf (white)
        f2.setColor((short) HSSFCellStyle.WHITE);
        //make it bold
        f2.setBoldweight(f2.BOLDWEIGHT_BOLD);

        //set cell stlye
        cs.setFont(f);
        //set the cell format see HSSFDataFromat for a full list
        cs.setDataFormat(HSSFDataFormat.getFormat("($#,##0_);[Red]($#,##0)"));

        //set a thin border
        cs2.setBorderBottom(cs2.BORDER_THIN);
        //fill w fg fill color
        cs2.setFillPattern((short) HSSFCellStyle.SOLID_FOREGROUND);
        // set foreground fill to red
        cs2.setFillForegroundColor((short) HSSFCellStyle.RED);

        // set the font
        cs2.setFont(f2);

        // set the sheet name to HSSF Test
        wb.setSheetName(0, "HSSF Test");
        // create a sheet with 300 rows (0-299)
        for (rownum = (short) 0; rownum < 300; rownum++)
        {
            // create a row
            r = s.createRow(rownum);
            // on every other row
            if ((rownum % 2) == 0)
            {
                // make the row height bigger  (in twips - 1/20 of a point)
                r.setHeight((short) 0x249);
            }

            //r.setRowNum(( short ) rownum);
            // create 50 cells (0-49) (the += 2 becomes apparent later
            for (short cellnum = (short) 0; cellnum < 50; cellnum += 2)
            {
                // create a numeric cell
                c = r.createCell(cellnum);
                // do some goofy math to demonstrate decimals
                c.setCellValue(rownum * 10000 + cellnum
                        + (((double) rownum / 1000)
                        + ((double) cellnum / 10000)));

                // on every other row
                if ((rownum % 2) == 0)
                {
                    // set this cell to the first cell style we defined
                    c.setCellStyle(cs);
                }

                // create a string cell (see why += 2 in the
                c = r.createCell((short) (cellnum + 1));

                // set the cell's string value to "TEST"
                c.setCellValue("TEST");
                // make this column a bit wider
                s.setColumnWidth((short) (cellnum + 1), (short) ((50 * 8) / ((double) 1 / 20)));

                // on every other row
                if ((rownum % 2) == 0)
                {
                    // set this to the white on red cell style
                    // we defined above
                    c.setCellStyle(cs2);
                }

            }
        }

        //draw a thick black border on the row at the bottom using BLANKS
        // advance 2 rows
        rownum++;
        rownum++;

        r = s.createRow(rownum);

        // define the third style to be the default
        // except with a thick black border at the bottom
        cs3.setBorderBottom(cs3.BORDER_THICK);

        //create 50 cells
        for (short cellnum = (short) 0; cellnum < 50; cellnum++)
        {
            //create a blank type cell (no value)
            c = r.createCell(cellnum);
            // set it to the thick black border style
            c.setCellStyle(cs3);
        }

        //end draw thick black border


        // demonstrate adding/naming and deleting a sheet
        // create a sheet, set its title then delete it
        s = wb.createSheet();
        wb.setSheetName(1, "DeletedSheet");
        wb.removeSheetAt(1);
        //end deleted sheet

        // write the workbook to the output stream
        // close our file (don't blow out our file handles
        wb.write(out);
        out.close();
        ]]></source>
        </section>
                <section title="Reading or modifying an existing file">

<p>Reading in a file is equally simple. To read in a file, create a
new instance of org.apache.poi.poifs.Filesystem, passing in an open InputStream, such as a FileInputStream
for your XLS, to the constructor. Construct a new instance of
org.apache.poi.hssf.usermodel.HSSFWorkbook passing the
Filesystem instance to the constructor. From there you have access to
all of the high level model objects through their assessor methods
(workbook.getSheet(sheetNum), sheet.getRow(rownum), etc).
</p>
<p>Modifying the file you have read in is simple. You retrieve the
object via an assessor method, remove it via a parent object's remove
method (sheet.removeRow(hssfrow)) and create objects just as you
would if creating a new xls. When you are done modifying cells just
call workbook.write(outputstream) just as you did above.</p>
<p>An example of this can be seen in
<link href="http://cvs.apache.org/viewcvs/~checkout~/jakarta-poi/src/java/org/apache/poi/hssf/dev/HSSF.java?rev=1.1">org.apache.poi.hssf.dev.HSSF</link>.</p>
         </section>
        </section>
                <section title="Event API">

                    <p>The event API is brand new.  It is intended for intermediate
                    developers who are willing to learn a little bit of the low level API
                    structures.  Its relatively simple to use, but requires a basic
                    understanding of the parts of an Excel file (or willingness to
                    learn).  The advantage provided is that you can read an XLS with a
                    relatively small memory footprint.
                    </p>
                    <p>To use this API you construct an instance of
                    org.apache.poi.hssf.eventmodel.HSSFRequest. Register a class you
                    create that supports the
                    org.apache.poi.hssf.eventmodel.HSSFListener interface using the
                    HSSFRequest.addListener(yourlistener, recordsid).  The recordsid
                    should be a static reference number (such as BOFRecord.sid) contained
                    in the classes in org.apache.poi.hssf.record.  The trick is you
                    have to know what these records are.  Alternatively you can call
                    HSSFRequest.addListenerForAllRecords(mylistener).  In order to learn
                    about these records you can either read all of the javadoc in the
                    org.apache.poi.hssf.record package or you can just hack up a
                    copy of org.apache.poi.hssf.dev.EFHSSF and adapt it to your
                    needs.  TODO: better documentation on records.</p>
                    <p>Once you've registered your listeners in the HSSFRequest object
                    you can construct an instance of
                    org.apache.poi.poifs.filesystem.FileSystem (see POIFS howto) and
                    pass it your XLS file inputstream.  You can either pass this, along
                    with the request you constructed, to an instance of HSSFEventFactory
                    via the HSSFEventFactory.processWorkbookEvents(request, Filesystem)
                    method, or you can get an instance of DocumentInputStream from
                    Filesystem.createDocumentInputStream(&quot;Workbook&quot;) and pass
                    it to HSSFEventFactory.processEvents(request, inputStream).  Once you
                    make this call, the listeners that you constructed receive calls to
                    their processRecord(Record) methods with each Record they are
                    registered to listen for until the file has been completely read.
                    </p>
                    <p>A code excerpt from org.apache.poi.hssf.dev.EFHSSF (which is
                    in CVS or the source distribution) is reprinted below with excessive
                    comments:</p>
<source><![CDATA[
/**
 * This example shows how to use the event API for reading a file.
 */
public class EventExample
        implements HSSFListener
{
    private SSTRecord sstrec;

    /**
     * This method listens for incoming records and handles them as required.
     * @param record    The record that was found while reading.
     */
    public void processRecord(Record record)
    {
        switch (record.getSid())
        {
            // the BOFRecord can represent either the beginning of a sheet or the workbook
            case BOFRecord.sid:
                BOFRecord bof = (BOFRecord) record;
                if (bof.getType() == bof.TYPE_WORKBOOK)
                {
                    System.out.println("Encountered workbook");
                    // assigned to the class level member
                } else if (bof.getType() == bof.TYPE_WORKSHEET)
                {
                    System.out.println("Encountered sheet reference");
                }
                break;
            case BoundSheetRecord.sid:
                BoundSheetRecord bsr = (BoundSheetRecord) record;
                System.out.println("New sheet named: " + bsr.getSheetname());
                break;
            case RowRecord.sid:
                RowRecord rowrec = (RowRecord) record;
                System.out.println("Row found, first column at "
                        + rowrec.getFirstCol() + " last column at " + rowrec.getLastCol());
                break;
            case NumberRecord.sid:
                NumberRecord numrec = (NumberRecord) record;
                System.out.println("Cell found with value " + numrec.getValue()
                        + " at row " + numrec.getRow() + " and column " + numrec.getColumn());
                break;
                // SSTRecords store a array of unique strings used in Excel.
            case SSTRecord.sid:
                sstrec = (SSTRecord) record;
                for (int k = 0; k < sstrec.getNumUniqueStrings(); k++)
                {
                    System.out.println("String table value " + k + " = " + sstrec.getString(k));
                }
                break;
            case LabelSSTRecord.sid:
                LabelSSTRecord lrec = (LabelSSTRecord) record;
                System.out.println("String cell found with value "
                        + sstrec.getString(lrec.getSSTIndex()));
                break;
        }
    }

    /**
     * Read an excel file and spit out what we find.
     *
     * @param args      Expect one argument that is the file to read.
     * @throws IOException  When there is an error processing the file.
     */
    public static void main(String[] args) throws IOException
    {
        // create a new file input stream with the input file specified
        // at the command line
        FileInputStream fin = new FileInputStream(args[0]);
        // create a new org.apache.poi.poifs.filesystem.Filesystem
        POIFSFileSystem poifs = new POIFSFileSystem(fin);
        // get the Workbook (excel part) stream in a InputStream
        InputStream din = poifs.createDocumentInputStream("Workbook");
        // construct out HSSFRequest object
        HSSFRequest req = new HSSFRequest();
        // lazy listen for ALL records with the listener shown above
        req.addListenerForAllRecords(new EventExample());
        // create our event factory
        HSSFEventFactory factory = new HSSFEventFactory();
        // process our events based on the document input stream
        factory.processEvents(req, din);
        // once all the events are processed close our file input stream
        fin.close();
        // and our document input stream (don't want to leak these!)
        din.close();
        System.out.println("done.");
    }
}
]]></source>
     </section>
                <section title="Low Level APIs">

<p>The low level API is not much to look at. It consists of lots of
&quot;Records&quot; in the org.apache.poi.hssf.record.* package,
and set of helper classes in org.apache.poi.hssf.model.*. The
record classes are consistent with the low level binary structures
inside a BIFF8 file (which is embedded in a POIFS file system). You
probably need the book: &quot;Microsoft Excel 97 Developer's Kit&quot;
from Microsoft Press in order to understand how these fit together
(out of print but easily obtainable from Amazon's used books). In
order to gain a good understanding of how to use the low level APIs
should view the source in org.apache.poi.hssf.usermodel.* and
the classes in org.apache.poi.hssf.model.*. You should read the
documentation for the POIFS libraries as well.</p>
     </section>
                <section title="HSSF Class/Test Application">

<p>The HSSF application is nothing more than a test for the high
level API (and indirectly the low level support). The main body of
its code is repeated above. To run it:
</p>
<ul>
	<li>download the poi-alpha build and untar it (tar xvzf
	tarball.tar.gz)
	</li>
	<li>set up your classpath as follows:
	<code>export HSSFDIR={wherever you put HSSF's jar files}
export LOG4JDIR={wherever you put LOG4J's jar files}
export CLASSPATH=$CLASSPATH:$HSSFDIR/hssf.jar:$HSSFDIR/poi-poifs.jar:$HSSFDIR/poi-util.jar:$LOG4JDIR/jog4j.jar</code>
	</li><li>type:
	<code>java org.apache.poi.hssf.dev.HSSF ~/myxls.xls write</code></li>
</ul>
<p></p>
<p>This should generate a test sheet in your home directory called <code>&quot;myxls.xls&quot;</code>.  </p>
<ul>
	<li>Type:
	<code>java org.apache.poi.hssf.dev.HSSF ~/input.xls output.xls</code>
    <br/>
    <br/>
This is the read/write/modify test.  It reads in the spreadsheet, modifies a cell, and writes it back out.
Failing this test is not necessarily a bad thing.  If HSSF tries to modify a non-existant sheet then this will
most likely fail.  No big deal.  </li>
</ul>
     </section>
                <section title="HSSF Logging Facility">
<p>
    POI has a small amount of logging code embedded within it.  Defining the system property
    poi.logging will enable logging to standard out.
</p>
     </section>
                <section title="HSSF Developer's Tools">

<p>HSSF has a number of tools useful for developers to debug/develop
stuff using HSSF (and more generally XLS files). We've already
discussed the app for testing HSSF read/write/modify capabilities;
now we'll talk a bit about BiffViewer. Early on in the development of
HSSF, it was decided that knowing what was in a record, what was
wrong with it, etc. was virtually impossible with the available
tools. So we developed BiffViewer. You can find it at
org.apache.poi.hssf.dev.BiffViewer. It performs two basic
functions and a derivative.
</p>
<p>The first is &quot;biffview&quot;. To do this you run it (assumes
you have everything setup in your classpath and that you know what
you're doing enough to be thinking about this) with an xls file as a
parameter. It will give you a listing of all understood records with
their data and a list of not-yet-understood records with no data
(because it doesn't know how to interpret them). This listing is
useful for several things. First, you can look at the values and SEE
what is wrong in quasi-English. Second, you can send the output to a
file and compare it.
</p>
<p>The second function is &quot;big freakin dump&quot;, just pass a
file and a second argument matching &quot;bfd&quot; exactly. This
will just make a big hexdump of the file.
</p>
<p>Lastly, there is &quot;mixed&quot; mode which does the same as
regular biffview, only it includes hex dumps of certain records
intertwined. To use that just pass a file with a second argument
matching &quot;on&quot; exactly.</p>
<p>In the next release cycle we'll also have something called a
FormulaViewer. The class is already there, but its not very useful
yet. When it does something, we'll document it.</p>

     </section>
                <section title="What's Next?">

<p>This release contains code that supports &quot;internationalization&quot;
or more accurately non-US/UK languages; however, it has not been
tested with the new API changes (please help us with this). We've
shifted focus a bit for this release in recognition of the
international support we've gotten. We're going to focus on western
European languages for our first beta. We're more than happy to
accept help in supporting non-Western European languages if someone
who knows what they're doing in this area is willing to pitch in!
(There is next to no documentation on what is necessary to support
such a move and its really hard to support a language when you don't even
know the alphabet).</p>
<p>This release of HSSF does not yet support Formulas. I've been
focusing on the requests I've gotten in. That being said, if we get
more user feedback on what is most useful first we'll aim for that.
As a general principal, HSSF's goal is to support HSSF-Serializer
(meaning an emphasis on write). We would like to hear from you! How
are you using HSSF/POIFS? How would you like to use it? What features
are most important first?
</p>
     </section>

</section>

</section>
</body>
</document>