Busy Developers' Guide to HSSF and XSSF FeaturesBusy Developers' Guide to Features
Want to use HSSF and XSSF read and write spreadsheets in a hurry? This
guide is for you. If you're after more in-depth coverage of the HSSF and
XSSF user-APIs, please consult the HOWTO
guide as it contains actual descriptions of how to use this stuff.
Index of Features
How to create a new workbook
How to create a sheet
How to create cells
How to create date cells
Working with different types of cells
Iterate over rows and cells
Text Extraction
Aligning cells
Working with borders
Fills and color
Merging cells
Working with fonts
Custom colors
Reading and writing
Use newlines in cells.
Create user defined data formats
Fit Sheet to One Page
Set print area for a sheet
Set page numbers on the footer of a sheet
Shift rows
Set a sheet as selected
Set the zoom magnification for a sheet
Create split and freeze panes
Repeating rows and columns
Headers and Footers
Drawing Shapes
Styling Shapes
Shapes and Graphics2d
Outlining
Images
Named Ranges and Named Cells
How to set cell comments
How to adjust column width to fit the contents
Hyperlinks
FeaturesNew WorkbookNew SheetCreating CellsCreating Date CellsWorking with different types of cellsDemonstrates various alignment optionsWorking with bordersIterate over rows and cells
Sometimes, you'd like to just iterate over all the rows in
a sheet, or all the cells in a row. This is possible with
a simple for loop.
Luckily, this is very easy. HSSFRow defines a
CellIterator inner class to handle iterating over
the cells (get one with a call to row.cellIterator()),
and HSSFSheet provides a rowIterator() method to
give an iterator over all the rows.
(Unfortunately, due to the broken and
backwards-incompatible way that Java 5 foreach loops were
implemented, it isn't possible to use them on a codebase
that supports Java 1.4, as POI does)
Iterate over rows and cells using Java 1.5 foreach loops - OOXML Branch Only
Sometimes, you'd like to just iterate over all the rows in
a sheet, or all the cells in a row. If you are using Java
5 or later, then this is especially handy, as it'll allow the
new foreach loop support to work.
Luckily, this is very easy. Both HSSFSheet and HSSFRow
implement java.lang.Iterable to allow foreach
loops. For HSSFRow this allows access to the
CellIterator inner class to handle iterating over
the cells, and for HSSFSheet gives the
rowIterator() to iterator over all the rows.
This only works on the OOXML branch of POIText Extraction
For most text extraction requirements, the standard
ExcelExtractor class should provide all you need.
For very fancy text extraction, XLS to CSV etc,
take a look at
/src/scratchpad/examples/src/org/apache/poi/hssf/eventusermodel/examples/XLS2CSVmra.java
Fills and colorsMerging cellsWorking with fonts
Note, the maximum number of unique fonts in a workbook is limited to 32767 (
the maximum positive short). You should re-use fonts in your apllications instead of
creating a font for each cell.
Examples:
Wrong:
Correct:
Custom colorsReading and Rewriting WorkbooksUsing newlines in cellsData FormatsFit Sheet to One PageSet Print AreaSet Page Numbers on FooterUsing the Convenience Functions
The convenience functions live in contrib and provide
utility features such as setting borders around merged
regions and changing style attributes without explicitly
creating new styles.
Shift rows up or down on a sheetSet a sheet as selectedSet the zoom magnification
The zoom is expressed as a fraction. For example to
express a zoom of 75% use 3 for the numerator and
4 for the denominator.
Splits and freeze panes
There are two types of panes you can create; freeze panes and split panes.
A freeze pane is split by columns and rows. You create
a freeze pane using the following mechanism:
sheet1.createFreezePane( 3, 2, 3, 2 );
The first two parameters are the columns and rows you
wish to split by. The second two parameters indicate
the cells that are visible in the bottom right quadrant.
Split pains appear differently. The split area is
divided into four separate work area's. The split
occurs at the pixel level and the user is able to
adjust the split by dragging it to a new position.
The first parameter is the x position of the split.
This is in 1/20th of a point. A point in this case
seems to equate to a pixel. The second parameter is
the y position of the split. Again in 1/20th of a point.
The last parameter indicates which pane currently has
the focus. This will be one of HSSFSheet.PANE_LOWER_LEFT,
PANE_LOWER_RIGHT, PANE_UPPER_RIGHT or PANE_UPPER_LEFT.
Repeating rows and columns
It's possible to set up repeating rows and columns in
your printouts by using the setRepeatingRowsAndColumns()
function in the HSSFWorkbook class.
This function Contains 5 parameters.
The first parameter is the index to the sheet (0 = first sheet).
The second and third parameters specify the range for the columns to repreat.
To stop the columns from repeating pass in -1 as the start and end column.
The fourth and fifth parameters specify the range for the rows to repeat.
To stop the columns from repeating pass in -1 as the start and end rows.
Headers and Footers
Example is for headers but applies directly to footers.
Drawing Shapes
POI supports drawing shapes using the Microsoft Office
drawing tools. Shapes on a sheet are organized in a
hiearchy of groups and and shapes. The top-most shape
is the patriarch. This is not visisble on the sheet
at all. To start drawing you need to call createPatriarch
on the HSSFSheet class. This has the
effect erasing any other shape information stored
in that sheet. By default POI will leave shape
records alone in the sheet unless you make a call to
this method.
To create a shape you have to go through the following
steps:
Create the patriarch.
Create an anchor to position the shape on the sheet.
Ask the patriarch to create the shape.
Set the shape type (line, oval, rectangle etc...)
Set any other style details converning the shape. (eg:
line thickness, etc...)
Text boxes are created using a different call:
It's possible to use different fonts to style parts of
the text in the textbox. Here's how:
Just as can be done manually using Excel, it is possible
to group shapes together. This is done by calling
createGroup() and then creating the shapes
using those groups.
It's also possible to create groups within groups.
Any group you create should contain at least two
other shapes or subgroups.
Here's how to create a shape group:
If you're being observant you'll noticed that the shapes
that are added to the group use a new type of anchor:
the HSSFChildAnchor. What happens is that
the created group has it's own coordinate space for
shapes that are placed into it. POI defaults this to
(0,0,1023,255) but you are able to change it as desired.
Here's how:
If you create a group within a group it's also going
to have it's own coordinate space.
Styling Shapes
By default shapes can look a little plain. It's possible
to apply different styles to the shapes however. The
sorts of things that can currently be done are:
Change the fill color.
Make a shape with no fill color.
Change the thickness of the lines.
Change the style of the lines. Eg: dashed, dotted.
Change the line color.
Here's an examples of how this is done:
Shapes and Graphics2d
While the native POI shape drawing commands are the
recommended way to draw shapes in a shape it's sometimes
desirable to use a standard API for compatibility with
external libraries. With this in mind we created some
wrappers for Graphics and Graphics2d.
It's important to not however before continuing that
Graphics2d is a poor match to the capabilities
of the Microsoft Office drawing commands. The older
Graphics class offers a closer match but is
still a square peg in a round hole.
All Graphics commands are issued into an HSSFShapeGroup.
Here's how it's done:
The first thing we do is create the group and set it's coordinates
to match what we plan to draw. Next we calculate a reasonable
fontSizeMultipler then create the EscherGraphics object.
Since what we really want is a Graphics2d
object we create an EscherGraphics2d object and pass in
the graphics object we created. Finally we call a routine
that draws into the EscherGraphics2d object.
The vertical points per pixel deserves some more explanation.
One of the difficulties in converting Graphics calls
into escher drawing calls is that Excel does not have
the concept of absolute pixel positions. It measures
it's cell widths in 'characters' and the cell heights in points.
Unfortunately it's not defined exactly what type of character it's
measuring. Presumably this is due to the fact that the Excel will be
using different fonts on different platforms or even within the same
platform.
Because of this constraint we've had to implement the concept of a
verticalPointsPerPixel. This the amount the font should be scaled by when
you issue commands such as drawString(). To calculate this value
use the follow formula:
The height of the group is calculated fairly simply by calculating the
difference between the y coordinates of the bounding box of the shape. The
height of the group can be calculated by using a convenience called
HSSFClientAnchor.getAnchorHeightInPoints().
Many of the functions supported by the graphics classes
are not complete. Here's some of the functions that are known
to work.
fillRect()
fillOval()
drawString()
drawOval()
drawLine()
clearRect()
Functions that are not supported will return and log a message
using the POI logging infrastructure (disabled by default).
Outlining
Outlines are great for grouping sections of information
together and can be added easily to columns and rows
using the POI API. Here's how:
To collapse (or expand) an outline use the following calls:
The row/column you choose should contain an already
created group. It can be anywhere within the group.
Images
Images are part of the drawing support. To add an image just
call createPicture() on the drawing patriarch.
At the time of writing the following types are supported:
PNG
JPG
DIB
It should be noted that any existing drawings may be erased
once you add a image to a sheet.
Creating an image and setting its anchor to the actual width and height:
or
HSSFPicture.resize() works only for JPEG and PNG. Other formats are not yet supported.
Reading images from a workbook:
Named Ranges and Named Cells
Named Range is a way to refer to a group of cells by a name. Named Cell is a
degenerate case of Named Range in that the 'group of cells' contains exactly one
cell. You can create as well as refer to cells in a workbook by their named range.
When working with Named Ranges, the classes: org.apache.poi.hssf.util.CellReference and
& org.apache.poi.hssf.util.AreaReference are used.
Creating Named Range / Named Cell
Reading from Named Range / Named Cell
Reading from non-contiguous Named Ranges
Cell Comments
In Excel a comment is a kind of a text shape,
so inserting a comment is very similar to placing a text box in a worksheet:
Reading cell comments
Adjust column width to fit the contents
To calculate column width HSSFSheet.autoSizeColumn uses Java2D classes
that throw exception if graphical environment is not available. In case if graphical environment
is not available, you must tell Java that you are running in headless mode and
set the following system property: java.awt.headless=true
(either via -Djava.awt.headless=true startup parameter or via System.setProperty("java.awt.headless", "true")).
How to read hyperlinksHow to create hyperlinks