837 lines
33 KiB
HTML
Executable File
837 lines
33 KiB
HTML
Executable File
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=iso-8859-1">
|
|
<TITLE></TITLE>
|
|
<META NAME="GENERATOR" CONTENT="StarOffice/5.2 (Linux)">
|
|
<META NAME="AUTHOR" CONTENT=" ">
|
|
<META NAME="CREATED" CONTENT="20010728;10223600">
|
|
<META NAME="CHANGEDBY" CONTENT="Marc Johnson">
|
|
<META NAME="CHANGED" CONTENT="20010810;13415800">
|
|
<STYLE>
|
|
<!--
|
|
@page { margin-left: 1.25in; margin-right: 1.25in; margin-top: 1in; margin-bottom: 1in }
|
|
H1 { margin-bottom: 0.08in; font-size: 16pt }
|
|
TD P { margin-bottom: 0.08in }
|
|
H2 { margin-bottom: 0.08in; font-size: 14pt; font-style: italic }
|
|
H3 { margin-bottom: 0.08in }
|
|
H4 { margin-bottom: 0.08in; font-size: 11pt; font-style: italic }
|
|
P { margin-bottom: 0.08in }
|
|
-->
|
|
</STYLE>
|
|
</HEAD>
|
|
<BODY>
|
|
<H1>POI Filesystem format</H1>
|
|
<H2>Introduction</H2>
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium">
|
|
The POI file format is essentially an archive wrapper
|
|
around files. It is intended to mimic a filesystem. For
|
|
the remainder of this document it is referred to as a
|
|
filesystem in order to avoid confusion with the
|
|
"files" it contains.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium; text-decoration: none">
|
|
POI filesystems are compatible with those document formats
|
|
used by a well-known software company's popular office
|
|
productivity suite and programs outputting compatible
|
|
data. Because the POI filesystem does not provide
|
|
compression, encryption or any other worthwhile feature,
|
|
its not a good choice unless you require interoperability
|
|
with these programs.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium">
|
|
The POI filesystem does not encode the documents
|
|
themselves. For example, if you had a word processor file
|
|
with the extension ".doc", you would actually
|
|
have a POI filesystem with a document file archived inside
|
|
of the filesystem.
|
|
</P>
|
|
<H2>Document Conventions</H2>
|
|
<P STYLE="margin-bottom: 0in">
|
|
This document utilizes the numeric types as described by
|
|
the Java Language Specification, which can be found at
|
|
java.sun.com. In short:
|
|
</P>
|
|
<UL>
|
|
<LI>
|
|
<P STYLE="margin-bottom: 0in">
|
|
a byte is an 8 bit signed integer ranging from
|
|
(-128) to 127.
|
|
</P>
|
|
</LI>
|
|
<LI>
|
|
<P STYLE="margin-bottom: 0in">
|
|
a short is a 16 bit signed integer ranging from
|
|
(-32768) to 32767
|
|
</P>
|
|
</LI>
|
|
<LI>
|
|
<P STYLE="margin-bottom: 0in">
|
|
an int is a 32 bit signed integer ranging from
|
|
(-2.14e+9) to 2.14e+9
|
|
</P>
|
|
</LI>
|
|
<LI>
|
|
<P STYLE="margin-bottom: 0in">
|
|
a long is a 64 bit signed integer ranging from
|
|
(-9.22e+18) to 9.22e+18
|
|
</P>
|
|
</LI>
|
|
</UL>
|
|
<P STYLE="margin-bottom: 0in">
|
|
The Java Language Specification spells out a number of
|
|
other types that are not referred to by this document.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
Where this document makes references to "endian
|
|
conversion" it is referring to the byte order of
|
|
stored numbers. Numbers in "little-endian order"
|
|
are stored with the LEAST significant byte first. In order
|
|
to properly read a short, for example, you'd read two
|
|
bytes and then shift the second byte 8 bits to the left
|
|
before performing an <CODE>or</CODE> operation to it
|
|
against the first byte while stripping the
|
|
"sign" from the first byte. The following code
|
|
illustrates this method:
|
|
</P>
|
|
<P STYLE="text-decoration: none">
|
|
<FONT FACE="Courier, monospace"><FONT
|
|
SIZE=2><B>public int getShort (byte[ ] rec)
|
|
{</B></FONT></FONT>
|
|
</P>
|
|
<P>
|
|
<FONT FACE="Courier, monospace"><FONT SIZE=2><B>return (
|
|
(rec[1] << 8) | (rec[0] & 0xff)
|
|
);</B></FONT></FONT>
|
|
</P>
|
|
<P>
|
|
<FONT FACE="Courier, monospace"><FONT
|
|
SIZE=2><B>}</B></FONT></FONT>
|
|
</P>
|
|
<H2>Filesystem Introduction</H2>
|
|
<P STYLE="margin-bottom: 0in">
|
|
POI filesystems are essentially normal files stored on a
|
|
Java-compatible platform's native filesystem. They are
|
|
identified by names ending in a four character identifier
|
|
noting what type of data they contain. For example, a file
|
|
ending in ".xls" would likely contain
|
|
spreadsheet data, and a file ending in ".doc"
|
|
would probably contain a word processing document. POI
|
|
filesystems are called "filesystem", because
|
|
they contain multiple embedded files in a manner similar
|
|
to traditional filesystems. Along functional lines, it
|
|
would be more accurate to call these POI archives.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
POI filesystems do not provide encryption, compression, or
|
|
any other feature of a modern archive and are therefore a
|
|
poor choice for implementing new file formats. It is
|
|
suggested that POI filesystems are most useful for
|
|
interoperability with legacy applications that use a
|
|
compatible file format.
|
|
</P>
|
|
<H2>Filesystem Walkthrough</H2>
|
|
<P STYLE="margin-bottom: 0in">
|
|
This is a walkthrough of a POI filesystem and how it is
|
|
put together. It is not intended to give a concise
|
|
description but to give a "big picture" of the
|
|
general structure and how it's interpreted.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
A POI filesystem begins with a <A
|
|
HREF="HeaderBlock"><B><I>header</I></B></A>. This header
|
|
identifies locations in the file by function and provides
|
|
a sanity check identifying a native filesystem file as
|
|
indeed a POI filesystem.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
The first 64 bits of the header compose a <B><I>magic
|
|
number identifier.</I></B> This identifier tells the
|
|
client software that this is indeed a POI filesystem and
|
|
that it should be treated as such. This is a "sanity
|
|
check" to make sure this is a POI filesystem and not
|
|
some other format. The header also contains an <B><I>array
|
|
of block numbers</I></B>. These block numbers refer to
|
|
blocks in the file. When these blocks are read together
|
|
they form the <A HREF="#BAT"><B><I>Block Allocation
|
|
Table</I></B></A>. The header also contains a pointer to
|
|
the first element in the <A
|
|
HREF="#PropertyTable"><B><I>property table</I></B></A>
|
|
also known as the <A HREF="RootEntry"><B><I>root
|
|
element</I></B></A>, and a pointer to the <B>small Block
|
|
Allocation Table (SBAT)</B>.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
The <A HREF="#BAT"><B><I>block allocation
|
|
table</I></B></A> or <B><I>BAT</I></B>, along with the <A
|
|
HREF="#PropertyTable"><B><I>property table</I></B></A>
|
|
specify which blocks in the filesystem belong to which
|
|
files. It is somewhat hard to conceptualize the Block
|
|
Allocation Table at first. The block allocation table is
|
|
essentially an array of integers that point at each
|
|
other. These elements form chains.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
To read the <A HREF="#BAT"><B><I>block allocation
|
|
table</I></B></A> you must first read the <B><I>start
|
|
block </I></B>of the file from the <A
|
|
HREF="#PropertyTable"><B><I>property
|
|
table</I></B></A>. This is both your index for the next
|
|
element in the <B><I>BAT </I></B>array as well as the
|
|
index of the first block in your file. For instance: if
|
|
the <B><I>start block</I></B> from your file's property is
|
|
0 then you read block 0 (the first block after the header)
|
|
from your filesystem as the first block of your file. You
|
|
also read element 0 from the <B><I>BAT array</I></B>.
|
|
Supposing this element has a value equal to 2, you'd read
|
|
block 2 from your filesystem as the next block of your
|
|
file and element 2 from your <B><I>BAT array</I></B>.
|
|
This will be covered further later in this document.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
The <A HREF="#PropertyTable"><B><I>Property
|
|
Table</I></B></A> is essentially the directory structure
|
|
for the filesystem. It consists of the name of the file or
|
|
directory, its <B><I>start block</I></B> in both the
|
|
filesystem and <B><I>BAT</I></B>, and its actual size.
|
|
The first property in the <A
|
|
HREF="#PropertyTable">property table</A> is the <A
|
|
HREF="RootEntry"><B><I>root element</I></B></A>. Its real
|
|
purpose is to hold the start block for the <B><I>small
|
|
blocks.</I></B>
|
|
</P>
|
|
<H3>Filesystem Structure</H3>
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium">
|
|
All values in the POI filesystem are stored in
|
|
"little-endian" order, meaning you must reverse
|
|
the order of the bytes before assigning them to
|
|
variables. Assume the values you see below are originally
|
|
stored backwards.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium">
|
|
The POI filesystem is divided into 512 byte blocks. Each
|
|
block has an implicit block-type. The order and
|
|
description of these is described below.
|
|
</P>
|
|
<A NAME="HeaderBlock"><H3>Header Block</H3></A>
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium">
|
|
The POI filesystem begins with a <B><I>header
|
|
block</I></B>. The first 64 bits of the header form a long
|
|
<B><I>file type id</I></B> or <B><I>magic number
|
|
identifier</I></B> of
|
|
<CODE>0xE11AB1A1E011CFD0L</CODE>. This is basically a
|
|
sanity check. If this isn't the first thing in the header
|
|
(and consequently the filesystem) then this is not a POI
|
|
filesystem and should be read with some other library.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium">
|
|
It's important to know the most important parts of the
|
|
header. These are discussed in the rest of this
|
|
section.
|
|
</P>
|
|
<H4>BATs</H4>
|
|
<P STYLE="margin-bottom: 0in">
|
|
At offset <B>0x2c</B> is an int specifying the number of
|
|
elements in the <B><I>BAT array</I></B>. The array at
|
|
<B>0x4c</B> an array of ints. This array contains the
|
|
indices of every block in the <A HREF="#BAT">Block
|
|
Allocation Table</A>.
|
|
</P>
|
|
<H4><I><B>XBATs</B></I></H4>
|
|
<P STYLE="margin-bottom: 0in">
|
|
Very large POI archives may have more blocks than can be
|
|
addressed by the BAT blocks enumerated in the header
|
|
block. How large? Well, the BAT array in the header can
|
|
contain up to 109 BAT block indices; each BAT block
|
|
references up to 128 blocks, and each block is 512 bytes,
|
|
so we're talking about 109 * 128 * 512 = 6.8MB. That's a
|
|
pretty respectable document! But, you could have much more
|
|
data than that, and in today's world of cheap gigabyte
|
|
drives, why not? So, the BAT may be extended in that
|
|
event. The integer value at offset <B>0x44</B> of the
|
|
header is the index of the first <B><I>extended BAT (XBAT)
|
|
block</I></B>. At offset <B>0x48</B> of the header, there
|
|
is an int value that specifies how many XBAT blocks there
|
|
are. The XBAT blocks begin at the specified index into the
|
|
array of blocks making up the POI filesystem, and continue
|
|
in sequence for the specified count of XBAT blocks.
|
|
</p>
|
|
<p>
|
|
Each XBAT block contains the indices of up to 128 BAT
|
|
blocks, so the document size can be expanded by another
|
|
8MB for each XBAT block. The BAT blocks indexed by an XBAT
|
|
block are appended to the end of the list of BAT blocks
|
|
enumerated in the header block. Thus the BAT blocks
|
|
enumerated in the header block are BAT blocks 0 through
|
|
108, the BAT blocks enumerated in the first XBAT block are
|
|
BAT blocks 109 through 236, the BAT blocks enumerated in
|
|
the second XBAT block are BAT blocks 237 through 364, and
|
|
so on.
|
|
</P>
|
|
<p>
|
|
Through the use of XBAT blocks, the limit on the overall
|
|
document size is that imposed by the 4-byte block indices;
|
|
if the indices are unsigned ints, the maximum file size is
|
|
2 terabytes, 1 terabyte if the indices are treated as
|
|
signed ints. Either way, I have yet to see a disk drive
|
|
large enough to accommodate such a file on the shelves at
|
|
the local office supply stores.
|
|
</p>
|
|
<H4>SBATs</H4>
|
|
<P STYLE="margin-bottom: 0in">
|
|
If a file contained in a POI archive is smaller than 4096
|
|
bytes, it is stored in small blocks. Small blocks are 64
|
|
bytes in length and are contained within big blocks, up to
|
|
8 to a big block. As the main BAT is used to navigate the
|
|
array of big blocks, so the <B><I>small block allocation
|
|
table</I></B> is used to navigate the array of small
|
|
blocks. The SBAT's start block index is found at offset
|
|
<B>0x3C</B> of the header block, and remaining blocks
|
|
constituting the SBAT are found by walking the main BAT as
|
|
if it were an ordinary file in the POI filesystem (this
|
|
process is described below).
|
|
</P>
|
|
<H4>Property Table Start Index</H4>
|
|
<P STYLE="margin-bottom: 0in">
|
|
An integer at address <B>0x30</B> specifies the start
|
|
index of the <A HREF="#PropertyTable">property
|
|
table</A>. This integer is specified as a
|
|
<B><I>"block index". </I></B>The <A
|
|
HREF="#PropertyTable">Property Table</A> is stored, as is
|
|
almost everything in a POI file system, in big blocks and
|
|
walked via the BAT. The <A HREF="#PropertyTable">Property
|
|
Table</A> is described below.
|
|
</P>
|
|
<A NAME="PropertyTable"><H3>Property Table</H3></A>
|
|
<P STYLE="margin-bottom: 0in">
|
|
The property table is essentially nothing more than the
|
|
directory system. Properties are 128 byte records
|
|
contained within the 512 byte blocks. The first property
|
|
is always the <A HREF="RootEntry">Root Entry</A>. The
|
|
following applies to individual properties within a
|
|
property table:
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
At offset <B>0x00</B> in the property is the
|
|
"<B><I>name</I></B>". This is stored as an
|
|
uncompressed 16 bit unicode string. In short every other
|
|
byte corresponds to an "ASCII" character. The
|
|
size of this string is stored at offset <B>0x40</B>
|
|
(<B><I>string size</I></B>) as a short.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
At offset <B>0x42</B> is the <B><I>property type</I></B>
|
|
(byte). The type is 1 for directory, 2 for file or 5 for
|
|
the Root Entry.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
At offset <B>0x43</B> is the <B><I>node color</I></B>
|
|
(byte). The color is either 1, (black), or 0,
|
|
(red). Properties are apparently meant to be arranged in a
|
|
red-black binary tree, subject to the following rules:
|
|
<A name="node_rules"></A>
|
|
<OL>
|
|
<LI>The root of the tree is always black
|
|
<LI>Two consecutive nodes cannot both be red
|
|
<LI>A property is less than another property if its
|
|
name length is less than the other property's name
|
|
length
|
|
<LI>If two properties have the same name length, the
|
|
sort order is determined by the sort order of the
|
|
properties' names.
|
|
</OL>
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
At offset <B>0x44</B> is the index (int) of the
|
|
<B><I>previous property</I></B>.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
At offset <B>0x48</B> is the index (int) of the <B><I>next
|
|
property</I></B>.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
At offset <B>0x4C</B> is the index (int) of the
|
|
<B><I>first directory entry</I></B>.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
At offset <B>0x74</B> is an integer giving the <B><I>start
|
|
block</I></B> for the file described by this
|
|
property. This index corresponds to an index in the array
|
|
of indices that is the Block Allocation Table (or the
|
|
Small Block Allocation Table) as well as the index of the
|
|
first block in the file.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
At offset <B>0x78</B> is an integer giving the total
|
|
<B><I>actual size</I></B> of the file pointed at by this
|
|
property. If the file size is less than 4096, the file is
|
|
stored in small blocks and the SBAT is used to walk the
|
|
small blocks making up the file. If the file size is 4096
|
|
or larger, the file is stored in big blocks and the main
|
|
BAT is used to walk the big blocks making up the file. The
|
|
exception to this rule is the <B><I>Root Entry</I></B>,
|
|
which, regardless of its size, is ALWAYS stored in big
|
|
blocks and the main BAT is used to walk the big blocks
|
|
making up this special file.
|
|
</P>
|
|
<A NAME="RootEntry"><H3>Root Entry</H3></A>
|
|
<P STYLE="margin-bottom: 0in">
|
|
The <B><I>Root Entry</I></B> in the <A
|
|
HREF="#PropertyTable"><B><I>Property Table</I></B></A>
|
|
contains the information necessary to read and write small
|
|
files, which are files less than 4096 bytes long. The
|
|
start block field of the Root Entry is the start index of
|
|
the <B><I>Small Block Array</I></B>, which is read like
|
|
any other file in the POI filesysstem. Since the SBAT
|
|
cannot be used without the Small Block Array, the Root
|
|
Entry MUST be read or written using the <A
|
|
HREF="#BAT"><B><I>Block Allocation Table</I></B></A>. The
|
|
blocks making up the Small Block Array are divided into
|
|
64-byte small blocks, up to the size indicated in the Root
|
|
Entry (which should always be a multiple of 64)
|
|
</P>
|
|
<H3>Walking the Nodes of the <A HREF="#PropertyTable">Property
|
|
Table</A></H3>
|
|
<P STYLE="margin-bottom: 0in">
|
|
The individual properties form a directory tree, with the
|
|
<B><I>Root Entry</I></B> as the directory tree's root, as
|
|
shown in the accompanying drawing. Note the numbers in
|
|
parentheses in each node; they represent the node's index
|
|
in the array of properties. The <B>NEXT_PROP</B>,
|
|
<B>PREVIOUS_PROP</B>, and <B>CHILD_PROP</B> fields hold
|
|
these indices, and are used to navigate the tree.
|
|
</P>
|
|
<P>
|
|
<IMG SRC="PropertySet.jpg">
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
Each <A NAME="directoryEntry">directory entry</A> (i.e., a
|
|
property whose type is <B><I>directory</I></B> or
|
|
<B><I>root entry</I></B>) uses its <B>CHILD_PROP</B> field
|
|
to point to one of its subordinate (child) properties. It
|
|
doesn't seem to matter which of its children it points
|
|
to. Thus in the previous drawing, the Root Entry's
|
|
CHILD_PROP field may contain 1, 4, or the index of one of
|
|
its other children. Similarly, the directory node (index
|
|
1) may have, in its CHILD_PROP field, 2, 3, or the index
|
|
of one of its other children.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
The children of a given <A
|
|
HREF="#directoryEntry">directory property</A> point to
|
|
each other in a similar fashion by using their
|
|
<B>NEXT_PROP</B> and <B>PREVIOUS_PROP</B> fields. The
|
|
ordering of the children is governed by rules described <a
|
|
href="#node_rules">here</a>
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
Unused <B>NEXT_PROP</B>, <B>PREVIOUS_PROP</B>, and
|
|
<B>CHILD_PROP</B> fields contain the marker value of
|
|
-1. All file properties have a value of -1 for their
|
|
CHILD_PROP fields for example.
|
|
</P>
|
|
<A NAME="BAT"><H3>Block Allocation Table</H3></A>
|
|
<P STYLE="margin-bottom: 0in">
|
|
The <B><I>BAT blocks</I></B> are pointed at by the bat
|
|
array contained in the <A HREF="HeaderBlock">header</A>
|
|
and supplemented, if necessary, by the <B><I>XBAT
|
|
blocks</I></B>. These blocks form a large table of
|
|
integers. These integers are block numbers. The
|
|
<B><I>Block Allocation Table</I></B> holds chains of
|
|
integers. These chains are terminated with -2. The
|
|
elements in these chains refer to blocks in the files. The
|
|
starting block of a file is NOT specified in the BAT. It
|
|
is specified by the <B><I>property</I></B> for a given
|
|
file. The elements in this BAT are both the block number
|
|
(within the file minus the header) AND the number of the
|
|
next BAT element in the chain. This can be thought of as a
|
|
linked list of blocks. The BAT array contains the links
|
|
from one block to the next, including the end of chain
|
|
marker.
|
|
</P>
|
|
<P>
|
|
Here's an example: Let's assume that the BAT begins as
|
|
follows:
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 0 ] = 2</B></FONT>
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 1 ] = 5</B></FONT>
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 2 ] = 3</B></FONT>
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 3 ] = 4</B></FONT>
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 4 ] = 6</B></FONT>
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 5 ] =
|
|
-2</B></FONT>
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 6 ] = 7</B></FONT>
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 7 ] =
|
|
-2</B></FONT>
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
<B>...</B>
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
Now, if we have a file whose <A
|
|
HREF="#PropertyTable">Property Table</A> entry says it
|
|
begins with index 0, we walk the BAT array and see that
|
|
the file consists of blocks 0 (because the start block is
|
|
0), 2 (because BAT[ 0 ] is 2), 3 (BAT[ 2 ] is 3), 4 (BAT[
|
|
3 ] is 4), 6 (BAT[ 4 ] is 6), and 7 (BAT[ 6 ] is 7). It
|
|
ends at block 7 because BAT[ 7 ] is -2, which is the end
|
|
of chain marker.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
Similarly, a file beginning at index 1 consists of
|
|
blocks 1 and 5.
|
|
</P>
|
|
<P STYLE="margin-bottom: 0in">
|
|
Other special numbers in a BAT array are:
|
|
</P>
|
|
<UL>
|
|
<LI>
|
|
<P STYLE="margin-bottom: 0in">
|
|
-1, which indicates an unused block
|
|
</P>
|
|
</LI>
|
|
<LI>
|
|
<P STYLE="margin-bottom: 0in">
|
|
-3, which indicates a "special" block,
|
|
such as a block used to make up the Small Block
|
|
Array, the <A HREF="#PropertyTable">Property
|
|
Table</A>, the main BAT, or the SBAT
|
|
</P>
|
|
</LI>
|
|
</UL>
|
|
<H2>Filesystem Structures</H2>
|
|
<P>
|
|
The following outlines the basic filesystem structures.
|
|
</P>
|
|
<H3>Header (block 1) -- 512 (0x200) bytes</H3>
|
|
<TABLE BORDER=0 CELLPADDING=4 CELLSPACING=0>
|
|
<TR VALIGN=TOP>
|
|
<TD><B>Field</B></TD>
|
|
<TD><B>Description</B></TD>
|
|
<TD><B>Offset</B></TD>
|
|
<TD><B>Length</B></TD>
|
|
<TD><B>Default value or const</B></TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>FILETYPE</TD>
|
|
<TD>Magic number identifying this as a POI
|
|
filesystem.</TD>
|
|
<TD>0x0000</TD>
|
|
<TD>Long</TD>
|
|
<TD>0xE11AB1A1E011CFD0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>UK1</TD>
|
|
<TD>Unknown constant</TD>
|
|
<TD>0x0008</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>UK2</TD>
|
|
<TD>Unknown Constant</TD>
|
|
<TD>0x000C</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>UK3</TD>
|
|
<TD>Unknown Constant</TD>
|
|
<TD>0x0014</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>UK4</TD>
|
|
<TD>Unknown Constant (revision?)</TD>
|
|
<TD>0x0018</TD>
|
|
<TD>Short</TD>
|
|
<TD>0x003B</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>UK5</TD>
|
|
<TD>Unknown Constant (version?)</TD>
|
|
<TD>0x001A</TD>
|
|
<TD>Short</TD>
|
|
<TD>0x0003</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>UK6</TD>
|
|
<TD>Unknown Constant</TD>
|
|
<TD>0x001C</TD>
|
|
<TD>Short</TD>
|
|
<TD>-2</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>LOG_2_BIG_BLOCK_SIZE</TD>
|
|
<TD>Log, base 2, of the big block size</TD>
|
|
<TD>0x001E</TD>
|
|
<TD>Short</TD>
|
|
<TD>9 (2 ^ 9 = 512 bytes)</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>LOG_2_SMALL_BLOCK_SIZE</TD>
|
|
<TD>Log, base 2, of the small block size</TD>
|
|
<TD>0x0020</TD>
|
|
<TD>Integer</TD>
|
|
<TD>6 (2 ^ 6 = 64 bytes)</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>UK7</TD>
|
|
<TD>Unknown Constant</TD>
|
|
<TD>0x0024</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>UK8</TD>
|
|
<TD>Unknown Constant</TD>
|
|
<TD>0x0028</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>BAT_COUNT</TD>
|
|
<TD>Number of elements in the BAT array</TD>
|
|
<TD>0x002C</TD>
|
|
<TD>Integer</TD>
|
|
<TD>required</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>PROPERTIES_START</TD>
|
|
<TD>Block index of the first block of the <A
|
|
HREF="#PropertyTable">property table</A></TD>
|
|
<TD>0x0030</TD>
|
|
<TD>Integer</TD>
|
|
<TD>required</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>UK9</TD>
|
|
<TD>Unknown Constant</TD>
|
|
<TD>0x0034</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>UK10</TD>
|
|
<TD>Unknown Constant</TD>
|
|
<TD>0x0038</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0x00001000</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>SBAT_START</TD>
|
|
<TD>Block index of first big block containing the
|
|
small block allocation table (SBAT)</TD>
|
|
<TD>0x003C</TD>
|
|
<TD>Integer</TD>
|
|
<TD>-2</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>UK11</TD>
|
|
<TD>Unknown Constant</TD>
|
|
<TD>0x0040</TD>
|
|
<TD>Integer</TD>
|
|
<TD>1</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>XBAT_START</TD>
|
|
<TD>Block index of the first block in the Extended
|
|
Block Allocation Table (XBAT)</TD>
|
|
<TD>0x0044</TD>
|
|
<TD>Integer</TD>
|
|
<TD>-2</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>XBAT_COUNT</TD>
|
|
<TD>Number of elements in the Extended Block
|
|
Allocation Table (to be added to the BAT)</TD>
|
|
<TD>0x0048</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>BAT_ARRAY</TD>
|
|
<TD>Array of block indicies constituting the <A
|
|
HREF="#BAT">Block Allocation Table (BAT)</A></TD>
|
|
<TD>0x004C, 0x0050, 0x0054 ... 0x01FC</TD>
|
|
<TD>Integer[ ]</TD>
|
|
<TD>-1 for unused elements, at least first element
|
|
must be filled.</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>N/A</TD>
|
|
<TD>Header block data not otherwise described in this
|
|
table</TD>
|
|
<TD>N/A</TD>
|
|
<TD>N/A</TD>
|
|
<TD>-1</TD>
|
|
</TR>
|
|
</TABLE>
|
|
<A HREF="#BAT"><H3><B>Block Allocation Table Block -- 512
|
|
(0x200) bytes</B></H3></A>
|
|
<TABLE BORDER=0 CELLPADDING=4 CELLSPACING=0>
|
|
<TR VALIGN=TOP>
|
|
<TD><B>Field</B></TD>
|
|
<TD><B>Description</B></TD>
|
|
<TD><B>Offset</B></TD>
|
|
<TD><B>Length</B></TD>
|
|
<TD><B>Default value or const</B></TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>BAT_ELEMENT</TD>
|
|
<TD>Any given element in the BAT block</TD>
|
|
<TD>0x0000, 0x0004, 0x0008, ... 0x01FC</TD>
|
|
<TD>Integer</TD>
|
|
<TD>-1 = unused<BR>
|
|
-2 = end of chain<BR>
|
|
-3 = special (e.g., BAT block)<BR>
|
|
All other values point to the next element in the
|
|
chain and the next index of a block composing the
|
|
file.</TD>
|
|
</TR>
|
|
</TABLE>
|
|
<H3>Property Block -- 512 (0x200) byte block</H3>
|
|
<TABLE BORDER=0 CELLPADDING=4 CELLSPACING=0>
|
|
<TR VALIGN=TOP>
|
|
<TD><B>Field</B></TD>
|
|
<TD><B>Description</B></TD>
|
|
<TD><B>Offset</B></TD>
|
|
<TD><B>Length</B></TD>
|
|
<TD><B>Default value or const</B></TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>Properties[ ]</TD>
|
|
<TD>This block contains the properties.</TD>
|
|
<TD>0x0000, 0x0080, 0x0100, 0x0180</TD>
|
|
<TD>128 bytes</TD>
|
|
<TD>All unused space is set to -1.</TD>
|
|
</TR>
|
|
</TABLE>
|
|
<H3>Property -- 128 (0x80) byte block</H3>
|
|
<TABLE BORDER=0 CELLPADDING=4 CELLSPACING=0>
|
|
<TR VALIGN=TOP>
|
|
<TD><B>Field</B></TD>
|
|
<TD><B>Description</B></TD>
|
|
<TD><B>Offset</B></TD>
|
|
<TD><B>Length</B></TD>
|
|
<TD><B>Default value or const</B></TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>NAME</TD>
|
|
<TD>A unicode null-terminated uncompressed 16bit
|
|
string (lose the high bytes) containing the name
|
|
of the property.</TD>
|
|
<TD>0x00, 0x02, 0x04, ... 0x3E</TD>
|
|
<TD>Short[ ]</TD>
|
|
<TD>0x0000 for unused elements, field required, 32
|
|
(0x40) element max</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>NAME_SIZE</TD>
|
|
<TD>Number of characters in the NAME field</TD>
|
|
<TD>0x40</TD>
|
|
<TD>Short</TD>
|
|
<TD>Required</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>PROPERTY_TYPE</TD>
|
|
<TD>Property type (directory, file, or root)</TD>
|
|
<TD>0x42</TD>
|
|
<TD>Byte</TD>
|
|
<TD>1 (directory), 2 (file), or 5 (root entry)</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>NODE_COLOR</TD>
|
|
<TD>Node color</TD>
|
|
<TD>0x43</TD>
|
|
<TD>Byte</TD>
|
|
<TD>0 (red) or 1 (black)</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>PREVIOUS_PROP</TD>
|
|
<TD>Previous property index</TD>
|
|
<TD>0x44</TD>
|
|
<TD>Integer</TD>
|
|
<TD>-1</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>NEXT_PROP</TD>
|
|
<TD>Next property index</TD>
|
|
<TD>0x48</TD>
|
|
<TD>Integer</TD>
|
|
<TD>-1</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>CHILD_PROP</TD>
|
|
<TD>First child property index</TD>
|
|
<TD>0x4c</TD>
|
|
<TD>Integer</TD>
|
|
<TD>-1</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>SECONDS_1</TD>
|
|
<TD>Seconds component of the created timestamp?</TD>
|
|
<TD>0x64</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>DAYS_1</TD>
|
|
<TD>Days since epoch component of the created
|
|
timestamp?</TD>
|
|
<TD>0x68</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>SECONDS_2</TD>
|
|
<TD>Seconds component of the modified timestamp?</TD>
|
|
<TD>0x6C</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>DAYS_2</TD>
|
|
<TD>Days since epoch component of the modified
|
|
timestamp?</TD>
|
|
<TD>0x70</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>START_BLOCK</TD>
|
|
<TD>Starting block of the file, used as the first
|
|
block in the file and the pointer to the next
|
|
block from the BAT</TD>
|
|
<TD>0x74</TD>
|
|
<TD>Integer</TD>
|
|
<TD>Required</TD>
|
|
</TR>
|
|
<TR VALIGN=TOP>
|
|
<TD>SIZE</TD>
|
|
<TD>Actual size of the file this property points
|
|
to. (used to truncate the blocks to the real
|
|
size).</TD>
|
|
<TD>0x78</TD>
|
|
<TD>Integer</TD>
|
|
<TD>0</TD>
|
|
</TR>
|
|
</TABLE>
|
|
</BODY>
|
|
</HTML> |