1167 lines
30 KiB
XML
1167 lines
30 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.0//EN" "../dtd/document-v10.dtd">
|
|
<document>
|
|
<header>
|
|
<title>HPSF Internals: The Horrible Property Set Format</title>
|
|
<authors>
|
|
<person name="Rainer Klute" email="klute@rainer-klute.de"/>
|
|
</authors>
|
|
</header>
|
|
<body>
|
|
<s1 title="HPSF Internals">
|
|
|
|
<s2 title="Introduction">
|
|
|
|
<p>A Microsoft Office document is internally organized like a filesystem
|
|
with directory and files. Microsoft calls these files
|
|
<strong>streams</strong>. A document can have properties attached to it,
|
|
like author, title, number of words etc. These metadata are not stored in
|
|
the main stream of, say, a Word document, but instead in a dedicated
|
|
stream with a special format. Usually this stream's name is
|
|
<code>\005SummaryInformation</code>, where <code>\005</code> represents
|
|
the character with a decimal value of 5.</p>
|
|
|
|
<p>A single piece of information in the stream is called a
|
|
<strong>property</strong>, for example the document title. Each property
|
|
has an integral <strong>ID</strong> (e.g. 2 for title), a
|
|
<strong>type</strong> (telling that the title is a string of bytes) and a
|
|
<strong>value</strong> (what this is should be obvious). A stream
|
|
containing properties is called a
|
|
<strong>property set stream</strong>.</p>
|
|
|
|
<p>This document describes the internal structure of a property set stream,
|
|
i.e. the <strong>Horrible Property Set Format (HDF)</strong>. It does not
|
|
describe how a Microsoft Office document is organized internally and how
|
|
to retrieve a stream from it. See the <link
|
|
href="../poifs/index.html">POIFS documentation</link> for that kind of
|
|
stuff.</p>
|
|
|
|
<p>The Horrible Property Set Format is not only used in the Summary
|
|
Information stream in the top-level document of a Microsoft Office
|
|
document. Often there is also a property set stream named
|
|
<code>\005DocumentSummaryInformation</code> with additional properties.
|
|
Embedded documents may have their own property set streams. You cannot
|
|
tell by a stream's name whether it is a property set stream or not.
|
|
Instead you have to open the stream and look at its bytes.</p>
|
|
</s2>
|
|
|
|
|
|
|
|
<s2 title="Data Types">
|
|
|
|
<p>Before delving into the details of the property set stream format we
|
|
have to have a short look at data types. Integral values are stored in the
|
|
so-called <strong>little endian</strong> format. In this format the bytes
|
|
that make out an integral value are stored in the "wrong" order. For
|
|
example, the decimal value 4660 is 0x1234 in the hexadecimal notation. If
|
|
you think this should be represented by a byte 0x12 followed by another
|
|
byte 0x34, you are right. This is called the <strong>big endian</strong>
|
|
format. In the little endian format, however, this order is reversed and
|
|
the low-value byte comes first: 0x3412.
|
|
</p>
|
|
|
|
<p>The following table gives an overview about some important data
|
|
types:</p>
|
|
|
|
<table>
|
|
|
|
<tr>
|
|
<th>
|
|
<p>Name</p>
|
|
</th>
|
|
<th>
|
|
<p>Length</p>
|
|
</th>
|
|
<th>
|
|
<p>Example (Little Endian)</p>
|
|
</th>
|
|
<th>
|
|
<p>Example (Big Endian)</p>
|
|
</th>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td><p><strong>Bytes</strong></p></td>
|
|
<td>
|
|
<p>1 byte</p>
|
|
</td>
|
|
<td><p><code>0x12</code></p></td>
|
|
<td><p><code>0x12</code></p></td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td><p><strong>Word</strong></p></td>
|
|
<td>
|
|
<p>2 bytes</p>
|
|
</td>
|
|
<td><p><code>0x1234</code></p></td>
|
|
<td><p><code>0x3412</code></p></td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td><p><strong>DWord</strong></p></td>
|
|
<td>
|
|
<p>4 bytes</p>
|
|
</td>
|
|
<td><p><code>0x12345678</code></p></td>
|
|
<td><p><code>0x78563412</code></p></td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p><strong>ClassID</strong></p>
|
|
<p>A sequence of one DWord, two Words and eight Bytes</p>
|
|
</td>
|
|
|
|
<td>
|
|
<p>16 bytes</p>
|
|
</td>
|
|
|
|
<td>
|
|
<p><code>0xE0859FF2F94F6810AB9108002B27B3D9</code> resp.
|
|
<code>E0859FF2-F94F-6810-AB-91-08-00-2B-27-B3-D9</code></p>
|
|
</td>
|
|
|
|
<td>
|
|
<p><code>0xF29F85E04FF91068AB9108002B27B3D9</code> resp.
|
|
<code>F29F85E0-4FF9-1068-AB-91-08-00-2B-27-B3-D9</code></p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td></td>
|
|
<td></td>
|
|
<td>
|
|
<p>The ClassID examples are given here in two different notations. The
|
|
second notation without the "0x" at the beginning and with dashes
|
|
inside shows the internal grouping into one DWord, two Words and eight
|
|
Bytes.</p>
|
|
</td>
|
|
<td>
|
|
<p><em>Watch out:</em> Microsoft documentation and tools show class IDs
|
|
a little bit differently like
|
|
<code>F29F85E0-4FF9-1068-AB91-08002B27B3D9</code>.
|
|
However, that representation is (intentionally?) misleading with
|
|
respect to endianess.</p>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</s2>
|
|
|
|
|
|
|
|
<s2 title="HPSF Overview">
|
|
|
|
<p>A property set stream consists of three main parts:</p>
|
|
|
|
<ol>
|
|
<li>
|
|
<p>The <strong>header</strong> and</p>
|
|
</li>
|
|
<li>
|
|
<p>the <strong>section(s)</strong> containing the properties.</p>
|
|
</li>
|
|
</ol>
|
|
</s2>
|
|
|
|
|
|
|
|
<s2 title="The Header">
|
|
|
|
<p>The first bytes in a property set stream is the <strong>header</strong>.
|
|
It has a fixed length and looks like this:</p>
|
|
|
|
<table>
|
|
<tr>
|
|
<th>
|
|
<p>Offset</p>
|
|
</th>
|
|
<th>
|
|
<p>Type</p>
|
|
</th>
|
|
<th>
|
|
<p>Contents</p>
|
|
</th>
|
|
<th>
|
|
<p>Remarks</p>
|
|
</th>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>0</p>
|
|
</td>
|
|
<td>
|
|
<p>Word</p>
|
|
</td>
|
|
<td><p><code>0xFFFE</code></p></td>
|
|
<td>
|
|
<p>If the first four bytes of a stream do not contain these values, the
|
|
stream is not a property set stream.</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>2</p>
|
|
</td>
|
|
<td>
|
|
<p>Word</p>
|
|
</td>
|
|
<td><p><code>0x0000</code></p></td>
|
|
<td></td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>4</p>
|
|
</td>
|
|
<td>
|
|
<p>DWord</p>
|
|
</td>
|
|
<td>
|
|
<p>Denotes the operating system and the OS version under which this
|
|
stream was created. The operating system ID is in the DWord's higher
|
|
word (after little endian decoding): <code>0x0000</code> for Win16,
|
|
<code>0x0001</code> for Macintosh and <code>0x0002</code> for Win32 - that's
|
|
all. The reader is most likely aware of the fact that there are some
|
|
more operating systems. However, Microsoft does not seem to know.</p>
|
|
</td>
|
|
<td></td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>8</p>
|
|
</td>
|
|
<td>
|
|
<p>ClassID</p>
|
|
</td>
|
|
<td><p><code>0x00000000000000000000000000000000</code></p></td>
|
|
<td>
|
|
<p>Most property set streams have this value but this is not
|
|
required.</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>24</p>
|
|
</td>
|
|
<td>
|
|
<p>DWord</p>
|
|
</td>
|
|
<td>
|
|
<p><code>0x01000000</code> or greater</p>
|
|
</td>
|
|
<td>
|
|
<p>Section count. This field's value should be equal to 1 or greater.
|
|
Microsoft claims that this is a "reserved" field, but it seems to tell
|
|
how many sections (see below) are following in the stream. This would
|
|
really make sense because otherwise you could not know where and how
|
|
far you should read section data.</p>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</s2>
|
|
|
|
|
|
|
|
<s2 title="Section List">
|
|
|
|
<p>Following the header is the section list. This is an array of pairs each
|
|
consisting of a section format ID and an offset. This array has as many
|
|
pairs of ClassID and and DWord fields as the section count field in the
|
|
header says. The Summary Information stream contains a single section, the
|
|
Document Summary Information stream contains two.</p>
|
|
|
|
<table>
|
|
<tr>
|
|
<th>
|
|
<p>Type</p>
|
|
</th>
|
|
<th>
|
|
<p>Contents</p>
|
|
</th>
|
|
<th>
|
|
<p>Remarks</p>
|
|
</th>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>ClassID</p>
|
|
</td>
|
|
<td>
|
|
<p>Section format ID</p>
|
|
</td>
|
|
<td>
|
|
<p><code>0xF29F85E04FF91068AB9108002B27B3D9</code> for the single section
|
|
in the Summary Information stream.</p>
|
|
|
|
<p><code>0xD5CDD5022E9C101B939708002B2CF9AE</code> for the first
|
|
section in the Document Summary Information stream.</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>DWord</p>
|
|
</td>
|
|
<td>
|
|
<p>Offset</p>
|
|
</td>
|
|
<td>
|
|
<p>The number of bytes between the beginning of the stream and the
|
|
beginning of the section within the stream.</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>ClassID</p>
|
|
</td>
|
|
<td>
|
|
<p>Section format ID</p>
|
|
</td>
|
|
<td>
|
|
<p>...</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>DWord</p>
|
|
</td>
|
|
<td>
|
|
<p>Offset</p>
|
|
</td>
|
|
<td>
|
|
<p>...</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>...</p>
|
|
</td>
|
|
<td>
|
|
<p>...</p>
|
|
</td>
|
|
<td>
|
|
<p>...</p>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</s2>
|
|
|
|
|
|
|
|
<s2 title="Section">
|
|
|
|
<p>A section is divided into three parts: the section header (with the
|
|
section length and the number of properties in the section), the
|
|
properties list (with type and offset of each property), and the
|
|
properties themselves. Here are the details:</p>
|
|
|
|
<table>
|
|
<tr>
|
|
<th>
|
|
<p> </p>
|
|
</th>
|
|
<th>
|
|
<p>Type</p>
|
|
</th>
|
|
<th>
|
|
<p>Contents</p>
|
|
</th>
|
|
<th>
|
|
<p>Remarks</p>
|
|
</th>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>Section header</p>
|
|
</td>
|
|
|
|
<td>
|
|
<p>DWord</p>
|
|
</td>
|
|
<td>
|
|
<p>Length</p>
|
|
</td>
|
|
<td>
|
|
<p>The length of the section in bytes.</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td></td>
|
|
<td>
|
|
<p>DWord</p>
|
|
</td>
|
|
<td>
|
|
<p>Property count</p>
|
|
</td>
|
|
<td>
|
|
<p>The number of properties in the section.</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td>
|
|
<p>Properties list</p>
|
|
</td>
|
|
|
|
<td>
|
|
<p>DWord</p>
|
|
</td>
|
|
<td>
|
|
<p>Property ID</p>
|
|
</td>
|
|
<td>
|
|
<p>The property ID tells what the property means. For example, an ID of
|
|
<code>0x0002</code> in the Summary Information stands for the document's
|
|
title. See the <link href="#property_ids">Property IDs</link>
|
|
chapter below for more details.</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td></td>
|
|
<td>
|
|
<p>DWord</p>
|
|
</td>
|
|
<td>
|
|
<p>Offset</p>
|
|
</td>
|
|
<td>
|
|
<p>The number of bytes between the beginning of the section and the
|
|
property.</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td></td>
|
|
<td>
|
|
<p>...</p>
|
|
</td>
|
|
<td>
|
|
<p>...</p>
|
|
</td>
|
|
<td>
|
|
<p>...</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td>
|
|
<p>Properties</p>
|
|
</td>
|
|
|
|
<td>
|
|
<p>DWord</p>
|
|
</td>
|
|
<td>
|
|
<p>Property type ("variant")</p>
|
|
</td>
|
|
<td>
|
|
<p>This is the property's data type, e.g. an integer value, a byte
|
|
string or a Unicode string. See the
|
|
<link href="#property_types"><em>Property Types</em></link> chapter for
|
|
details!</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td></td>
|
|
<td><p><em>Field length depends on the property type
|
|
("variant")</em></p></td>
|
|
<td>
|
|
<p>Property value</p>
|
|
</td>
|
|
<td>
|
|
<p>This field's length depends on the property's type. These are the
|
|
bytes that make out the DWord, the byte string or some other data of
|
|
fixed or variable length.</p>
|
|
|
|
<p>The property value's length is always stored in an area which is a
|
|
multiple of 4 in length. If the property is shorter, e.g. a byte
|
|
string of 13 bytes, the remaining bytes are padded with <code>0x00</code>
|
|
bytes.</p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td></td>
|
|
<td>
|
|
<p>...</p>
|
|
</td>
|
|
<td>
|
|
<p>...</p>
|
|
</td>
|
|
<td>
|
|
<p>...</p>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</s2>
|
|
|
|
|
|
|
|
<s2 title="Property IDs">
|
|
<anchor id="property_ids"/>
|
|
|
|
<p>As seen above, a section holds a property list: an array with property
|
|
IDs and offsets. The property ID gives each property a meaning. For
|
|
example, in the Summary Information stream the property ID 2 says that
|
|
this property is the document's title.</p>
|
|
|
|
<p>If you want to know a property ID's meaning, it is not sufficient to
|
|
know the ID itself. You must also know the
|
|
<strong>section format ID</strong>. For example, in the Document Summary
|
|
Information stream the property ID 2 means not the document's title but
|
|
its category. Due to Microsoft's infinite wisdom the section format ID is
|
|
not part of the section. Thus if you have only a section without the
|
|
stream it is in, you cannot make any sense of the properties because you
|
|
do not know what they mean.</p>
|
|
|
|
<p>So each section format ID has its own name space of property IDs.
|
|
Microsoft defined some "well-known" property IDs for the Summary
|
|
Information and the Document Summary Information streams. You can extend
|
|
them by your own additional IDs. This will be described below.</p>
|
|
|
|
<s3 title="Property IDs in The Summary Information Stream">
|
|
|
|
<p>The Summary Information stream has a single section with a section
|
|
format ID of <code>0xF29F85E04FF91068AB9108002B27B3D9</code>. The following
|
|
table defines the meaning of its property IDs. Each row associates a
|
|
property ID with a <em>name</em> and an <em>ID string</em>. (The property
|
|
<em>type</em> is just for informational purposes given here. As we have
|
|
seen above, the type is always given along with the value.)</p>
|
|
|
|
<p>The property <em>name</em> is a readable string which could be
|
|
displayed to the user. However, this string is useful only for users who
|
|
understand English. The property name does not help with other
|
|
languages.</p>
|
|
|
|
<p>The property <em>ID string</em> is about the same but looks more
|
|
technically and is nothing a user should bother with. You could the ID
|
|
string and map it to an appropriate display string in a particular
|
|
language. Of course you could do that with the property ID as well and
|
|
with less overhead, but people (including software developers) tend to be
|
|
better in remembering symbolic constants than remembering numbers.</p>
|
|
|
|
<table>
|
|
<tr>
|
|
<th><p>Property ID</p></th>
|
|
<th><p>Property Name</p></th>
|
|
<th><p>Property ID String</p></th>
|
|
<th><p>Property Type</p></th>
|
|
</tr>
|
|
<tr>
|
|
<td><p>2</p></td>
|
|
<td><p>Title</p></td>
|
|
<td><p>PID_TITLE</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>3</p></td>
|
|
<td><p>Subject</p></td>
|
|
<td><p>PID_SUBJECT</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>4</p></td>
|
|
<td><p>Author</p></td>
|
|
<td><p>PID_AUTHOR</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>5</p></td>
|
|
<td><p>Keywords</p></td>
|
|
<td><p>PID_KEYWORDS</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>6</p></td>
|
|
<td><p>Comments</p></td>
|
|
<td><p>PID_COMMENTS</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>7</p></td>
|
|
<td><p>Template</p></td>
|
|
<td><p>PID_TEMPLATE</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>8</p></td>
|
|
<td><p>Last Saved By</p></td>
|
|
<td><p>PID_LASTAUTHOR</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>9</p></td>
|
|
<td><p>Revision Number</p></td>
|
|
<td><p>PID_REVNUMBER</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>10</p></td>
|
|
<td><p>Total Editing Time</p></td>
|
|
<td><p>PID_EDITTIME</p></td>
|
|
<td><p>VT_FILETIME</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>11</p></td>
|
|
<td><p>Last Printed</p></td>
|
|
<td><p>PID_LASTPRINTED</p></td>
|
|
<td><p>VT_FILETIME</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>12</p></td>
|
|
<td><p>Create Time/Date</p></td>
|
|
<td><p>PID_CREATE_DTM</p></td>
|
|
<td><p>VT_FILETIME</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>13</p></td>
|
|
<td><p>Last Saved Time/Date</p></td>
|
|
<td><p>PID_LASTSAVE_DTM</p></td>
|
|
<td><p>VT_FILETIME</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>14</p></td>
|
|
<td><p>Number of Pages</p></td>
|
|
<td><p>PID_PAGECOUNT</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>15</p></td>
|
|
<td><p>Number of Words</p></td>
|
|
<td><p>PID_WORDCOUNT</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>16</p></td>
|
|
<td><p>Number of Characters</p></td>
|
|
<td><p>PID_CHARCOUNT</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>17</p></td>
|
|
<td><p>Thumbnail</p></td>
|
|
<td><p>PID_THUMBNAIL</p></td>
|
|
<td><p>VT_CF</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>18</p></td>
|
|
<td><p>Name of Creating Application</p></td>
|
|
<td><p>PID_APPNAME</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>19</p></td>
|
|
<td><p>Security</p></td>
|
|
<td><p>PID_SECURITY</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
</tr>
|
|
</table>
|
|
</s3>
|
|
|
|
|
|
|
|
<s3 title="Property IDs in The Document Summary Information Stream">
|
|
|
|
<p>The Document Summary Information stream has two sections with a section
|
|
format ID of <code>0xD5CDD5022E9C101B939708002B2CF9AE</code> for the first
|
|
one. The following table defines the meaning of the property IDs in the
|
|
first section. See the preceeding section for interpreting the table.</p>
|
|
|
|
<table>
|
|
<tr>
|
|
<th><p>Property ID</p></th>
|
|
<th><p>Property name</p></th>
|
|
<th><p>Property ID string</p></th>
|
|
<th><p>VT type</p></th>
|
|
</tr>
|
|
<tr>
|
|
<td><p>2</p></td>
|
|
<td><p>Category</p></td>
|
|
<td><p>PID_CATEGORY</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>3</p></td>
|
|
<td><p>PresentationTarget</p></td>
|
|
<td><p>PID_PRESFORMAT</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>4</p></td>
|
|
<td><p>Bytes</p></td>
|
|
<td><p>PID_BYTECOUNT</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>5</p></td>
|
|
<td><p>Lines</p></td>
|
|
<td><p>PID_LINECOUNT</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>6</p></td>
|
|
<td><p>Paragraphs</p></td>
|
|
<td><p>PID_PARCOUNT</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>7</p></td>
|
|
<td><p>Slides</p></td>
|
|
<td><p>PID_SLIDECOUNT</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>8</p></td>
|
|
<td><p>Notes</p></td>
|
|
<td><p>PID_NOTECOUNT</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>9</p></td>
|
|
<td><p>HiddenSlides</p></td>
|
|
<td><p>PID_HIDDENCOUNT</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>10</p></td>
|
|
<td><p>MMClips</p></td>
|
|
<td><p>PID_MMCLIPCOUNT</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>11</p></td>
|
|
<td><p>ScaleCrop</p></td>
|
|
<td><p>PID_SCALE</p></td>
|
|
<td><p>VT_BOOL</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>12</p></td>
|
|
<td><p>HeadingPairs</p></td>
|
|
<td><p>PID_HEADINGPAIR</p></td>
|
|
<td><p>VT_VARIANT | VT_VECTOR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>13</p></td>
|
|
<td><p>TitlesofParts</p></td>
|
|
<td><p>PID_DOCPARTS</p></td>
|
|
<td><p>VT_LPSTR | VT_VECTOR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>14</p></td>
|
|
<td><p>Manager</p></td>
|
|
<td><p>PID_MANAGER</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>15</p></td>
|
|
<td><p>Company</p></td>
|
|
<td><p>PID_COMPANY</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>16</p></td>
|
|
<td><p>LinksUpTo Date</p></td>
|
|
<td><p>PID_LINKSDIRTY</p></td>
|
|
<td><p>VT_BOOL</p></td>
|
|
</tr>
|
|
</table>
|
|
</s3>
|
|
</s2>
|
|
|
|
|
|
|
|
<s2 title="Property Types">
|
|
<anchor id="property_types"/>
|
|
|
|
<p>A property consists of a DWord <em>type field</em> followed by the
|
|
property value. The property type is an integer value and tells how the
|
|
data byte following it are to be interpreted. In the Microsoft world it is
|
|
also known as the <em>variant</em>.</p>
|
|
|
|
<p>The <em>Usage</em> column says where a variant type may occur. Not all
|
|
of them are allowed in a property set but just those marked with a [P].
|
|
<strong>[V]</strong> - may appear in a VARIANT, <strong>[T]</strong> - may
|
|
appear in a TYPEDESC, <strong>[P]</strong> - may appear in an OLE property
|
|
set, <strong>[S]</strong> - may appear in a Safe Array.</p>
|
|
|
|
<table>
|
|
<tr>
|
|
<th><p>Variant ID</p></th>
|
|
<th><p>Variant Type</p></th>
|
|
<th><p>Usage</p> </th>
|
|
<th><p>Description</p></th>
|
|
</tr>
|
|
<tr>
|
|
<td><p>0</p></td>
|
|
<td><p>VT_EMPTY</p></td>
|
|
<td><p>[V] [P]</p></td>
|
|
<td><p>nothing</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>1</p></td>
|
|
<td><p>VT_NULL</p></td>
|
|
<td><p>[V] [P]</p></td>
|
|
<td><p>SQL style Null</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>2</p></td>
|
|
<td><p>VT_I2</p></td>
|
|
<td><p>[V] [T] [P] [S]</p></td>
|
|
<td><p>2 byte signed int</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>3</p></td>
|
|
<td><p>VT_I4</p></td>
|
|
<td><p>[V] [T] [P] [S]</p></td>
|
|
<td><p>4 byte signed int</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>4</p></td>
|
|
<td><p>VT_R4</p></td>
|
|
<td><p>[V] [T] [P] [S]</p></td>
|
|
<td><p>4 byte real</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>5</p></td>
|
|
<td><p>VT_R8</p></td>
|
|
<td><p>[V] [T] [P] [S]</p></td>
|
|
<td><p>8 byte real</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>6</p></td>
|
|
<td><p>VT_CY</p></td>
|
|
<td><p>[V] [T] [P] [S]</p></td>
|
|
<td><p>currency</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>7</p></td>
|
|
<td><p>VT_DATE</p></td>
|
|
<td><p>[V] [T] [P] [S]</p></td>
|
|
<td><p>date</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>8</p></td>
|
|
<td><p>VT_BSTR</p></td>
|
|
<td><p>[V] [T] [P] [S]</p></td>
|
|
<td><p>OLE Automation string</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>9</p></td>
|
|
<td><p>VT_DISPATCH</p></td>
|
|
<td><p>[V] [T] [P] [S]</p></td>
|
|
<td><p>IDispatch *</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>10</p></td>
|
|
<td><p>VT_ERROR</p></td>
|
|
<td><p>[V] [T] [S]</p></td>
|
|
<td><p>SCODE</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>11</p></td>
|
|
<td><p>VT_BOOL</p></td>
|
|
<td><p>[V] [T] [P] [S]</p></td>
|
|
<td><p>True=-1, False=0</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>12</p></td>
|
|
<td><p>VT_VARIANT</p></td>
|
|
<td><p>[V] [T] [P] [S]</p></td>
|
|
<td><p>VARIANT *</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>13</p></td>
|
|
<td><p>VT_UNKNOWN</p></td>
|
|
<td><p>[V] [T] [S]</p></td>
|
|
<td><p>IUnknown *</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>14</p></td>
|
|
<td><p>VT_DECIMAL</p></td>
|
|
<td><p>[V] [T] [S]</p></td>
|
|
<td><p>16 byte fixed point</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>16</p></td>
|
|
<td><p>VT_I1</p></td>
|
|
<td><p>[T]</p></td>
|
|
<td><p>signed char</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>17</p></td>
|
|
<td><p>VT_UI1</p></td>
|
|
<td><p>[V] [T] [P] [S]</p></td>
|
|
<td><p>unsigned char</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>18</p></td>
|
|
<td><p>VT_UI2</p></td>
|
|
<td><p>[T] [P]</p></td>
|
|
<td><p>unsigned short</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>19</p></td>
|
|
<td><p>VT_UI4</p></td>
|
|
<td><p>[T] [P]</p></td>
|
|
<td><p>unsigned short</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>20</p></td>
|
|
<td><p>VT_I8</p></td>
|
|
<td><p>[T] [P]</p></td>
|
|
<td><p>signed 64-bit int</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>21</p></td>
|
|
<td><p>VT_UI8</p></td>
|
|
<td><p>[T] [P]</p></td>
|
|
<td><p>unsigned 64-bit int</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>22</p></td>
|
|
<td><p>VT_INT</p></td>
|
|
<td><p>[T]</p></td>
|
|
<td><p>signed machine int</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>23</p></td>
|
|
<td><p>VT_UINT</p></td>
|
|
<td><p>[T]</p></td>
|
|
<td><p>unsigned machine int</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>24</p></td>
|
|
<td><p>VT_VOID</p></td>
|
|
<td><p>[T]</p></td>
|
|
<td><p>C style void</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>25</p></td>
|
|
<td><p>VT_HRESULT</p></td>
|
|
<td><p>[T]</p></td>
|
|
<td><p>Standard return type</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>26</p></td>
|
|
<td><p>VT_PTR</p></td>
|
|
<td><p>[T]</p></td>
|
|
<td><p>pointer type</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>27</p></td>
|
|
<td><p>VT_SAFEARRAY</p></td>
|
|
<td><p>[T]</p></td>
|
|
<td><p>(use VT_ARRAY in VARIANT)</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>28</p></td>
|
|
<td><p>VT_CARRAY</p></td>
|
|
<td><p>[T]</p></td>
|
|
<td><p>C style array</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>29</p></td>
|
|
<td><p>VT_USERDEFINED</p></td>
|
|
<td><p>[T]</p></td>
|
|
<td><p>user defined type</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>30</p></td>
|
|
<td><p>VT_LPSTR</p></td>
|
|
<td><p>[T] [P]</p></td>
|
|
<td><p>null terminated string</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>31</p></td>
|
|
<td><p>VT_LPWSTR</p></td>
|
|
<td><p>[T] [P]</p></td>
|
|
<td><p>wide null terminated string</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>64</p></td>
|
|
<td><p>VT_FILETIME</p></td>
|
|
<td><p>[P]</p></td>
|
|
<td><p>FILETIME</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>65</p></td>
|
|
<td><p>VT_BLOB</p></td>
|
|
<td><p>[P]</p></td>
|
|
<td><p>Length prefixed bytes</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>66</p></td>
|
|
<td><p>VT_STREAM</p></td>
|
|
<td><p>[P]</p></td>
|
|
<td><p>Name of the stream follows</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>67</p></td>
|
|
<td><p>VT_STORAGE</p></td>
|
|
<td><p>[P]</p></td>
|
|
<td><p>Name of the storage follows</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>68</p></td>
|
|
<td><p>VT_STREAMED_OBJECT</p></td>
|
|
<td><p>[P]</p></td>
|
|
<td><p>Stream contains an object</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>69</p></td>
|
|
<td><p>VT_STORED_OBJECT</p></td>
|
|
<td><p>[P]</p></td>
|
|
<td><p>Storage contains an object</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>70</p></td>
|
|
<td><p>VT_BLOB_OBJECT</p></td>
|
|
<td><p>[P]</p></td>
|
|
<td><p>Blob contains an object</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>71</p></td>
|
|
<td><p>VT_CF</p></td>
|
|
<td><p>[P]</p></td>
|
|
<td><p>Clipboard format</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>72</p></td>
|
|
<td><p>VT_CLSID</p></td>
|
|
<td><p>[P]</p></td>
|
|
<td><p>A Class ID</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>0x1000</p></td>
|
|
<td><p>VT_VECTOR</p></td>
|
|
<td><p>[P]</p></td>
|
|
<td><p>simple counted array</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>0x2000</p></td>
|
|
<td><p>VT_ARRAY</p></td>
|
|
<td><p>[V]</p></td>
|
|
<td><p>SAFEARRAY*</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>0x4000</p></td>
|
|
<td><p>VT_BYREF</p></td>
|
|
<td><p>[V]</p></td>
|
|
<td><p>void* for local use</p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>0x8000</p></td>
|
|
<td><p>VT_RESERVED</p></td>
|
|
<td><p><br/></p></td>
|
|
<td><p><br/></p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>0xFFFF</p></td>
|
|
<td><p>VT_ILLEGAL</p></td>
|
|
<td><p><br/></p></td>
|
|
<td><p><br/></p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>0xFFF</p></td>
|
|
<td><p>VT_ILLEGALMASKED</p></td>
|
|
<td><p><br/></p></td>
|
|
<td><p><br/></p></td>
|
|
</tr>
|
|
<tr>
|
|
<td><p>0xFFF</p></td>
|
|
<td><p>VT_TYPEMASK</p></td>
|
|
<td><p><br/></p></td>
|
|
<td><p><br/></p></td>
|
|
</tr>
|
|
</table>
|
|
</s2>
|
|
|
|
|
|
|
|
<s2 title="References">
|
|
|
|
<p>In order to assemble the HPSF description I used information publically
|
|
available on the Internet only. The references given below have been very
|
|
helpful. If you have any amendments or corrections, please let us know!
|
|
Thank you!</p>
|
|
|
|
<ol>
|
|
|
|
<li>
|
|
<p>In
|
|
<link href="http://www.kyler.com/pubs/ddj9894.html"><em>Understanding OLE
|
|
documents</em></link>, Ken Kyler gives an introduction to OLE2
|
|
documents
|
|
and especially to property sets. He names the property names, types, and
|
|
IDs of the Summary Information and Document Summary Information
|
|
stream.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The
|
|
<link href="http://www.dwam.net/docs/oleref/"><em>ActiveX Programmer's
|
|
Reference</em></link> at
|
|
<link href="http://www.dwam.net/docs/oleref/">http://www.dwam.net/docs/oleref/</link>
|
|
seems a little outdated, but that's what I have found.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>An overview of the <code>VT_</code> types is in
|
|
<link href="http://www.marin.clara.net/COM/variant_type_definitions.htm"><em>Variant
|
|
Type Definitions</em></link>.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>What is a <code>FILETIME</code>? The answer can be found for example under
|
|
<link href="http://www.vbapi.com/ref/f/filetime.html">http://www.vbapi.com/ref/f/filetime.html</link>
|
|
or
|
|
<link href="http://www.cs.rpi.edu/courses/fall01/os/FILETIME.html">http://www.cs.rpi.edu/courses/fall01/os/FILETIME.html</link>. In
|
|
short:
|
|
<em>The FILETIME structure holds a date and time associated with a file.
|
|
The structure identifies a 64-bit integer specifying the number of
|
|
100-nanosecond intervals which have passed since January 1, 1601. This
|
|
64-bit value is split into the two dwords stored in the
|
|
structure.</em></p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>This documentation origins from the <link href="http://www.rainer-klute.de/~klute/Software/poibrowser/doc/HPSF-Description.html">HPSF description</link> available at <link href="http://www.rainer-klute.de/~klute/Software/poibrowser/doc/HPSF-Description.html">http://www.rainer-klute.de/~klute/Software/poibrowser/doc/HPSF-Description.html</link>.</p>
|
|
</li>
|
|
</ol>
|
|
</s2>
|
|
</s1>
|
|
</body>
|
|
</document>
|
|
|
|
<!-- Keep this comment at the end of the file
|
|
Local variables:
|
|
mode: xml
|
|
sgml-omittag:nil
|
|
sgml-shorttag:nil
|
|
sgml-namecase-general:nil
|
|
sgml-general-insert-case:lower
|
|
sgml-minimize-attributes:nil
|
|
sgml-always-quote-attributes:t
|
|
sgml-indent-step:1
|
|
sgml-indent-data:t
|
|
sgml-parent-document:nil
|
|
sgml-exposed-tags:nil
|
|
sgml-local-catalogs:nil
|
|
sgml-local-ecat-files:nil
|
|
End:
|
|
-->
|