HPSF: codepage support added

git-svn-id: https://svn.apache.org/repos/asf/jakarta/poi/trunk@353460 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Rainer Klute 2003-12-02 17:46:01 +00:00
parent 6385296f3f
commit 131bb9d0bd
15 changed files with 276 additions and 103 deletions

View File

@ -12,7 +12,11 @@
<person id="MJ" name="Marc Johnson" email="mjohnson@apache.org"/> <person id="MJ" name="Marc Johnson" email="mjohnson@apache.org"/>
<person id="NKB" name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/> <person id="NKB" name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
<person id="POI-DEVELOPERS" name="POI Developers" email="poi-dev@jakarta.apache.org"/> <person id="POI-DEVELOPERS" name="POI Developers" email="poi-dev@jakarta.apache.org"/>
<person id="RK" name="Rainer Klute" email="klute@apache.org"/>
</devs> </devs>
<release version="2.0-pre3" date="unreleased">
<action dev="RK" type="add">HPSF: Much better codepage support</action>
</release>
<release version="2.0-pre1" date="unreleased"> <release version="2.0-pre1" date="unreleased">
<action dev="POI-DEVELOPERS" type="add">Patch applied for deep cloning of worksheets was provided</action> <action dev="POI-DEVELOPERS" type="add">Patch applied for deep cloning of worksheets was provided</action>
<action dev="POI-DEVELOPERS" type="add">Patch applied to allow sheet reordering</action> <action dev="POI-DEVELOPERS" type="add">Patch applied to allow sheet reordering</action>

View File

@ -708,8 +708,9 @@ No property set stream: "/1Table"</source>
<td>The property's value is the number of a <strong>codepage</strong>, <td>The property's value is the number of a <strong>codepage</strong>,
i.e. a mapping from character codes to characters. All strings in the i.e. a mapping from character codes to characters. All strings in the
section containing this property must be interpreted using this section containing this property must be interpreted using this
codepage. Typical property values are 1252 (8-bit "western" characters) codepage. Typical property values are 1252 (8-bit "western" characters,
or 1200 (16-bit Unicode characters).</td> ISO-8859-1), 1200 (16-bit Unicode characters, UFT-16), or 65001 (8-bit
Unicode characters, UFT-8).</td>
</tr> </tr>
</table> </table>
</section> </section>
@ -833,18 +834,34 @@ No property set stream: "/1Table"</source>
</section> </section>
<section><title>Codepage support</title> <section><title>Codepage support</title>
<fixme author="Rainer Klute">Improve codepage support!</fixme>
<p>The property with ID 1 holds the number of the codepage which was used <p>The property with ID 1 holds the number of the codepage which was used
to encode the strings in this section. The present HPSF codepage support to encode the strings in this section. If this property is not available
is still very limited: When reading property value strings, HPSF in a section, the platform's default character encoding will be
distinguishes between 16-bit characters and 8-bit characters. 16-bit used. This works fine as long as the document being read has been written
characters should be Unicode characters and thus be okay. 8-bit on a platform with the same default character encoding. However, if you
characters are interpreted according to the platform's default character receive a document from another region of the world and the codepage is
set. This is fine as long as the document being read has been written on undefined, you are in trouble.</p>
a platform with the same default character set. However, if you receive a
document from another region of the world and want to process it with <p>HPSF's codepage support is as good as the character encoding support of
HPSF you are in trouble - unless the creator used Unicode, of course.</p> the Java Virtual Machine (JVM) the application runs on. If HPSF
encounters a codepage number it assumes that the JVM has a character
encoding with a corresponding name. For example, if the codepage is 1252,
HPSF uses the character encoding "cp1252" to read or write strings. If
the JVM does not have that character encoding installed or if the
codepage number is illegal, an UnsupportedEncodingException will be
thrown.</p>
<p>There are two exceptions to the rule that a character encoding's name
is derived from the codepage number by prepending the string "cp" to
it:</p>
<dl>
<dt>Codepage 1200</dt>
<dd>is mapped to the character encoding "UTF-16".</dd>
<dt>Codepage 65001</dt>
<dd>is mapped to the character encoding "UTF-8".</dd>
</dl>
</section> </section>
</section> </section>

View File

@ -944,6 +944,60 @@
<section>
<title>The Dictionary</title>
<p>What a dictionary is good for is explained in the <link
href="how-to.html">HPSF HOW-TO</link>. This chapter explains how it is
organized internally.</p>
<p>The dictionary has a simple header consisting of a single UInt value. It
tells how many entries the dictionary comprises:</p>
<table>
<tr>
<th>Name</th>
<th>Data type</th>
<th>Description</th>
</tr>
<tr>
<td>nrEntries</td>
<th>UInt</th>
<td>Number of dictionary entries</td>
</tr>
</table>
<p>The dictionary entries follow the header. Each one looks like this:</p>
<table>
<tr>
<th>Name</th>
<td>Data type</td>
<th>Description</th>
</tr>
<tr>
<td>key</td>
<td>UInt</td>
<td>The unique number of this property, i.e. the PID</td>
</tr>
<tr>
<td>length</td>
<td>UInt</td>
<td>The length of the property name associated with the key</td>
</tr>
<tr>
<td>value</td>
<td>String</td>
<td>The property's name, terminated with a 0x00 character</td>
</tr>
</table>
<p>The entries are not aligned, i.e. each one follows its predecessor
without any gap or fill characters.</p>
</section>
<section><title>References</title> <section><title>References</title>
<p>In order to assemble the HPSF description I used information publically <p>In order to assemble the HPSF description I used information publically

View File

@ -20,11 +20,6 @@
easily writing summary information streams and document summary easily writing summary information streams and document summary
information streams. information streams.
</li> </li>
<li>
Add codepage support: Presently the bytes making out the string in a
property's value are interpreted using the platform's default character
set.
</li>
<li> <li>
Add resource bundles to Add resource bundles to
<code>org.apache.poi.hpsf.wellknown</code> to ease <code>org.apache.poi.hpsf.wellknown</code> to ease
@ -38,8 +33,8 @@
arrays. arrays.
</li> </li>
<li> <li>
Add WMF to <code>java.awt.Image</code> example code in <link Add WMF to <code>java.awt.Image</code> example code in the <link
href="thumbnails.html">Thumbnail HOW TO</link>. href="thumbnails.html">Thumbnail HOW-TO</link>.
</li> </li>
</ol> </ol>
</section> </section>

View File

@ -558,7 +558,10 @@ public class CopyCompare
* exists. However, since we have full control about directory * exists. However, since we have full control about directory
* creation we can ensure that this will never happen. */ * creation we can ensure that this will never happen. */
ex.printStackTrace(System.err); ex.printStackTrace(System.err);
throw new RuntimeException(ex); throw new RuntimeException(ex.toString());
/* FIXME (2): Replace the previous line by the following once we
* no longer need JDK 1.3 compatibility. */
// throw new RuntimeException(ex);
} }
} }
} }

View File

@ -444,7 +444,10 @@ public class WriteAuthorAndTitle
* exists. However, since we have full control about directory * exists. However, since we have full control about directory
* creation we can ensure that this will never happen. */ * creation we can ensure that this will never happen. */
ex.printStackTrace(System.err); ex.printStackTrace(System.err);
throw new RuntimeException(ex); throw new RuntimeException(ex.toString());
/* FIXME (2): Replace the previous line by the following once we
* no longer need JDK 1.3 compatibility. */
// throw new RuntimeException(ex);
} }
} }
} }

View File

@ -80,19 +80,20 @@ public class MutableProperty extends Property
* <p>Writes the property to an output stream.</p> * <p>Writes the property to an output stream.</p>
* *
* @param out The output stream to write to. * @param out The output stream to write to.
* @param codepage The codepage to use for writing non-wide strings
* @return the number of bytes written to the stream * @return the number of bytes written to the stream
* *
* @exception IOException if an I/O error occurs * @exception IOException if an I/O error occurs
* @exception WritingNotSupportedException if a variant type is to be * @exception WritingNotSupportedException if a variant type is to be
* written that is not yet supported * written that is not yet supported
*/ */
public int write(final OutputStream out) public int write(final OutputStream out, final int codepage)
throws IOException, WritingNotSupportedException throws IOException, WritingNotSupportedException
{ {
int length = 0; int length = 0;
long variantType = getType(); long variantType = getType();
length += TypeWriter.writeUIntToStream(out, variantType); length += TypeWriter.writeUIntToStream(out, variantType);
length += VariantSupport.write(out, variantType, getValue()); length += VariantSupport.write(out, variantType, getValue(), codepage);
return length; return length;
} }

View File

@ -420,16 +420,16 @@ public class MutableSection extends Section
/* If the property ID is not equal 0 we write the property and all /* If the property ID is not equal 0 we write the property and all
* is fine. However, if it equals 0 we have to write the section's * is fine. However, if it equals 0 we have to write the section's
* dictionary which does not have a type but just a value. */ * dictionary which has an implicit type only and an explicit
* value. */
if (id != 0) if (id != 0)
/* Write the property and update the position to the next /* Write the property and update the position to the next
* property. */ * property. */
position += p.write(propertyStream); position += p.write(propertyStream, getCodepage());
else else
{ {
final Integer codepage = final int codepage = getCodepage();
(Integer) getProperty(PropertyIDMap.PID_CODEPAGE); if (codepage == -1)
if (codepage == null)
throw new IllegalPropertySetDataException throw new IllegalPropertySetDataException
("Codepage (property 1) is undefined."); ("Codepage (property 1) is undefined.");
position += writeDictionary(propertyStream, dictionary); position += writeDictionary(propertyStream, dictionary);

View File

@ -62,9 +62,11 @@
*/ */
package org.apache.poi.hpsf; package org.apache.poi.hpsf;
import java.io.UnsupportedEncodingException;
import java.util.HashMap; import java.util.HashMap;
import java.util.Map; import java.util.Map;
import org.apache.poi.util.HexDump;
import org.apache.poi.util.LittleEndian; import org.apache.poi.util.LittleEndian;
/** /**
@ -161,9 +163,13 @@ public class Property
* @param length The property's type/value pair's length in bytes. * @param length The property's type/value pair's length in bytes.
* @param codepage The section's and thus the property's * @param codepage The section's and thus the property's
* codepage. It is needed only when reading string values. * codepage. It is needed only when reading string values.
*
* @exception UnsupportedEncodingException if the specified codepage is not
* supported
*/ */
public Property(final long id, final byte[] src, final long offset, public Property(final long id, final byte[] src, final long offset,
final int length, final int codepage) final int length, final int codepage)
throws UnsupportedEncodingException
{ {
this.id = id; this.id = id;
@ -183,7 +189,7 @@ public class Property
try try
{ {
value = VariantSupport.read(src, o, length, (int) type); value = VariantSupport.read(src, o, length, (int) type, codepage);
} }
catch (UnsupportedVariantTypeException ex) catch (UnsupportedVariantTypeException ex)
{ {
@ -382,8 +388,27 @@ public class Property
b.append(getID()); b.append(getID());
b.append(", type: "); b.append(", type: ");
b.append(getType()); b.append(getType());
final Object value = getValue();
b.append(", value: "); b.append(", value: ");
b.append(getValue()); b.append(value.toString());
if (value instanceof String)
{
final String s = (String) value;
final int l = s.length();
final byte[] bytes = new byte[l * 2];
for (int i = 0; i < l; i++)
{
final char c = s.charAt(i);
final byte high = (byte) ((c & 0x00ff00) >> 8);
final byte low = (byte) ((c & 0x0000ff) >> 0);
bytes[i * 2] = high;
bytes[i * 2 + 1] = low;
}
final String hex = HexDump.dump(bytes, 0L, 0);
b.append(" [");
b.append(hex);
b.append("]");
}
b.append(']'); b.append(']');
return b.toString(); return b.toString();
} }

View File

@ -56,6 +56,7 @@ package org.apache.poi.hpsf;
import java.io.IOException; import java.io.IOException;
import java.io.InputStream; import java.io.InputStream;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.List; import java.util.List;
@ -300,9 +301,11 @@ public class PropertySet
* @param length The length of the stream data. * @param length The length of the stream data.
* @throws NoPropertySetStreamException if the byte array is not a * @throws NoPropertySetStreamException if the byte array is not a
* property set stream. * property set stream.
*
* @exception UnsupportedEncodingException if the codepage is not supported
*/ */
public PropertySet(final byte[] stream, final int offset, final int length) public PropertySet(final byte[] stream, final int offset, final int length)
throws NoPropertySetStreamException throws NoPropertySetStreamException, UnsupportedEncodingException
{ {
if (isPropertySetStream(stream, offset, length)) if (isPropertySetStream(stream, offset, length))
init(stream, offset, length); init(stream, offset, length);
@ -321,8 +324,11 @@ public class PropertySet
* complete byte array contents is the stream data. * complete byte array contents is the stream data.
* @throws NoPropertySetStreamException if the byte array is not a * @throws NoPropertySetStreamException if the byte array is not a
* property set stream. * property set stream.
*
* @exception UnsupportedEncodingException if the codepage is not supported
*/ */
public PropertySet(final byte[] stream) throws NoPropertySetStreamException public PropertySet(final byte[] stream)
throws NoPropertySetStreamException, UnsupportedEncodingException
{ {
this(stream, 0, stream.length); this(stream, 0, stream.length);
} }
@ -435,6 +441,7 @@ public class PropertySet
* @param length Length of the property set stream. * @param length Length of the property set stream.
*/ */
private void init(final byte[] src, final int offset, final int length) private void init(final byte[] src, final int offset, final int length)
throws UnsupportedEncodingException
{ {
/* FIXME (3): Ensure that at most "length" bytes are read. */ /* FIXME (3): Ensure that at most "length" bytes are read. */
@ -651,7 +658,7 @@ public class PropertySet
final PropertySet ps = (PropertySet) o; final PropertySet ps = (PropertySet) o;
int byteOrder1 = ps.getByteOrder(); int byteOrder1 = ps.getByteOrder();
int byteOrder2 = getByteOrder(); int byteOrder2 = getByteOrder();
ClassID classId1 = ps.getClassID(); ClassID classID1 = ps.getClassID();
ClassID classID2 = getClassID(); ClassID classID2 = getClassID();
int format1 = ps.getFormat(); int format1 = ps.getFormat();
int format2 = getFormat(); int format2 = getFormat();
@ -660,7 +667,7 @@ public class PropertySet
int sectionCount1 = ps.getSectionCount(); int sectionCount1 = ps.getSectionCount();
int sectionCount2 = getSectionCount(); int sectionCount2 = getSectionCount();
if (byteOrder1 != byteOrder2 || if (byteOrder1 != byteOrder2 ||
!classId1.equals(classID2) || !classID1.equals(classID2) ||
format1 != format2 || format1 != format2 ||
osVersion1 != osVersion2 || osVersion1 != osVersion2 ||
sectionCount1 != sectionCount2) sectionCount1 != sectionCount2)

View File

@ -54,6 +54,7 @@
*/ */
package org.apache.poi.hpsf; package org.apache.poi.hpsf;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.Collections; import java.util.Collections;
import java.util.Iterator; import java.util.Iterator;
@ -193,8 +194,12 @@ public class Section
* @param src Contains the complete property set stream. * @param src Contains the complete property set stream.
* @param offset The position in the stream that points to the * @param offset The position in the stream that points to the
* section's format ID. * section's format ID.
*
* @exception UnsupportedEncodingException if the section's codepage is not
* supported.
*/ */
public Section(final byte[] src, final int offset) public Section(final byte[] src, final int offset)
throws UnsupportedEncodingException
{ {
int o1 = offset; int o1 = offset;
@ -638,4 +643,18 @@ public class Section
return dictionary; return dictionary;
} }
/**
* <p>Gets the section's codepage, if any.</p>
*
* @return The section's codepage if one is defined, else -1.
*/
public int getCodepage()
{
final Integer codepage =
(Integer) getProperty(PropertyIDMap.PID_CODEPAGE);
return codepage != null ? codepage.intValue() : -1;
}
} }

View File

@ -185,7 +185,8 @@ public class TypeWriter
* @exception IOException if an I/O error occurs * @exception IOException if an I/O error occurs
*/ */
public static void writeToStream(final OutputStream out, public static void writeToStream(final OutputStream out,
final Property[] properties) final Property[] properties,
final int codepage)
throws IOException, UnsupportedVariantTypeException throws IOException, UnsupportedVariantTypeException
{ {
/* If there are no properties don't write anything. */ /* If there are no properties don't write anything. */
@ -207,7 +208,7 @@ public class TypeWriter
final Property p = (Property) properties[i]; final Property p = (Property) properties[i];
long type = p.getType(); long type = p.getType();
writeUIntToStream(out, type); writeUIntToStream(out, type);
VariantSupport.write(out, (int) type, p.getValue()); VariantSupport.write(out, (int) type, p.getValue(), codepage);
} }
} }

View File

@ -64,6 +64,7 @@ package org.apache.poi.hpsf;
import java.io.IOException; import java.io.IOException;
import java.io.OutputStream; import java.io.OutputStream;
import java.io.UnsupportedEncodingException;
import java.util.Date; import java.util.Date;
import java.util.LinkedList; import java.util.LinkedList;
import java.util.List; import java.util.List;
@ -163,17 +164,21 @@ public class VariantSupport extends Variant
* @param length The length of the variant including the variant * @param length The length of the variant including the variant
* type field * type field
* @param type The variant type to read * @param type The variant type to read
* @param codepage The codepage to use to write non-wide strings
* @return A Java object that corresponds best to the variant * @return A Java object that corresponds best to the variant
* field. For example, a VT_I4 is returned as a {@link Long}, a * field. For example, a VT_I4 is returned as a {@link Long}, a
* VT_LPSTR as a {@link String}. * VT_LPSTR as a {@link String}.
* @exception ReadingNotSupportedException if a property is to be written * @exception ReadingNotSupportedException if a property is to be written
* who's variant type HPSF does not yet support * who's variant type HPSF does not yet support
* @exception UnsupportedEncodingException if the specified codepage is not
* supported
* *
* @see Variant * @see Variant
*/ */
public static Object read(final byte[] src, final int offset, public static Object read(final byte[] src, final int offset,
final int length, final long type) final int length, final long type,
throws ReadingNotSupportedException final int codepage)
throws ReadingNotSupportedException, UnsupportedEncodingException
{ {
Object value; Object value;
int o1 = offset; int o1 = offset;
@ -221,18 +226,18 @@ public class VariantSupport extends Variant
* Read a byte string. In Java it is represented as a * Read a byte string. In Java it is represented as a
* String object. The 0x00 bytes at the end must be * String object. The 0x00 bytes at the end must be
* stripped. * stripped.
*
* FIXME (2): Reading an 8-bit string should pay attention
* to the codepage. Currently the byte making out the
* property's value are interpreted according to the
* platform's default character set.
*/ */
final int first = o1 + LittleEndian.INT_SIZE; final int first = o1 + LittleEndian.INT_SIZE;
long last = first + LittleEndian.getUInt(src, o1) - 1; long last = first + LittleEndian.getUInt(src, o1) - 1;
o1 += LittleEndian.INT_SIZE; o1 += LittleEndian.INT_SIZE;
final int rawLength = (int) (last - first + 1);
while (src[(int) last] == 0 && first <= last) while (src[(int) last] == 0 && first <= last)
last--; last--;
value = new String(src, (int) first, (int) (last - first + 1)); final int l = (int) (last - first + 1);
value = codepage != -1 ?
new String(src, (int) first, l,
codepageToEncoding(codepage)) :
new String(src, (int) first, l);
break; break;
} }
case Variant.VT_LPWSTR: case Variant.VT_LPWSTR:
@ -298,6 +303,38 @@ public class VariantSupport extends Variant
/**
* <p>Turns a codepage number into the equivalent character encoding's
* name.</p>
*
* @param codepage The codepage number
*
* @return The character encoding's name. If the codepage number is 65001,
* the encoding name is "UTF-8". All other positive numbers are mapped to
* "cp" followed by the number, e.g. if the codepage number is 1252 the
* returned character encoding name will be "cp1252".
*
* @exception UnsupportedEncodingException if the specified codepage is
* less than zero.
*/
public static String codepageToEncoding(final int codepage)
throws UnsupportedEncodingException
{
if (codepage <= 0)
throw new UnsupportedEncodingException
("Codepage number may not be " + codepage);
switch (codepage)
{
case 1200:
return "UTF-16";
case 65001:
return "UTF-8";
default:
return "cp" + codepage;
}
}
/** /**
* <p>Writes a variant value to an output stream. This method ensures that * <p>Writes a variant value to an output stream. This method ensures that
* always a multiple of 4 bytes is written.</p> * always a multiple of 4 bytes is written.</p>
@ -305,6 +342,7 @@ public class VariantSupport extends Variant
* @param out The stream to write the value to. * @param out The stream to write the value to.
* @param type The variant's type. * @param type The variant's type.
* @param value The variant's value. * @param value The variant's value.
* @param codepage The codepage to use to write non-wide strings
* @return The number of entities that have been written. In many cases an * @return The number of entities that have been written. In many cases an
* "entity" is a byte but this is not always the case. * "entity" is a byte but this is not always the case.
* @exception IOException if an I/O exceptions occurs * @exception IOException if an I/O exceptions occurs
@ -312,7 +350,7 @@ public class VariantSupport extends Variant
* who's variant type HPSF does not yet support * who's variant type HPSF does not yet support
*/ */
public static int write(final OutputStream out, final long type, public static int write(final OutputStream out, final long type,
final Object value) final Object value, final int codepage)
throws IOException, WritingNotSupportedException throws IOException, WritingNotSupportedException
{ {
int length = 0; int length = 0;
@ -330,16 +368,13 @@ public class VariantSupport extends Variant
} }
case Variant.VT_LPSTR: case Variant.VT_LPSTR:
{ {
length = TypeWriter.writeUIntToStream final byte[] bytes =
(out, ((String) value).length() + 1); (codepage == -1 ?
char[] s = Util.pad4((String) value); ((String) value).getBytes() :
/* FIXME (2): The following line forces characters to bytes. ((String) value).getBytes(codepageToEncoding(codepage)));
* This is generally wrong and should only be done according to length = TypeWriter.writeUIntToStream(out, bytes.length + 1);
* a codepage. Alternatively Unicode could be written (see final byte[] b = new byte[bytes.length + 1];
* Variant.VT_LPWSTR). */ System.arraycopy(bytes, 0, b, 0, bytes.length);
byte[] b = new byte[s.length + 1];
for (int i = 0; i < s.length; i++)
b[i] = (byte) s[i];
b[b.length - 1] = 0x00; b[b.length - 1] = 0x00;
out.write(b); out.write(b);
length += b.length; length += b.length;
@ -419,12 +454,13 @@ public class VariantSupport extends Variant
} }
} }
/* Add 0x00 character to write a multiple of four bytes: */ /* Add 0x00 characters to write a multiple of four bytes: */
while (length % 4 != 0) // FIXME (1) Try this!
{ // while (length % 4 != 0)
out.write(0); // {
length++; // out.write(0);
} // length++;
// }
return length; return length;
} }

View File

@ -357,7 +357,10 @@ public class TestWrite extends TestCase
catch (Exception ex) catch (Exception ex)
{ {
ex.printStackTrace(); ex.printStackTrace();
throw new RuntimeException(ex); throw new RuntimeException(ex.toString());
/* FIXME (2): Replace the previous line by the following
* one once we no longer need JDK 1.3 compatibility. */
// throw new RuntimeException(ex);
} }
} }
}, },
@ -398,37 +401,40 @@ public class TestWrite extends TestCase
public void testVariantTypes() public void testVariantTypes()
{ {
Throwable t = null; Throwable t = null;
final int codepage = -1;
/* FIXME (2): Add tests for various codepages! */
try try
{ {
check(Variant.VT_EMPTY, null); check(Variant.VT_EMPTY, null, codepage);
check(Variant.VT_BOOL, new Boolean(true)); check(Variant.VT_BOOL, new Boolean(true), codepage);
check(Variant.VT_BOOL, new Boolean(false)); check(Variant.VT_BOOL, new Boolean(false), codepage);
check(Variant.VT_CF, new byte[]{0}); check(Variant.VT_CF, new byte[]{0}, codepage);
check(Variant.VT_CF, new byte[]{0, 1}); check(Variant.VT_CF, new byte[]{0, 1}, codepage);
check(Variant.VT_CF, new byte[]{0, 1, 2}); check(Variant.VT_CF, new byte[]{0, 1, 2}, codepage);
check(Variant.VT_CF, new byte[]{0, 1, 2, 3}); check(Variant.VT_CF, new byte[]{0, 1, 2, 3}, codepage);
check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4}); check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4}, codepage);
check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5}); check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5}, codepage);
check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}); check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10},
check(Variant.VT_I2, new Integer(27)); codepage);
check(Variant.VT_I4, new Long(28)); check(Variant.VT_I2, new Integer(27), codepage);
check(Variant.VT_FILETIME, new Date()); check(Variant.VT_I4, new Long(28), codepage);
check(Variant.VT_LPSTR, ""); check(Variant.VT_FILETIME, new Date(), codepage);
check(Variant.VT_LPSTR, "ä"); check(Variant.VT_LPSTR, "", codepage);
check(Variant.VT_LPSTR, "äö"); check(Variant.VT_LPSTR, "ä", codepage);
check(Variant.VT_LPSTR, "äöü"); check(Variant.VT_LPSTR, "äö", codepage);
check(Variant.VT_LPSTR, "äöüÄ"); check(Variant.VT_LPSTR, "äöü", codepage);
check(Variant.VT_LPSTR, "äöüÄÖ"); check(Variant.VT_LPSTR, "äöüÄ", codepage);
check(Variant.VT_LPSTR, "äöüÄÖÜ"); check(Variant.VT_LPSTR, "äöüÄÖ", codepage);
check(Variant.VT_LPSTR, "äöüÄÖÜß"); check(Variant.VT_LPSTR, "äöüÄÖÜ", codepage);
check(Variant.VT_LPWSTR, ""); check(Variant.VT_LPSTR, "äöüÄÖÜß", codepage);
check(Variant.VT_LPWSTR, "ä"); check(Variant.VT_LPWSTR, "", codepage);
check(Variant.VT_LPWSTR, "äö"); check(Variant.VT_LPWSTR, "ä", codepage);
check(Variant.VT_LPWSTR, "äöü"); check(Variant.VT_LPWSTR, "äö", codepage);
check(Variant.VT_LPWSTR, "äöüÄ"); check(Variant.VT_LPWSTR, "äöü", codepage);
check(Variant.VT_LPWSTR, "äöüÄÖ"); check(Variant.VT_LPWSTR, "äöüÄ", codepage);
check(Variant.VT_LPWSTR, "äöüÄÖÜ"); check(Variant.VT_LPWSTR, "äöüÄÖ", codepage);
check(Variant.VT_LPWSTR, "äöüÄÖÜß"); check(Variant.VT_LPWSTR, "äöüÄÖÜ", codepage);
check(Variant.VT_LPWSTR, "äöüÄÖÜß", codepage);
} }
catch (Exception ex) catch (Exception ex)
{ {
@ -466,20 +472,22 @@ public class TestWrite extends TestCase
* @throws UnsupportedVariantTypeException if the variant is not supported. * @throws UnsupportedVariantTypeException if the variant is not supported.
* @throws IOException if an I/O exception occurs. * @throws IOException if an I/O exception occurs.
*/ */
private void check(final long variantType, final Object value) private void check(final long variantType, final Object value,
final int codepage)
throws UnsupportedVariantTypeException, IOException throws UnsupportedVariantTypeException, IOException
{ {
final ByteArrayOutputStream out = new ByteArrayOutputStream(); final ByteArrayOutputStream out = new ByteArrayOutputStream();
VariantSupport.write(out, variantType, value); VariantSupport.write(out, variantType, value, codepage);
out.close(); out.close();
final byte[] b = out.toByteArray(); final byte[] b = out.toByteArray();
final Object objRead = final Object objRead =
VariantSupport.read(b, 0, b.length + LittleEndian.INT_SIZE, VariantSupport.read(b, 0, b.length + LittleEndian.INT_SIZE,
variantType); variantType, -1);
if (objRead instanceof byte[]) if (objRead instanceof byte[])
{ {
final int diff = diff(org.apache.poi.hpsf.Util.pad4 // final int diff = diff(org.apache.poi.hpsf.Util.pad4
((byte[]) value), (byte[]) objRead); // ((byte[]) value), (byte[]) objRead);
final int diff = diff((byte[]) value, (byte[]) objRead);
if (diff >= 0) if (diff >= 0)
fail("Byte arrays are different. First different byte is at " + fail("Byte arrays are different. First different byte is at " +
"index " + diff + "."); "index " + diff + ".");