HPSF: codepage support added
git-svn-id: https://svn.apache.org/repos/asf/jakarta/poi/trunk@353460 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
6385296f3f
commit
131bb9d0bd
@ -12,7 +12,11 @@
|
||||
<person id="MJ" name="Marc Johnson" email="mjohnson@apache.org"/>
|
||||
<person id="NKB" name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/>
|
||||
<person id="POI-DEVELOPERS" name="POI Developers" email="poi-dev@jakarta.apache.org"/>
|
||||
<person id="RK" name="Rainer Klute" email="klute@apache.org"/>
|
||||
</devs>
|
||||
<release version="2.0-pre3" date="unreleased">
|
||||
<action dev="RK" type="add">HPSF: Much better codepage support</action>
|
||||
</release>
|
||||
<release version="2.0-pre1" date="unreleased">
|
||||
<action dev="POI-DEVELOPERS" type="add">Patch applied for deep cloning of worksheets was provided</action>
|
||||
<action dev="POI-DEVELOPERS" type="add">Patch applied to allow sheet reordering</action>
|
||||
|
@ -708,8 +708,9 @@ No property set stream: "/1Table"</source>
|
||||
<td>The property's value is the number of a <strong>codepage</strong>,
|
||||
i.e. a mapping from character codes to characters. All strings in the
|
||||
section containing this property must be interpreted using this
|
||||
codepage. Typical property values are 1252 (8-bit "western" characters)
|
||||
or 1200 (16-bit Unicode characters).</td>
|
||||
codepage. Typical property values are 1252 (8-bit "western" characters,
|
||||
ISO-8859-1), 1200 (16-bit Unicode characters, UFT-16), or 65001 (8-bit
|
||||
Unicode characters, UFT-8).</td>
|
||||
</tr>
|
||||
</table>
|
||||
</section>
|
||||
@ -833,18 +834,34 @@ No property set stream: "/1Table"</source>
|
||||
</section>
|
||||
|
||||
<section><title>Codepage support</title>
|
||||
<fixme author="Rainer Klute">Improve codepage support!</fixme>
|
||||
|
||||
<p>The property with ID 1 holds the number of the codepage which was used
|
||||
to encode the strings in this section. The present HPSF codepage support
|
||||
is still very limited: When reading property value strings, HPSF
|
||||
distinguishes between 16-bit characters and 8-bit characters. 16-bit
|
||||
characters should be Unicode characters and thus be okay. 8-bit
|
||||
characters are interpreted according to the platform's default character
|
||||
set. This is fine as long as the document being read has been written on
|
||||
a platform with the same default character set. However, if you receive a
|
||||
document from another region of the world and want to process it with
|
||||
HPSF you are in trouble - unless the creator used Unicode, of course.</p>
|
||||
to encode the strings in this section. If this property is not available
|
||||
in a section, the platform's default character encoding will be
|
||||
used. This works fine as long as the document being read has been written
|
||||
on a platform with the same default character encoding. However, if you
|
||||
receive a document from another region of the world and the codepage is
|
||||
undefined, you are in trouble.</p>
|
||||
|
||||
<p>HPSF's codepage support is as good as the character encoding support of
|
||||
the Java Virtual Machine (JVM) the application runs on. If HPSF
|
||||
encounters a codepage number it assumes that the JVM has a character
|
||||
encoding with a corresponding name. For example, if the codepage is 1252,
|
||||
HPSF uses the character encoding "cp1252" to read or write strings. If
|
||||
the JVM does not have that character encoding installed or if the
|
||||
codepage number is illegal, an UnsupportedEncodingException will be
|
||||
thrown.</p>
|
||||
|
||||
<p>There are two exceptions to the rule that a character encoding's name
|
||||
is derived from the codepage number by prepending the string "cp" to
|
||||
it:</p>
|
||||
|
||||
<dl>
|
||||
<dt>Codepage 1200</dt>
|
||||
<dd>is mapped to the character encoding "UTF-16".</dd>
|
||||
<dt>Codepage 65001</dt>
|
||||
<dd>is mapped to the character encoding "UTF-8".</dd>
|
||||
</dl>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
|
@ -944,6 +944,60 @@
|
||||
|
||||
|
||||
|
||||
<section>
|
||||
<title>The Dictionary</title>
|
||||
|
||||
<p>What a dictionary is good for is explained in the <link
|
||||
href="how-to.html">HPSF HOW-TO</link>. This chapter explains how it is
|
||||
organized internally.</p>
|
||||
|
||||
<p>The dictionary has a simple header consisting of a single UInt value. It
|
||||
tells how many entries the dictionary comprises:</p>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Name</th>
|
||||
<th>Data type</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>nrEntries</td>
|
||||
<th>UInt</th>
|
||||
<td>Number of dictionary entries</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p>The dictionary entries follow the header. Each one looks like this:</p>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<th>Name</th>
|
||||
<td>Data type</td>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>key</td>
|
||||
<td>UInt</td>
|
||||
<td>The unique number of this property, i.e. the PID</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>length</td>
|
||||
<td>UInt</td>
|
||||
<td>The length of the property name associated with the key</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>value</td>
|
||||
<td>String</td>
|
||||
<td>The property's name, terminated with a 0x00 character</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p>The entries are not aligned, i.e. each one follows its predecessor
|
||||
without any gap or fill characters.</p>
|
||||
</section>
|
||||
|
||||
|
||||
|
||||
<section><title>References</title>
|
||||
|
||||
<p>In order to assemble the HPSF description I used information publically
|
||||
|
@ -21,25 +21,20 @@
|
||||
information streams.
|
||||
</li>
|
||||
<li>
|
||||
Add codepage support: Presently the bytes making out the string in a
|
||||
property's value are interpreted using the platform's default character
|
||||
set.
|
||||
</li>
|
||||
<li>
|
||||
Add resource bundles to
|
||||
<code>org.apache.poi.hpsf.wellknown</code> to ease
|
||||
localizations. This would be useful for mapping standard property IDs to
|
||||
localized strings. Example: The property ID 4 could be mapped to "Author"
|
||||
in English or "Verfasser" in German.
|
||||
Add resource bundles to
|
||||
<code>org.apache.poi.hpsf.wellknown</code> to ease
|
||||
localizations. This would be useful for mapping standard property IDs to
|
||||
localized strings. Example: The property ID 4 could be mapped to "Author"
|
||||
in English or "Verfasser" in German.
|
||||
</li>
|
||||
<li>
|
||||
Implement reading functionality for those property types that are not
|
||||
yet supported. HPSF should return proper Java types instead of just byte
|
||||
arrays.
|
||||
yet supported. HPSF should return proper Java types instead of just byte
|
||||
arrays.
|
||||
</li>
|
||||
<li>
|
||||
Add WMF to <code>java.awt.Image</code> example code in <link
|
||||
href="thumbnails.html">Thumbnail HOW TO</link>.
|
||||
Add WMF to <code>java.awt.Image</code> example code in the <link
|
||||
href="thumbnails.html">Thumbnail HOW-TO</link>.
|
||||
</li>
|
||||
</ol>
|
||||
</section>
|
||||
|
@ -558,7 +558,10 @@ public class CopyCompare
|
||||
* exists. However, since we have full control about directory
|
||||
* creation we can ensure that this will never happen. */
|
||||
ex.printStackTrace(System.err);
|
||||
throw new RuntimeException(ex);
|
||||
throw new RuntimeException(ex.toString());
|
||||
/* FIXME (2): Replace the previous line by the following once we
|
||||
* no longer need JDK 1.3 compatibility. */
|
||||
// throw new RuntimeException(ex);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -444,7 +444,10 @@ public class WriteAuthorAndTitle
|
||||
* exists. However, since we have full control about directory
|
||||
* creation we can ensure that this will never happen. */
|
||||
ex.printStackTrace(System.err);
|
||||
throw new RuntimeException(ex);
|
||||
throw new RuntimeException(ex.toString());
|
||||
/* FIXME (2): Replace the previous line by the following once we
|
||||
* no longer need JDK 1.3 compatibility. */
|
||||
// throw new RuntimeException(ex);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -80,19 +80,20 @@ public class MutableProperty extends Property
|
||||
* <p>Writes the property to an output stream.</p>
|
||||
*
|
||||
* @param out The output stream to write to.
|
||||
* @param codepage The codepage to use for writing non-wide strings
|
||||
* @return the number of bytes written to the stream
|
||||
*
|
||||
* @exception IOException if an I/O error occurs
|
||||
* @exception WritingNotSupportedException if a variant type is to be
|
||||
* written that is not yet supported
|
||||
*/
|
||||
public int write(final OutputStream out)
|
||||
public int write(final OutputStream out, final int codepage)
|
||||
throws IOException, WritingNotSupportedException
|
||||
{
|
||||
int length = 0;
|
||||
long variantType = getType();
|
||||
length += TypeWriter.writeUIntToStream(out, variantType);
|
||||
length += VariantSupport.write(out, variantType, getValue());
|
||||
length += VariantSupport.write(out, variantType, getValue(), codepage);
|
||||
return length;
|
||||
}
|
||||
|
||||
|
@ -420,16 +420,16 @@ public class MutableSection extends Section
|
||||
|
||||
/* If the property ID is not equal 0 we write the property and all
|
||||
* is fine. However, if it equals 0 we have to write the section's
|
||||
* dictionary which does not have a type but just a value. */
|
||||
* dictionary which has an implicit type only and an explicit
|
||||
* value. */
|
||||
if (id != 0)
|
||||
/* Write the property and update the position to the next
|
||||
* property. */
|
||||
position += p.write(propertyStream);
|
||||
position += p.write(propertyStream, getCodepage());
|
||||
else
|
||||
{
|
||||
final Integer codepage =
|
||||
(Integer) getProperty(PropertyIDMap.PID_CODEPAGE);
|
||||
if (codepage == null)
|
||||
final int codepage = getCodepage();
|
||||
if (codepage == -1)
|
||||
throw new IllegalPropertySetDataException
|
||||
("Codepage (property 1) is undefined.");
|
||||
position += writeDictionary(propertyStream, dictionary);
|
||||
|
@ -62,9 +62,11 @@
|
||||
*/
|
||||
package org.apache.poi.hpsf;
|
||||
|
||||
import java.io.UnsupportedEncodingException;
|
||||
import java.util.HashMap;
|
||||
import java.util.Map;
|
||||
|
||||
import org.apache.poi.util.HexDump;
|
||||
import org.apache.poi.util.LittleEndian;
|
||||
|
||||
/**
|
||||
@ -161,9 +163,13 @@ public class Property
|
||||
* @param length The property's type/value pair's length in bytes.
|
||||
* @param codepage The section's and thus the property's
|
||||
* codepage. It is needed only when reading string values.
|
||||
*
|
||||
* @exception UnsupportedEncodingException if the specified codepage is not
|
||||
* supported
|
||||
*/
|
||||
public Property(final long id, final byte[] src, final long offset,
|
||||
final int length, final int codepage)
|
||||
throws UnsupportedEncodingException
|
||||
{
|
||||
this.id = id;
|
||||
|
||||
@ -183,7 +189,7 @@ public class Property
|
||||
|
||||
try
|
||||
{
|
||||
value = VariantSupport.read(src, o, length, (int) type);
|
||||
value = VariantSupport.read(src, o, length, (int) type, codepage);
|
||||
}
|
||||
catch (UnsupportedVariantTypeException ex)
|
||||
{
|
||||
@ -382,8 +388,27 @@ public class Property
|
||||
b.append(getID());
|
||||
b.append(", type: ");
|
||||
b.append(getType());
|
||||
final Object value = getValue();
|
||||
b.append(", value: ");
|
||||
b.append(getValue());
|
||||
b.append(value.toString());
|
||||
if (value instanceof String)
|
||||
{
|
||||
final String s = (String) value;
|
||||
final int l = s.length();
|
||||
final byte[] bytes = new byte[l * 2];
|
||||
for (int i = 0; i < l; i++)
|
||||
{
|
||||
final char c = s.charAt(i);
|
||||
final byte high = (byte) ((c & 0x00ff00) >> 8);
|
||||
final byte low = (byte) ((c & 0x0000ff) >> 0);
|
||||
bytes[i * 2] = high;
|
||||
bytes[i * 2 + 1] = low;
|
||||
}
|
||||
final String hex = HexDump.dump(bytes, 0L, 0);
|
||||
b.append(" [");
|
||||
b.append(hex);
|
||||
b.append("]");
|
||||
}
|
||||
b.append(']');
|
||||
return b.toString();
|
||||
}
|
||||
|
@ -56,6 +56,7 @@ package org.apache.poi.hpsf;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.io.InputStream;
|
||||
import java.io.UnsupportedEncodingException;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
|
||||
@ -300,9 +301,11 @@ public class PropertySet
|
||||
* @param length The length of the stream data.
|
||||
* @throws NoPropertySetStreamException if the byte array is not a
|
||||
* property set stream.
|
||||
*
|
||||
* @exception UnsupportedEncodingException if the codepage is not supported
|
||||
*/
|
||||
public PropertySet(final byte[] stream, final int offset, final int length)
|
||||
throws NoPropertySetStreamException
|
||||
throws NoPropertySetStreamException, UnsupportedEncodingException
|
||||
{
|
||||
if (isPropertySetStream(stream, offset, length))
|
||||
init(stream, offset, length);
|
||||
@ -321,8 +324,11 @@ public class PropertySet
|
||||
* complete byte array contents is the stream data.
|
||||
* @throws NoPropertySetStreamException if the byte array is not a
|
||||
* property set stream.
|
||||
*
|
||||
* @exception UnsupportedEncodingException if the codepage is not supported
|
||||
*/
|
||||
public PropertySet(final byte[] stream) throws NoPropertySetStreamException
|
||||
public PropertySet(final byte[] stream)
|
||||
throws NoPropertySetStreamException, UnsupportedEncodingException
|
||||
{
|
||||
this(stream, 0, stream.length);
|
||||
}
|
||||
@ -435,6 +441,7 @@ public class PropertySet
|
||||
* @param length Length of the property set stream.
|
||||
*/
|
||||
private void init(final byte[] src, final int offset, final int length)
|
||||
throws UnsupportedEncodingException
|
||||
{
|
||||
/* FIXME (3): Ensure that at most "length" bytes are read. */
|
||||
|
||||
@ -651,7 +658,7 @@ public class PropertySet
|
||||
final PropertySet ps = (PropertySet) o;
|
||||
int byteOrder1 = ps.getByteOrder();
|
||||
int byteOrder2 = getByteOrder();
|
||||
ClassID classId1 = ps.getClassID();
|
||||
ClassID classID1 = ps.getClassID();
|
||||
ClassID classID2 = getClassID();
|
||||
int format1 = ps.getFormat();
|
||||
int format2 = getFormat();
|
||||
@ -660,7 +667,7 @@ public class PropertySet
|
||||
int sectionCount1 = ps.getSectionCount();
|
||||
int sectionCount2 = getSectionCount();
|
||||
if (byteOrder1 != byteOrder2 ||
|
||||
!classId1.equals(classID2) ||
|
||||
!classID1.equals(classID2) ||
|
||||
format1 != format2 ||
|
||||
osVersion1 != osVersion2 ||
|
||||
sectionCount1 != sectionCount2)
|
||||
|
@ -54,6 +54,7 @@
|
||||
*/
|
||||
package org.apache.poi.hpsf;
|
||||
|
||||
import java.io.UnsupportedEncodingException;
|
||||
import java.util.ArrayList;
|
||||
import java.util.Collections;
|
||||
import java.util.Iterator;
|
||||
@ -193,8 +194,12 @@ public class Section
|
||||
* @param src Contains the complete property set stream.
|
||||
* @param offset The position in the stream that points to the
|
||||
* section's format ID.
|
||||
*
|
||||
* @exception UnsupportedEncodingException if the section's codepage is not
|
||||
* supported.
|
||||
*/
|
||||
public Section(final byte[] src, final int offset)
|
||||
throws UnsupportedEncodingException
|
||||
{
|
||||
int o1 = offset;
|
||||
|
||||
@ -638,4 +643,18 @@ public class Section
|
||||
return dictionary;
|
||||
}
|
||||
|
||||
|
||||
|
||||
/**
|
||||
* <p>Gets the section's codepage, if any.</p>
|
||||
*
|
||||
* @return The section's codepage if one is defined, else -1.
|
||||
*/
|
||||
public int getCodepage()
|
||||
{
|
||||
final Integer codepage =
|
||||
(Integer) getProperty(PropertyIDMap.PID_CODEPAGE);
|
||||
return codepage != null ? codepage.intValue() : -1;
|
||||
}
|
||||
|
||||
}
|
||||
|
@ -185,7 +185,8 @@ public class TypeWriter
|
||||
* @exception IOException if an I/O error occurs
|
||||
*/
|
||||
public static void writeToStream(final OutputStream out,
|
||||
final Property[] properties)
|
||||
final Property[] properties,
|
||||
final int codepage)
|
||||
throws IOException, UnsupportedVariantTypeException
|
||||
{
|
||||
/* If there are no properties don't write anything. */
|
||||
@ -207,7 +208,7 @@ public class TypeWriter
|
||||
final Property p = (Property) properties[i];
|
||||
long type = p.getType();
|
||||
writeUIntToStream(out, type);
|
||||
VariantSupport.write(out, (int) type, p.getValue());
|
||||
VariantSupport.write(out, (int) type, p.getValue(), codepage);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -64,6 +64,7 @@ package org.apache.poi.hpsf;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.io.OutputStream;
|
||||
import java.io.UnsupportedEncodingException;
|
||||
import java.util.Date;
|
||||
import java.util.LinkedList;
|
||||
import java.util.List;
|
||||
@ -163,17 +164,21 @@ public class VariantSupport extends Variant
|
||||
* @param length The length of the variant including the variant
|
||||
* type field
|
||||
* @param type The variant type to read
|
||||
* @param codepage The codepage to use to write non-wide strings
|
||||
* @return A Java object that corresponds best to the variant
|
||||
* field. For example, a VT_I4 is returned as a {@link Long}, a
|
||||
* VT_LPSTR as a {@link String}.
|
||||
* @exception ReadingNotSupportedException if a property is to be written
|
||||
* who's variant type HPSF does not yet support
|
||||
* @exception UnsupportedEncodingException if the specified codepage is not
|
||||
* supported
|
||||
*
|
||||
* @see Variant
|
||||
*/
|
||||
public static Object read(final byte[] src, final int offset,
|
||||
final int length, final long type)
|
||||
throws ReadingNotSupportedException
|
||||
final int length, final long type,
|
||||
final int codepage)
|
||||
throws ReadingNotSupportedException, UnsupportedEncodingException
|
||||
{
|
||||
Object value;
|
||||
int o1 = offset;
|
||||
@ -221,18 +226,18 @@ public class VariantSupport extends Variant
|
||||
* Read a byte string. In Java it is represented as a
|
||||
* String object. The 0x00 bytes at the end must be
|
||||
* stripped.
|
||||
*
|
||||
* FIXME (2): Reading an 8-bit string should pay attention
|
||||
* to the codepage. Currently the byte making out the
|
||||
* property's value are interpreted according to the
|
||||
* platform's default character set.
|
||||
*/
|
||||
final int first = o1 + LittleEndian.INT_SIZE;
|
||||
long last = first + LittleEndian.getUInt(src, o1) - 1;
|
||||
o1 += LittleEndian.INT_SIZE;
|
||||
final int rawLength = (int) (last - first + 1);
|
||||
while (src[(int) last] == 0 && first <= last)
|
||||
last--;
|
||||
value = new String(src, (int) first, (int) (last - first + 1));
|
||||
final int l = (int) (last - first + 1);
|
||||
value = codepage != -1 ?
|
||||
new String(src, (int) first, l,
|
||||
codepageToEncoding(codepage)) :
|
||||
new String(src, (int) first, l);
|
||||
break;
|
||||
}
|
||||
case Variant.VT_LPWSTR:
|
||||
@ -298,6 +303,38 @@ public class VariantSupport extends Variant
|
||||
|
||||
|
||||
|
||||
/**
|
||||
* <p>Turns a codepage number into the equivalent character encoding's
|
||||
* name.</p>
|
||||
*
|
||||
* @param codepage The codepage number
|
||||
*
|
||||
* @return The character encoding's name. If the codepage number is 65001,
|
||||
* the encoding name is "UTF-8". All other positive numbers are mapped to
|
||||
* "cp" followed by the number, e.g. if the codepage number is 1252 the
|
||||
* returned character encoding name will be "cp1252".
|
||||
*
|
||||
* @exception UnsupportedEncodingException if the specified codepage is
|
||||
* less than zero.
|
||||
*/
|
||||
public static String codepageToEncoding(final int codepage)
|
||||
throws UnsupportedEncodingException
|
||||
{
|
||||
if (codepage <= 0)
|
||||
throw new UnsupportedEncodingException
|
||||
("Codepage number may not be " + codepage);
|
||||
switch (codepage)
|
||||
{
|
||||
case 1200:
|
||||
return "UTF-16";
|
||||
case 65001:
|
||||
return "UTF-8";
|
||||
default:
|
||||
return "cp" + codepage;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* <p>Writes a variant value to an output stream. This method ensures that
|
||||
* always a multiple of 4 bytes is written.</p>
|
||||
@ -305,6 +342,7 @@ public class VariantSupport extends Variant
|
||||
* @param out The stream to write the value to.
|
||||
* @param type The variant's type.
|
||||
* @param value The variant's value.
|
||||
* @param codepage The codepage to use to write non-wide strings
|
||||
* @return The number of entities that have been written. In many cases an
|
||||
* "entity" is a byte but this is not always the case.
|
||||
* @exception IOException if an I/O exceptions occurs
|
||||
@ -312,7 +350,7 @@ public class VariantSupport extends Variant
|
||||
* who's variant type HPSF does not yet support
|
||||
*/
|
||||
public static int write(final OutputStream out, final long type,
|
||||
final Object value)
|
||||
final Object value, final int codepage)
|
||||
throws IOException, WritingNotSupportedException
|
||||
{
|
||||
int length = 0;
|
||||
@ -330,16 +368,13 @@ public class VariantSupport extends Variant
|
||||
}
|
||||
case Variant.VT_LPSTR:
|
||||
{
|
||||
length = TypeWriter.writeUIntToStream
|
||||
(out, ((String) value).length() + 1);
|
||||
char[] s = Util.pad4((String) value);
|
||||
/* FIXME (2): The following line forces characters to bytes.
|
||||
* This is generally wrong and should only be done according to
|
||||
* a codepage. Alternatively Unicode could be written (see
|
||||
* Variant.VT_LPWSTR). */
|
||||
byte[] b = new byte[s.length + 1];
|
||||
for (int i = 0; i < s.length; i++)
|
||||
b[i] = (byte) s[i];
|
||||
final byte[] bytes =
|
||||
(codepage == -1 ?
|
||||
((String) value).getBytes() :
|
||||
((String) value).getBytes(codepageToEncoding(codepage)));
|
||||
length = TypeWriter.writeUIntToStream(out, bytes.length + 1);
|
||||
final byte[] b = new byte[bytes.length + 1];
|
||||
System.arraycopy(bytes, 0, b, 0, bytes.length);
|
||||
b[b.length - 1] = 0x00;
|
||||
out.write(b);
|
||||
length += b.length;
|
||||
@ -419,12 +454,13 @@ public class VariantSupport extends Variant
|
||||
}
|
||||
}
|
||||
|
||||
/* Add 0x00 character to write a multiple of four bytes: */
|
||||
while (length % 4 != 0)
|
||||
{
|
||||
out.write(0);
|
||||
length++;
|
||||
}
|
||||
/* Add 0x00 characters to write a multiple of four bytes: */
|
||||
// FIXME (1) Try this!
|
||||
// while (length % 4 != 0)
|
||||
// {
|
||||
// out.write(0);
|
||||
// length++;
|
||||
// }
|
||||
return length;
|
||||
}
|
||||
|
||||
|
@ -357,7 +357,10 @@ public class TestWrite extends TestCase
|
||||
catch (Exception ex)
|
||||
{
|
||||
ex.printStackTrace();
|
||||
throw new RuntimeException(ex);
|
||||
throw new RuntimeException(ex.toString());
|
||||
/* FIXME (2): Replace the previous line by the following
|
||||
* one once we no longer need JDK 1.3 compatibility. */
|
||||
// throw new RuntimeException(ex);
|
||||
}
|
||||
}
|
||||
},
|
||||
@ -398,37 +401,40 @@ public class TestWrite extends TestCase
|
||||
public void testVariantTypes()
|
||||
{
|
||||
Throwable t = null;
|
||||
final int codepage = -1;
|
||||
/* FIXME (2): Add tests for various codepages! */
|
||||
try
|
||||
{
|
||||
check(Variant.VT_EMPTY, null);
|
||||
check(Variant.VT_BOOL, new Boolean(true));
|
||||
check(Variant.VT_BOOL, new Boolean(false));
|
||||
check(Variant.VT_CF, new byte[]{0});
|
||||
check(Variant.VT_CF, new byte[]{0, 1});
|
||||
check(Variant.VT_CF, new byte[]{0, 1, 2});
|
||||
check(Variant.VT_CF, new byte[]{0, 1, 2, 3});
|
||||
check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4});
|
||||
check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5});
|
||||
check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10});
|
||||
check(Variant.VT_I2, new Integer(27));
|
||||
check(Variant.VT_I4, new Long(28));
|
||||
check(Variant.VT_FILETIME, new Date());
|
||||
check(Variant.VT_LPSTR, "");
|
||||
check(Variant.VT_LPSTR, "ä");
|
||||
check(Variant.VT_LPSTR, "äö");
|
||||
check(Variant.VT_LPSTR, "äöü");
|
||||
check(Variant.VT_LPSTR, "äöüÄ");
|
||||
check(Variant.VT_LPSTR, "äöüÄÖ");
|
||||
check(Variant.VT_LPSTR, "äöüÄÖÜ");
|
||||
check(Variant.VT_LPSTR, "äöüÄÖÜß");
|
||||
check(Variant.VT_LPWSTR, "");
|
||||
check(Variant.VT_LPWSTR, "ä");
|
||||
check(Variant.VT_LPWSTR, "äö");
|
||||
check(Variant.VT_LPWSTR, "äöü");
|
||||
check(Variant.VT_LPWSTR, "äöüÄ");
|
||||
check(Variant.VT_LPWSTR, "äöüÄÖ");
|
||||
check(Variant.VT_LPWSTR, "äöüÄÖÜ");
|
||||
check(Variant.VT_LPWSTR, "äöüÄÖÜß");
|
||||
check(Variant.VT_EMPTY, null, codepage);
|
||||
check(Variant.VT_BOOL, new Boolean(true), codepage);
|
||||
check(Variant.VT_BOOL, new Boolean(false), codepage);
|
||||
check(Variant.VT_CF, new byte[]{0}, codepage);
|
||||
check(Variant.VT_CF, new byte[]{0, 1}, codepage);
|
||||
check(Variant.VT_CF, new byte[]{0, 1, 2}, codepage);
|
||||
check(Variant.VT_CF, new byte[]{0, 1, 2, 3}, codepage);
|
||||
check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4}, codepage);
|
||||
check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5}, codepage);
|
||||
check(Variant.VT_CF, new byte[]{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10},
|
||||
codepage);
|
||||
check(Variant.VT_I2, new Integer(27), codepage);
|
||||
check(Variant.VT_I4, new Long(28), codepage);
|
||||
check(Variant.VT_FILETIME, new Date(), codepage);
|
||||
check(Variant.VT_LPSTR, "", codepage);
|
||||
check(Variant.VT_LPSTR, "ä", codepage);
|
||||
check(Variant.VT_LPSTR, "äö", codepage);
|
||||
check(Variant.VT_LPSTR, "äöü", codepage);
|
||||
check(Variant.VT_LPSTR, "äöüÄ", codepage);
|
||||
check(Variant.VT_LPSTR, "äöüÄÖ", codepage);
|
||||
check(Variant.VT_LPSTR, "äöüÄÖÜ", codepage);
|
||||
check(Variant.VT_LPSTR, "äöüÄÖÜß", codepage);
|
||||
check(Variant.VT_LPWSTR, "", codepage);
|
||||
check(Variant.VT_LPWSTR, "ä", codepage);
|
||||
check(Variant.VT_LPWSTR, "äö", codepage);
|
||||
check(Variant.VT_LPWSTR, "äöü", codepage);
|
||||
check(Variant.VT_LPWSTR, "äöüÄ", codepage);
|
||||
check(Variant.VT_LPWSTR, "äöüÄÖ", codepage);
|
||||
check(Variant.VT_LPWSTR, "äöüÄÖÜ", codepage);
|
||||
check(Variant.VT_LPWSTR, "äöüÄÖÜß", codepage);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
@ -466,20 +472,22 @@ public class TestWrite extends TestCase
|
||||
* @throws UnsupportedVariantTypeException if the variant is not supported.
|
||||
* @throws IOException if an I/O exception occurs.
|
||||
*/
|
||||
private void check(final long variantType, final Object value)
|
||||
private void check(final long variantType, final Object value,
|
||||
final int codepage)
|
||||
throws UnsupportedVariantTypeException, IOException
|
||||
{
|
||||
final ByteArrayOutputStream out = new ByteArrayOutputStream();
|
||||
VariantSupport.write(out, variantType, value);
|
||||
VariantSupport.write(out, variantType, value, codepage);
|
||||
out.close();
|
||||
final byte[] b = out.toByteArray();
|
||||
final Object objRead =
|
||||
VariantSupport.read(b, 0, b.length + LittleEndian.INT_SIZE,
|
||||
variantType);
|
||||
variantType, -1);
|
||||
if (objRead instanceof byte[])
|
||||
{
|
||||
final int diff = diff(org.apache.poi.hpsf.Util.pad4
|
||||
((byte[]) value), (byte[]) objRead);
|
||||
// final int diff = diff(org.apache.poi.hpsf.Util.pad4
|
||||
// ((byte[]) value), (byte[]) objRead);
|
||||
final int diff = diff((byte[]) value, (byte[]) objRead);
|
||||
if (diff >= 0)
|
||||
fail("Byte arrays are different. First different byte is at " +
|
||||
"index " + diff + ".");
|
||||
|
BIN
src/testcases/org/apache/poi/hpsf/data/TestChineseProperties.doc
Normal file
BIN
src/testcases/org/apache/poi/hpsf/data/TestChineseProperties.doc
Normal file
Binary file not shown.
Loading…
Reference in New Issue
Block a user