Commit Graph

662 Commits

Author SHA1 Message Date
Maxim Valyanskiy
22730f9a12 HWPF: better fix for TextPieceTable.getCharIndex()
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@960922 13f79535-47bb-0310-9956-ffa450edef68
2010-07-06 15:45:36 +00:00
Maxim Valyanskiy
78b0c18ade HWPF: Improve reading of auto-saved ("complex") document
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@960587 13f79535-47bb-0310-9956-ffa450edef68
2010-07-05 12:56:02 +00:00
Nick Burch
256e73d16d More Word 6 / Word 95 Support
HWPFOldDocument now processes a few more table sections, and so we can fake up some
 basic Ranges. This allows us to do paragraph level text extraction


git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@960102 13f79535-47bb-0310-9956-ffa450edef68
2010-07-02 20:59:30 +00:00
Nick Burch
01ec911b74 Fix generics warnings
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@960094 13f79535-47bb-0310-9956-ffa450edef68
2010-07-02 20:01:42 +00:00
Nick Burch
6ee6d9095f Enable Word6Extractor in ExtractorFactory
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@959360 13f79535-47bb-0310-9956-ffa450edef68
2010-06-30 16:08:10 +00:00
Nick Burch
30848a80aa Basic text extraction support for old Word 6 and Word 95 documents via some HWPF extensions
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@959346 13f79535-47bb-0310-9956-ffa450edef68
2010-06-30 15:13:10 +00:00
Nick Burch
0910eb1ab5 Fix generics warnings
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@959335 13f79535-47bb-0310-9956-ffa450edef68
2010-06-30 14:41:03 +00:00
Nick Burch
ad33151624 Better handling of Outlook messages in HSMF when there's no recipient email address
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@954476 13f79535-47bb-0310-9956-ffa450edef68
2010-06-14 13:47:22 +00:00
Nick Burch
05ddf6a51e Fix for bug #48245 - tweak HWPF table cell detection to work across more files
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@953694 13f79535-47bb-0310-9956-ffa450edef68
2010-06-11 13:29:44 +00:00
Nick Burch
bf4e6ff464 Add additional RevisionMarkAuthorTable test
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@953343 13f79535-47bb-0310-9956-ffa450edef68
2010-06-10 15:02:05 +00:00
Yegor Kozlov
8c4341facf cleaned javadoc warnings
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@951920 13f79535-47bb-0310-9956-ffa450edef68
2010-06-06 18:19:08 +00:00
Nick Burch
d29d1d7d9b Apply with tweaks the patch from bug #45269 - improve replaceText on HWPF ranges
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@951498 13f79535-47bb-0310-9956-ffa450edef68
2010-06-04 17:19:31 +00:00
Nick Burch
45c4b6bf8f Tweak @link reference to avoid compiler issues
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@951055 13f79535-47bb-0310-9956-ffa450edef68
2010-06-03 16:23:40 +00:00
Nick Burch
f9fa636e6d Remove un-used imports
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@951053 13f79535-47bb-0310-9956-ffa450edef68
2010-06-03 16:21:41 +00:00
Nick Burch
65d7431a9f Parse the HSMF headers chunk if present, and use it to find Dates in text extraction if needed
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@951034 13f79535-47bb-0310-9956-ffa450edef68
2010-06-03 15:33:54 +00:00
Nick Burch
cee16bc83b List attachment names in the output of OutlookTextExtractor (to get attachment contents, use ExtractorFactory as normal)
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@950595 13f79535-47bb-0310-9956-ffa450edef68
2010-06-02 15:24:11 +00:00
Yegor Kozlov
6ee427ddf9 fixed construction of the DIB picture header, see Bugzilla 43161
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@949483 13f79535-47bb-0310-9956-ffa450edef68
2010-05-30 06:56:32 +00:00
Yegor Kozlov
55c924c5d2 removed deprecation warnings to keep javac quiet
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@949434 13f79535-47bb-0310-9956-ffa450edef68
2010-05-29 18:31:04 +00:00
Nick Burch
0df94e6be8 Apply patch from bug #48924 - Allow access of the HWPF DateAndTime underlying date values
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@948455 13f79535-47bb-0310-9956-ffa450edef68
2010-05-26 14:40:25 +00:00
Nick Burch
6666c539da Add a simple testcase for the new RevisionMarkAuthorTable.java
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@948445 13f79535-47bb-0310-9956-ffa450edef68
2010-05-26 14:22:49 +00:00
Nick Burch
9798e24fd2 Apply patch from bug #48926 - Initial support for the HWPF revision marks authors list
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@948435 13f79535-47bb-0310-9956-ffa450edef68
2010-05-26 14:17:15 +00:00
Nick Burch
4c1d86e5de Apply patches from Peter Kutak from bugs 49334 and 49242 - HSSFChart improvements by tracking more records
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@948080 13f79535-47bb-0310-9956-ffa450edef68
2010-05-25 15:59:26 +00:00
Nick Burch
a3899a57d8 Resolve bug #49139 - don't assume that the block size is always 512 bytes. Instead of hard coding this value in, pass around the new POIFSBigBlockSize object that holds the size and various helper subsizes. Should now be possible to open 4k block files without error.
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@937834 13f79535-47bb-0310-9956-ffa450edef68
2010-04-25 17:35:56 +00:00
Yegor Kozlov
fe048df54e Fixed locale-sensitive formatters in PackagePropertiesPart, see Bugzilla 49138
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@935896 13f79535-47bb-0310-9956-ffa450edef68
2010-04-20 12:57:27 +00:00
Nick Burch
f1371227be Remove old .cvsignore files
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@933963 13f79535-47bb-0310-9956-ffa450edef68
2010-04-14 14:11:13 +00:00
Maxim Valyanskiy
fc53ead4ca bugfix: ClassCastException it PicturesTable.getAllPictures():
UnknownEscherRecord cannot be cast to EscherBlipRecord


git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@931111 13f79535-47bb-0310-9956-ffa450edef68
2010-04-06 12:12:03 +00:00
Yegor Kozlov
639bf94c6f propagate parent to parent-aware records decoded from Escher, also ensure that TextShape and EscherTextboxWrapper hold the same cached sets of records, see Bugzilla 48916
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@930525 13f79535-47bb-0310-9956-ffa450edef68
2010-04-03 14:44:39 +00:00
Nick Burch
918f1a496d Fix an issue with the HSMF tests working on some machines but not others - Make poifs.filesystem.DirectoryNode preserve the original ordering of its files, which HSMF needs to be able
to correctly match up chunks

git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@911878 13f79535-47bb-0310-9956-ffa450edef68
2010-02-19 17:55:32 +00:00
Nick Burch
943d3d19e1 Add a disabled test for bug #44501, which still remains, plus fix a generics warning
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@904062 13f79535-47bb-0310-9956-ffa450edef68
2010-01-28 12:28:29 +00:00
Nick Burch
9bbf3ef4d0 Fix generics warnings, and fix up tests to handle the extra bit of text being extracted now
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@904060 13f79535-47bb-0310-9956-ffa450edef68
2010-01-28 12:20:32 +00:00
Nick Burch
3aef368b71 Apply patch from Jukka from bug #43670 to improve HDGF v11 Separator detection, and handle short strings better, hopefully solving the Negative length of ChunkHeader issue
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@904052 13f79535-47bb-0310-9956-ffa450edef68
2010-01-28 12:05:13 +00:00
Nick Burch
2880d934f9 Improve error message, and fix generics warnings
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@904049 13f79535-47bb-0310-9956-ffa450edef68
2010-01-28 12:00:38 +00:00
Maxim Valyanskiy
ed3cae95f8 PowerPoint OLEShape: extract last version of embedded ole object
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@901215 13f79535-47bb-0310-9956-ffa450edef68
2010-01-20 14:33:58 +00:00
Nick Burch
545f2e1119 Improved how HSMF handles multiple recipients
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@898295 13f79535-47bb-0310-9956-ffa450edef68
2010-01-12 12:02:18 +00:00
Nick Burch
6e97a360a3 Add PublisherTextExtractor support to ExtractorFactory
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897887 13f79535-47bb-0310-9956-ffa450edef68
2010-01-11 14:55:43 +00:00
Nick Burch
5621bb0800 Make it possible to return null on missing chunks, rather than the exception
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897847 13f79535-47bb-0310-9956-ffa450edef68
2010-01-11 12:19:42 +00:00
Nick Burch
5ad8301c2a Add embeded (attachment) support to the outlook text extractor
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897258 13f79535-47bb-0310-9956-ffa450edef68
2010-01-08 16:44:08 +00:00
Nick Burch
98cea49eb5 Rename the outlook extractor to be more consistent with other extractors
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897249 13f79535-47bb-0310-9956-ffa450edef68
2010-01-08 16:18:52 +00:00
Nick Burch
cefe4e1d28 Wire up the new HSMFTextExtactor to the ExtractorFactory
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897246 13f79535-47bb-0310-9956-ffa450edef68
2010-01-08 16:14:27 +00:00
Nick Burch
bd2f63c721 Add a text extractor to HSMF for simpler extraction of text from .msg files
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897242 13f79535-47bb-0310-9956-ffa450edef68
2010-01-08 16:04:15 +00:00
Nick Burch
a6e7575999 Fix generics warnings
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897239 13f79535-47bb-0310-9956-ffa450edef68
2010-01-08 16:02:35 +00:00
Nick Burch
7ae86fab09 More work on the recipient related chunks, including a helper method to do best-effort finding of the recipients email address
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897213 13f79535-47bb-0310-9956-ffa450edef68
2010-01-08 14:26:27 +00:00
Nick Burch
52695c1811 Quick bit of refactoring to save parsing the type and id twice
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897205 13f79535-47bb-0310-9956-ffa450edef68
2010-01-08 13:49:09 +00:00
Nick Burch
ff94e5c61b Support fetching the message date from the submission id
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897201 13f79535-47bb-0310-9956-ffa450edef68
2010-01-08 13:42:53 +00:00
Nick Burch
58806414fc Tweak a few tests, and add in a few more chunk types
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897185 13f79535-47bb-0310-9956-ffa450edef68
2010-01-08 12:08:35 +00:00
Nick Burch
795ed3ce55 Complete chunk parser tests, and make more chunk groups available
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897172 13f79535-47bb-0310-9956-ffa450edef68
2010-01-08 11:37:37 +00:00
Nick Burch
0e368a23da Fix some chunk types, fix the directory descent, fix the Msg2txt example, and start on fixing core tests
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@897167 13f79535-47bb-0310-9956-ffa450edef68
2010-01-08 11:14:58 +00:00
Nick Burch
6afb781730 Shuffle where some of the HSMF tests live to better match package names, and stub out a few more tests
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@896923 13f79535-47bb-0310-9956-ffa450edef68
2010-01-07 16:47:09 +00:00
Nick Burch
2bb376f55b Start on major HSMF refactoring. Should compile, but not quite all tests pass as a little bit of work is left
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@896914 13f79535-47bb-0310-9956-ffa450edef68
2010-01-07 16:15:20 +00:00
Nick Burch
e5884f2f66 Add a couple more HSMF chunk types, and use Generics in a few places
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@896868 13f79535-47bb-0310-9956-ffa450edef68
2010-01-07 12:56:39 +00:00