Can POIFS convert PDF to OLE

View: New views
10 Messages — Rating Filter:   Alert me  

Can POIFS convert PDF to OLE

by Helmut Ziegler :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I want to embed pdfs programmatically in word2003 xml file.
This word2003 xml file serves as a kind of compound document.

I've done this with Word by hand (wrote some text and imported two pdf files in a word document).  It seems that one big ole object is created for the pdf documents.
I thought that there must be an API to accomplish this task, but didn't found any.
As the POIFS documentation says that it's capable of working with ole objects, I wonder if I'm able to use it to generate the ole object.

Can anyone of you tell me if POIFS can do that?

Cheers,
Helmut

--
GMX startet ShortView.de. Hier findest Du Leute mit Deinen Interessen!
Jetzt dabei sein: http://www.shortview.de/wasistshortview.php?mc=sv_ext_mf@gmx

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@...
For additional commands, e-mail: user-help@...


Re: Can POIFS convert PDF to OLE

by Helmut Ziegler :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

hm, maybe noone knows an answer to this question.

I decoded the ole objects of the imported pdfs piece by piece.
But I still run into several problems doing the roundtrip (i. e. building the compound file programmatically) ... It might be that I can solve several problems, don't know if all are manageable.

Does anyone of you has another suggestion how to generated Word files that have several other files in it?

Cheers,
Helmut



-------- Original-Nachricht --------
> Datum: Mon, 21 Jul 2008 14:54:26 +0200
> Von: "Helmut Ziegler" <scruffytech@...>
> An: user@...
> Betreff: Can POIFS convert PDF to OLE

> Hi,
>
> I want to embed pdfs programmatically in word2003 xml file.
> This word2003 xml file serves as a kind of compound document.
>
> I've done this with Word by hand (wrote some text and imported two pdf
> files in a word document).  It seems that one big ole object is created for
> the pdf documents.
> I thought that there must be an API to accomplish this task, but didn't
> found any.
> As the POIFS documentation says that it's capable of working with ole
> objects, I wonder if I'm able to use it to generate the ole object.
>
> Can anyone of you tell me if POIFS can do that?
>
> Cheers,
> Helmut
>
> --
> GMX startet ShortView.de. Hier findest Du Leute mit Deinen Interessen!
> Jetzt dabei sein:
> http://www.shortview.de/wasistshortview.php?mc=sv_ext_mf@gmx
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@...
> For additional commands, e-mail: user-help@...

--
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@...
For additional commands, e-mail: user-help@...


Re: Can POIFS convert PDF to OLE

by David Fisher :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Maybe know quite understood your question. Let me see if I understand  
and can restate in a way that people understand.

Do you want to create a Word document that carries one or more PDFs  
as file attachments?

or,

Do you want to create a Compound document that has a Word file and a  
PDF rendition of that file?

or,

Are you inserting the pdfs as pictures - viewing the first page of  
each pdf in the word document?

Regards,
Dave

On Jul 23, 2008, at 1:43 PM, Helmut Ziegler wrote:

> Hi,
>
> hm, maybe noone knows an answer to this question.
>
> I decoded the ole objects of the imported pdfs piece by piece.
> But I still run into several problems doing the roundtrip (i. e.  
> building the compound file programmatically) ... It might be that I  
> can solve several problems, don't know if all are manageable.
>
> Does anyone of you has another suggestion how to generated Word  
> files that have several other files in it?
>
> Cheers,
> Helmut
>
>
>
> -------- Original-Nachricht --------
>> Datum: Mon, 21 Jul 2008 14:54:26 +0200
>> Von: "Helmut Ziegler" <scruffytech@...>
>> An: user@...
>> Betreff: Can POIFS convert PDF to OLE
>
>> Hi,
>>
>> I want to embed pdfs programmatically in word2003 xml file.
>> This word2003 xml file serves as a kind of compound document.
>>
>> I've done this with Word by hand (wrote some text and imported two  
>> pdf
>> files in a word document).  It seems that one big ole object is  
>> created for
>> the pdf documents.
>> I thought that there must be an API to accomplish this task, but  
>> didn't
>> found any.
>> As the POIFS documentation says that it's capable of working with ole
>> objects, I wonder if I'm able to use it to generate the ole object.
>>
>> Can anyone of you tell me if POIFS can do that?
>>
>> Cheers,
>> Helmut
>>
>> --
>> GMX startet ShortView.de. Hier findest Du Leute mit Deinen  
>> Interessen!
>> Jetzt dabei sein:
>> http://www.shortview.de/wasistshortview.php?mc=sv_ext_mf@gmx
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@...
>> For additional commands, e-mail: user-help@...
>
> --
> Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
> Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@...
> For additional commands, e-mail: user-help@...
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@...
For additional commands, e-mail: user-help@...


Re: Can POIFS convert PDF to OLE

by Helmut Ziegler :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi David,


>
> Do you want to create a Word document that carries one or more PDFs  
> as file attachments?

yes, I mean this one ;-)
Actually the Word document should also carry other documents like other word files.

Cheers,
Helmut


>
> or,
>
> Do you want to create a Compound document that has a Word file and a  
> PDF rendition of that file?
>
> or,
>
> Are you inserting the pdfs as pictures - viewing the first page of  
> each pdf in the word document?
>
> Regards,
> Dave
>
> On Jul 23, 2008, at 1:43 PM, Helmut Ziegler wrote:
>
> > Hi,
> >
> > hm, maybe noone knows an answer to this question.
> >
> > I decoded the ole objects of the imported pdfs piece by piece.
> > But I still run into several problems doing the roundtrip (i. e.  
> > building the compound file programmatically) ... It might be that I  
> > can solve several problems, don't know if all are manageable.
> >
> > Does anyone of you has another suggestion how to generated Word  
> > files that have several other files in it?
> >
> > Cheers,
> > Helmut
> >
> >
> >
> > -------- Original-Nachricht --------
> >> Datum: Mon, 21 Jul 2008 14:54:26 +0200
> >> Von: "Helmut Ziegler" <scruffytech@...>
> >> An: user@...
> >> Betreff: Can POIFS convert PDF to OLE
> >
> >> Hi,
> >>
> >> I want to embed pdfs programmatically in word2003 xml file.
> >> This word2003 xml file serves as a kind of compound document.
> >>
> >> I've done this with Word by hand (wrote some text and imported two  
> >> pdf
> >> files in a word document).  It seems that one big ole object is  
> >> created for
> >> the pdf documents.
> >> I thought that there must be an API to accomplish this task, but  
> >> didn't
> >> found any.
> >> As the POIFS documentation says that it's capable of working with ole
> >> objects, I wonder if I'm able to use it to generate the ole object.
> >>
> >> Can anyone of you tell me if POIFS can do that?
> >>
> >> Cheers,
> >> Helmut
> >>
> >> --
> >> GMX startet ShortView.de. Hier findest Du Leute mit Deinen  
> >> Interessen!
> >> Jetzt dabei sein:
> >> http://www.shortview.de/wasistshortview.php?mc=sv_ext_mf@gmx
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@...
> >> For additional commands, e-mail: user-help@...
> >
> > --
> > Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
> > Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@...
> > For additional commands, e-mail: user-help@...
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@...
> For additional commands, e-mail: user-help@...

--
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@...
For additional commands, e-mail: user-help@...


Re: Can POIFS convert PDF to OLE

by Nick Burch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 24 Jul 2008, Helmut Ziegler wrote:
> Actually the Word document should also carry other documents like other
> word files.

I'd suggest dumping out the stream(s), and looking at them with things
like org.apache.poi.poifs.dev.POIFSViewer

Start by seeing if you can change on bit of one file in the poifs stream,
and have the change noticed. If that works, but adding a new poifs stream
doesn't, then there are extra things in the poifs stream that need to be
set up. I think you're probably going to need to run diff quite a bit,
across two files (one that works, one that doesn't) and see what's
different

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@...
For additional commands, e-mail: user-help@...


Re: Can POIFS convert PDF to OLE

by Helmut Ziegler :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Nick,

thanks for your response!
I didn't use POIFSViewer but I know (now) the structure of my PDF Ole Object. Unfortunately this isn't enough ...

Here is what I did:

First of all I created a Word2003 xml file with Word and imported a pdf file. The PDF is recognized as a package (not as a pdf file) as there wasn't a program to handle pdf files on that computer.
These are the important parts:
<w:docOleData>
<w:binData w:name="oledata.mso">
0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/
...
</w:binData></w:docOleData>

<o:OLEObject Type="Embed" ProgID="Package" ShapeID="_x0000_i1025" DrawAspect="Content" ObjectID="_1277043057"/>

In the word xml file the ole object is base64 encoded.
I decoded it and wrote a binary file (OleObject.bin) that I inspected (first with 7-zip, later with POIFS)

The structure of OleObject.bin is the following
+ Root entry
++ _1277043057
+++[3]OleObjectInfo
+++[1]Ole10Native
+++[1]Ole
+++[1]CompObj

Ole10Native represents my pdf with a custom header that word attached.
To get to this content I had to:
1. Create a POIFSFilesSystem based on OleObject.bin
2. Get the Entry "_1277043057" and write it to the hard disk (as "_1277043057").
3. Strip the first 4 Bytes of "_1277043057"
4. Use the inflate algortithm to decompress it as "_1277043057_decompressed"
5. Create a POIFSFileSystem again based on the decompressed "_1277043057_decompressed")
6. Write the contents listed above to the hard disk.
==>I could then open my PDF file.

So far, so good. Now I tried it vice versa. After packaging the content again and tried to open the file in Word, Word complained that it can't open the file because
"The server application, the source file, or the element wasn't found"  (this is only a translation)

The I was looking for the step that that fails.
Steps 1 to 4 worked also in the other direction but creating "_1277043057_decompressed" seemed not to  work.
When I compared the to original "_1277043057_decompressed" to the generated one there are many similarities (file size and most of the content). But in first part of the file original there is more information.
I had a look at it in a text editor. The information is some kind of metadata:
1. The alphabet
2. The structure of the ole object. "R.o.o.t. .E.n.t.r.y .... O.l.e. ... C.o.m.p.O.b.j...."
3. The kind of ole object "P.a.c.k.a.g.e"


Does anyone know how I get this information into my file?

Cheers,
Helmut

P. S. The reverse enineering is based on this excellent article
http://www.trustedsource.org/download/research_publications/CAlme_VBOct06.pdf



----
-------- Original-Nachricht --------
> Datum: Thu, 24 Jul 2008 11:42:10 +0100 (BST)
> Von: Nick Burch <nick@...>
> An: POI Users List <user@...>
> Betreff: Re: Can POIFS convert PDF to OLE

> On Thu, 24 Jul 2008, Helmut Ziegler wrote:
> > Actually the Word document should also carry other documents like other
> > word files.
>
> I'd suggest dumping out the stream(s), and looking at them with things
> like org.apache.poi.poifs.dev.POIFSViewer
>
> Start by seeing if you can change on bit of one file in the poifs stream,
> and have the change noticed. If that works, but adding a new poifs stream
> doesn't, then there are extra things in the poifs stream that need to be
> set up. I think you're probably going to need to run diff quite a bit,
> across two files (one that works, one that doesn't) and see what's
> different
>
> Nick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@...
> For additional commands, e-mail: user-help@...

--
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@...
For additional commands, e-mail: user-help@...


Re: Can POIFS convert PDF to OLE

by Yury Batrakov :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Helmut,

As far as I remember, this is OLE header. I decoded OLE embedds from
RTF and they were looking similar to yours. Microsoft RTF spec says:
"When the object is an OLE embedded or linked object, the data part of
the object is the structure produced by the OLESaveToStream function".
I tried to reverse-engineer the format and read wine's source for
OLESaveToStream and OLELoadFromStream, but was defeated soon as this
feature wasn't mandatory in our product.

I hope this will help you somehow, good luck and, please, keep
notifying this maillist in case of any progress.


On 7/24/08, Helmut Ziegler <scruffytech@...> wrote:

> Hi Nick,
>
> thanks for your response!
> I didn't use POIFSViewer but I know (now) the structure of my PDF Ole
> Object. Unfortunately this isn't enough ...
>
> Here is what I did:
>
> First of all I created a Word2003 xml file with Word and imported a pdf
> file. The PDF is recognized as a package (not as a pdf file) as there wasn't
> a program to handle pdf files on that computer.
> These are the important parts:
> <w:docOleData>
> <w:binData w:name="oledata.mso">
> 0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/
> ...
> </w:binData></w:docOleData>
>
> <o:OLEObject Type="Embed" ProgID="Package" ShapeID="_x0000_i1025"
> DrawAspect="Content" ObjectID="_1277043057"/>
>
> In the word xml file the ole object is base64 encoded.
> I decoded it and wrote a binary file (OleObject.bin) that I inspected (first
> with 7-zip, later with POIFS)
>
> The structure of OleObject.bin is the following
> + Root entry
> ++ _1277043057
> +++[3]OleObjectInfo
> +++[1]Ole10Native
> +++[1]Ole
> +++[1]CompObj
>
> Ole10Native represents my pdf with a custom header that word attached.
> To get to this content I had to:
> 1. Create a POIFSFilesSystem based on OleObject.bin
> 2. Get the Entry "_1277043057" and write it to the hard disk (as
> "_1277043057").
> 3. Strip the first 4 Bytes of "_1277043057"
> 4. Use the inflate algortithm to decompress it as "_1277043057_decompressed"
> 5. Create a POIFSFileSystem again based on the decompressed
> "_1277043057_decompressed")
> 6. Write the contents listed above to the hard disk.
> ==>I could then open my PDF file.
>
> So far, so good. Now I tried it vice versa. After packaging the content
> again and tried to open the file in Word, Word complained that it can't open
> the file because
> "The server application, the source file, or the element wasn't found"
> (this is only a translation)
>
> The I was looking for the step that that fails.
> Steps 1 to 4 worked also in the other direction but creating
> "_1277043057_decompressed" seemed not to  work.
> When I compared the to original "_1277043057_decompressed" to the generated
> one there are many similarities (file size and most of the content). But in
> first part of the file original there is more information.
> I had a look at it in a text editor. The information is some kind of
> metadata:
> 1. The alphabet
> 2. The structure of the ole object. "R.o.o.t. .E.n.t.r.y .... O.l.e. ...
> C.o.m.p.O.b.j...."
> 3. The kind of ole object "P.a.c.k.a.g.e"
>
>
> Does anyone know how I get this information into my file?
>
> Cheers,
> Helmut
>
> P. S. The reverse enineering is based on this excellent article
> http://www.trustedsource.org/download/research_publications/CAlme_VBOct06.pdf
>
>
>
> ----
> -------- Original-Nachricht --------
>> Datum: Thu, 24 Jul 2008 11:42:10 +0100 (BST)
>> Von: Nick Burch <nick@...>
>> An: POI Users List <user@...>
>> Betreff: Re: Can POIFS convert PDF to OLE
>
>> On Thu, 24 Jul 2008, Helmut Ziegler wrote:
>> > Actually the Word document should also carry other documents like other
>> > word files.
>>
>> I'd suggest dumping out the stream(s), and looking at them with things
>> like org.apache.poi.poifs.dev.POIFSViewer
>>
>> Start by seeing if you can change on bit of one file in the poifs stream,
>> and have the change noticed. If that works, but adding a new poifs stream
>> doesn't, then there are extra things in the poifs stream that need to be
>> set up. I think you're probably going to need to run diff quite a bit,
>> across two files (one that works, one that doesn't) and see what's
>> different
>>
>> Nick
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@...
>> For additional commands, e-mail: user-help@...
>
> --
> Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten
> Browser-Versionen downloaden: http://www.gmx.net/de/go/browser
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@...
> For additional commands, e-mail: user-help@...
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@...
For additional commands, e-mail: user-help@...


Re: Can POIFS convert PDF to OLE

by Helmut Ziegler :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I didn't make a progress but know a bit more about the "upper part" (s. below)

> But in the first part of the original file there is more information.
> I had a look at it in a text editor. The information is some kind of
> metadata:
> 1. The alphabet
> 2. The structure of the ole object. "R.o.o.t. .E.n.t.r.y .... O.l.e. ...
> C.o.m.p.O.b.j...."
> 3. The kind of ole object "P.a.c.k.a.g.e"

The compound object "_1277043057_decompressed_generated" that I generated from
[3]ObjectInfo
[1]Ole10Native
[1]Ole
[1]CompObj
has a slightly different structure than the original "_1277043057_decompressed_original".
The "metadata" is just in another place. I think it's the directory structure of the compound file and the other objects [3]OleObjectInfo, [1]Ole, [1]CompObj.

In the generated file the structure is similar to this:
1. Part (512 Byte): Header (probably addresses to the other parts, and the rest padded up)
2. Part (512 Byte): The alphabet in this form "A...B...C..."
3. Part (512 Byte): The directory structure "R.o.o.t. .E.n.t.r.y....O.b.j.I.n.f.o." (without Ole10Native!)
4. Part (512 Byte): Unkown block (maybe the first part signals the end of the directory structure, the rest is padded up)
5. Part (512 Byte): This seems to be the first content block, as there is the content of [1]Ole, [3]ObjectInfo and [1]CompObj (every content part  is padded up with "00").
6. Part (512 Byte): Here again comes a directory structure, but now only with "O.l.e.1.0.N.a.t.i.v.e"
7. Part (512 Byte): Unknown block (again it may signal the end of the file structure)
8. Part (Rest of file): The content for [1]Ole10Native ==> the pdf

The structure of the file that was generated using POIFS:
1. Part (512 Byte): Header (probably addresses to the other parts, and the rest padded up)
2. Part (.... Byte): The content for [1]Ole10Native ==> the pdf
3. Part (512 Byte): The directory structure "R.o.o.t. .E.n.t.r.y....O.b.j.I.n.f.o." (without Ole10Native)
4. Part (512 Byte): The directory structure for "O.l.e.1.0.N.a.t.i.v.e"
5. Part (512 Byte): This is the content block for the content parts [1]Ole, [3]ObjectInfo and [1]CompObj (every content part  is padded up with "FF", in contrary to the original file).
6. Part: Unknown part (mostly padded up with FF)
7. Part: The alphabet in this form "A...B...C..."
8. Part: Unknown part (seems to be part 7 of the original)

So the main differences are:
a) the divided directory structure in the original (word generated) file
b) ole10native comes before all other objects and even the directory structure in the POIFS generated file
c) content parts are normally padded up with 00 in the original file and FF in the POIFS generated

Maybe some of these differences aren't a problem but I still can't open the ole object I generated with POIFS in Word...

Cheers,
Helmut










-------- Original-Nachricht --------
> Datum: Thu, 24 Jul 2008 15:40:44 +0200
> Von: "Helmut Ziegler" <scruffytech@...>
> An: "POI Users List" <user@...>
> Betreff: Re: Can POIFS convert PDF to OLE

> Hi Nick,
>
> thanks for your response!
> I didn't use POIFSViewer but I know (now) the structure of my PDF Ole
> Object. Unfortunately this isn't enough ...
>
> Here is what I did:
>
> First of all I created a Word2003 xml file with Word and imported a pdf
> file. The PDF is recognized as a package (not as a pdf file) as there wasn't
> a program to handle pdf files on that computer.
> These are the important parts:
> <w:docOleData>
> <w:binData w:name="oledata.mso">
> 0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/
> ...
> </w:binData></w:docOleData>
>
> <o:OLEObject Type="Embed" ProgID="Package" ShapeID="_x0000_i1025"
> DrawAspect="Content" ObjectID="_1277043057"/>
>
> In the word xml file the ole object is base64 encoded.
> I decoded it and wrote a binary file (OleObject.bin) that I inspected
> (first with 7-zip, later with POIFS)
>
> The structure of OleObject.bin is the following
> + Root entry
> ++ _1277043057
> +++[3]ObjectInfo
> +++[1]Ole10Native
> +++[1]Ole
> +++[1]CompObj
>
> Ole10Native represents my pdf with a custom header that word attached.
> To get to this content I had to:
> 1. Create a POIFSFilesSystem based on OleObject.bin
> 2. Get the Entry "_1277043057" and write it to the hard disk (as
> "_1277043057").
> 3. Strip the first 4 Bytes of "_1277043057"
> 4. Use the inflate algortithm to decompress it as
> "_1277043057_decompressed"
> 5. Create a POIFSFileSystem again based on the decompressed
> "_1277043057_decompressed")
> 6. Write the contents listed above to the hard disk.
> ==>I could then open my PDF file.
>
> So far, so good. Now I tried it vice versa. After packaging the content
> again and tried to open the file in Word, Word complained that it can't open
> the file because
> "The server application, the source file, or the element wasn't found"
> (this is only a translation)
>
> The I was looking for the step that that fails.
> Steps 1 to 4 worked also in the other direction but creating
> "_1277043057_decompressed" seemed not to  work.
> When I compared the to original "_1277043057_decompressed" to the
> generated one there are many similarities (file size and most of the content). But
> in first part of the file original there is more information.
> I had a look at it in a text editor. The information is some kind of
> metadata:
> 1. The alphabet
> 2. The structure of the ole object. "R.o.o.t. .E.n.t.r.y .... O.l.e. ...
> C.o.m.p.O.b.j...."
> 3. The kind of ole object "P.a.c.k.a.g.e"
>
>
> Does anyone know how I get this information into my file?
>
> Cheers,
> Helmut
>
> P. S. The reverse enineering is based on this excellent article
> http://www.trustedsource.org/download/research_publications/CAlme_VBOct06.pdf
>
>
>
> ----
> -------- Original-Nachricht --------
> > Datum: Thu, 24 Jul 2008 11:42:10 +0100 (BST)
> > Von: Nick Burch <nick@...>
> > An: POI Users List <user@...>
> > Betreff: Re: Can POIFS convert PDF to OLE
>
> > On Thu, 24 Jul 2008, Helmut Ziegler wrote:
> > > Actually the Word document should also carry other documents like
> other
> > > word files.
> >
> > I'd suggest dumping out the stream(s), and looking at them with things
> > like org.apache.poi.poifs.dev.POIFSViewer
> >
> > Start by seeing if you can change on bit of one file in the poifs
> stream,
> > and have the change noticed. If that works, but adding a new poifs
> stream
> > doesn't, then there are extra things in the poifs stream that need to be
> > set up. I think you're probably going to need to run diff quite a bit,
> > across two files (one that works, one that doesn't) and see what's
> > different
> >
> > Nick
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@...
> > For additional commands, e-mail: user-help@...
>
> --
> Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten
> Browser-Versionen downloaden: http://www.gmx.net/de/go/browser
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@...
> For additional commands, e-mail: user-help@...

--
GMX Kostenlose Spiele: Einfach online spielen und Spaß haben mit Pastry Passion!
http://games.entertainment.gmx.net/de/entertainment/games/free/puzzle/6169196

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@...
For additional commands, e-mail: user-help@...


Re: Can POIFS convert PDF to OLE

by Nick Burch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 24 Jul 2008, Helmut Ziegler wrote:
>> But in the first part of the original file there is more information.
>> I had a look at it in a text editor. The information is some kind of
>> metadata:
>> 1. The alphabet
>> 2. The structure of the ole object. "R.o.o.t. .E.n.t.r.y .... O.l.e. ...
>> C.o.m.p.O.b.j...."
>> 3. The kind of ole object "P.a.c.k.a.g.e"

Try using org.apache.poi.poifs.dev.POIFSViewer on the file parts. I think
that'll give you output that's much easier to compare and make sense of
than the raw bytes :)

org.apache.poi.poifs.dev.POIFSLister might also be worth checking too,
that'll show you what files you have, without the full contents of
POIFSViewer, which'll help you spot if bits go missing

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@...
For additional commands, e-mail: user-help@...


Re: Can POIFS convert PDF to OLE

by Helmut Ziegler :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I used POIFSLister and POIFSViewer.

POIFSLister  shows me the list of files in an OLE object and POIFSViewer  seems to show me what a Hex-Editor would show me. I'm working with a Hex-Editor now ...

I didn't thought that I have to go down to the byte level, but didn't see another possibility.
Yesterday evening we came to the conclusion that POIFS creates something that isn't compatible with Word (because it's not easy to build the interior of a black box).
Then a colleague (with c++ knowledge) wrote a program, that's based on ole32.dll. Using it there was nearly(!) no difference to the original file created by word (same structure, etc.) except of two small differences in the directory structure and the first content part which holds (Ole, CompObj, ObjInfo).

I think that the problem might be in the directory structure. The "Root Entry"-entry from the generated file ist different to the one of the original file.
It's incredible how minimal the differences are.

The result is the same as with the POIFS generated file: Word says "The server application.... was not found"
:-(

Cheers,
Helmut
-------- Original-Nachricht --------
> Datum: Thu, 24 Jul 2008 22:54:31 +0100 (BST)
> Von: Nick Burch <nick@...>
> An: POI Users List <user@...>
> Betreff: Re: Can POIFS convert PDF to OLE

> On Thu, 24 Jul 2008, Helmut Ziegler wrote:
> >> But in the first part of the original file there is more information.
> >> I had a look at it in a text editor. The information is some kind of
> >> metadata:
> >> 1. The alphabet
> >> 2. The structure of the ole object. "R.o.o.t. .E.n.t.r.y .... O.l.e.
> ...
> >> C.o.m.p.O.b.j...."
> >> 3. The kind of ole object "P.a.c.k.a.g.e"
>
> Try using org.apache.poi.poifs.dev.POIFSViewer on the file parts. I think
> that'll give you output that's much easier to compare and make sense of
> than the raw bytes :)
>
> org.apache.poi.poifs.dev.POIFSLister might also be worth checking too,
> that'll show you what files you have, without the full contents of
> POIFSViewer, which'll help you spot if bits go missing
>
> Nick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@...
> For additional commands, e-mail: user-help@...

--
Psssst! Schon vom neuen GMX MultiMessenger gehört?
Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@...
For additional commands, e-mail: user-help@...

LightInTheBox - Buy quality products at wholesale price