PdfReader String

View: New views
4 Messages — Rating Filter:   Alert me  

PdfReader String

by Stéphane Bansard :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi all,

using a DMS, I can have access to pdf contents of a file (not to the pdf file). I want to add a cover page to this file. I can do that on a pdf file, using PdfReader(fileName) and use PdfCopy and PdfStamper.

But, when it comes to read the pdf contents I have -as String, I cannot do it! I transformed my String to a Byte Array (using java.String.getBytes()) and read this with PdfReader(byte[]), but the resulting document is always empty (the content is lost).

I'm just thinking trying to get the bytes from this pdf String content doesn't make sense. Could you tell what to try out?
Thanks in advance,
Best regards
Stéphane


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
iText-questions mailing list
iText-questions@...
https://lists.sourceforge.net/lists/listinfo/itext-questions

Do you like iText?
Buy the iText book: http://www.1t3xt.com/docs/book.php
Or leave a tip: https://tipit.to/itexttipjar

Re: PdfReader String

by 1T3XT info :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stéphane Bansard wrote:
> Hi all,
>
> using a DMS, I can have access to pdf contents of a file (not to the pdf
> file).

What do you mean by "contents"?
The content stream without the resources?
A raster image of the contents?
The text that should be shown on a page?
You should clarify.

> I want to add a cover page to this file. I can do that on a pdf
> file, using PdfReader(fileName) and use PdfCopy and PdfStamper.

Can you? That surprises me.
First you say you don't have access to the file,
now you say you can use PdfReader(fileName).
That's confusing.

> But, when it comes to read the pdf contents I have -as String, I cannot
> do it!

I repeat my question: what do you mean by contents?

> I transformed my String to a Byte Array (using
> java.String.getBytes()) and read this with PdfReader(byte[]), but the
> resulting document is always empty (the content is lost).

If by 'contents' you mean the text that should be shown
on a page, you are completely off track. PdfReader needs
a proper PDF file, with a header, a body, a cross reference
table and a trailer.

> I'm just thinking trying to get the bytes from this pdf String content
> doesn't make sense. Could you tell what to try out?

I have no idea what you are talking about.
You really should clarify.
--
This answer is provided by 1T3XT BVBA

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
iText-questions mailing list
iText-questions@...
https://lists.sourceforge.net/lists/listinfo/itext-questions

Do you like iText?
Buy the iText book: http://www.1t3xt.com/docs/book.php
Or leave a tip: https://tipit.to/itexttipjar

Re: PdfReader String

by Stéphane Bansard :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Thanks a lot for your quick answer, and sorry for not being clear at all...

>> using a DMS, I can have access to pdf contents of a file (not to the pdf file).

>What do you mean by "contents"?
>The content stream without the resources?
>A raster image of the contents?
>The text that should be shown on a page?

>You should clarify.

this kind of pdf String (I think it's what is called the content stream?):
%PDF-1.4
%âãÏÓ
2 0 obj <</Length 74/Filter/FlateDecode>>stream
[...]

>> I want to add a cover page to this file. I can do that on A pdf
>> file, using PdfReader(fileName) and use PdfCopy and PdfStamper.

>Can you? That surprises me.
>First you say you don't have access to the file,
>now you say you can use PdfReader(fileName).
>That's confusing.
Sorry for introducing confusion. I'm just saying I did it on a "regular" file for some simple tests, outside of the DMS.

>> But, when it comes to read the pdf contents I have -as String, I cannot
>> do it!

>I repeat my question: what do you mean by contents?
Hopefully, things are now clearer!

>> I transformed my String to a Byte Array (using
>> java.String.getBytes()) and read this with PdfReader(byte[]), but the
>> resulting document is always empty (the content is lost).

>If by 'contents' you mean the text that should be shown
>on a page, you are completely off track. PdfReader needs
>a proper PDF file, with a header, a body, a cross reference
>table and a trailer.

>> I'm just thinking trying to get the bytes from this pdf String content
>> doesn't make sense. Could you tell what to try out?

>I have no idea what you are talking about.
>You really should clarify.

So, I guess my question could be summed up with "is it possible to use PdfReader from some pdf content stream,
and add a page on that String"?

Thanks again,
Best regards
Stéphane


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
iText-questions mailing list
iText-questions@...
https://lists.sourceforge.net/lists/listinfo/itext-questions

Do you like iText?
Buy the iText book: http://www.1t3xt.com/docs/book.php
Or leave a tip: https://tipit.to/itexttipjar

Re: PdfReader String

by 1T3XT info :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stéphane Bansard wrote:
>>You should clarify.
>
> this kind of pdf String (I think it's what is called the content stream?):
>
>     %PDF-1.4
>     %âãÏÓ
>     2 0 obj <</Length 74/Filter/FlateDecode>>stream
> [...]

If it ends with %EOF, it's a complete PDF file.
When I talk about a content stream, I refer to the stuff that is
inside a stream object, like your object "2 0 obj". Such a content
stream uses the Adobe Imaging Model to define the content of a page
(or an XObject). If you were only able to get the content stream of
a page without the rest of its resources, you would be out of luck.

> Sorry for introducing confusion. I'm just saying I did it on a "regular" file
> for some simple tests, outside of the DMS.

OK, there is no protection whatsoever on those regular files.
However, if the file is protected, there will probably be some
proprietary stuff inside the PDF that prevents you to rip it.
Or at least, that's what I'd expect. But let's forget that for
a moment. Suppose that your DRM gives you the unprotected file.

>>> But, when it comes to read the pdf contents I have -as String, I cannot
>>> do it!

Yuck. PDF files are binary files. If you get them as a String,
how about the encoding? If you treat the characters as ASCII
every page content stream will be corrupted and often this
results in the 'blank page' problem as described in chapter 17
of the book "iText in Action". Is that what you mean when you
say "the document is empty (the content is lost)"? Everything
seems to work fine (because the document structure is OK), but
when you open the resulting PDF the pages are blank (because
the content stream of the pages are corrupted).

>>> I'm just thinking trying to get the bytes from this pdf String content

Yes, but are you respecting the bytes?
Or are you treating them as ASCII characters?

> So, I guess my question could be summed up with "is it possible to use
> PdfReader from some pdf content stream, and add a page on that String"?

Yes, you can do so. Proof can be found in examples such
as HelloWorldStampCopy on this page:
http://www.1t3xt.info/examples/browse/?page=toc&id=7
In these examples ByteArrayOutputStream.toByteArray() is used
to feed PdfStamper with a byte[].

I suggest that before you go on with iText, you convert your PDF
String to bytes and then write these bytes to a file. I'm 99%
sure the file will be corrupt too.
--
This answer is provided by 1T3XT BVBA

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
iText-questions mailing list
iText-questions@...
https://lists.sourceforge.net/lists/listinfo/itext-questions

Do you like iText?
Buy the iText book: http://www.1t3xt.com/docs/book.php
Or leave a tip: https://tipit.to/itexttipjar