How to find document size

View: New views
6 Messages — Rating Filter:   Alert me  

How to find document size

by Jasmin_Mehta :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi,

My requirement is to keep populating elements to one xml document object until it reaches to certain size. Which JDOM API can help to find the size of create new XML  org.jdom.Document?

Thanks
Jasmin

******************************************************************************
ATTENTION ATTENTION ATTENTION ATTENTION ATTENTION
Our domain name is changing.  Please take note of the sender's
e-Mail address and make changes to your personal address list,
if needed.  Both domains will continue to work, only for a limited
time.
******************************************************************************
This email and any files transmitted with it are intended solely for
the use of the individual or agency to whom they are addressed.
If you have received this email in error please notify the Navy
Exchange Service Command e-mail administrator. This footnote
also confirms that this email message has been scanned for the
presence of computer viruses.

Thank You!           
******************************************************************************


_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@...

Parent Message unknown Re: How to find document size

by Jasmin_Mehta :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


I do not want to get any io done while I am creating xml document from one database and sending it to other after processing it on the fly. This will be batch process and can have huge amount of records while executing batch each time.

I need to know the xml document size as it is when it comes out from Oracle's 'XMLTYPE' datatype. I am appending resultset records one after another in one xml document until it reaches certain size. For this purpose I need to know the size of

the algorithm - something like:

1)        ArrayList finalXmlDocumentList = new ArrayList();

2)        org.jdom.Document  accumulatedXmlDoc = new Document();

3)
for (i = 0; i < newXmlDocs.size; i++)
{
        sizeOfDocument =  FIND SIZE OF accumulatedXmlDoc
       
        if (accumulatedXmlDoc is not sizeOfDocument )
        {
            accumulatedXmlDoc = accumulatedXmlDoc + newXmlDocs[i];   // here actual jdom api is used to concet 2 xml docs.
        }
        else
        {
             finalXmlDocumentList .add(accumulatedXmlDoc );                  
            accumulatedXmlDoc  = new  Document();
            accumulatedXmlDoc = accumulatedXmlDoc + newXmlDocs[i];         // here actual jdom api is used to concet 2 xml docs.
        }
}

4)        use finalXmlDocumentList  further


I would like to know the size for one-liner format XML. You mentioned about storeing in byte[], how woule I do that? Would it create performace issue?

Thanks
Jasmin



frode@...

01/17/2008 04:20 PM

To
Jasmin_Mehta@...
cc
Subject
Re: [jdom-interest] How to find document size







Hello.

The physical size of the file produced when outputting the document will vary depending on formatting ;pretty-format
takes more space than a one-liner, and encoding (UTF-8 encoding creates some double-byte characters).

The filesize will also depend on platform (windows/Linux) because of the varying representation of CRFL (especially with
pretty-print, since this generates a lot of linefeeds)

The only way to know exactly, is to actually write it to disk ans check size. A system-independent size can be obtained
by outputting to a byte[] and check for length of this.

You could also create the document with zero, one and two empty elements (all tags included, but no content) and take
the zero-one + the difference between 1 and 2 times the number of elements, and on-the-fly calculate the total length of
of each element's content and add it up as you go along. A lot of code, but if the elements are the same with different
content this should not be an awful lot of work.

Frode




                                                                                                                       
            Jasmin_Mehta@...                                                                                    
            Sent by:                                                                                                  
            jdom-interest-bounces@...                                                                          To
                                                        <jdom-interest@...>                                      
                                                                                                                    cc
            17.01.2008 21:34                                                                                          
                                                                                                               Subject
                                                        [jdom-interest] How to find document size                      
                                                                                                                       
                                                                                                                       
                                                                                                                       
                                                                                                                       
                                                                                                                       
                                                                                                                       





Hi,

My requirement is to keep populating elements to one xml document object until it reaches to certain size. Which JDOM
API can help to find the size of create new XML  org.jdom.Document?

Thanks
Jasmin


******************************************************************************
ATTENTION ATTENTION ATTENTION ATTENTION ATTENTION
Our domain name is changing.  Please take note of the sender's
e-Mail address and make changes to your personal address list,
if needed.  Both domains will continue to work, only for a limited
time.
******************************************************************************
This email and any files transmitted with it are intended solely for
the use of the individual or agency to whom they are addressed.
If you have received this email in error please notify the Navy
Exchange Service Command e-mail administrator. This footnote
also confirms that this email message has been scanned for the
presence of computer viruses.


Thank You!
******************************************************************************
_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@...





_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@...

Re: How to find document size

by Laurent Bihanic :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

To avoid useless memory allocation, I know no other method than serializing
the document (using XMLOutputter) to a special OutputStream implementation
that will count the number of bytes being written and throw the data away.

But be careful because XMl serialization has a cost, CPU-wise.

Regards,

Laurent


Jasmin_Mehta@... a écrit :

>
> I do not want to get any io done while I am creating xml document from
> one database and sending it to other after processing it on the fly.
> This will be batch process and can have huge amount of records while
> executing batch each time.
>
> I need to know the xml document size as it is when it comes out from
> Oracle's 'XMLTYPE' datatype. I am appending resultset records one after
> another in one xml document until it reaches certain size. For this
> purpose I need to know the size of
>
> the algorithm - something like:
>
> 1)        ArrayList finalXmlDocumentList = new ArrayList();
>
> 2)        org.jdom.Document  accumulatedXmlDoc = new Document();
>
> 3)
> for (i = 0; i < newXmlDocs.size; i++)
> {
>         *sizeOfDocument =  FIND SIZE OF accumulatedXmlDoc*
> *        *
> *        if (accumulatedXmlDoc is not sizeOfDocument )*
>         {
>             accumulatedXmlDoc = accumulatedXmlDoc + newXmlDocs[i];   //
> here actual jdom api is used to concet 2 xml docs.
>         }
>         else
>         {
>              finalXmlDocumentList .add(accumulatedXmlDoc );            
>      
>             accumulatedXmlDoc  = new  Document();
>             accumulatedXmlDoc = accumulatedXmlDoc + newXmlDocs[i];      
>   // here actual jdom api is used to concet 2 xml docs.
>         }
> }
>
> 4)        use finalXmlDocumentList  further
>
>
> I would like to know the size for one-liner format XML. You mentioned
> about storeing in byte[], how woule I do that? Would it create
> performace issue?
>
> Thanks
> Jasmin
_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@...

Re: How to find document size

by Tatu Saloranta :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


--- Jasmin_Mehta@... wrote:

> I need to know the xml document size as it is when
> it comes out from
> Oracle's 'XMLTYPE' datatype. I am appending
> resultset records one after
> another in one xml document until it reaches certain
> size. For this
> purpose I need to know the size of

There is no efficient way to get exact size; either
for serialized or amount of memory Java uses for the
JDOM tree. To know exact size, you have to serialize
it: and even when using dummy OutputStream there's
linear cost for doing that, and with incremental
calls, it has N^2 complexity (essentially dog slow for
sizable docs).

But you can approximate size yourself; this would give
rough idea of either serialized size or amount of
memory used to store the JDOM tree in memory (latter
which is usually 3x - 4x of the former, depending on
kind of document).

The main question is what is this to be used for?
To avoid getting a DB exception from Oracle (there's a
max size for column), to avoid Denial-of-service
attack (memory used on server side for processing doc)
or something else?
As long as approximate size is enough (size that's
within, say, 2x of the actual size) it should be
doable quite efficiently.

-+ Tatu +-



      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs
_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@...

Re: How to find document size

by Jasmin_Mehta :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


how can I find amount of memory Java uses for JDOM tree?

and how can I serialize it?



Tatu Saloranta <cowtowncoder@...>
Sent by: jdom-interest-bounces@...

01/18/2008 12:51 PM

To
jdom interest <jdom-interest@...>
cc
Subject
Re: [jdom-interest] How to find document size






--- Jasmin_Mehta@... wrote:

> I need to know the xml document size as it is when
> it comes out from
> Oracle's 'XMLTYPE' datatype. I am appending
> resultset records one after
> another in one xml document until it reaches certain
> size. For this
> purpose I need to know the size of

There is no efficient way to get exact size; either
for serialized or amount of memory Java uses for the
JDOM tree. To know exact size, you have to serialize
it: and even when using dummy OutputStream there's
linear cost for doing that, and with incremental
calls, it has N^2 complexity (essentially dog slow for
sizable docs).

But you can approximate size yourself; this would give
rough idea of either serialized size or amount of
memory used to store the JDOM tree in memory (latter
which is usually 3x - 4x of the former, depending on
kind of document).

The main question is what is this to be used for?
To avoid getting a DB exception from Oracle (there's a
max size for column), to avoid Denial-of-service
attack (memory used on server side for processing doc)
or something else?
As long as approximate size is enough (size that's
within, say, 2x of the actual size) it should be
doable quite efficiently.

-+ Tatu +-



     ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs
_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@...

******************************************************************************
ATTENTION ATTENTION ATTENTION ATTENTION ATTENTION
Our domain name is changing.  Please take note of the sender's
e-Mail address and make changes to your personal address list,
if needed.  Both domains will continue to work, only for a limited
time.
******************************************************************************
This email and any files transmitted with it are intended solely for
the use of the individual or agency to whom they are addressed.
If you have received this email in error please notify the Navy
Exchange Service Command e-mail administrator. This footnote
also confirms that this email message has been scanned for the
presence of computer viruses.

Thank You!           
******************************************************************************


_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@...

RE: How to find document size

by Michael Kay :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Tatu asked you a perfectly reasonable question (how accurate does your information need to be?), and explained why he was asking. So it seems rather discourteous to come back with more questions rather than answering him.
 
Michael Kay
 
 


From: jdom-interest-bounces@... [mailto:jdom-interest-bounces@...] On Behalf Of Jasmin_Mehta@...
Sent: 18 January 2008 19:05
To: Tatu Saloranta
Cc: jdom interest
Subject: Re: [jdom-interest] How to find document size


how can I find amount of memory Java uses for JDOM tree?

and how can I serialize it?



Tatu Saloranta <cowtowncoder@...>
Sent by: jdom-interest-bounces@...

01/18/2008 12:51 PM

To
jdom interest <jdom-interest@...>
cc
Subject
Re: [jdom-interest] How to find document size






--- Jasmin_Mehta@... wrote:

> I need to know the xml document size as it is when
> it comes out from
> Oracle's 'XMLTYPE' datatype. I am appending
> resultset records one after
> another in one xml document until it reaches certain
> size. For this
> purpose I need to know the size of

There is no efficient way to get exact size; either
for serialized or amount of memory Java uses for the
JDOM tree. To know exact size, you have to serialize
it: and even when using dummy OutputStream there's
linear cost for doing that, and with incremental
calls, it has N^2 complexity (essentially dog slow for
sizable docs).

But you can approximate size yourself; this would give
rough idea of either serialized size or amount of
memory used to store the JDOM tree in memory (latter
which is usually 3x - 4x of the former, depending on
kind of document).

The main question is what is this to be used for?
To avoid getting a DB exception from Oracle (there's a
max size for column), to avoid Denial-of-service
attack (memory used on server side for processing doc)
or something else?
As long as approximate size is enough (size that's
within, say, 2x of the actual size) it should be
doable quite efficiently.

-+ Tatu +-



     ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs
_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@...

******************************************************************************
ATTENTION ATTENTION ATTENTION ATTENTION ATTENTION
Our domain name is changing. Please take note of the sender's
e-Mail address and make changes to your personal address list,
if needed. Both domains will continue to work, only for a limited
time.
******************************************************************************
This email and any files transmitted with it are intended solely for
the use of the individual or agency to whom they are addressed.
If you have received this email in error please notify the Navy
Exchange Service Command e-mail administrator. This footnote
also confirms that this email message has been scanned for the
presence of computer viruses.

Thank You!
******************************************************************************


_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@...
LightInTheBox - Buy quality products at wholesale price