Speed of using Derby DB

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

Speed of using Derby DB

by vodarus vodarus :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello.

I tried to use Derby and compare it with Oracle. I thought that Derby can have the same performance as Oracle on easy procedures.

Purpose of bench-mark test: use Derby as local db and get better performance for local data-manipulations.

DB schema:

create table TESTBIG
(
    CLIENT       int not null,
    ORDER_ID     int not null,
    ORDER_AMOUNT int not null
);

alter table TESTBIG add constraint TESTBIG_PK primary key (CLIENT, ORDER_ID);
   
create table TESTTOTALS
(
    CLIENT       int not null,
    CLIENT_TOTAL int
);

alter table TESTTOTALS add constraint TESTTOTALS_PK primary key (CLIENT);

We populating TESTBIG table with 1 000 000 rows, then stored procedure calculates TESTTOTAL: CLIENT_TOTAL is SUMM of all ORDER_AMOUNT for that CLIENT.

i wrote stored procedure for Derby in Java language:

static public void calculateTotal() {

        int totalAmount = 0;
        int lastClient = 0;

        try {
            Connection connection = DriverManager.getConnection("jdbc:default:connection");
            Statement s = connection.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
            ResultSet rs = s.executeQuery("SELECT CLIENT, ORDER_ID, ORDER_AMOUNT FROM TESTBIG");

            PreparedStatement updateData = connection.prepareStatement("UPDATE testtotals SET " +
                    " client_total = client_total + ? " +
                    " WHERE client = ?");
            PreparedStatement insertData = connection.prepareStatement("INSERT INTO testtotals " +
                    " (client, client_total) " +
                    " VALUES (?, ?) ");

            while (rs.next()) {
                int client = rs.getInt(1);
                int order_amount = rs.getInt(3);

                if (lastClient == 0) {
                    lastClient = client;
                    totalAmount = 0;
                }

                if (lastClient != client) {
                   
                    // System.out.println("MERGE amount" + lastClient + ":" + totalAmount);
                    updateData.setInt(1, totalAmount);
                    updateData.setInt(2, lastClient);
                    int sqlRowCount = updateData.executeUpdate();

                    if (sqlRowCount == 0) {
                        insertData.setInt(1, lastClient);
                        insertData.setInt(2, totalAmount);
                        sqlRowCount = insertData.executeUpdate();
                    }

                    lastClient = client;
                    totalAmount = order_amount;
                } else {
                    totalAmount = totalAmount + order_amount;
                }
            }

            updateData.setInt(1, totalAmount);
            updateData.setInt(2, lastClient);
            int sqlRowCount = updateData.executeUpdate();

            if (sqlRowCount == 0) {
                insertData.setInt(1, lastClient);
                insertData.setInt(2, totalAmount);
                sqlRowCount = insertData.executeUpdate();
            }


            rs.close();
            s.close();
            connection.commit();
            connection.close();
        } catch (SQLException ex) {
            Logger.getLogger(CalculateTotal.class.getName()).log(Level.SEVERE, null, ex);
        }
    }


it takes up to 12 seconds to calculate TESTTOTALS. Oracle PL/SQL procedure with the same algorithm need 1,5 second.

How can i improve performance? Or Derby is so slow because of Java / JVM issues???

Thanks.

Re: Speed of using Derby DB

by oysteing :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

vodarus vodarus wrote:
> it takes up to 12 seconds to calculate TESTTOTALS. Oracle PL/SQL
> procedure with the same algorithm need 1,5 second.
>
> *How can i improve performance? Or Derby is so slow because of Java /
> JVM issues???*
>
> Thanks.

I do not have any experience with performance of stored procedures, but
note that Derby has a pretty high overhead for query compilation since
it compiles queries directly into Java byte code.  Hence, comparison on
single executions of statements will usual not be in favor of Derby.
However, if same statements are executed many times efficiency improves
(both due to reuse of compiled plans and hot-spot compilation of the
Java code).

For you particular problem, why use stored procedures when you can do
the same with the following query:

insert into testtotals
select client, sum(order_amount) from testbig group by client;

--
Øystein

Re: Speed of using Derby DB

by vodarus vodarus :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello.

2008/6/11 Øystein Grøvlen <Oystein.Grovlen@...>:
vodarus vodarus wrote:
it takes up to 12 seconds to calculate TESTTOTALS. Oracle PL/SQL procedure with the same algorithm need 1,5 second.

*How can i improve performance? Or Derby is so slow because of Java / JVM issues???*

Thanks.

I do not have any experience with performance of stored procedures, but note that Derby has a pretty high overhead for query compilation since it compiles queries directly into Java byte code.  Hence, comparison on single executions of statements will usual not be in favor of Derby. However, if same statements are executed many times efficiency improves (both due to reuse of compiled plans and hot-spot compilation of the Java code).

I run stored procedure for many times:
for (int i = 0; i < 5; i++) {
                s = conn.createStatement();

                sql = "DELETE FROM testtotals";
                s.execute(sql);

                begin = new Date();
                sql = "CALL calculateTotal()";
                s.execute(sql);
                end = new Date();
                System.out.println("Timing " + (end.getTime() - begin.getTime()));
                conn.commit();
                System.out.println("Committed the transaction");
            }


For you particular problem, why use stored procedures when you can do the same with the following query:

insert into testtotals
select client, sum(order_amount) from testbig group by client;

Because it's just example. There should be any function that can not be replased with pure-SQL. of course SQL can be used in this test, but test purpose is "check speed of server-side data analysing with big amount of data".

And what tools can you use to analyse big amount of relation-data? SQL can not be used in many cases. So people usually use stored procedures in RDBMS - PL/SQL in Oracle. I have idea: replase Oracle with Java application. And that Java application have RDBMS inside (like Derby in Embedded mode).


PS does derby / javaDB have it's own forum?


Re: Speed of using Derby DB

by oysteing :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> And what tools can you use to analyse big amount of relation-data? SQL
> can not be used in many cases. So people usually use stored procedures
> in RDBMS - PL/SQL in Oracle. I have idea: replase Oracle with Java
> application. And that Java application have RDBMS inside (like Derby in
> Embedded mode).
>

If you are using Derby in embedded mode, I do not think there will be a
performance advantage of using a stored procedure over executing the
code directly in the application.  It may even be faster to skip the
stored procedure.  It would be interesting if you could try it out and
report the results.

> PS does derby / javaDB have it's own forum?
>

Do you mean outside this mailing list?

--
Øystein

Re: Speed of using Derby DB

by vodarus vodarus :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



2008/6/11 Øystein Grøvlen <Oystein.Grovlen@...>:

If you are using Derby in embedded mode, I do not think there will be a performance advantage of using a stored procedure over executing the code directly in the application.  It may even be faster to skip the stored procedure.  It would be interesting if you could try it out and report the results.

I thought tt using stored procedure will avoid transferring data from DataEngine to Application.
Using Java + Oracle will be slower than usage of PL/SQL code. Application will spend a lot of resources to transfer data from DBMS to external code and back.

Do you mean usage of Java application itself and avoid usage of stored procedure?


PS does derby / javaDB have it's own forum?


Do you mean outside this mailing list?

Yes. I think forum is more visual tool for communication than mail-list.


Re: Speed of using Derby DB

by oysteing :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

vodarus vodarus wrote:
> I thought tt using stored procedure will avoid transferring data from
> DataEngine to Application.

That cost is very little when using embedded where both are executing in
the same VM.

> Using Java + Oracle will be slower than usage of PL/SQL code.

But Java + Oracle implies client/server and transfer of data between
different processes.

> Application will spend a lot of resources to transfer data from DBMS to
> external code and back.
>
> Do you mean usage of Java application itself and avoid usage of stored
> procedure?

Yes.

>
>
>         PS does derby / javaDB have it's own forum?
>
>
>     Do you mean outside this mailing list?
>
>
> Yes. I think forum is more visual tool for communication than mail-list.

(I would think that depends on your mail client.)

Nabble and several other sites provide a way to access the mailing list
in as a forum. http://www.nabble.com/Apache-Derby-Users-f93.html

--
Øystein

RE: Speed of using Derby DB

by Derby-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

Hi,

 

You really can’t compare Derby against Oracle.

 

They are two different beasts.

 

You can’t embed Oracle in your application. You can embed Derby.

You spend $$$$ on an Oracle solution. You spend $ on Derby. TANSTAAFL!

 

Oracle has a paid support staff of developers. Derby? Their support staff is paid by IBM and SUN. Ooops! IBM stopped supporting Cloudscape, no? Or else community members support Derby outside of their $Dayjob.

 

Oracle big, written in C/C++ etc.

Derby? 100% Java, small.

 

If you want a better comparison, write a Java Stored Procedure in Oracle and then compare the time. Even then you will have differences that will effect your performance.

Oracle can take advantage of partitioning database tables, certain caching, and of course query optimization will vary.

 

I’m not sure if  Informix’s Standard Engine supported SPL, but if it did, it would be a closer comparison, although SE is C based and should perform faster.

 

HTH

 

-G


From: vodarus vodarus [mailto:vodarus@...]
Sent: Wednesday, June 11, 2008 3:31 AM
To: derby-user@...
Subject: Speed of using Derby DB

 

Hello.

I tried to use Derby and compare it with Oracle. I thought that Derby can have the same performance as Oracle on easy procedures.

Purpose of bench-mark test: use Derby as local db and get better performance for local data-manipulations.

DB schema:

create table TESTBIG
(
    CLIENT       int not null,
    ORDER_ID     int not null,
    ORDER_AMOUNT int not null
);

alter table TESTBIG add constraint TESTBIG_PK primary key (CLIENT, ORDER_ID);
   
create table TESTTOTALS
(
    CLIENT       int not null,
    CLIENT_TOTAL int
);

alter table TESTTOTALS add constraint TESTTOTALS_PK primary key (CLIENT);

We populating TESTBIG table with 1 000 000 rows, then stored procedure calculates TESTTOTAL: CLIENT_TOTAL is SUMM of all ORDER_AMOUNT for that CLIENT.

i wrote stored procedure for Derby in Java language:

static public void calculateTotal() {

        int totalAmount = 0;
        int lastClient = 0;

        try {
            Connection connection = DriverManager.getConnection("jdbc:default:connection");
            Statement s = connection.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
            ResultSet rs = s.executeQuery("SELECT CLIENT, ORDER_ID, ORDER_AMOUNT FROM TESTBIG");

            PreparedStatement updateData = connection.prepareStatement("UPDATE testtotals SET " +
                    " client_total = client_total + ? " +
                    " WHERE client = ?");
            PreparedStatement insertData = connection.prepareStatement("INSERT INTO testtotals " +
                    " (client, client_total) " +
                    " VALUES (?, ?) ");

            while (rs.next()) {
                int client = rs.getInt(1);
                int order_amount = rs.getInt(3);

                if (lastClient == 0) {
                    lastClient = client;
                    totalAmount = 0;
                }

                if (lastClient != client) {
                   
                    // System.out.println("MERGE amount" + lastClient + ":" + totalAmount);
                    updateData.setInt(1, totalAmount);
                    updateData.setInt(2, lastClient);
                    int sqlRowCount = updateData.executeUpdate();

                    if (sqlRowCount == 0) {
                        insertData.setInt(1, lastClient);
                        insertData.setInt(2, totalAmount);
                        sqlRowCount = insertData.executeUpdate();
                    }

                    lastClient = client;
                    totalAmount = order_amount;
                } else {
                    totalAmount = totalAmount + order_amount;
                }
            }

            updateData.setInt(1, totalAmount);
            updateData.setInt(2, lastClient);
            int sqlRowCount = updateData.executeUpdate();

            if (sqlRowCount == 0) {
                insertData.setInt(1, lastClient);
                insertData.setInt(2, totalAmount);
                sqlRowCount = insertData.executeUpdate();
            }


            rs.close();
            s.close();
            connection.commit();
            connection.close();
        } catch (SQLException ex) {
            Logger.getLogger(CalculateTotal.class.getName()).log(Level.SEVERE, null, ex);
        }
    }


it takes up to 12 seconds to calculate TESTTOTALS. Oracle PL/SQL procedure with the same algorithm need 1,5 second.

How can i improve performance? Or Derby is so slow because of Java / JVM issues???

Thanks.


Re: Speed of using Derby DB

by Thomas Nielsen - Sun Microsystems - Trondheim, Norway :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

vodarus vodarus wrote:
>         PS does derby / javaDB have it's own forum?
>
>
>     Do you mean outside this mailing list?
>
>
> Yes. I think forum is more visual tool for communication than mail-list.
>

If so, you can read the list using nabble (www.nabble.com) :)

Cheers,
Thomas

PS:
It *may* be that the reason for the slow execution is outdated index
cardinality statistics. Check out "CALL
SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE" either on nabble or in the
derby docs.

--
Thomas Nielsen

Re: Speed of using Derby DB

by vodarus vodarus :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



2008/6/11 <derby@...>:

Hi,

You really can't compare Derby against Oracle.

They are two different beasts.

You can't embed Oracle in your application. You can embed Derby.

You spend $$$$ on an Oracle solution. You spend $ on Derby. TANSTAAFL!

Yes, i thought that Derby can win this comparsion if it will work with the same speed. I have really easy procedure, so all perfomance cost should be in I/O (also i have more RAM than database take on HDD)

Oracle has a paid support staff of developers. Derby? Their support staff is paid by IBM and SUN. Ooops! IBM stopped supporting Cloudscape, no? Or else community members support Derby outside of their $Dayjob.

Oracle big, written in C/C++ etc.

Derby? 100% Java, small.

 

If you want a better comparison, write a Java Stored Procedure in Oracle and then compare the time. Even then you will have differences that will effect your performance.

it's more easy to write code on PL/ SQL than use Java. Of course you can try to use SQLJ but...

Let's compare:
PL/SQL code:
            UPDATE testtotals SET
client_total = iTotalAmt
WHERE client = iLastClient;
the same code in Java:
PreparedStatement update = connection.prepareStatement("UPDATE testtotals SET " +
" client_total = client_total + ? " +
" WHERE client = ?");
update.setInt(1, totalAmount);
update.setInt(2, lastClient);
int sqlRowCount = update.executeUpdate();
Also there is no support for SQL in java, so SQL live as Strings without any checks. SQLJ can help here, but java refuse usage of preprocessor, so IDE's not understanding SQLJ dialect.

Oracle can take advantage of partitioning database tables, certain caching, and of course query optimization will vary.


I camparing only speed of data analysis, and only Oracle PL/SQL and Java. I prefer Java, but it have many disadvantages when you need to work with big amount of data.


Re: Speed of using Derby DB

by vodarus vodarus :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

If you want a better comparison, write a Java Stored Procedure in Oracle and then compare the time. Even then you will have differences that will effect your performance.

I do not need "Java Stored Procedure in Oracle". I need fast application that will analyse big amount of relation data. This analysis can not be done via SQL, so necessary to use some server side stored procedure.

When i use Java with Derby - perfomance fall is 12 times for 1 iteration, but necessary to do 100 of analysis for each amount of data. So performance will fall 12*100 times. So it will be better to buy Oracle than use Derby and buy more computers.

In this comparsion PL/SQL wins because it have better syntax AND better perfomance.

But may be i have done something wrong and together we can fix it :)

Re: Speed of using Derby DB

by vodarus vodarus :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

If you are using Derby in embedded mode, I do not think there will be a performance advantage of using a stored procedure over executing the code directly in the application.  It may even be faster to skip the stored procedure.  It would be interesting if you could try it out and report the results.


I change this function and execute it from Java application directly, not as stored procudere.

Results looks like the same. Near 12-14 seconds.

Re: Speed of using Derby DB

by Kristian Waagan-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thomas Nielsen wrote:

> vodarus vodarus wrote:
>>         PS does derby / javaDB have it's own forum?
>>
>>
>>     Do you mean outside this mailing list?
>>
>>
>> Yes. I think forum is more visual tool for communication than mail-list.
>>
>
> If so, you can read the list using nabble (www.nabble.com) :)
>
> Cheers,
> Thomas
>
> PS:
> It *may* be that the reason for the slow execution is outdated index
> cardinality statistics. Check out "CALL
> SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE" either on nabble or in the
> derby docs.

Vodarus,

Did you get around to try out the tip from Thomas?

If you first create the index and then load the data, the statistics can
be badly outdated. If you first created the table, loaded the data and
finally created the index you should be fine though.

Just curious, as we have seen reports of severe performance degradations
because of this issue, which causes the optimizer to choose a bad plan.



regards,
--
Kristian


>


Re: Speed of using Derby DB

by Mike Matrigali :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

vodarus vodarus wrote:

> Hello.
>
> I tried to use Derby and compare it with Oracle. I thought that Derby
> can have the same performance as Oracle on easy procedures.
>
> Purpose of bench-mark test: use Derby as local db and get better
> performance for local data-manipulations.
>
> DB schema:
>
> create table TESTBIG
> (
>     CLIENT       int not null,
>     ORDER_ID     int not null,
>     ORDER_AMOUNT int not null
> );
>
> alter table TESTBIG add constraint TESTBIG_PK primary key (CLIENT,
> ORDER_ID);
>    
> create table TESTTOTALS
> (
>     CLIENT       int not null,
>     CLIENT_TOTAL int
> );
>
> alter table TESTTOTALS add constraint TESTTOTALS_PK primary key (CLIENT);
>
> We populating TESTBIG table with 1 000 000 rows, then stored procedure
> calculates TESTTOTAL: CLIENT_TOTAL is SUMM of all ORDER_AMOUNT for that
> CLIENT.
>
> i wrote stored procedure for Derby in Java language:
>
> static public void calculateTotal() {
>
>         int totalAmount = 0;
>         int lastClient = 0;
>
>         try {
>             Connection connection =
> DriverManager.getConnection("jdbc:default:connection");
>             Statement s =
> connection.createStatement(ResultSet.TYPE_FORWARD_ONLY,
> ResultSet.CONCUR_READ_ONLY);
>             ResultSet rs = s.executeQuery("SELECT CLIENT, ORDER_ID,
> ORDER_AMOUNT FROM TESTBIG");
>
>             PreparedStatement updateData =
> connection.prepareStatement("UPDATE testtotals SET " +
>                     " client_total = client_total + ? " +
>                     " WHERE client = ?");
>             PreparedStatement insertData =
> connection.prepareStatement("INSERT INTO testtotals " +
>                     " (client, client_total) " +
>                     " VALUES (?, ?) ");
>
>             while (rs.next()) {
>                 int client = rs.getInt(1);
>                 int order_amount = rs.getInt(3);
>
>                 if (lastClient == 0) {
>                     lastClient = client;
>                     totalAmount = 0;
>                 }
>
>                 if (lastClient != client) {
>                    
>                     // System.out.println("MERGE amount" + lastClient +
> ":" + totalAmount);
>                     updateData.setInt(1, totalAmount);
>                     updateData.setInt(2, lastClient);
>                     int sqlRowCount = updateData.executeUpdate();
>
>                     if (sqlRowCount == 0) {
>                         insertData.setInt(1, lastClient);
>                         insertData.setInt(2, totalAmount);
>                         sqlRowCount = insertData.executeUpdate();
>                     }
>
>                     lastClient = client;
>                     totalAmount = order_amount;
>                 } else {
>                     totalAmount = totalAmount + order_amount;
>                 }
>             }
>
>             updateData.setInt(1, totalAmount);
>             updateData.setInt(2, lastClient);
>             int sqlRowCount = updateData.executeUpdate();
>
>             if (sqlRowCount == 0) {
>                 insertData.setInt(1, lastClient);
>                 insertData.setInt(2, totalAmount);
>                 sqlRowCount = insertData.executeUpdate();
>             }
>
>
>             rs.close();
>             s.close();
>             connection.commit();
>             connection.close();
>         } catch (SQLException ex) {
>            
> Logger.getLogger(CalculateTotal.class.getName()).log(Level.SEVERE, null,
> ex);
>         }
>     }
>
>
> it takes up to 12 seconds to calculate TESTTOTALS. Oracle PL/SQL
> procedure with the same algorithm need 1,5 second.
>
> *How can i improve performance? Or Derby is so slow because of Java /
> JVM issues???*
>
> Thanks

For derby the default is autocommit=true, which means it will do a
synchronous log commit write for every statement.  My reading of the app
and the
call to commit at the end seems to indicate you want to only commit
at the end.  Depending on the data distribution (ie. how many times you
call executeUpdate this may be a big issue.  When you run the test do
you see a lot of idle time or is 1 cpu at 100% for 12-14 secs? The
suggestion for using aggregates seems like a better way to do the app.

What kind of page cache does oracle have when you compare?  Are you
interested in performance of a query when no page is in cache or when
all pages are in cache?  This size table is bigger than the default
derby page cache (1000 pages) so it basically is going to be an I/O test
of how fast
stuff can be read from disk each time.  If you want to try a cached
test try setting derby page cache bigger - probably 10,000 pages will
fit that table (probably less - didn't do exact math).


Re: Speed of using Derby DB

by vodarus vodarus :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


PS:
It *may* be that the reason for the slow execution is outdated index cardinality statistics. Check out "CALL SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE" either on nabble or in the derby docs.

Vodarus,

Did you get around to try out the tip from Thomas?

If you first create the index and then load the data, the statistics can be badly outdated. If you first created the table, loaded the data and finally created the index you should be fine though.

Just curious, as we have seen reports of severe performance degradations because of this issue, which causes the optimizer to choose a bad plan.

regards,
Hello.

I created example of application. All app merged into one file. All in this topic can run app and get the same results.

final private WORK_TYPE work_type should be set to DATA_FROM_SCRATCH at first run.
Than use DATA_DROP_AND_CREATE for recreating data or DATA_USE_OLD to use old data.

Anyone can use app and get the same results as i. I am interested in high speed of calculateTotalStored or calculateTotalClient methods ONLY!!!


PS this app for testing purposes only, so code re-usability and other things like it are not important.


FatTest.java (17K) Download Attachment

Re: Speed of using Derby DB

by vodarus vodarus :: Rate this Message: