RE: [eclipselink-users] JPA performance over JDBC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

RE: [eclipselink-users] JPA performance over JDBC

From: "Markus Karg" <karg@xxxxxxxxx>
Date: Mon, 2 Aug 2010 08:31:09 +0200
Delivered-to: eclipselink-users@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/eclipselink-users>
List-help: <mailto:eclipselink-users-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/eclipselink-users>, <mailto:eclipselink-users-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/eclipselink-users>, <mailto:eclipselink-users-request@eclipse.org?subject=unsubscribe>
Thread-index: Acsx13cn77vDbheiRCSmfCA7h2strAANEBtw
Thread-topic: [eclipselink-users] JPA performance over JDBC

Yes I did such tests years back and had similar results, but that Microtesting is somewhat tricky... As JPA internally uses JDBC it is technically impossible that it actually is faster. What you are measuring is not JPA's performance, but EclipseLink's cache efficiency (EclipseLink stores rows in its own cache instead of loading it a second time). If you switch of caches (even sometimes if you don't) you'll notice that JPA is actually slower than JDBC, as it need JDBC's time plus an overhead. For example, with deep trees you'll be much faster with JDBC than with JPA, as JPA in some cases runs thousands of SQLs while actually a single JOIN would be enough. If you want to learn about the actual performance, there is no way around than doing the actual queries you app needs in the typical environment of your app. Everything else wouldn't be a fair comparison.

From: eclipselink-users-bounces@xxxxxxxxxxx [mailto:eclipselink-users-bounces@xxxxxxxxxxx] On Behalf Of Daoqi(Doug) Yang
Sent: Montag, 2. August 2010 02:13
To: eclipselink-users@xxxxxxxxxxx
Subject: [eclipselink-users] JPA performance over JDBC

I did some performance comparison on JPA queries and JDBC queries using EclipseLink 2.1.

My one test result on an Intel processor shows that JPA queries are 15% faster than functionally equivalent, but slightly simplified JDBC queries , while another test result on an AMD processor says JPA queries are the same as JDBC queries.

In both tests, JPA named queries, JPA criteria queries, and JPA Native queries are equally fast.

My question is: Has any one done similar tests and observed different results from my mine? How to explain EclipseLink is faster on one machine architecture over the other?

See below for more testing details..

Thanks,
Daoqi

************ more details **************************************

Since a JPA implementation is a framework built on the top of JDBC, a valid question to ask is: how much overhead does a JPA implementation introduce over using JDBC directly. In the following, I use EclipseLink JPA 2.1 to do some performance testing and compare it to JDBC.

A typical insert, update, and delete operation is performed on one row at a time and to one database table only. Thus I am not worried about possible performance overhead from JPA. My test is focused on query performance between JPA and JDBC. My test scenario is constructed as follows:

Use some well-known entities: Order, LineItem, and Customer with some obvious relationships and mapping.

Perform JPA JPQL queries, JPA criteria queries, and JPA native queries, and JDBC queries on separate JVM runs. These queries are functionally equivalent with JPA native queries and JDBC queries perform almost exactly the same SQL queries. The JDBC queries are slightly simplified versions of the SQL queries generated from JPA JPQL queries by EclipseLink.

They simulate a typical web application, in which a max number of records is retrieved from all queries, and the values in the JDBC result set are programmatically converted into a list of Java objects. JPA queries return a list of Java objects automatically. These objects are potentially used in a framework such as Struts or JSF for display in the browser.

A test run using Derby 10.4.2 database on a Windows XP computer with Intel Core Duo CPU, 2.66 GHz, and 3.5 GB RAM gives the following results:

Query Types	Time in seconds
JPA JPQL Queries	53
JPA Criteria Queries	53
JPA Native Queries	53
JDBC Queries	62

This test run has 11984 records in the Order table, 23966 records in the LineItem table, and 47935 records in the Customer table. It shows that JPA queries are 15% faster than their equivalent JDBC queries. Also, JPA criteria queries are as fast as JPQL queries, and JPA native queries have the same efficiency as JPQL queries.

Another test run on the same queries using Derby 10.4.2 database on a Windows XP computer with 2.20 GHz AMD Athlon 64 processor 3500+ and 2.5GB RAM gives the following results:

Query Types	Time in seconds
JPA JPQL Queries	42
JPA Criteria Queries	42
JPA Native Queries	42
JDBC Queries	42

This test run has 11008 records in the Order table, 22008 records in the LineItem table, and 44000 records in the Customer table. It shows that JPA queries are as fast as their equivalent JDBC queries.

Next, I show the three queries that are run multiple times during each test.

Query 1

The first JPQL named query involves a two table inner join and reads:

SELECT o FROM Order o JOIN o.lineItems i WHERE i.price > 5

EclipseLink JPA generates the following SQL query during execution:

SELECT t1.ORDER_ID_PK, t1.CREATE_TIME, t1.CREATE_USER, t1.PRICE,

t1.STATUS, t1.UPDATE_TIME, t1.ORDER_TIME, t1.VERSION,

t1.UPDATE_USER, t1.CUSTOMER_ID_FK

FROM jpatest.LINE_ITEM t0, jpatest.ORDERS t1

WHERE ((t0.PRICE > ?) AND (t0.ORDER_ID_FK = t1.ORDER_ID_PK))

The corresponding first JDBC query is based on the generated SQL above but with fewer data fields in the SELECT clause:

SELECT t1.ORDER_ID_PK, t1.PRICE, t1.STATUS, t1.ORDER_TIME

FROM jpatest.LINE_ITEM t0, jpatest.ORDERS t1

WHERE ((t0.PRICE > ?) AND (t0.ORDER_ID_FK = t1.ORDER_ID_PK))

The first JPA native query executes the same SQL query as the JDBC query above, and the first JPA criteria query corresponds exactly to the JPQL query above.

Query 2

The second JPQL query involves a two table outer join and reads:

SELECT o FROM Order o LEFT OUTER JOIN o.lineItems i

WHERE o.orderTime < :today

EclipseLink JPA generates the following SQL query during execution:

SELECT t1.ORDER_ID_PK, t1.CREATE_TIME, t1.CREATE_USER, t1.PRICE,

t1.STATUS, t1.UPDATE_TIME, t1.ORDER_TIME, t1.VERSION,

t1..UPDATE_USER, t1.CUSTOMER_ID_FK

FROM jpatest.ORDERS t1 LEFT OUTER JOIN jpatest.LINE_ITEM t0

ON (t0.ORDER_ID_FK = t1.ORDER_ID_PK)

WHERE (t1.ORDER_TIME < ?)

The corresponding second JDBC query is based on the generated SQL above but with fewer data fields in the SELECT clause:

SELECT t1.ORDER_ID_PK, t1.PRICE, t1.STATUS, t1.ORDER_TIME

FROM jpatest.ORDERS t1 LEFT OUTER JOIN jpatest.LINE_ITEM t0

ON (t0.ORDER_ID_FK = t1.ORDER_ID_PK)

WHERE (t1.ORDER_TIME < ?)

The second JPA native query executes the same SQL query as the JDBC query above, and the second JPA criteria query corresponds exactly to the JPQL query above.

Query 3

The third JPQL query involves third tables with a subquery and reads:

SELECT c FROM Customer c

WHERE (

SELECT COUNT(o) FROM c.orders o WHERE SIZE(o.lineItems) >= 1

) > :count

EclipseLink JPA generates the following SQL query during execution:

SELECT t0.CUSTOMER_ID_PK, t0.CUSTOMER_TYPE, t0.CREATE_TIME,

t0.UPDATE_TIME, t0.VERSION, t0.PICTURE, t0.CREATE_USER,

t0.INCOME, t0.NAME, t0.UPDATE_USER, t0.ADDRESS_ID_FK,

t0.ACCOUNT_NUMBER, t0.BANK_NAME, t0.ROUTING_NUMBER,

t1.CUSTOMER_ID_FPK, t1.CREDIT_LIMIT, t1.CARD_NUMBER,

t2.CUSTOMER_ID_FPK, t2.EXPIRATION_DATE, t2.DISCOUNT_RATE

FROM jpatest.CUSTOMER t0 LEFT OUTER JOIN jpatest.GOLD_CUSTOMER t1

ON (t1.CUSTOMER_ID_FPK = t0.CUSTOMER_ID_PK)

LEFT OUTER JOIN jpatest.PREFERRED_CUSTOMER t2

ON (t2.CUSTOMER_ID_FPK = t0.CUSTOMER_ID_PK)

WHERE ((SELECT COUNT(t3.ORDER_ID_PK) FROM jpatest.CUSTOMER t4,

jpatest.ORDERS t3 WHERE ((((SELECT COUNT(t5.LINE_ITEM_ID_PK)

FROM jpatest.LINE_ITEM t5

WHERE (t5..ORDER_ID_FK = t3.ORDER_ID_PK)) >= ?)

AND (t0.CUSTOMER_ID_PK = t4.CUSTOMER_ID_PK))

AND (t3.CUSTOMER_ID_FK = t4.CUSTOMER_ID_PK))) > ?)

EclipseLink JPA binds the first parameter with the value of 1.

The corresponding third JDBC query is based on the generated SQL above but with fewer data fields in the SELECT clause:

SELECT t0.CUSTOMER_ID_PK, t0.CUSTOMER_TYPE, t0.PICTURE, t0.INCOME,

t0.NAME, t0.UPDATE_USER, t0.ADDRESS_ID_FK, t0.ACCOUNT_NUMBER,

t0.BANK_NAME, t0.ROUTING_NUMBER,

t1.CUSTOMER_ID_FPK, t1.CREDIT_LIMIT, t1.CARD_NUMBER,

t2.CUSTOMER_ID_FPK, t2.EXPIRATION_DATE, t2.DISCOUNT_RATE

FROM jpatest.CUSTOMER t0 LEFT OUTER JOIN jpatest.GOLD_CUSTOMER t1

ON (t1.CUSTOMER_ID_FPK = t0.CUSTOMER_ID_PK)

LEFT OUTER JOIN jpatest.PREFERRED_CUSTOMER t2

ON (t2.CUSTOMER_ID_FPK = t0.CUSTOMER_ID_PK)

WHERE ((SELECT COUNT(t3.ORDER_ID_PK) FROM jpatest.CUSTOMER t4,

jpatest.ORDERS t3

WHERE ((((SELECT COUNT(t5.LINE_ITEM_ID_PK)

FROM jpatest.LINE_ITEM t5

WHERE (t5.ORDER_ID_FK = t3.ORDER_ID_PK)) >= ?)

AND (t0.CUSTOMER_ID_PK = t4.CUSTOMER_ID_PK))

AND (t3.CUSTOMER_ID_FK = t4.CUSTOMER_ID_PK))) > ?)

The first parameter in this JDBC query is also bound with value 1.

The third JPA native query executes the same SQL query as the JDBC query above, and the third JPA criteria query corresponds exactly to the JPQL query above.

During the test run for JPQL queries, the following Java code is executed multiple times and running time is measured:

protected void queryUsingJPQL() {

EntityManager em = emf.createEntityManager();

try {

// JPQL query 1: join two tables

{

Query q = em.createQuery(jpqlQuery1);

q.setMaxResults(maxRows);

q.setFirstResult(0);

List resultList = q.getResultList();

for (Object o : resultList) {

double price = ((Order) o).getPrice();

if (debug) {

System.out.println("jpql inner join o: " + price);

}

// JPQL query 2: join two tables with outer join

{

. . .

}

// JPQL query 3: join two tables with a subquery

{

. .

}

} catch (Throwable t) {

t.printStackTrace();

}

em.close();

}

Note that a fixed number of records is retrieved for each query to simulate a typical web application in which a page of data is displayed on the screen each time.

During the test run for JDBC queries, the following Java code is executed multiple times and running time is measured:

protected void queryUsingJDBC() {

try {

Connection conn = DriverManager.getConnection(databaseURL,

databaseUsername, databasePassword);

// JDBC query 1: join two tables

{

PreparedStatement ps = conn.prepareStatement(jdbcQuery1);

ps.setDouble(1, 5.0);

ps.setMaxRows(maxRows);

ResultSet rs = ps.executeQuery();

List<Order> orderList = new ArrayList<Order>();

while (rs.next()) {

Order order = new Order();

rs.getInt(1);

order.setPrice(rs.getDouble(2));

int status = rs.getInt(3);

order.setStatus(OrderStatus.fromInt(status));

order.setOrderTime(rs.getTimestamp(4));

orderList.add(order);

if (debug) {

System.out.println("jdbc inner join o: " + order.getPrice());

}

// JDBC query 2: join two tables with outer join

{

. . .

}

// JDBC query 3: join two tables with a subquery

{

. . .

}

conn.close();

} catch (Throwable t) {

t.printStackTrace();

}

To make the JDBC and JPQL queries functionally equivalent, the values in the JDBC result set are converted into Java objects. These objects are typically used by frameworks such as Struts or JSF in the presentation layer. This will make it an apple for apple comparison..

References:
- [eclipselink-users] JPA performance over JDBC
  - From: Daoqi(Doug) Yang

Prev by Date: Re: [eclipselink-users] Eclipse link connection timeout
Next by Date: RE: [eclipselink-users] Having issues with CopyGroup
Previous by thread: [eclipselink-users] JPA performance over JDBC
Next by thread: Re: [eclipselink-users] JPA performance over JDBC
Index(es):
- Date
- Thread

Breadcrumbs