Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [eclipselink-users] Insert 2 billion rows in db

Ideally  invoking existing API from a bulk insertion client should not have the overhead of entity manager holding the objects in the memory, since the transaction begins and ends with the method call; if required we can even spawn a few threads to create and use  different instances of the services.
If the em still holds the references to the enities, then it should be a bug in the persistence provider since the same mechanism happens in an 
EJB  container.

This helps to not only load test the application, but also reuse the existing code base and have a single infrastructure for unit testing, load testing, and performance testing. We can get the average execution times for all the crud activities by utilizing the existing API in most kinds of test cases.

If this doesn't work for you, then I would say Stored Procedures/Functions will be an ideal candidate for quickly loading bulk data; in fact we have a bundle of  SQL stored procedures that completely mimic all our crud operations performed by eclipselink. When the speed of execution of the stored procedures is compared to the ORM based approach, I some times feel that we Java developers are obsessed with object oriented programming so much so we are compromising performance for portability/simplicity.


Regards,
Samba

On Mon, Dec 28, 2009 at 8:17 AM, Tim Hollosy <hollosyt@xxxxxxxxx> wrote:
You'll want to look at the JDBC Batch Updates API on the wiki.

But, I strongly, strongly, strongly do not suggest inserting this many
rows into the database. This is not what ORM's are made for. You can
search the list and see lots of people have had to do little hacks
like getting a new EM every 100,000 inserts, etc.

Unless you have a very good reason to use eclipselink for this (will
save you tons of coding time or something). I would just populate your
database using raw jdbc with batch updates, or DB specific loading
tool/sql script.

./tch



On Sun, Dec 27, 2009 at 5:35 PM, harshavardhan786
<harshavardhan786@xxxxxxxxx> wrote:
>
> Hi,
>
> We are doing some performance tests. One of the test is to check how the
> system behaves in case of a populated database. Populated database in our
> application has 2 billion rows ( 2 billion rows = sum of rows  in all tables
> ) .
> We use eclipselink.
>
> I was wondering if we can fill these many entries using eclipselink ? Is
> there any eclipselink specific api , which does insert pretty quickly ?
> How much time will it take ?
>
> Regards,
> Harsha
> --
> View this message in context: http://old.nabble.com/Insert-2-billion-rows-in-db-tp26937805p26937805.html
> Sent from the EclipseLink - Users mailing list archive at Nabble.com.
>
> _______________________________________________
> eclipselink-users mailing list
> eclipselink-users@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/eclipselink-users
>
_______________________________________________
eclipselink-users mailing list
eclipselink-users@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/eclipselink-users


Back to the top