Bug 176833 - change relational mapping of EMap
Summary: change relational mapping of EMap
Status: VERIFIED FIXED
Alias: None
Product: EMF
Classification: Modeling
Component: Teneo (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Martin Taal CLA
QA Contact:
URL:
Whiteboard:
Keywords: contributed
Depends on:
Blocks:
 
Reported: 2007-03-09 11:42 EST by Jean-Denis Boudreault CLA
Modified: 2008-06-13 05:59 EDT (History)
1 user (show)

See Also:


Attachments
The patch of the changes for the bug (17.93 KB, patch)
2007-03-11 13:04 EDT, Jean-Denis Boudreault CLA
no flags Details | Diff
The new updated code and lazy loaded emaps (39.07 KB, patch)
2007-03-14 21:17 EDT, Jean-Denis Boudreault CLA
no flags Details | Diff
The newly refactored EMap (42.10 KB, patch)
2007-03-15 21:02 EDT, Jean-Denis Boudreault CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jean-Denis Boudreault CLA 2007-03-09 11:42:50 EST
After a good discussion with Martin,

it is believed that there is a possibility to change the current mapping of EMaps to avoid the KeyToValueEntity table and map it as a real map. 

this would also add the possibility to enable the @MapKey EJB annotation since the child table would be the entity directly, with all its available columns, and not a 'middleman' entity.

i herby add the log of the conversation on the newsgroup on the topic:

Hi Jean-Denis,
No problem of being verbose :-).
Just to react on point in your description: you say that the developer manually has to create/handle the keyToValueEntry class. However these entry classes are also generated by emf. The entry classes are hidden for the developer using the emf objects. So the developer does not need create/handle these classes manually. Or am I maybe missing your point here?

For me the real issue is in the current database mapping. From a relational-mapping perspective the issue with the current approach is that the resulting database model has redundant information: the key is repeated in a separate field while it can be present in the value-entity. Also the current approach will always create a join table while this is not always required. In addition the entry class will always get a synthetic id and version which does not work in case of client-server scenarios.

Your proposal is interesting and certainly something to try out. I can also look again at mapping an emap as a real map.
Can you enter a separate bugzilla for this (change relational mapping of emap) and copy the contents of these posting into it? Then I can take it from there. This is some work so it will probably take some time (number of weeks) before I can spend time on it.

gr. Martin

	Jean-Denis Boudreault wrote:
	
	Hello again,


	i deeply apologize for the verbose, but i think this can actually be very useful. This post may be long and detailed, but i want to make sure it is clear.


	So, i went a head and took a look at the BasicEList and BasicEMap implementation files of EMF 2.3. So, now i understand what you were saying about EMaps. it is in reality just a list, of the type map.entry. Where we get an object by key in the EMap, it basically loops the list until it finds the entry for that key. (I guess we should create an EMF feature request for a EHashmap then since this is not very efficient for large lists).


	but, back to teneo. lets take this POJO scenario:

	<code>
	public class parent
	{
	  int ID;

	  // the key is the GUID
	  public EMap<String, child>
	}

	public class child
	{
	  int ID;
	  string GUID;
	  String name;
	}
	</code>

	basically, a parent class holds a collection of child class, and to accelerate lookup, keeps a map where the key is the unique string GUID of the parent class. the datbase tables might look like this:

	<dbschema>
	--------------------
	|      PARENT      |
	--------------------
	| Int     |   ID   |
	--------------------



	--------------------
	|      CHILD       |
	--------------------
	| Int     |   ID   |
	| Int     |parentID|
	| varchar |  GUID  |
	| varchar |  name  |
	--------------------

	where there is a join in ID/ParentID

	</dbschema>

	Now, if we created a EList collection instead of a EMap, it would be a simple thing, we would simply add a manyToOne/OneToMany hibernate relationship on the collection to get it going.

	but with emaps, its slightly different. to go with EMF, what teneo currently does (if i correctly understand Martin's explanation) is to add a third table, which is a 'middleman' joining table that represents the map.Entry. this table has a field for the key and the value. this im not sure, but i assume the value could also be a oneToOne join to the child table. For this reason, the @MapKey is not supported, since there is no other field to select the key from; just the defined field 'KEY' in that table.

	here is my suggestion: (i hope it makes sense).

	when we think of it, a map really is just a list, with a key that simplifies each entries lookup (hence the EMap design decision). for this reason, pure hibernate allows for parent/*childs collections to be mapped either as lists, or as maps where we can define which column can be used as the key (hence @MapKey).

	i belive it could be possible to allow teneo to function the same way, by eliminating the burden on the developper of maintaining manually the map.Entry 'KeyToValueEntry' class. Teneo could consider the EMap mapping to be a EList, with a key column. The table mapping could remain the same as EList.  when rebuilding the list at load time, Teneo could dynamically create a map.Entry<K,V> for each list entry, and then use the key column defined to set the key value. this is basically what hibernate does too.

	so, we could have in the database:

	[parent table]
	ID: 1

	[child table]
	ID: 1; ParentID: 1l GUID: 'whatever', name: 'name1'
	ID: 2; ParentID: 1l GUID: 'whatever2', name: 'name2'
	ID: 3; ParentID: 1l GUID: 'whatever3', name: 'name3'

	then, the runtime loaded class could have a list of childs, and the key set at the GUID (defiend by @MapKey annotation).

	we could then search by key (here GUID) like this:
	child chld = parent.childs.Get("whatever2"); // returns the child with ID=2


	so, basically, my suggestion is to keep closer to the hibernate methodology and map it to EMF, rather than map EMF to hibernate. This way is just as valid, but removes the need to manually create / handle the keyToValueEntry classes as well as map them to the table schema, complicating map like options such as defining the key.


	so, what do you think?


	thanks and i hope this helps


		Hi,
		
		i was just thinking about this and i think i get your point.
		
		so basically if i get what your saying, the mapentry class is defined in a
		table of its own (entity), and the key is probably the unique ID of this
		class, is that it? the value itself is either a column type or a onetoone
		reference to another entity table.
		
		so, following this logic, there would be no way to make direct relationships
		to other entity table, without going through this 'intermediary' entity
		being the mapentry.
		
		If this is so, could it be possible to eliminate the need to explicitely
		define a mapEntry entity and let teneo generate one in the background? foe
		example:
		
		
		KeyToValueEntry<K, V>.
		
		then, there would be no need to have an intermediary table, and we could
		directly specify the key column and the entity type. so we could create:
		
		<pseudocode>
		ClassA
		{
		  public EMap<Integer, ClassB> field;
		}
		</pseudocode>
		
		and with anotations define the column of the entity that is used as the Key
		with @MapKey. in the background, teneo could generate this
		KeyToValueEntry<K, V> list and dynamically rebuild an EMap.  In the xml.hbm
		file, there would be not intermediary entity, just a map from EnityA to
		EntityB with a key column defined.
		
		how does that sound?   (i could be right off, sorry if i am, jsut trying to
		help)
		
			"Jean-Denis Boudreault" <jdboudreault@xxxxxxxx> wrote in message
			news:esq552$ekj$1@xxxxxxxxe.org...
		
			Hi Martin, thanks for your very detailed answers!
		
			I guess where i am confused is how a list can behaves like a map, since
			how does it provide the key/value functionality. i explain:
		
			I just found on the EMF FAQ
			(http://wiki.eclipse.org/index.php/EMF-FAQ#How_do_i_create_a_Map_in_EMF.3F)
			where they explain about EMaps. it surely concurs with your sayings. I
			have two questions from there (i hope these questions dont annoy you, but
			if i better understand, perhaps i can help too :))
		
			1. In my ecore diagram,  i create an EAttribute of EType EMap <K, V>
			[java.util.Map]. When i generate the code, i get a good old java.util.Map.
			I guess this is where i get confused. with a java.util.Map,  can't we get
			hibernate to function like it always did?  and from there, wouldn't it be
			a good idea to implement the @MapKey annotation?
		
			2. Im pretty sure im not understanding well and you will explain to my why
			the java.util.Map thing is wrong. then, if the EMap is a list of custom
			mapEntries class, what happens behind the scenes when i do: myMap.ge(2)
			specifying a key value?  is there a hidden foreach in the implementation
			that iterates the list of entries and returns the found result?  and if
			so, what prevents us from mapping a column in a oneToMany relationship as
			the key to the mapEntries?  there, the @MapKey annotation would be useful,
			to define which column in the entity is to be used for key, instead of
			forcing the ID column.  am i being clear?
		
			basically, if i get this right, we define a EClass for the EMap entries,
			right?  so, suppose we take my database example from before:
		
			<scenario>
			TableA
		
			[ID: 1, Name: "entry1"]
		
		
			TableB (joins table A on ID:ParentID)
		
			[ID: 1, ParentID: 1, name="entry1", someUniqueField=233]
			[ID: 2, ParentID: 1, name="entry2", someUniqueField=34456]
			[ID: 3 ParentID: 1, name="entry3", someUniqueField=2342]
		
			</scenario>
		
			Then, if we want a map of type tableB with the key being someUniqueField
			column, we could have:
		
			SomeUniqueFieldToTableBEntry<Integer, TableBClass>
		
			with this, it would make sense to have @MapKey defined to select the
			column to be used to set the key.
		
			am i making any sense?
		
			anyways, i hope im getting this right, since perhaps i really dont know
			what goes under the hood.
		
			thanks for your help, this is much appreciated!  :)
		
				"Martin Taal" <mtaal@xxxxxxxx> wrote in message
				news:espslr$uoj$1@xxxxxxxxe.org...
		
				Hi JD,
				Teneo maps from emf to hibernate (so the other way around). This means
				that it maps the emap to hibernate in correspondence to the emf approach.
				For an emap entry emf (so not teneo but emf itself) will generate a
				separate class (for example stringtostringemapentry or something). The
				emf emap is not a real map but a list of these map entries (so not a
				hashmap with map entries). Because it is a list it can not be mapped
				easily to a real map-like model in the db.
				Teneo takes the separately generated map entry class and creates a
				mapping for that class as if it is in the list of the owner. The entry
				class is modeled as a separate entity with its own primary key (for
				example a generated id). This key has nothing to do with the map key (it
				does not really need it).
				Teneo just follows the emf approach here.
		
				gr. Martin
		
				JD wrote:
		
				thanks for your answer.
	
				im not sure if it get it all, perhaps i dont understand very well how
				teneo maps from the hibernate engine to the EMF datatypes.
	
				But if i get it right, Teneo creates a oneToMany/ManyToOne relationship
				to the owner object,is that it? then, how dies it define the key? is it
				possible to perform this kind of mapping in teneo and define the column
				used for the key? (or perhaps the key column is defined automatically?)
	
				<scenario>
				TableA
	
				[ID: 1, Name: "entry1"]
	
	
				TableB (joins table A on ID:ParentID)
	
				[ID: 1, ParentID: 1, name="entry1"]
				[ID: 2, ParentID: 1, name="entry2"]
				[ID: 3 ParentID: 1, name="entry3"]
	
	
				then, in code:
	
				class TableAEntity
				{
					public Map<Integer, TableBEntity> childs;
				}
				</scenario>
	
				where the key of the map is mapped to the ID column of TableB and the
				value of the map is the TableB entity for this ID.
	
				thanks for your time!
Comment 1 Jean-Denis Boudreault CLA 2007-03-11 13:04:33 EDT
Created attachment 60517 [details]
The patch of the changes for the bug

no unit tests
Comment 2 Jean-Denis Boudreault CLA 2007-03-11 13:07:53 EDT
OK, here it is. in short:

Since the keyToValueEntry EMap mechanism is so engrained in EMF, there was no way to change it.  so basically, i changed the way teneo sends the map data to hibernate and how it loads it back.  The keyToValueEntry class now becomes an EMF wrapper, but it is not serialized ot the database anymore. hibernate sees the map as being parent->child directly. this also means that we can select any column of the child table to be the key, hence @MapKey is now enabled.

change log: 

1. When a EMap reference field is defined in the ECore model, Teneo will not create a <list> anymore in the hbm.xml but rather a <map>. i enabled the @MapKey annotation on the <map>, so this annotation now works. we can now set any column in the target entity as the key. (note, i also enabled the @Where annotation).

2. when teneo sends the delegate list to hibernate, it unwraps the inner entities from the keyToValueEntry into a java.util.map that hibernate understands.

3. When teneo loads from hibernate, it wraps the java.map.map.entry items into the keyToValueEntry wrapper defined in the ecore model.


4. i did not do any unit test in the test framework. i deeply appologize, but i was missing many libraries and didn't have time to make it compile. i hope you dont minde. if i do any more contributions, i will get it to work.


hope this helps!  :)

NB: patch is attached.
Comment 3 Martin Taal CLA 2007-03-11 13:14:28 EDT
Okay, thanks I'll look at it. You did run a testcase for yourselve to test if it worked?

gr. Martin
Comment 4 Jean-Denis Boudreault CLA 2007-03-14 09:18:53 EDT
Hi Martin,


i was thinking about it and i may be able to improve the patch.  does teneo support lazy loading of emaps? and if so, can you tell me on which event it actually physically "loads" the data?

i could check to see if i can move the delegate data wrapping which is currently in the HibernatePersistableEmap constructor to that lazy loaded event.

i also though of an improvement in the wrapping to check if an entry already exists and avoid rebuilding the map at each hibernate save.

in any case, please put a hold on the evaluation of the patch, i will upload a new one tonight based on these improvements and what your answer will be. (i will take the opportunity to improve the coding style, for the sake of it)

thanks
Comment 5 Martin Taal CLA 2007-03-14 09:30:28 EDT
Hi J-D,
I think the lazy loading should work fine. Just refering to the delegate orm list does not load it. Only when the delegate orm list is actually accessed then it is loaded. The persistableemap has a doload method which forces the load (in a transaction). 
But testing if this actually works is ofcourse of great value.

gr. Martin
Comment 6 Jean-Denis Boudreault CLA 2007-03-14 10:30:19 EDT
Hi,


ok, i get it. i guess i will move the wrapping code from the constructor to the doLoad method, where it is not at the moment. 

as for lazy loading, i got this line from HibernatePersistableEMap class javadoc: "Note an emap is not loaded lazily!"; and of course as ou said, doLoad is invoked in the constructor.  is there a reason for not lazy loading EMaps? I ask as lazy loading is a desirable feature for large models, why force it out? and can I add lazy loading in HibernatePersistableEMap?

thanks
Comment 7 Martin Taal CLA 2007-03-14 10:35:23 EDT
Actually I had forgotten that doload was in the constructor :-) and I do not remember why I put it there. As a first thought I would say that a emap should also be lazy loadable just as an elist. You can look at the elist to see how this is done (override the delegatesize, delegateindexof etc. methods).

gr. Martin
Comment 8 Jean-Denis Boudreault CLA 2007-03-14 10:57:41 EDT
great, thanks.

i will ensure emaps are lazy loaded.  stay tuned for a patch update...
Comment 9 Jean-Denis Boudreault CLA 2007-03-14 21:16:38 EDT
ok, i made some changes to the code to support lazy loading on EMaps; performances should be greatly enhanced. i also somewhat cleanup some code and comments.

everything should work fine, my own tests worked correctly, although they may not cover all possible scenarios. please try with teneo unit tests.
Comment 10 Jean-Denis Boudreault CLA 2007-03-14 21:17:16 EDT
Created attachment 60885 [details]
The new updated code and lazy loaded emaps

The new updated code and lazy loaded emaps
Comment 11 Jean-Denis Boudreault CLA 2007-03-15 11:53:45 EDT
Comment on attachment 60885 [details]
The new updated code and lazy loaded emaps

just noticed i could make it a little better. new patch to come soon.
Comment 12 Jean-Denis Boudreault CLA 2007-03-15 21:02:16 EDT
Created attachment 61032 [details]
The newly refactored EMap

This new version is a vast refactor of the original map. it now supports lazy loading, size without loading a lazy loaded collection, etc.

it was tested through my own tests, but should be run against teneo's unit tests to be sure. im not sure if it all works, but it is definitively a better Map form.
Comment 13 Martin Taal CLA 2007-03-18 15:18:08 EDT
Hi Jean-Denis,
Thanks for the contribution. I divided the map implementation in two parts: 1) the old method (using list-mapping), and 2) the new method (using the real map-mapping). 1 is implemented in the PersistableEmap and HibernatePersistableEMap). 2 is implementated in the MapPersistableEMap and MapHibernatePersistableEMap.
I re-used most of your patch to support scenario 2.
Scenario 2 will be default, with an option a user can use the old scenario. 
I tested this with the teneo test case and it all worked.
It has all been checked in in cvs. 
Can you also try (you need to get a getlatest for all projects)?
Thanks!

gr. Martin
Comment 14 Martin Taal CLA 2007-03-21 18:24:51 EDT
handled in release: 0.8.0 I200703211738
Comment 15 Nick Boldt CLA 2008-01-28 16:56:55 EST
Move to verified as per bug 206558.
Comment 16 Martin Taal CLA 2008-06-13 05:59:15 EDT
Fix available in HEAD: 1.0.0RC4 (S200806111928).