Re: [linuxtools-dev] TmfEventContent data storage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [linuxtools-dev] TmfEventContent data storage

From: Francois Chouinard <fchouinard@xxxxxxxxx>
Date: Sun, 8 Nov 2009 23:42:27 -0500
Delivered-to: linuxtools-dev@xxxxxxxxxxx
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=gac/I1f4VkXwIvgu62q1ME5zCVi7HiJ5vQdijnpUeLS32GUbF0m9Ctv7BTVrvECi8z kV8uZT/eHddCP834EJC9H5GxkPUFTSjYzDfsURTUfxlxoUinDKd2HjJXM7QOZCny+HYa caYp9Feq7y08Wdz0jisd2gaSVmd9ASqzfXWVs=
List-archive: <https://dev.eclipse.org/mailman/private/linuxtools-dev>
List-help: <mailto:linuxtools-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/linuxtools-dev>, <mailto:linuxtools-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/linuxtools-dev>, <mailto:linuxtools-dev-request@eclipse.org?subject=unsubscribe>

Hi Franz,

> The problem is that the content now parses the fields in the constructor
> which means that the map of fields is generated even for events that are
> only used to index the trace.

Agree. I put the parseContent() call back in the getField*() methods so it is only called on demand and if needed. This should speed up the indexing.

> Also I think that the map of field id’s to field values that is stored
> in each event is a performance issue. Why do you not keep the access
> function with the integral id? When used from a GUI side it would also
> be much faster getting the field values by an integer than by a String.
> Because a GUI would normally get a list of labels for the columns and
> then fill it with fields. It makes no sense from my point of view having
> an accessor function by a field label which would end up in a map-lookup.

The event content is not handled solely by GUI tables. Some (LTTng) components need to access fields individually and storage in a Map structure, indexed by field ID, is more versatile in that case.

Also, it seems to me that it is safer to have the method parseContent() return a Map of the parsed fields instead of an array. This way the FW doesn't have to worry about [1] variable list of fields (missing or optional), [2] positional errors, [3] incompatible array sizes (number of fields different than number of parameters). All this can very well be handled by the parseContent() implementor but it definitely imposes an extra constraints on him/her.

Finally, I like the possibility of augmenting the set of attributes/flags of the event as it navigates through the application. A map gives us more flexibility for this.

It is expected that there will be relatively very few events displayed in tabular format at any time (a few tens, not millions) so the extra map lookup time should not even be perceivable form the user standpoint. However, I agree that removing the index-based field access function wasn't a good idea and I put it back.

> And finally - why is data stored that often in the TMF class structure?
> The content holds a list of field labels and a reference to the type.
> The type holds a list of field labels. The content holds a map of field
> label to values. And finally the field itself holds a copy of the field
> label. That means that each field label is stored on 4 different places
> within the same object. For the content I think it would be much faster
> to store a type reference. The list of field names can then be retrieved
> by asking the type. And the map should be replaced by a list of fields
> (like it was before) that is lazy-parsed. This way indexing will speed-up
> a lot.

I removed the list of labels from TmfEventContent and used the TmfEventType reference. This adds an indirection but only if individual fields need to be accessed.

From a modelling perspective, I think that it is not a bad thing if a field knows its ID. In the big scheme of things, the field might initially be accessed through Event, then Content, but it could potentially be handled out of context and it wouldn't be easy to find out its ID if we had to navigate through the structure, figure out its position, etc.

I changed the type of the fields map to <String, Object>. If you don't care about the field ID then you only need to store the field content. If you need the field ID, it can be stored as a TmfEventField instead (holding ID and content).

I posted the updated patch in https://bugs.eclipse.org/bugs/show_bug.cgi?id=287562

.

Regards,
/fc

Follow-Ups:
- [linuxtools-dev] First event missing in Experiment
  - From: Polzer, Franz
- RE: [linuxtools-dev] TmfEventContent data storage
  - From: Polzer, Franz

References:
- [linuxtools-dev] TmfEventContent data storage
  - From: Polzer, Franz
- Re: [linuxtools-dev] TmfEventContent data storage
  - From: Francois Chouinard
- AW: [linuxtools-dev] TmfEventContent data storage
  - From: Polzer, Franz
- Re: [linuxtools-dev] TmfEventContent data storage
  - From: Francois Chouinard
- RE: [linuxtools-dev] TmfEventContent data storage
  - From: Polzer, Franz

Prev by Date: Re: [linuxtools-dev] TmfEventContent data storage
Next by Date: RE: [linuxtools-dev] TmfEventContent data storage
Previous by thread: RE: [linuxtools-dev] TmfEventContent data storage
Next by thread: RE: [linuxtools-dev] TmfEventContent data storage
Index(es):
- Date
- Thread

Breadcrumbs