Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[linuxtools-dev] Re: TMF indexing performance


- What are the events ranked N to M?
- What are the events that occurred in time range [T1,T2]?

The timestamp criteria is useful for time-based analysis while the rank criteria is useful for tabular display.

Yes, analysis is "time based" but tabular display gets us locally into rank based access. For time based access, we have per CPU tracefiles, each being a sorted array of fixed size blocks, each block being filled with variable size events. The access time is O(log n) because of the binary search among blocks. The average number of events in a block is a constant anyway. Moreover, the number of blocks accessed is more or less the number of disk accesses, for a very large trace, which dominates the elapsed time as compared to searching linearly within a block.

The interesting thing is that this works well even when we "merge" several traces together. We have one tracefile per CPU, we may have different "channels" to trace different event types on the same system or we may want to view traces from several distributed computers interacting together. To view the events at time t, we seek each tracefile to time t and then we retrieve events in time order using a heap of the next events available from each tracefile; this is basically an external merge sort. You simply stop when you have filled your display table.

If you want to get the events from N to M in a tracefile, we could envision noting the number of elements in each block and the rank of the first event in each block. Either this is computed as we trace (atomic increment of the number of events in a buffer) or a post-processing step will be required. Given that information, we could get events N to M as efficiently as our time based queries.

The problem is when we merge several traces together. The rank is on a "sorted time basis" and is known only as we merge sort the events. Indeed, getting 1 000 000 events later may require 9000 000 events from one file and only 100 000 from the other tracefile. If we remove a trace from a traceset and bring in another one for the analysis (i.e. look at nodes 1, 2, 4 versus 1, 3, 5), the "merged" ranking is to be recomputed!

We had similar problems with filters a few years back. If you want to fill the display with the 20 next events from time t after applying a filter (e.g. show events of type interrupt where irq number is 12) you have no idea how many blocks you may need to look into before filling the 20 slots.

I dont have an easy solution for general rank based processing.

Back to the top