Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [linuxtools-dev] [TMF] Advanced statistics view

>> this method only requires a single "attribute" to store the cumulative time and whether or not the process is in an active interval. Its value is changed at schedule in and at schedule out. However, you need to do additional queries. For instance, if you want to show the CPU usage per process for an interval t0,t1 (pixel on screen), you need the query at t0 and t1 and, for each process, you need as well values at their respective interval.startTime -1, leading to an additional query for each process...
> Yes, this is an issue, I think since the number of processes is
> relatively small (under 65535 or in reality under 1k) and for a given
> time slice most threads/processes would not change,  this may or may not
> be a performance issue. It should be benchmarked.

If you fall on a "real" interval, there is no additional queries. If you
fall on a null interval, you have to do one additional query, to get the
previous interval. If you want to query a range, you have to do 2
queries already, by definition. So in the worst case, you end up doing 4
queries instead of 2. I don't think this is a big problem from a
performance point of view (especially since those intervals are close,
they will be close in the history backend, perhaps even in the same
block, in which case it doesn't even have to go to the disk).

>> One possible alternative is to store the cumulative CPU time in one attribute and the entryTime for the current interval if scheduled in and thus ongoing (or NULL if scheduled out). This would be 2 attributes instead of 1 in the current state, 1 change at schedule in and 2 at schedule out (thus 3 changes instead of 2 in the history). However, you would not need any of the additional queries and there should be no problem with partial history storage optimizations.
> Yes, this is an interesting idea. About the partial history vs full
> history, this is something where partial history IMO is not at all
> beneficial since the intervals are large and the changing events are few
> and far between, relatively on the kernel front.  this state system
> takes (empirically) approx 10 mb, 20 mb for the new system for every gb
> of the full state system, so trying compression to save space here is
> like trying to balance the US economy by cutting PBS's funding. Alex
> will probably disagree with me here.
>
> With very very large traces, I can't tell if this state system will be
> larger than say a partial stats history tree. I think some investigation
> is needed.

If you can fit it in a partial history, it will most likely be smaller
than a full history, unless you use 1000 times more attributes ;)

An interesting point with the proposed method is that we already store
the status of each process in the standard kernel state system, that can
indicate if the process was on or off the CPU at any given time. So we
could avoid duplicating this information.

I still have to wrap my head around it (I'm not sure if it implies using
back-assignments or not), but it's definitely good to experiment.


Back to the top