Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [linuxtools-dev] [TMF] Advanced statistics view

Hello

I want to mention a general case, in which we want to support "group by" queries for different attributes and resources. 

Regarding the  problem mentioned by François, suppose we want to calculate per-cpu CPU utilization of each process (select CPU usages for each CPU separately, group by process).
> Process #1 : 25% total
>     -> CPU0 : 20%
>     -> CPU1 : 5%
>
> Process #2 : 10% total
>     -> CPU0 : 10%
>     -> CPU1 : 0%

In the meantime, suppose we are also interested to have a reverse statistics:  per-process CPU utilization for each CPU (select CPU usages of each process separately, group by CPU).
> CPU0 : 30% total
>     -> Process #1 : 20%
>     -> Process #2 : 10%
>
> CPU1 : 5% total
>     -> Process #1 : 5%
>     -> Process #2 : 0%

Or another example, we want to calculate the IO throughout of processes and files grouped by each one separately:

For IO throughput:
Process #1 : 25% total
    -> File0 : 10%  (quark: 1)
    -> File1 : 5%   (quark: 2)
Process #2 : 15% total
    -> File0 : 5%    (quark: 5)
    -> File1 : 10%   (quark: 6)
and
File0 : 12% total
    -> Process #1 : 8%   (quark: 10)
    -> Process #2 : 4%   (quark: 11)
File1 : 20% total
    -> Process #1 : 10%  (quark: 15)
    -> Process #2 : 10%  (quark: 16)

By using the current organization of the attribute tree , we may need to duplicate the data and store them twice in the history tree, a separate value for each attribute pair (e.g. cpu1--> process1  and process1-->cpu1 have different quark values and need to store their equal statistics values separately in different places of the history tree). 

However, it may be useful to somehow relax the definition of the attribute tree and let different applications define their own organizations of the attributes.


For instance, I suggest a new organization for managing the statistics:


1- We firstly create hierarchy of resources separately.
Processes 
    -> Process #1
    -> Process #2

CPUs 
    -> CPU #1
    -> CPU #2

Files 
    -> File #1
    -> File #2

2- Then, define the metric nodes between different resources and assign them different quark values. For example, we define "cpu usage" metric node between each process and each CPU:
    -> Process #2

              ---> CPU usage    (quark: 1)
    -> CPU #1

or IO between each File and Process
    -> Process #1
              ---> IO           (quark: 2)
    -> File #3

This organization avoids duplication in the history tree: for each tuple (e.g. process and CPU), it stores only one value in the history tree. 
Furthermore, it supports different "group by" queries, aggregation functions, etc.


Thanks,
Naser

Back to the top