Bug 578055 - Add JFR Events (Java Flight Recording)
Summary: Add JFR Events (Java Flight Recording)
Status: CLOSED WONTFIX
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Runtime (show other bugs)
Version: 4.22   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: platform-runtime-inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: investigate
Depends on:
Blocks:
 
Reported: 2022-01-05 06:51 EST by Jörg Kubitz CLA
Modified: 2023-10-09 05:41 EDT (History)
9 users (show)

See Also:


Attachments
Example file events - IDE startup.png (177.18 KB, image/png)
2022-01-05 06:51 EST, Jörg Kubitz CLA
no flags Details
Example Job events - IDE startup.png (144.31 KB, image/png)
2022-01-05 06:51 EST, Jörg Kubitz CLA
no flags Details
jfr event patches.zip (8.52 KB, application/x-zip-compressed)
2022-02-02 04:47 EST, Jörg Kubitz CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jörg Kubitz CLA 2022-01-05 06:51:08 EST
Created attachment 287778 [details]
Example file events - IDE startup.png

I would like to know if others would like to see Eclipse specific JFR Events in Eclipse too. They do not offer any value for the user. But they would allow eclipse developers to get a better understanding of what is going on in eclipse. Hopefully help to understand performance issues or optimization possibilities.
JFR adds stacktrace, thread name, timestamps and duration for free. 
Additional events would only help if they provide additional information like a name of something.
Events do add information how often the events occurred for how long with a very fine time resolution. While sampling only reveals a coarse total amount time. And profiling is often a no go  due to the big overhead.
Unlike logging the Events are well structured and can be sorted / grouped with Tools like JDK Mission Control.

Possible Events:
1. File: name, length
 Most of performance issues are OS related (SWT or File access). Since many of eclipses file method are native they can not be observed with the JDK's standards Events. Seeing which files where accessed hopefully helps to eliminate such calls. Example: Currently at IDE start the same files/directories are accessed multiple times.

2. Build: name, reason
 Build time for big workspaces is a problem. I'd like to understand why which projects is build (for us sometimes it feels like they are unnecessary build again and again) and which project takes longest.

3. Job: name, class, result, (locking) rule, ...
 Sometimes Jobs lock workspace or project for some time. I don't know yet how it may help but there is a lot of information available that may give a deeper insight.

There will be a (very) small performance overhead - but only until the methods are compiled: after compilation it is known that the code is completely optimized away if the events are disabled.
To enable JFR Events one need to start the JVM with special parameters:

-XX:StartFlightRecording:filename=dump.jfr,dumponexit=true -XX:FlightRecorderOptions=stackdepth=-1

Or with custom settings like:

-XX:StartFlightRecording:filename=dump.jfr,dumponexit=true,settings=C:/setting.jfc
Comment 1 Jörg Kubitz CLA 2022-01-05 06:51:36 EST
Created attachment 287779 [details]
Example Job events - IDE startup.png
Comment 2 Eclipse Genie CLA 2022-01-05 06:53:55 EST
New Gerrit change created: https://git.eclipse.org/r/c/platform/eclipse.platform.resources/+/189305
Comment 3 Eclipse Genie CLA 2022-01-05 06:53:58 EST
New Gerrit change created: https://git.eclipse.org/r/c/platform/eclipse.platform.runtime/+/189306
Comment 4 Lars Vogel CLA 2022-01-05 07:08:48 EST
I think this is a really cool idea. Alex, any opinion here? IIRC you used the JFR Event a lot in the past.
Comment 5 Andrey Loskutov CLA 2022-01-09 03:56:09 EST
I see it as kind of concurrent idea to already existing "Tracing" functionality, built in into many plugins. Also job/resources/jdt etc already post lot of tracing information which may be also interesting to jfr.

The differences to jfr are:
- tracing is really off by default (no extra objects are created if tracing is off, in jfr they are always created)
- tracing can be enabled/disabled per bundle / sub-module, jfr is either "on" or "off"
- the result is a plain text file
- no high sophisticated event life cycle like create/commit etc 
- no visual representation of the data

The similarities are:
- if not enabled, *should* not affect application
- if enabled, debug code may run that may slow down application
- only interesting for service providers but not for end users
- in many cases data is hierarchical / structured in the way that helps to understand some module specific aspects

I wonder if the tracing / jfr could be unified and already existing code that adds valuable tracing info may provide it either via tracing or via jfr entries.

The most interesting part of jfr that is missing in tracing is the ability to efficiently persist and visually post process the data. I also find it very useful that jfr can collect the "last X events" data in memory without persisting on disk and persist it on demand / on shutdown only, which is useful for applications running for a long time and producing gazillions of events.

I see that JFR can be enabled without the extra JVM flag for already running application (see https://bell-sw.com/announcements/2021/01/29/JDK-Flight-Recorder-The-Programmatic-Way/ and https://docs.oracle.com/en/java/javase/11/docs/api/jdk.jfr/jdk/jfr/FlightRecorder.html#getFlightRecorder()).

This could be used on the Preferences/Tracing to control the JFR in the same way it is done for tracing - the platform would pre-select log location with workspace log directory and enable "dump on shutdown" etc.

This all is for sure more than just few patches, especially if the pattern is supposed to be re-used by other bundles based on the platform.

I think you should post your comment 0 (and may be the idea of unifying tracing / jfr approach) on general platform & cross-project lists asking for opinions. Not many people watch to platform resources bugzilla component.

Once we agreed on the proposal (if there will be an agreement), I would create dedicated bugs for concrete modules to work on.
Comment 6 Jörg Kubitz CLA 2022-01-10 05:29:57 EST
(In reply to Andrey Loskutov from comment #5)
> The differences to jfr are:
> - tracing is really off by default (no extra objects are created if tracing
> is off, in jfr they are always created)

That's even more true for JFR: even though you see the new() and commit() in the code the compiler eliminates the code if disabled: See https://youtu.be/plYESjZ12hM?t=1606 how that happens.

> - tracing can be enabled/disabled per bundle / sub-module, jfr is either
> "on" or "off"

One benefit of JFR is, that we would not need a eclipse specific configuration dialog because JDK Mission Control already has a Dialog to enable/disable the events and or stacktraces. Even based on a threshold (only record events that lasted longer then configurable ammount of time). That is totally generic and more comfortable then the current eclipse tracing.

> - no high sophisticated event life cycle like create/commit etc 

I don't see JFR events as more complex. Instead of log.trace("hallo") one could also new TraceEvent("hallo").commit() - done. the begin() before commit() is only needed for defining a block between to make a duration measurement (which is then however much smaller then the custom millisecond calculations). What is more sophisticated: One should not use it that generic way but create an Event Class for every single usecase to get the additional information in separated fields. That's unfamiliar and looks like boiler code. That overhead makes one think twice if that Event is really worth or if the information could instead obtained by sampling/profiling. 

> I wonder if the tracing / jfr could be unified and already existing code
> that adds valuable tracing info may provide it either via tracing or via jfr
> entries.

I'd rather suggest to convert tracing with JFR Events as i do not not a single benefit of tracing over JFR. Only that you get no .log file but a .jfr file - which is unfamiliar but much more space efficient.
Especially i recommend to convert every time measurement tracing to JFR. I would love to see that for JDT!
As far as i understand it requires Java 11 which may still not be available on all modules (like equinox which is still on 1.8). So by now we can not generally convert all tracings to JFR but could do it one by one.
Comment 7 Andrey Loskutov CLA 2022-01-10 11:30:39 EST
(In reply to Jörg Kubitz from comment #6)
> One benefit of JFR is, that we would not need a eclipse specific
> configuration dialog 

Unfortunately this is not possible to remove Tracing for a simple reason - too many code uses that API / UI / .options files already, also in commercial products.

So "Tracing" dialog is here to stay, like or not.

> because JDK Mission Control already has a Dialog to
> enable/disable the events and or stacktraces. 

Unlike built-in Eclipse dialog, JFR is not available if not installed. So you can't say to customer - open Tracing preference page, switch some options on / off and attach the generated log file to the ticket. Or use this ".options" file to start Eclipse and provide me xyz.log file after.
 
> > I wonder if the tracing / jfr could be unified and already existing code
> > that adds valuable tracing info may provide it either via tracing or via jfr
> > entries.
> 
> I'd rather suggest to convert tracing with JFR Events as i do not not a
> single benefit of tracing over JFR. 

So we are at "either or" case here. 

The question for me then: is it worth the effort to introduce a second way to do  almost same thing by knowing that both API's (tracing/JFR) will co-exist forever in Eclipse, once introduced?

Think of it from the maintenance / consistency perspective. Will be JFR there in 10 years? No idea. We had javafx, we had applets, SecurityManager etc. Now they are either outsourced or removed or deprecated. You will switch your employer and no one will update that code anymore, but it will be there and will need to be maintained *next* to the existing Tracing, and people will ask what the heck, why do we have two things of that kind and which one is to be used?

> As far as i understand it requires Java 11 which may still not be available
> on all modules (like equinox which is still on 1.8). So by now we can not
> generally convert all tracings to JFR but could do it one by one.

At least on *some* JDK JFR is available, I see it in Open JDK 1.8.0_282 from Red Hat.
Comment 8 Lars Vogel CLA 2022-01-11 05:02:16 EST
Maybe the tracing can "wrap" the JFR events. If JFR is active these JFR events are triggered and if not the regular trace logging is used.
Comment 9 Jörg Kubitz CLA 2022-01-11 07:36:24 EST
(In reply to Andrey Loskutov from comment #7)
> Unfortunately this is not possible to remove Tracing for a simple reason -
> too many code uses that API / UI / .options files already, also in
> commercial products.

There is no need to remove the tracing framework. We just do not need to use it.

> Unlike built-in Eclipse dialog, JFR is not available if not installed. 

JFR is included in every JDK>=11. JDK Mission Control is not. But your customers do not need to have Mission Control. You can just edit the preferences, export them as "setting.jfc" and then start eclipse with the preferences as argument like
eclipse.exe -vmargs -XX -XX:StartFlightRecording:filename=dump.jfr,dumponexit=true,settings=C:/setting.jfc
and let them send the dump.jfr to you.
However i recommend to have only such rare/meaningful events that there will be no need to disable any of them. Just enable them all and decide after recording which events you want to analyse - it's so much cheaper and better to filter then logging due to the structured information that the unwanted events do not harm.

JDK is even shipped with JFR.EXE so that you could configure/dump/print from commandline without Mission Control. Nevertheless i prefer the graphical user interface, and its free - everybody can use it.

If JFR would be dropped one could just remove the JFR events.

(In reply to Lars Vogel from comment #8)
> Maybe the tracing can "wrap" the JFR events. If JFR is active these JFR
> events are triggered and if not the regular trace logging is used.

The difference is that logging logs Strings while JFR only collects the data of the arguments. 
Convert strings to data has the runtime overhead for creating/analyzing the strings. 
While converting the data to String loses the structure. Can be done but what's the gain? Both does not sound helpful to me.
Comment 10 Alex Blewitt CLA 2022-01-18 17:40:20 EST
I think adding JFR events would be a valuable thing to do. They can be used to provide information about how the code is running, and additional information like the stack threads that generate them, and a threshold for not capturing information if the details are too short. (You could use this for determining whether the UI thread was hung, but not record really short UI events, for example.)

I wrote a BundleListner which can be used to generate information using JFR events to determine startup time:

https://github.com/alblue/com.bandlem.osgi.jfrbundlelistener

The only trick is to make sure that the events aren't triggered all the time, but only when you enable flight recording -- and even then, when you enable it, to try and have a suitable sub-set of the events. Having PDE/JDT support for launching a run with JFR tracing enabled, and a more user-friendly way of generating the JFR files that are needed to launch it, would make it much easier to run and collect data.

The only issue is making sure that the bundles you annotate are dependent on JavaSE-11 and above. I'm not sure whether the jdk.jfr is part of that profile or whether there needs to be an extension to add it to the classpath.

I'd also suggest, where possible, that the events are used to bound times -- for example, if you have a WorkspaceJob which holds the lock, then wrapping the execute job with a new Event() and then a .commit() afterwards will implicitly record how long the job took. Similarly, SWT could be augmented to know how long each event could take -- but I don't know if the SWT team would want to have that in the path, since it would mean that SWT apps would only run on Java 11.

I don't think the JFR and DebugOptions are ever going to play nicely with each other, and I don't think you could replace the latter with the former in any case. You might be able to trigger a DebugOptions setting to enable JFR on the fly, but the configuration format for JFR isn't really suited for toggling things on and off.

If you can make it work, I think you'll get a lot of valuable data out of it, but finding where to add it will be the key.
Comment 11 Matthias Sohn CLA 2022-01-24 11:07:54 EST
such Eclipse specific JFR events could also be used to implement performance tests using JfrUnit [1]

[1] https://github.com/moditect/jfrunit
https://youtu.be/OS7yZJQ2Q-0
Comment 12 Jörg Kubitz CLA 2022-02-02 04:47:51 EST
Created attachment 287950 [details]
jfr event patches.zip
Comment 13 Jörg Kubitz CLA 2022-02-02 04:50:24 EST
I still think those events make technical sense. But it does not make sense to add them to the code when nobody beside me will use them.

=> wontfix.

proposed patches attached, just in case of reopen. gerrits will be abandoned.
Comment 14 Marco Descher CLA 2023-10-09 05:41:03 EDT
Please see https://github.com/eclipse-platform/eclipse.platform/issues/734 for a requirement I have!

I give +1 that this would be good feature!