Bug 567819 - Add more flexibility for loading a DTFJ implementation
Summary: Add more flexibility for loading a DTFJ implementation
Status: CLOSED MOVED
Alias: None
Product: MAT
Classification: Tools
Component: Core (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-12 18:50 EDT by Kevin Grigorenko CLA
Modified: 2024-05-08 15:44 EDT (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kevin Grigorenko CLA 2020-10-12 18:50:51 EDT
This is a follow-on to bug 558984; although that was resolved with an updated IBM DTFJ plugin, there have been other issues related to loading Java 11 dumps and OpenJ9 dumps. The DTFJ plugin system does allow multiple plugins in theory, but that just has to brute force search for one that works, and issues often arise related to non-fatal parsing errors. Therefore, it would be nice to give users more flexibility in choosing which DTFJ is used to parse a J9 dump.

This bug proposes adding flexibility to the DTFJ plugin system:

1. For Java 8, allow users to specify explicit directories containing dtfj.jar and j9ddr.jar.
2. For Java > 8, allow users to load DTFJ from the running JVM rather than the DTFJ plugin.
3. For backwards compatibility, continue to prefer the IBM DTFJ plugin if available.
4. Allow users to load J9 dumps without the IBM DTFJ plugin if the JVM has DTFJ.
Comment 1 Eclipse Genie CLA 2020-10-12 18:56:11 EDT
New Gerrit change created: https://git.eclipse.org/r/c/mat/org.eclipse.mat/+/170676
Comment 2 Kevin Grigorenko CLA 2020-10-12 19:16:07 EDT
First proposed patch submitted to Gerrit. Major points:

1. New org.eclipse.mat.dtfj.bridge plugin added that has the majority of changes. This includes bringing in 34 interfaces and exception classes from Eclipse OpenJ9's DTFJ at https://github.com/eclipse/openj9/tree/master/jcl/src/openj9.dtfj/share/classes/com/ibm/dtfj/. I haven't started work on the works-with CQ yet because I wanted to first get feedback on the patch; however, I'm told by the J9 team that the CQ should not be difficult.

2. The org.eclipse.mat.dtfj project adds a dependency on org.eclipse.mat.dtfj.bridge to have access to the APIs needed in DTFJIndexBuilder and removes the dependency on com.ibm.dtfj.api

3. The DTFJ preferences are kept in org.eclipse.mat.dtfj so the additional two, new preference names are added to the bridge project and accessed from there but this is exported through DTFJBridgeConnector to org.eclipse.mat.dtfj so that it can draw those preferences.

4. The major change is in DTFJIndexBuilder.getDynamicDTFJDump to load the image factories from org.eclipse.mat.dtfj.bridge extension points instead of com.ibm.dtfj.api. However, for backwards compatibility, the default behavior will be for the bridge to load the factories from the IBM DTFJ plugin if available.

5. After the Eclipse content system loads a J9 dump through a concrete implementation of DelegatedImageFactory, that then creates a CustomClassLoader that will load the underlying image factory.

6. Most of the novel code is in the CustomClassLoader. The procedure is basically:

  a. If the user has specified explicit directories through the DTFJ preferences page containing dtfj.jar and j9ddr.jar, then search those for those JARs and add to the CustomClassLoader JAR URL list.

  b. Otherwise, if the user has not checked the "Skip IBM DTFJ feature (if installed)" preference, then search the Eclipse installation for the IBM DTFJ plugin and find the dtfj.jar and j9ddr.jar within it.

  c. If the "skip" option is selected or there have yet to be any DTFJ JARs found, search the java.home directory for the JARs.

  d. If any JARs have yet to be found, do not add any JARs to the JAR URL list of the CustomClassLoader and thus we will defer to ClassLoader.getSystemClassLoader

  e. For the 34 interfaces and exceptions that are used by org.eclipse.mat.dtfj, we explicitly check for those and load from the OSGi classloader.

7. All mvn tests pass with the changes:

[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  02:34 h
[INFO] Finished at: 2020-10-12T15:43:27-07:00

8. Tested with the old IBM DTFJ plugin compared to an explicit directory or a running JVM with the newer DTFJ to confirm that the types of bugs like 558984 would have been resolvable with this change.

9. Tested with a Java 11 JVM on a Java 11 core dump along with the "skip" option to confirm superior quality parsing compared to the current IBM DTFJ plugin.
Comment 3 Kevin Grigorenko CLA 2020-10-13 16:41:56 EDT
Create contribution questionnaire CQ22718: https://dev.eclipse.org/ipzilla/show_bug.cgi?id=22718
Comment 4 Andrew Johnson CLA 2020-10-25 11:28:46 EDT
I've tried the code and it solves a lot of the problems.

A remaining problem is how one installation of MAT could parse dumps from difference sources without being reconfigured. I'd like to see a plan for how this could be done (perhaps multiple concurrent implementations loaded inside dtfj.bridge or the dtfj plugin itself seeing multiple implementations and deciding at JavaRuntime.getVersion time).

Longer term I would like to know if it would be possible to read OpenJ9 Java 11 and later dumps from non-OpenJ9 VMs but perhaps that is going to be an OpenJ9 restriction.

It also seems a bit odd going inside the old IBM DTFJ bundles and reading the jars (if that is what happens) rather than using OSGi class loading on the bundle, but perhaps there are difficulties in getting an old DTFJ implementation to use the dtfj bridge DTFJ API classes.
Comment 5 Kevin Grigorenko CLA 2020-10-26 10:49:19 EDT
(In reply to Andrew Johnson from comment #4)
> I've tried the code and it solves a lot of the problems.

Thanks

> 
> A remaining problem is how one installation of MAT could parse dumps from
> difference sources without being reconfigured. I'd like to see a plan for
> how this could be done (perhaps multiple concurrent implementations loaded
> inside dtfj.bridge or the dtfj plugin itself seeing multiple implementations
> and deciding at JavaRuntime.getVersion time).

I agree. Ideally, a user should be able to configure a set of JREs, MAT starts by parsing out just the JRE version from the dump, and then switches to parsing with a matching JRE. However, in addition to keeping things simple at first, the main reason I haven't worked on this yet is the issue of Java >= 9 DTFJ. My initial tests showed that we could not dynamically load a Java >= 9 DTFJ from a directory using Java 8 as a MAT pre-req. So I think the next step towards this goal would be for MAT to pre-req Java 11 and then I can try out the above design. Here are my notes for posterity:

Loading openj9.dtfj.jmod doesn't work because the newer ImageFactory depends on jdk.internal.module.Modules:
https://github.com/eclipse/openj9/blob/v0.22.0-release/jcl/src/openj9.dtfj/share/classes/com/ibm/dtfj/image/j9/ImageFactory.java#L74
So the loose expanded classes cause the exception, "java.lang.IllegalAccessError: class com.ibm.dtfj.image.j9.ImageFactory (in unnamed module @0x3994900) cannot access class jdk.internal.module.Modules (in module java.base) because module java.base does not export jdk.internal.module to unnamed module @0x3994900"
In the future, when MAT pre-req's Java >= 9, we might be able to use some Module magic to allow ImageFactory to perform its Module magic. This might involve adding similar logic as ImageFactory and running with --add-exports java.base/jdk.internal.misc=ALL-UNNAMED

> 
> Longer term I would like to know if it would be possible to read OpenJ9 Java
> 11 and later dumps from non-OpenJ9 VMs but perhaps that is going to be an
> OpenJ9 restriction.

We'll have to discuss with Keith on the precise design and its implications.

> 
> It also seems a bit odd going inside the old IBM DTFJ bundles and reading
> the jars (if that is what happens) rather than using OSGi class loading on
> the bundle, but perhaps there are difficulties in getting an old DTFJ
> implementation to use the dtfj bridge DTFJ API classes.

Initially, I tried to use OSGi classloading but I couldn't figure out how to keep the classes sandboxed; after loading the implementation classes once, a subsequent load of alternative classes caused ClassCastExceptions. The current design sandboxes everything within the CustomClassLoader to avoid anything being loaded inside OSGi classloaders (other than the 34 interfaces and exceptions). I did find details on advanced OSGi dynamic classloading but I liked the approach of just using the loose JARs because that unifiied the classloading with the configured JAR option and made the code simpler.

The CQ is still in the approval pipeline and I've just followed up to see if there's any ETA.
Comment 6 Kevin Grigorenko CLA 2020-10-27 09:58:57 EDT
Hey Andrew, feedback was received on the CQ issue https://dev.eclipse.org/ipzilla/show_bug.cgi?id=22718:

> We have completed a high level triage of the attachment and you may now check
the content into your project’s source code repository. If a repository has not
yet been allocated, please connect with webmaster@eclipse.org to either create
a new repository or move your existing repository to the Eclipse Foundation. 
We will perform a comprehensive analysis at a later date based on IP work
queue.

Are you okay with moving forward with this patch? If so, I'll update the patch with documentation updates and submit for a final review.
Comment 7 Kevin Grigorenko CLA 2021-01-27 10:52:47 EST
Hey Andrew, I received a notice that the CQ has been approved.

Note also that I have created an internal build of MAT used by all of IBM support and added this patch to it. We can let this patch bake there for some time to see if any issues arise before considering it for core MAT.
Comment 8 Eclipse Genie CLA 2022-01-14 09:36:29 EST
New Gerrit change created: https://git.eclipse.org/r/c/mat/org.eclipse.mat/+/189638
Comment 9 Kevin Grigorenko CLA 2022-11-29 09:23:31 EST
I noticed an update to this bug and I read my last comment on it:

> Note also that I have created an internal build of MAT used by all of IBM
> support and added this patch to it. We can let this patch bake there for some
> time to see if any issues arise before considering it for core MAT.

I just wanted to clarify that I abandoned this approach and we currently have two separate launchers and the user must choose the right launcher based on the dump they plan to parse. I haven't had time to investigate a new patch that works with Java 11 and up.
Comment 10 Eclipse Webmaster CLA 2024-05-08 15:44:37 EDT
This issue has been migrated to https://github.com/eclipse-mat/org.eclipse.mat/issues/30.