Summary: | Move publication of the IWorkspace service to a separate thread, outside the activator | ||||||
---|---|---|---|---|---|---|---|
Product: | [Eclipse Project] Platform | Reporter: | Alex Blewitt <alex.blewitt> | ||||
Component: | Resources | Assignee: | Alex Blewitt <alex.blewitt> | ||||
Status: | NEW --- | QA Contact: | |||||
Severity: | enhancement | ||||||
Priority: | P3 | CC: | alexander.fedorov, Lars.Vogel, tjwatson | ||||
Version: | 4.20 | ||||||
Target Milestone: | --- | ||||||
Hardware: | All | ||||||
OS: | All | ||||||
See Also: |
https://bugs.eclipse.org/bugs/show_bug.cgi?id=571430 https://git.eclipse.org/r/c/egit/egit/+/176260 |
||||||
Whiteboard: | |||||||
Bug Depends on: | 572128 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
Alex Blewitt
2021-02-27 18:05:26 EST
I should add that if we move the publication of the IWorkspace out to a separate thread, while we will trigger any outstanding DS components of the activation thread, we will still trigger them serially. So if we have 10 components waiting on the IWorkspace publication, we will move from having the main thread do resources + 10xDS components to having 2 threads, one for the resources and one for the remaining 10xDS components to do serially. If DS were able to use a bounded thread pool to do startup, then each DS component could stay in its own thread or thread % pool size. Having lots of components should be an embarrassingly parallel problem, but we are embarrassingly serial at the moment. (In reply to Alex Blewitt from comment #1) > I should add that if we move the publication of the IWorkspace out to a > separate thread, while we will trigger any outstanding DS components of the > activation thread, we will still trigger them serially. So if we have 10 > components waiting on the IWorkspace publication, we will move from having > the main thread do resources + 10xDS components to having 2 threads, one for > the resources and one for the remaining 10xDS components to do serially. > > If DS were able to use a bounded thread pool to do startup, then each DS > component could stay in its own thread or thread % pool size. > > Having lots of components should be an embarrassingly parallel problem, but > we are embarrassingly serial at the moment. It sounds like "the optimization should be applied to DS first". I that case IWorkspace publication could have minimal impact. Am I right? Maybe do both? Use separate thread and allow to (optionally?) activate DS components in parallel in Felix with a bound thread pool. I think Equinox already supports optional parallel bundle activation adding this also for ds components would be great. (In reply to Alex Blewitt from comment #1) > I should add that if we move the publication of the IWorkspace out to a > separate thread, while we will trigger any outstanding DS components of the > activation thread, we will still trigger them serially. So if we have 10 > components waiting on the IWorkspace publication, we will move from having > the main thread do resources + 10xDS components to having 2 threads, one for > the resources and one for the remaining 10xDS components to do serially. > > If DS were able to use a bounded thread pool to do startup, then each DS > component could stay in its own thread or thread % pool size > > Having lots of components should be an embarrassingly parallel problem, but > we are embarrassingly serial at the moment. The SCR implementation just steals time on the thread that publishes the service event. Changing this has the potential to make start-levels meaningless if you simply move the activation of immediate components to be asynchronous. Right now when bundles are activated we know that SCR has processed all the components for the bundle. By the exit of Bundle.start SCR will have published and enabled all components it should in reaction to the bundle being started. Equinox did enhance the start-level implementation such that it can activate bundles in parallel. This gives SCR more threads to do the work on for many bundles. But still the thread that is starting a bundle will be used to fire the service event and SCR "fully processes" that event in the bundle starting thread. Equinox then makes sure all the parallel threads activating bundles for a particular start-level are done before moving onto the next start-level. This ensures the components from a previous start-level are "ready" before moving to the next start-level. SCR can not simply push that work to the background and allow Bundle.start to exit while it is still doing work. This would enable the framework to blast through all the start-levels and activate all the bundles while the SCR worker threads could still be processing events from start-level 1. It would need to "join" with the activating thread that published the event after it has done all the processing of the event in parallel threads. This really would only make sense to do for activation immediate components. Components that are non immediate should only get activated upon their first get from the service registry (lazily). This has to happen synchronously with the BundleContext.getService call. Alex, is this what you are suggesting SCR should do? As each immediate component gets enabled, queue its activation for parallel work. Once it is determined all the immediate components have been queued for activate work wait for them to complete before the SCR service listener implementation returns control back to the framework. This may be possible, but it may be a large effort to get it right. Especially given that each component bundle has its own service listener tracking services (implemented and registered by SCR on behalf of the component bundle). Analysis will be needed on the activate code for immediate components to determine how difficult it would be to coordinate that. I think this corresponds to this code in SCR: https://github.com/apache/felix-dev/blob/org.apache.felix.scr-2.1.26/scr/src/main/java/org/apache/felix/scr/impl/manager/AbstractComponentManager.java#L758-L787 Someone could investigate in a prototype there to see if this is worthwhile. (In reply to Thomas Watson from comment #4) > Someone could investigate in a prototype there to see if this is worthwhile. Maybe a good GSOC project? (In reply to Thomas Watson from comment #4) > (In reply to Alex Blewitt from comment #1) > > I should add that if we move the publication of the IWorkspace out to a > > separate thread, while we will trigger any outstanding DS components of the > > activation thread, we will still trigger them serially. So if we have 10 > > components waiting on the IWorkspace publication, we will move from having > > the main thread do resources + 10xDS components to having 2 threads, one for > > the resources and one for the remaining 10xDS components to do serially. > > > > If DS were able to use a bounded thread pool to do startup, then each DS > > component could stay in its own thread or thread % pool size > > > > Having lots of components should be an embarrassingly parallel problem, but > > we are embarrassingly serial at the moment. > > The SCR implementation just steals time on the thread that publishes the > service event. Changing this has the potential to make start-levels > meaningless if you simply move the activation of immediate components to be > asynchronous. Right now when bundles are activated we know that SCR has > processed all the components for the bundle. By the exit of Bundle.start > SCR will have published and enabled all components it should in reaction to > the bundle being started. > > Equinox did enhance the start-level implementation such that it can activate > bundles in parallel. This gives SCR more threads to do the work on for many > bundles. But still the thread that is starting a bundle will be used to > fire the service event and SCR "fully processes" that event in the bundle > starting thread. Equinox then makes sure all the parallel threads > activating bundles for a particular start-level are done before moving onto > the next start-level. This ensures the components from a previous > start-level are "ready" before moving to the next start-level. > > SCR can not simply push that work to the background and allow Bundle.start > to exit while it is still doing work. This would enable the framework to > blast through all the start-levels and activate all the bundles while the > SCR worker threads could still be processing events from start-level 1. It > would need to "join" with the activating thread that published the event > after it has done all the processing of the event in parallel threads. This > really would only make sense to do for activation immediate components. > Components that are non immediate should only get activated upon their first > get from the service registry (lazily). This has to happen synchronously > with the BundleContext.getService call. > > Alex, is this what you are suggesting SCR should do? As each immediate > component gets enabled, queue its activation for parallel work. Once it is > determined all the immediate components have been queued for activate work > wait for them to complete before the SCR service listener implementation > returns control back to the framework. There's really two sorts of issues here. Firstly, when a service gets published, it triggers synchronous setting of that service with other bundles. Typically this involves a simple 'set' call with a property setter on another class, but it may trigger a class load which auto-triggers a bundle.start. Secondly, once a component is available to start (i.e. it's immediate and it has all of its dependencies satisfied) then it will be serially started. If publication of service A causes components B1, B2, B3 to start, then even if all of them are independent then they'll still start up in sequence. On an embedded system with a single core, this makes sense; it's the most efficient way of doing it. However, on modern systems we have multi-cores and we could have B1/2/3 starting up in parallel. We have a parallel classloader, but it doesn't help if the things that are initiating class loading are launched in series :) I've put up a demo workspace at https://github.com/alblue/ResourceHog/archive/refs/heads/ds-slow.zip which has a bundle that creates a service that triggers 5 (otherwise identical) components to start. You can see that the 'set' calls for the service are serialised, as are the activation calls. |