447546 – annotation processing during incremental build

Bug 447546 - annotation processing during incremental build

Summary: annotation processing during incremental build

Status:	NEW

Alias:	None

Product:	JDT
Classification:	Eclipse Project
Component:	APT (show other bugs)
Version:	4.4
Hardware:	All All

Importance:	P3 normal with 3 votes (vote)
Target Milestone:	---
Assignee:	Generic inbox for the JDT-APT component
QA Contact:

URL:
Whiteboard:	stalebug
Keywords:

Depends on:
Blocks:

Reported:	2014-10-16 08:02 EDT by Igor Fedorenko
Modified:	2023-07-26 07:18 EDT (History)
CC List:	3 users (show)

See Also:

Attachments
small example to demonstrate the problem (8.62 KB, application/x-zip-compressed) 2014-10-16 08:02 EDT, Igor Fedorenko	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Igor Fedorenko

2014-10-16 08:02:32 EDT

Created attachment 247919 [details]
small example to demonstrate the problem

Steps to reproduce

* Import attached example as existing project into eclipse workspace
* Observe target/classes/META-INF/foo/bar.lst includes three lines for classes project.C1, project.C2 and project.C3
* edit project.C1 and remove @MyAnnotation
* Problem: target/classes/META-INF/foo/bar.lst was not updated to reflect the change to project.C1
* edit project.C1 and reintroduce @MyAnnotation
* Problem: target/classes/META-INF/foo/bar.lst now only has line for project.C1 but not for project.C2 and project.C3

I believe this is a bug in annotation processing API and I'd like to start discussion how to evolve the API to work properly in incremental build environment, like Eclipse IDE.

Comment 1 Jay Arthanareeswaran

2014-12-03 05:46:49 EST

I don't agree this is a bug. With incremental compilation, only the required files are compiled and again only those are passed for annotation processing. I don't see what we can do here other than picking up all the files and compiling. But I don't think we want that, do we.

But if you have some idea, I would like to hear that.

Comment 2 Igor Fedorenko

2014-12-03 08:19:00 EST

As a user, I expect the output file to include entries for all sources annotated with @MyAnnotation and no other entries. This does not happen during incremental build. From user perspective, the current behaviour is clearly incorrect.

I do agree with your other points. Only changed sources should be processed during incremental build. Proper incremental build behaviour is not possible with the current annotation processing API and processing all sources during incremental build will most likely result in endless workspace builds (on top of being too slow).

The solution I have in mind is to introduce some sort of intermediate persistent data structure to maintain information about processed inputs between builds. Only modified inputs will be processed during each build, but the persistent data structure will include information from previous builds and will allow generation of complete output file.

We already use this approach for other builders, so we know it works. It will require changes to annotation processing API, but I don't see a way around it. 

I can provide a proof of concept implementation, if you are interested, but it will likely take me several weeks to put together and I'd need some sort of assurance this will not be to waste before I start working on this.

Comment 3 Jay Arthanareeswaran

2014-12-05 00:17:00 EST

(In reply to Igor Fedorenko from comment #2)
> The solution I have in mind is to introduce some sort of intermediate
> persistent data structure to maintain information about processed inputs
> between builds. Only modified inputs will be processed during each build,
> but the persistent data structure will include information from previous
> builds and will allow generation of complete output file.

Sounds interesting to me.

Walter, are you still watching APT bugs? I would love to hear your opinion on this.

Comment 4 Walter Harley

2014-12-08 01:46:26 EST

I agree it's a bug in the API - that is, a shortcoming of the interface definition, not a bug in the implementation. We worked hard to get any support for incremental compilation into JSR-269 but even so the support is simply very poor. Certain specific use cases are supported but there are many that are not. 

In particular, generation of composite files - files that contain information from more than a single source file's annotations - are very poorly supported by JSR-269, and yet, these are a very frequently requested use case.

I think there are three ways you could go with this:

1. Work through the JCP to propose a new JSR with enhanced API. Very hard, slow process, that would ultimately only give you something in Java 10 or so. But this is the official path.

2. Add something proprietary within the existing (javax.tools) API. This would most likely break Eclipse's Java compatibility, which is a bad thing to do for many reasons.

3. Add something that is Eclipse-specific, i.e., annotation processors would have to import org.eclipse.jdt.* packages in order to use it. There is no technical or legal obstacle to this, but it would mean that such processors would no longer be compatible with javac's inbuilt annotation processing (unless they used reflection to determine which environment they were running in).

Number 3 is obviously the most appealing. However, I share Jay's concern about performance: anything that requires searching all package fragments, rather than just those being compiled at the moment, is simply not going to be performant in an incremental compile situation; and cacheing is also not a great option because there is no way to limit the size of cache and no good time to load it (ie, you don't know what annotations to cache, because annotation processors are free to access even the annotations that they do not "claim"). 

I fear the only way you'll get this to perform well is to treat the composition as a separate, post-compile, build step: i.e., each source file generates a separate intermediate file during annotation processing, and then in the final round or as a separate step, those intermediate files are composed. Obviously the composed result cannot itself contain further annotations to be processed. In fact within the constraints of JSR269 it's not even obvious to me that it can be a Java source or .class file; I think you might only be able to make things like XML files this way, which is pretty limiting.

I'm not optimistic, but I'm certainly willing to discuss further.

Comment 5 Igor Fedorenko

2014-12-08 09:33:52 EST

Yes, I agree that "Number 3" approach, i.e. new API in org.eclipse.jdt namespace is the only workable approach. I also think we must make this API backwards compatible with javac and I have a rough idea how to do this. We may decide to go through JCP once we have the new API and implementation ready, but, frankly, given that javac is not incremental, I am not sure how much value this will have.

I also agree about two-step approach to composite output generation and this is how our existing implementation works. First, we process all new/changed inputs and generate intermediate persistent data structures. Then, at the end of the build, we use these intermediate data structures to generate the composite output (although we call it "aggregate" output). We decided to provide simple build state persistence support as part of our incremental build API for performance and convenience reasons but I am open for discussion.

Since we appear to agree on all/most points, I plan to propose new API some time in the near-ish future, say in next two or three months (have some other commitments I have to deliver first).

Comment 6 Walter Harley

2014-12-08 14:20:51 EST

Sounds like we're roughly in agreement.  I do think you should consider whether or not to use files, rather than in-memory structures, as the intermediate representation.  I think it comes down to performance of the aggregation step versus persistence of the build state.  It might also affect other implementations, e.g., command-line implementations.  And it might affect the ability to build projects a module at a time.

I'm not saying that I have an absolute preference, just that I think this is a point to carefully consider and discuss.

Comment 7 Igor Fedorenko

2014-12-08 14:36:02 EST

Maybe I was not clear. The build state will have to be persisted on filesystem between builds, otherwise it will not be possible to provide correct after Eclipse restart or when re-running command line build. The current implementation we have uses single file per project, but the API provides simple map-file interface, so we can change how we persist the state without affecting clients.

Also, we have hard requirement to provide consistent behaviour between IDE and command-line Maven build, so the API I plan to propose will be validated in both environments.

Comment 8 Walter Harley

2014-12-08 19:58:24 EST

Got it.  Yes, that sounds right to me.  Looking forward to seeing what you come up with :-)

Comment 9 Jay Arthanareeswaran

2014-12-08 22:40:47 EST

Please note that any new API or API change must be agreed on and pushed by M6, around 3rd week of March.

Comment 10 Eclipse Genie

2021-08-04 03:48:23 EDT

This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.

Comment 11 Eclipse Genie

2023-07-26 07:18:18 EDT

This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.