Community
Participate
Working Groups
I’ve noticed some inefficiencies in the IDocument, IDocumentProvider and ITextStore (or their implementations) which could be resolved very simply by adding overloads to the public void set(String) method. My suggestion is that extensions be provided declaring the following methods: public void set(char[]) and, optionally, public void set(byte[]) . This would work much better for large documents using the GapTextStore. Because the GapTextStore uses a char[] internally anyway, efficiency is lost (refer to the setDocumentContent(IDocument, InputStream, String) method) by reading the input stream into a StringBuffer, passing the resulting String to the IDocument.set() method which passes it to the text store which, in most cases, turns it into a char[]. It would be faster and consume less memory if the String/StringBuffer were not created until absolutely necessary. To make a long story longer, if the input stream were simply read into a char[], and then the char[] passed into the IDocument.set() then only those documents/text stores that need a String need use a String which can simply be constructed from the char[]. The GapTextStore would be able to immediately make use of the char[] as is. I would be more than willing to apply the very simple changes if I were given permission to at least submit the changes for approval. Note that this really does become a performance issue with large files. My team are writing an eclipse based product. Our project manager tried opening a large (31MB) file which could not be opened until the heap size was set to over 512MB. This memory overhead is directly related to the use of the StringBuffer/String when its use was simply not called for. If I could fix this issue it would result in increased performance across the board for all eclipse users. Sincerely, Keith McQueen (hopeful eclipse developer)
Keith, you can provide a patch which implements your suggestions. Things to take care of: - existing API cannot be removed and has to be supported - ensure that all tests in the following test projects still run: org.eclipse.core.filebuffers.tests org.eclipse.jdt.text.tests org.eclipse.jdt.ui.tests org.eclipse.jdt.ui.tests.refactoring org.eclipse.jface.text.tests org.eclipse.text.tests org.eclipse.ui.editors.tests org.eclipse.ui.workbench.texteditor.tests org.eclipse.ltk.core.refactoring.tests org.eclipse.ltk.ui.refactoring.tests - write new test cases for the new functionality Of course the benefit will only be achieved for clients that actually switch to use the new API (a quick search found over 200 clients of IDocument.set(String) but most of them are in test cases).
I have been developing with eclipse, for an application based on the eclipse platform, but this would be my first foray into actually developing for eclipse. Is there a protocol, or a set of guidelines to follow? I have managed to set up the eclipse CVS repository, so I can check things out, but how exactly will I submit changes back to the CVS store. It appears that there are a relative few with such privileges. Some more pointers would be nice. Thank you very much. Keith
You load one of the latest versions from CVS (e.g. last I-build) and then apply your changes. For each project you then create a patch (select the project, context menu > Team > Create Patch...) and attach it to this PR for review. >Is there a protocol, or a set of guidelines to follow? Most important is to honor the current code style (e.g. don't reformat existing code).
Thank you for the pointers. I'm sorry to be so annoying, but I checked out the 200502010800 version of the org.eclipse.jface.text project, but it doesn't seem to build. I get errors stating that (among other things) the class DocumentRewriteSession cannot be resolved. Should I just try an earlier version or is there something else I need to do? I really do appreciate your assistance.
I'd first start by looking at the Platform Text architecture and especially what plug-ins belong to it. You are missing some dependent plug-ins. Dependent plug-ins are listed in the plugin.xml's required section.
Well, I've "hacked" around a bit in the eclipse code, and made my proposed changes, but, to my dismay (though I'm sure you're not really surprised) I didn't get the performance boost I had hoped for, in either speed or memory use. I'm not sure now what the best approach would be. It seems like the line tracker may have something to do with the large memory consumption, but I'm not really sure about that either. Do we have any recourse for handling large documents (in excess of 20M)? Our users are already complaining about performance, but now I don't really know how to address the issue. What do you think? Just FYI: I am working with a large text file around 31M. When opening the file in the default editor, the used heap grows to about 451M, nearly 15 times the size of the file. I can garbage collect, which brings the used heap to about 224M, but that is still nearly 7 times the size of the file. I get the same results with both the orginal (as is) eclipse code and my modified eclipse code. I would love see the used heap be around 3 times the file size. This is my quest. I know that one of the problems I have addressing this issue is that I don't really have a good java profiler at my disposal to determine precisely where the trouble is. I hate to be a bother, but your assistance would be greatly appreciated.
There are other things but the Document that may use memory - for example, if you're not using the Text editor, but a custom editor, it may allocate all kinds of objects to track the document structure. Also, if you're looking into using the text infrastructure for files > 10 or 20 MByte, it may be interesting to look at the memmap features of java.nio. Have fun...
For a normal editor that does not use quick diff it should be once the file size. If quick diff is enabled 3 times the file is used upon opening and two once it is fully open. 7 is definitely too much. Can you tell who's holding on to these?
In the case where I stated the used heap size for a 31M file, I was just using the default eclipse text editor. I am not sure where all the memory is being consumed (or rather who is doing it) but the behavior I notice is that opening the file, the memory only goes to a certain point (~192M sits there for a second), but then it all of a sudden skyrockets to the 451M level. I was wondering if it wasn't the Abstract/DefaultLineTracker which creates a list of Line objects. Oviously there are very many lines in this 31M file, so that would translate to a large collection of Line objects. I tried to have the line tracker not create all the lines at once, but that causes a number of problems as well. I considered having the lines built as needed (the ol' lazy loading idea), but because I wasn't sure if that is really where the problem is, it didn't seem worth the effort it would take. So, to make a long story longer, I don't really know where the memory hog is. I don't have access to a good memory debugger at this point. The profiler(s) I do have for eclipse, don't really work for memory debugging, or at least they are not intuitive to me.
Please note that memory shown Windows Task manager an similar system tools is not very useful and only leads to speculations because some VMs never give back allocated memory from the HEAP even if an object is already gone. The only way to see what's going on is to use a profiler. Having said that, if you think there's a memory leak or too much memory is used then file a bug report and we can look at it.
I was using the org.eclipse.ui.tools.heapstatus plugin to monitor memory usage and force garbage collection. I don't know how accurate it is really, but it is much more accurate than the windows task manager. I am concerned about memory usage, because all the classes I have used are the eclipse platform classes, there are none of our/my own classes here (using the default eclipse editor, document, document provider, text store and line tracker). There really just does not seem to be support for large documents in eclipse.
Created attachment 18191 [details] Zip file containing LARGE text file I know the attachment is large, but that's the point. It includes a large(IMHO) text file of about 15M (the original was around 31M, but I couldn't compress it enough to send it). I would appreciate it if you would just observe how eclipse behaves when opening it in the default text editor. Files of this size are not uncommon for my users, but I really don't think that the eclipse document/text framework scales (well) to files this large. I don't mean any disrespect, its just that the contents of the file seems to be multiplied about 7 or 8 times in memory (at best and sometimes as much as 20 times). Passing Strings around just ends up making more and more copies.
*** This bug has been marked as a duplicate of 75086 ***