Community
Participate
Working Groups
Large number of tasks (about 500 - 1000) for one user highly decreases performance, may even lead to OutOfMemoryError. The main problem is that tasks are saved as files on the disk each time user starts the long running operation on server. To decrease the risk of having more and more tasks saved some steps have been already made: * old tasks (older than 30 days) are removed at the server start * task may be marked as idempotent and they are removed after being consumed by the client * "Auth fail" tasks are removed straight after authentication error is handled by the client Still having those steps made Szymon managed to get over 1000 tasks saved which made his Operations View unsuable.
Szymon got out of memory when the task list was generated to be displayed. It was thrown when reading one of the tasks to add to the list: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Unknown Source) ~[na:1.6.0_02] at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source) ~[na:1.6.0_02] at java.lang.AbstractStringBuilder.append(Unknown Source) ~[na:1.6.0_02] at java.lang.StringBuffer.append(Unknown Source) ~[na:1.6.0_02] at org.json.JSONTokener.nextString(JSONTokener.java:279) ~[na:na] at org.json.JSONTokener.nextValue(JSONTokener.java:343) ~[na:na] at org.json.JSONObject.<init>(JSONObject.java:205) ~[na:na] at org.json.JSONTokener.nextValue(JSONTokener.java:346) ~[na:na] at org.json.JSONArray.<init>(JSONArray.java:124) ~[na:na] at org.json.JSONTokener.nextValue(JSONTokener.java:350) ~[na:na] at org.json.JSONObject.<init>(JSONObject.java:205) ~[na:na] at org.json.JSONTokener.nextValue(JSONTokener.java:346) ~[na:na] at org.json.JSONObject.<init>(JSONObject.java:205) ~[na:na] at org.json.JSONTokener.nextValue(JSONTokener.java:346) ~[na:na] at org.json.JSONObject.<init>(JSONObject.java:205) ~[na:na] at org.json.JSONObject.<init>(JSONObject.java:419) ~[na:na] at org.eclipse.orion.server.core.tasks.TaskInfo.fromJSON(TaskInfo.java:65) ~[na:na] at org.eclipse.orion.internal.server.core.tasks.TaskService.getTasks(TaskService.java:229) ~[na:na] at org.eclipse.orion.internal.server.servlets.task.TaskServlet.doGet(TaskServlet.java:160) ~[na:na] at javax.servlet.http.HttpServlet.service(HttpServlet.java:735) ~[javax.servlet_3.0.0.v201112011016.jar:na] at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) ~[javax.servlet_3.0.0.v201112011016.jar:na] at org.eclipse.equinox.http.registry.internal.ServletManager$ServletWrapper.service(ServletManager.java:180) ~[na:na] at org.eclipse.equinox.http.servlet.internal.ServletRegistration.service(ServletRegistration.java:61) ~[na:na] at org.eclipse.equinox.http.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:38) ~[na:na] at org.eclipse.orion.server.configurator.servlet.AuthorizedUserFilter.doFilter(AuthorizedUserFilter.java:84) ~[na:na] at org.eclipse.equinox.http.registry.internal.FilterManager$FilterWrapper.doFilter(FilterManager.java:173) ~[na:na] at org.eclipse.equinox.http.servlet.internal.FilterRegistration.doFilter(FilterRegistration.java:81) ~[na:na] at org.eclipse.equinox.http.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:35) ~[na:na] at org.eclipse.orion.internal.server.hosting.HostedSiteRequestFilter.doFilter(HostedSiteRequestFilter.java:50) ~[na:na] at org.eclipse.equinox.http.registry.internal.FilterManager$FilterWrapper.doFilter(FilterManager.java:173) ~[na:na] at org.eclipse.equinox.http.servlet.internal.FilterRegistration.doFilter(FilterRegistration.java:81) ~[na:na] at org.eclipse.equinox.http.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:35) ~[na:na]
Created attachment 211265 [details] Szymon's tasks This is the list of Szymon's tasks. Nothing really unusual except that there are plenty of operations here that should be marked as idempotent. Sometimes idempotent may be not removed by the client when site was closed before the operation finished. It may also happen when there was some debugging going on the the page or some uncaught exception was thrown when tracking the operation. Maybe we could do some server-site logic that removes the idempotent operation after an hour just in case client scheduled it but did not wait for the result. When operation is correctly tracked client removes the operation 5 seconds after it was read, provided that it finished without errors.
I looked at orion.eclipse.org and everyone seem to have less than 10 tasks, that's probably why we never had this problem there.
Created attachment 211276 [details] Removing idempotent tasks after 15 minutes. Szymon, this patch contains fix that removes idempotent tasks after 15 minutes if they are not removed by the client. Could you apply it and use for a while. This is the only "first aid" I can imagine for 0.4, but still I think it needs some testing before we apply it.
Don't know if this is the right bug, but I wanted to share my concern about the "idempotent" criteria used for removing tasks. I can imagine a situation when I would like to check whether an idempotent task (eg git log for a large repo) has finished (and when bug 370967). Not seeing it in the Ops view doesn't mean it's done, does it?
(In reply to comment #5) > Not seeing it in the Ops view doesn't mean > it's done, does it? It means that it's not running. It might have been done or cancelled. Do you have a particular case when user wants to check the status of task like git log and uses Operations view to do this, not the place where actually the log was to be consumed?
Bug 374094 - first occurrence of out of memory.
(In reply to comment #7) > Bug 374094 - first occurrence of out of memory. Isn't that a dupe of this bug? Also, severity of the other bug sounds like higher than "normal" to me. It seems it is no longer a performance issue.
(In reply to comment #8) > (In reply to comment #7) > > Bug 374094 - first occurrence of out of memory. > > Isn't that a dupe of this bug? Also, severity of the other bug sounds like > higher than "normal" to me. It seems it is no longer a performance issue. It'd say it's a subtask. I think I may find a solution to Bug 374094 but there will be still a lot to do in the area of performance of tasks. But you are right about the severity of Bug 374094, I will increase it.
Most of the work has been done in this area up to 2.0. I see no point of keeping generic bug opened. I have opened/planned bugs for particular changes that can be done in area of operations performance and I am closing this bug.
(In reply to comment #10) > I have opened/planned bugs for particular > changes that can be done in area of operations performance (...) What are the bug ids?
(In reply to comment #11) > What are the bug ids? Bug 382303, Bug 398005, Bug 374362