Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [cbi-dev] Some questions on CJE

Hi Karsten,

See my answers below.


- We saw now multiple times that builds were killed due to lack of heap memory. How many heap does a job have, or can this somehow configured for the pod instances?

If you don't specify anything, each container on CJE gets 4GiB of RAM. Note that this is specific to CJE as we don't have fine control over resource allocation there. With JIRO JIPPs, you will get what we provide to projects as specified on https://wiki.eclipse.org/CBI#What.27s_provided.3F.

The heap of your JVM(s) is defined by their ergnomics or the Xmx and related options you can set at startup. If the total RAM consumed by all processes in your container goes above 4GiB, you container will be killed by the OOMKiller.


- We had long build wait queues and sometimes only 2 pods were started, sometimes a bit more. Sometimes jobs were started, but waited long for an available executor. How can the waiting time optimised? Is there something we could do to mitigate that issue?

You should only be able to have 2 agents running at the same time, anytime. We don't have tight control over that on CJE, so sometime, you may get more. Jobs will be put in the queue as long as the 2 agents are busy.


- We tried to execute some build steps in parallel using parallel stages in Jenkinsfile. These jobs were killed due to OOM. We changed that now back to sequential stages. Is parallel building discouraged?

No, it's not discouraged. If the parallel stages are set to run in the same agent, then you will most probably be hit by OOM because agents don't have an infinite amount of memory. You can specify a different agent for each parallel stage, so each stage will run in a separate agent (it's all defined by labels). Of course, you won't be able to start more than 2 agents at the same time as specified above.

I'm about to start writing a section on the wiki about how resource allocation happens on the new clustered infrastructure. Hopefully, it will provide enough info.

Cheers,

Mikaël Barbero 
Team Lead - Release Engineering | Eclipse Foundation
📱 (+33) 642 028 039 | 🐦 @mikbarbero
Eclipse Foundation: The Platform for Open Innovation and Collaboration

Attachment: signature.asc
Description: Message signed with OpenPGP


Back to the top