Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jetty-dev] JETTY-1245 and buffer investigations


simone,

I'm responding the the jetty dev list, as this is relevant there.

On 4 March 2011 02:59, Simone Bordet <sbordet@xxxxxxxxxxx> wrote:
Hi,

while working on http://jira.codehaus.org/browse/JETTY-1245, I
stumbled upon a strange behavior of the CometD load client with SSL,
that we had already seen in the past, but did not investigate.
The behavior is that the client seems to be consuming more resources
than the server, in particular memory but also CPU.

After much fiddling, it turned out to be a problem due to the
combination of buffer sizes and ThreadLocalBuffers.
What happens is that sometimes a buffer of size "application_buffer"
is requested, and some other times a buffer of size "packet_buffer" is
requested and the 2 sizes are different.

However, since ThreadLocalBuffers checks for size equality before
reusing buffers (it only reuses buffers of the same size), unfortunate
combinations makes the buffers be discarded in > 40% of the cases for
the "header" buffer and > 25% of the cases for the "other" buffer (I
counted the hits and misses).
This leads to an excessive creation of direct buffers that leads to
JETTY-1245. Basically we are caching thread-locally a lot of buffers,
but they happen to be 40% of the times of the wrong size, so new ones
are created.

In a typical CometD load run, something like 25k buffers are created
(I know this from JDK 7's new MBean that shows DirectBuffer
information), and since each is almost 17k in size, it makes > 400 MiB
of direct buffer memory which is dangerously close to the max direct
memory (which is roughly equal to the heap size) when started with
-Xmx512m. Depending on the rate of messages, it's easy to go into
OOME.

I have written an alternative implementation of Buffers that uses a
single concurrent map keyed on buffer size and holding a concurrent
queue of buffers, and the miss rate is < 2 % even on first run (and
approaches zero for successive runs).
While both are non-blocking structures, it is a single access point
accessed by multiple threads, so perhaps we need some discussion on
pros and cons, or perhaps how to improve ThreadLocalBuffers.

I am investigating now also why even without using SSL, the client
still requires more resources than the server, as I feel it could be
related.

Simon
--
http://bordet.blogspot.com
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz


The ThreadLocal buffer store was used because of contention on a central store.
But that predates the availability of concurrent maps.

Given the different types and sizes of buffers, it was always going to be a compromise.

SSL etc are not using direct buffers, because they need to be accessed by the SSL engine to encyrpt/decrypt.

So I'm open to using concurrent map - perhaps with a threadlocal to hold the last returned buffer, so if that is the 60% case, then 60% of the time we'll avoid any spinning on the concurrents.

cheers








Back to the top