Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jetty-users] Jetty Client 9 for Iudex crawler

Hi,

On Wed, Dec 12, 2012 at 2:20 AM, David Kellum <dek94@xxxxxxxxxxxxx> wrote:
> I'm working on upgrading use of Jetty Client in the IĆ«dex web crawler to
> 9.0.0.M3.  Firstly, despite the work of absorbing a rewrite, client 9.x
> looks to simplify several aspects of my integration.  I like the callback
> pure-interfaces and move to use nio ByteBuffers (which matches my
> internals.)  Thanks very much for the open source!
>
> I have couple of questions below, mostly related to the changes of client
> 9.x vs client 7.x  My current Client code is here:
>
> https://github.com/dekellum/iudex/blob/jetty-9/iudex-jetty-httpclient/src/main/java/iudex/jettyhttpclient/Client.java
>
> Timeouts
>
> Client 7.x had settings for timeout, soTimeout, connectTimeout and
> idleTimeout.  Client 9.x only has idleTimeout and connectTimeout. My timeout
> related integration tests do appear to work: is idleTimeout essentially used
> as an soTimeout and (catch all) timeout in client 9.x?

Documentation link:
http://www.eclipse.org/jetty/documentation/current/http-client.html.
Please report if you miss some documentation sections or things you
would like to see in the docs.

In 7.x soTimeout was basically unused, especially with the NIO
connector. With this timeout out of the picture, in 9.x you get the
same timeouts.
The idleTimeout works with the same semantic of soTimeout for sockets,
while the global timeout for the whole request/response conversation
is either achieved with the Future returned by send(), or via the
utility class TimedResponseListener for asynchronous usage.
So, idleTimeout and the global timeout have 2 different meanings: you
can have a slow connection that sends 1 byte every second, so the
idleTimeout never fires, but the total timeout does fire.

> Retry
>
> Client 7.x had a setting for maxRetries.  Is retry as a feature no longer
> supported in Client 9.x? Any plans to add it?

Not supported in 9.x.
Can you explain how you would use it, and how would it work to be
useful for you ?

> Cookies
>
> I think I have may have a use case for reuse of the new Cookie support,
> however, this being a crawler I need more control over it.  Is there a way
> to control what cookies are sent on a per-request basis while still using
> Jetty's pooled connections?  In other words I would like to introspect
> cookies from a response, possibly filtering, and then apply these to a
> subsequent request to the same registration-level domain but possibly via a
> different connection.   I believe this is not completely unlike how browsers
> behave.
>
> Short of this, is there a way to disable cookie storage and sending
> entirely, with pooled connections?

Cookie handling will change in M4, since we decided to base it on JDK
classes java.net.CookieStore and the likes to avoid code duplication
and integration with WebSocket.
You can inspect cookie headers in Response.HeadersListener callbacks,
where all headers are arrived.
To filter cookies that are stored, you can "wrap" the CookieStore
implementation in this way:

HttpClient client = ...;
client.setCookieStore(new FilteringCookieStore(new HttpCookieStore()));

HttpCookieStore is a utility class from the jetty-util module, and
FilteringCookieStore will be a class you write that filters cookies
based on your logic, delegating to the inner CookieStore.

This is available in current master branch only, and shortly in M4.

Thanks for the feedback !

Simon
--
http://cometd.org
http://webtide.com
Developer advice, training, services and support
from the Jetty & CometD experts.
----
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz


Back to the top