Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jetty-users] SolrJ/Solr: HTTP protocol violation: Authentication challenge without WWW-Authenticate header

I usually look for low-effort changes to test assumptions and tug on the tangle from different directions when I'm stuck. In that spirit:

The queries gathering documents from the source are faster with a filter in place, so they feed data to the queue faster.  I think this is probably because it is only sorting a few million documents for each document batch instead of the full 30 million.

If you can turn off the sorting for a test, that would prove the assumption. 
 
I just looked up what SolrJ does - no insights, but now I'm curious how well the parallel indexing performs compared to postgres' single-threaded index builds. I wonder if it could speed up your bigger query's sort time.
 
Bill

--

Phobrain.com


On 2023-05-31 19:58, Shawn Heisey wrote:

On 5/31/23 17:48, Bill Ross wrote:
Can you swap in another httpclient to test? I assume swapping jetty server would be too much, given something works. :-)

I can't do anything about the Jetty server without upgrading Solr.  I really want to get them upgraded, but it's not up to me.

I tried to use the legacy SolrJ clients that utilize Apache HttpClient 4.x, but for an unknown reason I was not able to get those clients to work.  I am not using any http client directly, I use SolrJ.  Layers upon layers.  I am completely shielded from any direct interaction with the Jetty client by SolrJ.

 From faster result on smaller batch size: are you monitoring memory use? I'd try even smaller, looking at the perf profile for clues.

The queries gathering documents from the source are faster with a filter in place, so they feed data to the queue faster.  I think this is probably because it is only sorting a few million documents for each document batch instead of the full 30 million.

I was running my program with a 1GB heap.  With a queue size of 100000 or 150000, that worked well.

I later bumped the queue size to 200000 and had to bump the heap because I got OOME.  The space is consumed by the SolrInputDocument objects on the queue.  I set the heap to 2GB for the 200K queue size.  Now the max queue size is 500K and the heap is 5GB.  A larger queue evens out the transfer of data from the query thread to the indexing threads and keeps the migration from stalling.

I'm using ZGC to optimize for latency.  The code is compiled for Java 11.

Thanks,
Shawn
_______________________________________________
jetty-users mailing list
jetty-users@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/jetty-users

Back to the top