Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[smila-dev] Lucene indexing performance

Hi all,

during an index build (over 150.000 documents) we noticed that indexing
speed gets slower as the index increases in size. Compared to the first
hour of execution, the 2nd hour was only capable of indexing 80% of the
load that was indexed in the first hour.

I took a look at the Lucene integration code (by brox) and found, that
for each index update (add or delete) a new IndexWriter is created and
closed. This assures that the document is committed for IndexReaders and
the index is flushed, but I guess that it's bad for performance.

What were the reasons for implementing it that way ? Wouldn't it be
possible to reuse an IndexWriter, flushing the index either by Memory
usage or number of documents added/deleted ?

Bye,
Daniel


Back to the top