Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [rdf4j-dev] potential iterator leak in repository handling?

I haven’t looked at the purpose of the SailModel, but I assumed it was more for convenience rather than performance, I could very well be wrong though. If you are just using it for the Sail config stuff then there shouldn’t be much data that would need to be materialised. 

As for making the SailModel AutoCloseable, I would assume that the code that creates the SailModel would be responsible for closing it. The SailConnection needs to closed anyways, so also making sure to close the SailModel would hopefully fit together with that. 

Håvard

On 12 Apr 2023, at 01:53, Jeen Broekstra <jeen@xxxxxxxxxxxx> wrote:




On Wed, 12 Apr 2023, at 08:14, Håvard Ottestad wrote:
We could make the SailModel materialise everything. E.g. a CloseableIteration from the SailConnection would be copied to a list, and then we return the list and close the iteration. That would consume more memory, but it would be much safer. 

Doesn't that kind of defeat the purpose of having the SailModel in the first place though? I think it was designed to be able to deal with collections that don't fit in memory. 


We could also track all things that are closeable in a concurrent list and add a close method to the SailModel that closes everything in the list. Essentially making the SailModel AutoCloseable.

That might solve part of the issue, but unless someone is aware they are using a SailModel (and therefore explicitly cast to that) they won't get any hints or notifications about leaks. 

I think we need some way to signal at the (top) interface level that something is (Auto)Closeable. The problem is that Model is not the right place for that (because we don't want Model to become harder to use for its intended usage). Sounds to me like really, SailModels should not be Models, but implement some other interface that is _almost_ like Model, except Autocloseable :) 

Jeen

 

Håvard

On 11 Apr 2023, at 21:26, Jeen Broekstra <jeen@xxxxxxxxxxxx> wrote:


Thanks Matthew, that makes sense. It's a known issue with SailModel and anything based on it, which is why in RDF4J at least we're careful to only use it internally, in code where we have control over what happens with the iterators (still not ideal though).

The problem is really what you say: the designs are fundamentally at odds. I have been wondering if we should have a different top interface for SailModel etc to derive from, to be able to keep the use cases separate.

Jeen 

On Wed, 12 Apr 2023, at 05:23, Matthew Nguyen wrote:
Hey folks, wanted to close the loop here.  It turns out we're using SailModel which takes a SailConnection in the constructor (https://github.com/eclipse/rdf4j/blob/main/core/sail/model/src/main/java/org/eclipse/rdf4j/sail/model/SailModel.java#L60) which can produce AutoCloseable iterators.  SailModel derives (invariably) from Model and implements a fair number of methods using AutoCloseable friendly schemes but it does miss a few things like getStatements()(https://github.com/eclipse/rdf4j/blob/main/core/model-api/src/main/java/org/eclipse/rdf4j/model/Model.java#L172)  which can devolve into AutoCloseable unaware code.  In short, these two classes aren't really compatable and either require a fix and/or better separation.


-----Original Message-----
From: Jeen Broekstra <jeen@xxxxxxxxxxxx>
To: rdf4j developer discussions <rdf4j-dev@xxxxxxxxxxx>
Sent: Thu, Apr 6, 2023 10:09 pm
Subject: Re: [rdf4j-dev] potential iterator leak in repository handling?


On Fri, 7 Apr 2023, at 11:54, Matthew Nguyen via rdf4j-dev wrote:

Think some context may have been lost along the way but I'm referring to models.getstatements() returning an iterator that is not AutoCloseable.

Ok this seems to be the crux of the thing.

First of all, a nitpick: model.getStatements does not return an iterator. It returns an Iterable. The intent is that it can be used in a for-each loop.

Implementations of Iterable are things like Model, or HashSet, or ArrayList. None of these implement AutoCloseable. If you are suggesting that getStatements should return an AutoCloseable, then Model itself should also be AutoCloseable, for consistency's sake.

It would be technically possible to do this of course, but it breaks one of the main design motivations for the Model API: ease of use for people famiiiar with the Java Collection framework. Suddenly their IDEs will start screaming at them at every point where they're creating/accessing a model that they have potential resource leaks and should wrap things in try-with-resources blocks, when in 99.9% of use cases there is no need for this, as all model implementation we provide for use in third party code are in-memory collections, and not database or file-backed.

Cheers,

Jeen

_______________________________________________
rdf4j-dev mailing list
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/rdf4j-dev
_______________________________________________
rdf4j-dev mailing list
rdf4j-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/rdf4j-dev
_______________________________________________
rdf4j-dev mailing list
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/rdf4j-dev


_______________________________________________
rdf4j-dev mailing list
rdf4j-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/rdf4j-dev

Back to the top