Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [rdf4j-dev] SQL sail connection

Hi,

I’ve not used Halyard or HBase myself. It gives me the impression that it is much more low level than an SQL database with lower latency. 

As far as I remember, Jena SDB didn’t offload any logic to the SQL database. It might have had better performance if it had converted joins into SQL queries. 

The ElasticsearchStore works in the same way. It stores triples in ElasticSearch, then retrieves them to perform the join using RDF4J. Not that ElasticSearch really has any support for joins. 

Halyard seems designed around the same principle, but I would think that HBase has lower latency. Halyard also places their RDF4J layer within each node and manages to run joins locally on each node. That’s a lot more efficient than moving all the data to a single node and performing joins there, it also allows for parallel joins. 

Håvard

On 29 Jun 2023, at 23:45, Dan S <danielms853@xxxxxxxxx> wrote:


Thank you so much  Håvard!

Do you happen to know if Halyard was any faster than Jena SDB or similar/Elastic search? I think most of our data would also be static. I was also thinking of using either a similar indexing scheme to Halyard or multicolumn indexes to ideally speed up triple searching on the postgres side (we don't have particularly high performance needs, our needs are mostly around stability and durability). We were hoping to use/develop an open source solution, so Graph DB doesn't seem like an option.

Thanks,

Daniel

On Thu, Jun 29, 2023 at 8:46 PM Håvard Ottestad <hmottestad@xxxxxxxxx> wrote:
Hi,

If you want a replicated database you could try GraphDB. It’s not SQL, but it’s ACID and supports replication. RDF4J NativeStore is also ACID but doesn’t support any replication, and it’s less battle tested compared to GraphDB.

My experience with Jena SDB has been that it had terrible performance and needed to move a lot of data on the network.

I’ve implemented the ElasticsearchStore, which would allow you to have replicated ElasticSearch instances. Performance is rather terrible too. It’s useful if you are already using ElasticSearch for something and just want to store a small amount of mostly static RDF. Which is how I’m using it.

I would recommend going with a database that already supports RDF and SPARQL.

Cheers,
Håvard M. Ottestad

> On 29 Jun 2023, at 20:05, Dan S <danielms853@xxxxxxxxx> wrote:
>
> 
> Hello,
>
> I was wondering whether there was any existing SQL SAIL implementation for RDF4J (particularly targeting mysql or postgres), perhaps in the vein of Halyard.
>
> We (my team at work) and I would be interested in an Rdf4j implementation backed by a replicated store with ACID transactions (or, potentially a Jena equivalent, but RDB seems to be deprecated).
>
> I would potentially be willing to contribute an implementation.
>
> Thank you,
>
> Daniel Scanteianu
>
> _______________________________________________
> rdf4j-dev mailing list
> rdf4j-dev@xxxxxxxxxxx
> To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/rdf4j-dev
_______________________________________________
rdf4j-dev mailing list
rdf4j-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/rdf4j-dev
_______________________________________________
rdf4j-dev mailing list
rdf4j-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/rdf4j-dev

Back to the top