Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [rdf4j-dev] SHACL sail refactor



On 06/04/18 04:18, Håvard Ottestad wrote:
Hi Jeen,

Thanks for helping out :) The Code isn’t very clean no :(

Some basic concepts that I’ve worked on:

- A need to convert the shacl rules from rdf into a java model with
methods that represent the de-normalized rules. Eg. the AST.

- Based on what has changed in a transaction, we need to generate a
plan for how to get data to validate. These are the plan nodes. And
the tracking is done with the two memory stores in the connection
(first in two hashsets )

- A need to transport data and track where it came from, so we can
tell the user that they broke a shacl rule because of some triples
added or removed. This is done by the Tuple class.

All sounds very good.

I don’t want to mix data with rules. Regular SQL databases don’t
store their schemas inside the users database. The SHACL sail should
support multiple options for loading in rules and options for
updating rules. But updating rules should be done explicitly with an
explicit command (also a good reason for not having them inside the
userdata). Updating rules is not currently supported.

+1 on supporting multiple options for loading in rules.

What I prefer is that at the level of the API, the rules are exposed as
a 'virtual' named graph, e.g. 'rdf4j:Shapes'. The idea would be that the
rules can be examined and updated through normal SPARQL / Repository API
operations that explicitly use this named graph identifier.
However, the ShaclSail would ensure that this named graph is not
included in the default graph, so "normal" queries/updates won't access
the rules data. This means we still have separation between rules and data.

There's several advantages to this approach: first of all access and manipulation is available via the Repository API directly, using generic methods, so it won't involve any user code that needs to call special custom methods at the level of the ShaclSail (this is an important design consideration - user code should not be required to do any operations at the level of the Sail beyond initial config/initialization). It also gives us multiple upload/modification options for the rules, for free. Supporting rule updates becomes easier, as the ShaclSail can detect changes to the rules in a named graph more easily than if they are offered via a separate repository object.

Also note that in this setup, the decision on whether the rules data
lives in the same physical store as the user data is up to the Sail itself.

And, btw. I do have a branch where I’m working on supporting
sh:datatype, and someone else was working on some cleanup and support
for string based restrictions. Can I maybe merge my branch into
yours, it’s mostly done, and it also contains cleanup of some of the
tests?

I haven't really made as much progress on my changes as I expected (I
blame jet lag), so it might in fact be easier if I just wait until you're ready to merge and then pull the updates into my branch. Were you working from master or are you treating this as feature dev work?

Cheers,

Jeen


Back to the top