Re: [rdf4j-dev] SHACL sail refactor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [rdf4j-dev] SHACL sail refactor

From: Jeen Broekstra <jeen.broekstra@xxxxxxxxx>
Date: Sun, 8 Apr 2018 09:56:15 +1000
Delivered-to: rdf4j-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/rdf4j-dev>
List-help: <mailto:rdf4j-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/rdf4j-dev>, <mailto:rdf4j-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/rdf4j-dev>, <mailto:rdf4j-dev-request@eclipse.org?subject=unsubscribe>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0



On 06/04/18 04:18, Håvard Ottestad wrote:

Hi Jeen,

Thanks for helping out :) The Code isn’t very clean no :(

Some basic concepts that I’ve worked on:

- A need to convert the shacl rules from rdf into a java model with
methods that represent the de-normalized rules. Eg. the AST.

- Based on what has changed in a transaction, we need to generate a
plan for how to get data to validate. These are the plan nodes. And
the tracking is done with the two memory stores in the connection
(first in two hashsets )

- A need to transport data and track where it came from, so we can
tell the user that they broke a shacl rule because of some triples
added or removed. This is done by the Tuple class.


All sounds very good.

I don’t want to mix data with rules. Regular SQL databases don’t
store their schemas inside the users database. The SHACL sail should
support multiple options for loading in rules and options for
updating rules. But updating rules should be done explicitly with an
explicit command (also a good reason for not having them inside the
userdata). Updating rules is not currently supported.


+1 on supporting multiple options for loading in rules.

What I prefer is that at the level of the API, the rules are exposed as
a 'virtual' named graph, e.g. 'rdf4j:Shapes'. The idea would be that the
rules can be examined and updated through normal SPARQL / Repository API
operations that explicitly use this named graph identifier.
However, the ShaclSail would ensure that this named graph is not
included in the default graph, so "normal" queries/updates won't access
the rules data. This means we still have separation between rules and data.

There's several advantages to this approach: first of all access andmanipulation is available via the Repository API directly, using genericmethods, so it won't involve any user code that needs to call specialcustom methods at the level of the ShaclSail (this is an importantdesign consideration - user code should not be required to do anyoperations at the level of the Sail beyond initialconfig/initialization). It also gives us multiple upload/modificationoptions for the rules, for free. Supporting rule updates becomes easier,as the ShaclSail can detect changes to the rules in a named graph moreeasily than if they are offered via a separate repository object.


Also note that in this setup, the decision on whether the rules data
lives in the same physical store as the user data is up to the Sail itself.

And, btw. I do have a branch where I’m working on supporting
sh:datatype, and someone else was working on some cleanup and support
for string based restrictions. Can I maybe merge my branch into
yours, it’s mostly done, and it also contains cleanup of some of the
tests?


I haven't really made as much progress on my changes as I expected (I

blame jet lag), so it might in fact be easier if I just wait untilyou're ready to merge and then pull the updates into my branch. Were youworking from master or are you treating this as feature dev work?


Cheers,

Jeen

Follow-Ups:
- Re: [rdf4j-dev] SHACL sail refactor
  - From: Håvard Ottestad

References:
- [rdf4j-dev] SHACL sail refactor
  - From: Jeen Broekstra
- Re: [rdf4j-dev] SHACL sail refactor
  - From: Håvard Ottestad

Prev by Date: Re: [rdf4j-dev] Planning RDF4J 2.3.1
Next by Date: Re: [rdf4j-dev] SHACL sail refactor
Previous by thread: Re: [rdf4j-dev] SHACL sail refactor
Next by thread: Re: [rdf4j-dev] SHACL sail refactor
Index(es):
- Date
- Thread

Breadcrumbs