Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [rdf4j-dev] shac, statistics on classes etc

Hi Bart, does this tool support SHACL-SPARQL constraints? (https://www.w3.org/TR/shacl-af/)


-----Original Message-----
From: Bart Hanssens (BOSA) via rdf4j-dev <rdf4j-dev@xxxxxxxxxxx>
To: rdf4j developer discussions <rdf4j-dev@xxxxxxxxxxx>
Cc: Bart Hanssens (BOSA) <bart.hanssens@xxxxxxxxxxxx>
Sent: Thu, Apr 27, 2023 8:03 am
Subject: [rdf4j-dev] shac, statistics on classes etc

Hi,
 
Just a quick note and some thoughts.
 
I’m developing a stand-alone SHACL validator, nothing fancy, which is to be integrated in my data.gov.be toolchain
 
Of course the SHACL part works like charm, thanks Håvard 😉
Only a few minor issues that will either be solved in 4.3 (severity level),
or are arguably issues with the SHACL files on semic.eu (name on nodeshape, and nodeshapes with empty shacl:property)
 
I was wondering if it would be hard (or interesting for other people) to collect statistics on
  1. number of times a shape did _not_ have validation issues , or how many times a shape matched in total
  2. number of different classes/properties/object values in a dataset
 
Use case for (a) is mainly a metric for data quality (shape violations divided by total),
while (b) is useful for harmonizing data (eg reducing differences) but probably useful for optimizing queries / data storage as well.
 
For the time being I’m (ab)using data cubes for publishing the statistics in TTL, but perhaps there is a better vocabulary.
And I’m guessing some data stores already collect some of this data.
 
Happy to look into it myself, though hints on how to get started would be appreciated 😊
 
 
Best regards,
 
Bart
 
 
_______________________________________________
rdf4j-dev mailing list
rdf4j-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/rdf4j-dev

Back to the top