Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [rdf4j-dev] Split RDF4J Project

On Tue, 2017-08-29 at 22:17 +0100, Jacek Grzebyta wrote:
> On 28 August 2017 at 14:39, James Leigh <james.leigh@xxxxxxxxxxxx>
> wrote:
> > I propose the following setup:
> > 
> >  * /rdf4j           rio, repository-api, and http client
> >  * /rdf4j-storage   repository-sail, and all sail impls
> >  * /rdf4j-tools     server, workbench, console, runtime, bom,
> > assembly
> >  * /rdf4j-testsuite benchmark and testsuites
> >  * /rdf4j-doc       remains the home of documentation guides
> > 
> I imagine the new 'units' would be treated as git submodules (see htt
> ps://git-scm.com/book/en/v2/Git-Tools-Submodules) rather than java-
> level modules (Am I right?)  so I suggest "/rdf4j" unit would be the
> main repository - that one with .gitmodules file. If so I guess there
> should be a separate "policy of changes" for that.
> 

I envision them as separate projects with separate release artifacts
and staggered release dates. Similar to the way HttpComponents has
HttpClient and HttpCore. The URLs would be:

https://github.com/eclipse/rdf4j
https://github.com/eclipse/rdf4j-storage
https://github.com/eclipse/rdf4j-tools
https://github.com/eclipse/rdf4j-testsuite
https://github.com/eclipse/rdf4j-doc


> The role of that module would be to simplify of the whole project
> compilation. I think client, server and http might be deeply related
> with bunch of another modules (risk of spaghetti???) so IMHO they
> should be kept in the same unit.
>  

Actually, the separation is fairly clean as both http-client and http-
server depend on a common http-protocol module for shared terms.
Furthermore, with the resolution of #870 there is now a clean
dependency separation between rdf4j-client and rdf4j-storage (rdf4j-
client does not depend on any module in rdf4j-storage).

The http-server dependency is much larger than http-client as http-
server uses springframework and depends on all the SAIL modules. This
prevents it from being deployed to platforms that require a smaller
foot print (like Andriod). Below is the file size breakdown to show the
significance of the separation.

rdf4j-client.jar is 1.7M or 5.4M including all dependencies.
rdf4j-storage.jar is 1.4M or 26M in additional dependencies.



> > Compliance tests would be split up into the corresponding module.
> > Each
> > github project would be a fork of the current and each would have
> > master/develop branches and each would be setup to be built
> > nightly.
> > 
> > Issues and Pull-Requests would also be separated, but github allows
> > referencing issues and PRs in separate projects to aid with
> > integration.
> > 
> All java modules has dedicated integration test within the testsuit.
> Usually they all kept in the same PR. If we separate the code changes
> from the integration tests how to relate at least two PRs (3 PRs are
> for a combination: code+testsuit+documentation)?
> 

If there are any unit tests in the testsuite module they should be
migrated to their corresponding module. For the most part though, the
test suites are for compliance testing and many are based on w3c test
suites, which are maintained separately anyway. Documentation is
already in its own project (/rdf4j-doc) and wouldn't change. However,
there will be cases when a bug is revealed and a common test (against
all Repository implementations for example) is appropriate and that
would require two PRs (they can be linked though). In that case, the
effects of the test might reach further then the implementations in
rdf4j and it might be beneficial to separate the test code out for
third parties to have a chance to review and approve it in isolation.

Thanks for your questions Jacek.

James



Back to the top