Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [rdf4j-dev] Split RDF4J Project

On 31 August 2017 at 05:18, James Leigh <james.leigh@xxxxxxxxxxxx> wrote:
> On Tue, 2017-08-29 at 22:17 +0100, Jacek Grzebyta wrote:
>> On 28 August 2017 at 14:39, James Leigh <james.leigh@xxxxxxxxxxxx>
>> wrote:
>> > I propose the following setup:
>> >
>> >  * /rdf4j           rio, repository-api, and http client
>> >  * /rdf4j-storage   repository-sail, and all sail impls
>> >  * /rdf4j-tools     server, workbench, console, runtime, bom,
>> > assembly
>> >  * /rdf4j-testsuite benchmark and testsuites
>> >  * /rdf4j-doc       remains the home of documentation guides
>> >
>> I imagine the new 'units' would be treated as git submodules (see htt
>> ps://git-scm.com/book/en/v2/Git-Tools-Submodules) rather than java-
>> level modules (Am I right?)  so I suggest "/rdf4j" unit would be the
>> main repository - that one with .gitmodules file. If so I guess there
>> should be a separate "policy of changes" for that.
>>
>
> I envision them as separate projects with separate release artifacts
> and staggered release dates. Similar to the way HttpComponents has
> HttpClient and HttpCore. The URLs would be:
>
> https://github.com/eclipse/rdf4j
> https://github.com/eclipse/rdf4j-storage
> https://github.com/eclipse/rdf4j-tools
> https://github.com/eclipse/rdf4j-testsuite
> https://github.com/eclipse/rdf4j-doc

Managing each of the components separately based on function is great
for development (not so great for testing in the current state given
the general lack of unit tests and reliance on the testsuite modules
for coverage, but that can be improved).

My only objection is to staggered release dates, which would imply
that the most current version for each component won't be the same,
and a user may not be able to update to the latest client code until
storage has been released, for instance, or hypothetically there may
be more patch releases for the client code than the storage, leading
the version numbers being out of sync.

If releases are staggered, there also needs to be discussion about
where the rdf4j-bom (containing the latest versions) would be located.
It can't be located in rdf4j (client module) anymore, as client would
then need to be released again for each of the staggered release dates
for storage and tools come up, which would imply that the storage and
tools release cycles would still be affecting rdf4j (client module).
rdf4j-bom would need its own repository, possibly co-located with the
parent pom (or have the parent pom in a separate repository).

I still see it as essential that there is a single version number
common across the entire RDF4J module set for the most current
version, leading to non-staggered release dates as a conclusion. The
HttpClient and HttpCore model is not a good model for RDF4J IMO, as
even it with its small number of modules (compared to RDF4J) has given
me issues in deciding whether a given project has
compatible/up-to-date releases for each.

RDF4J generally keeps its API more stable than HttpClient/HttpCore
that have been known to remove APIs in patch releases. However, the
key is with users needing to either fully rely on rdf4j-bom
(where/when is the bom released) or they need knowledge about each of
the staggered release dates and when they can actually pick up new
code and know it will be compatible (not just API compatibility, but
also having consistent bug fixes across the modules).

Cheers,

Peter


Back to the top