Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
AW: [smila-dev] Re: [Aperture-devel] Aperture bundlization for SMILA

Hi all,
some more comments inline. (It's getting confusing :-) )


> -----Ursprüngliche Nachricht-----
> Von: smila-dev-bounces@xxxxxxxxxxx [mailto:smila-dev-
> bounces@xxxxxxxxxxx] Im Auftrag von Leo Sauermann
> Gesendet: Montag, 26. Januar 2009 14:42
> An: Stucky, Daniel, M-ED
> Cc: smila-dev@xxxxxxxxxxx; aperture-devel@xxxxxxxxxxxxxxxxxxxxx
> Betreff: [smila-dev] Re: [Aperture-devel] Aperture bundlization for
> SMILA
> 
> Hi Daniel!
> 
> good work!
> 
> In general: I proposed that fine grained packages are the way to go in
> december, and I thing we should
> document all these proposals and the decisions here:
> http://aperture.wiki.sourceforge.net/ApertureInOSGi
> 
> so Antoni, Daniel, please read on and say what is "the masterplan" now
> and then
> someone (= probably I) will change the ApertureInOSGi wikipage to show
> the masterplan.
> 
> answers within...
> 
> It was Daniel.Stucky@xxxxxxxxxxx who said at the right time 26.01.2009
> 11:08 the following words:
> > Hi all,
> >
> > with the fixes provided by Antoni I managed to get the "bundelized"
> > aperture to run in Smila.
> >
> horray!
> 
> > In Smila we should refactor our two existing aperture integration
> > bundles into just one and also clean up the code and implement a
> > ProcessingService instead of a pipelet (Aperture OSGi services are
> used
> > now which "cries" to use DS)
> >
> no, I think you must not refactor these into one!
> from what I know about our architecture, refactoring the bundles would
> cause trouble:
> * aperture is separated into interfaces and implementations (framework
> <> implementation), bundling it into one would give the wrong
> impression
> to other developers who would then think that aperture is a monolithic
> piece of .... . whereas "really" Aperture is a perfectly osgi
> conformant
> framework, similar to eclipse extension points. (=you would also not
> bundle all implementations of an extension point into the bundle
> defining the extension!)
> * if there are different binary releases for OSGi and on sourceforge,
> this would cause desaster.
> we intentionally want to have ONE RELEASE as java and osgi versions,
> repackaging it for Eclipse would break this.
> 
> did I miss something? does this help?

Sorry for the confusion, this has nothing to do with the current aperture bundles, but with smila bundles using aperture functionality (unfortunately called aperture bundles by me). Therein we have to set some things straight (e.g. reducing them to just one bundle as the other has become obsolete, accessing Extractors using OSGi services), as there are some legacies from our first try to integrate the non osgi aperture jars. So, this is just Smila homework, nothing to for aperture team, no refactoring of the "real" aperture bundles here.


> > Here is a list of all the bundles (and their License) required to run
> > "bundelized" aperture in Smila:
> >
> > com.drew.metadata_2.4.0.jar (Public Domain)
> > javax.activation_1.1.1.jar (CDDL)
> > javax.mail_1.4.1.jar (CDDL)
> > jcl104-over-slf4j-1.5.0.jar (MIT)
> > openrdf-sesame-2.2.1-onejar-osgi.jar (BSD)
> > org.apache.poi_3.2.0.jar (Apache License 2.0)
> > org.bouncycastle.bcmail_1.32.0.jar (MIT)
> > org.bouncycastle.bcprovider_1.32.0.jar (MIT)
> > org.fontbox_0.2.0.jar (BSD)
> > org.htmlparser_1.6.0.jar (CPL 1.0)
> > org.jempbox.xmp_0.2.0.jar (BSD)
> > org.pdfbox_0.7.4.jar (BSD)
> > org.semanticdesktop.aperture.safe_1.2.0.jar (BSD)
> > org.semanticdesktop.aperture_1.2.0.jar (BSD)
> > rdf2go.api-4.7.0.jar (BSD)
> > rdf2go.impl.sesame22-4.7.0.jar (BSD)
> > slf4j-api-1.5.0.jar (MIT)
> > slf4j-jdk14-1.5.0.jar (MIT)
> > com.sun.media.jai (Sun Binary Code License Agreement) required by
> > PDFBox. Did not publish this bundle yet, as we can't use it in Smila.
> >
> > License wise, the bundles are all EPL compatible except for
> > com.sun.media.jai.
> >
> Anotni is keeping a lookout on pfdbox because of that.
> 
> > 1) bundle org.semanticdesktop.aperture.safe_1.2.0.jar imports
> packages
> > from org.pdfbox_0.7.4.jar which in turn imports packages from
> > com.sun.media.jai. As the latter can't be provided by Smila (because
> of
> > LGPL) the other two bundles cannot be started if these packages are
> > missing!!! So we should separate the Extractors relying on PDFBox
> from
> > the other Extractors (putting them in  their own bundle).
> >
> yep, for now this solves the issue
> > It seems to be a good approach in general, to provide the Extractors
> not
> > in one bundle but on a "bundle per extractor" basis.
> 
> I made this masterplan back last year, where I said:
> 
> >
> > * one aperture core OSGi bundle
> > * one OSGi bundle for each Extractor (only for extractors that depend
> > on "Eclipse-Friendly" 3rd party libs)
> > * all remaining  crawlers & subcrawlers & extractors into an extra
> > OSGi package "the rest"
> >
> > Antoni, we already prepared all the fine-grained-activators for this,
> > so the task at hand is just to check the weird dependencies in the
> > core OSGi bundle (lib/applewrapper,  lib/aduna-commons-xml-2.0.jar)
> > and move - one by one - the most useful extractors into individual
> > OSGi bundles.
> >
> > Once we got some core Extractors out there, we can do a release and
> done.
> > Can we get these running quick?
> > *  Excel, Jpg, Office, OpenDocument, Pdf, Plaintext, Powerpoint, RTF
> > ... + all others that depend on POI
> > (PDF will be a beast because we have no official release of PDFBox)
> So we are halfway there - we still miss the individual bundles for each
> extractor.
> 
> A proper packaging must somehow be "one bundle per extractor" because
> of
> the 3rd party libs hassle.
> At the moment we have "all safe extractors into one bundle" which we
> call "contrib", which is a bit weird, because it is NOT what we have in
> the aperture-contrib project, but anyway, it works (tm)
> 
> As nobody objected back then, I assume this is still the masterplan!
> Daniel?

>From our point of view this is the way to go, "one OSGi bundle for each Extractor".
So the current state with the "safe" bundle was then just a intermediate step to the final fine grained solution to see if things will work.


> Antoni - should we change
> http://aperture.wiki.sourceforge.net/ApertureInOSGi to reflect what I
> said above?
> 
> 
> >  Even though the
> > Licenses of the other 3rd party bundles are OK, this does NOT mean
> that
> > the bundles will pass eclipse legal process ! One common problem is
> code
> > provenance. So if all Extractors remain in one bundle
> > org.semanticdesktop.aperture.safe_1.2.0.jar and just one 3rd party
> > bundle used by one Extractor does not pass it's CQ, Aperture can't be
> > used in Smila until this CQ is resolved or the dependencies are
> removed.
> > Finer grained bundles will allow us to use Aperture with a subset of
> > available Extractors. Adding additional extractors when their CQs are
> > completed.
> >
> ha in the beginning you said one bundle for whole aperture,
> now you follow the track of "one bundle for each extractor" ;-)
> I guess we are thinking the same direction :-)

I think so. Sometimes we just use different words to express the same thing :-)


> > 2) bundle org.semanticdesktop.aperture_1.2.0.jar contains 2 jar files
> > 	+ aduna-commons-xml.2.0.jar
> > 	+ applewrapper-0.2.jar
> >   We need to create CQs for both jars and according to
> > http://aperture.wiki.sourceforge.net/Dependencies applewrapper-
> 0.2.jar
> > is LGPL !? Are there any alternatives ?
> >
> this is fucked up,
> but I think Antoni fixed it today.

YES. So the only open 3rd party issue is PDFBox with its dependencies. Which we hope will be solved soon, too.

 
> 
> > 3) do we need all those bundles for just mimetype detection and
> > extractors ? (e.g. sesame ?) Or could some dependencies be removed,
> > perhaps also by finer grained bundles ?
> >
> aperture is a SEMANTIC framework (as the S in SMILA :-),
> so we build on RDF,
> = sesame has to stay in or nothing will work in aperture.
> 
> theoretically, it can be exchanged by Jena, because we are based on
> RDF2go,
> but you don't want to look into their own private hell of ~10mb of
> dependencies

I know that aperture is based on RDF, I hoped that for just using extractors no RDF store was required. I still think it is not used anyway, but some implementations just come with this jar. 
On the other hand this offers Smila a ready to use rdfstore, which will be needed in the near future. This is nice.


Bye,
Daniel




Back to the top