[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
[smila-dev] Re: [Aperture-devel] Aperture bundlization for SMILA
|
Hi Daniel!
good work!
In general: I proposed that fine grained packages are the way to go in
december, and I thing we should
document all these proposals and the decisions here:
http://aperture.wiki.sourceforge.net/ApertureInOSGi
so Antoni, Daniel, please read on and say what is "the masterplan" now
and then
someone (= probably I) will change the ApertureInOSGi wikipage to show
the masterplan.
answers within...
It was Daniel.Stucky@xxxxxxxxxxx who said at the right time 26.01.2009
11:08 the following words:
Hi all,
with the fixes provided by Antoni I managed to get the "bundelized"
aperture to run in Smila.
horray!
In Smila we should refactor our two existing aperture integration
bundles into just one and also clean up the code and implement a
ProcessingService instead of a pipelet (Aperture OSGi services are used
now which "cries" to use DS)
no, I think you must not refactor these into one!
from what I know about our architecture, refactoring the bundles would
cause trouble:
* aperture is separated into interfaces and implementations (framework
<> implementation), bundling it into one would give the wrong impression
to other developers who would then think that aperture is a monolithic
piece of .... . whereas "really" Aperture is a perfectly osgi conformant
framework, similar to eclipse extension points. (=you would also not
bundle all implementations of an extension point into the bundle
defining the extension!)
* if there are different binary releases for OSGi and on sourceforge,
this would cause desaster.
we intentionally want to have ONE RELEASE as java and osgi versions,
repackaging it for Eclipse would break this.
did I miss something? does this help?
Here is a list of all the bundles (and their License) required to run
"bundelized" aperture in Smila:
com.drew.metadata_2.4.0.jar (Public Domain)
javax.activation_1.1.1.jar (CDDL)
javax.mail_1.4.1.jar (CDDL)
jcl104-over-slf4j-1.5.0.jar (MIT)
openrdf-sesame-2.2.1-onejar-osgi.jar (BSD)
org.apache.poi_3.2.0.jar (Apache License 2.0)
org.bouncycastle.bcmail_1.32.0.jar (MIT)
org.bouncycastle.bcprovider_1.32.0.jar (MIT)
org.fontbox_0.2.0.jar (BSD)
org.htmlparser_1.6.0.jar (CPL 1.0)
org.jempbox.xmp_0.2.0.jar (BSD)
org.pdfbox_0.7.4.jar (BSD)
org.semanticdesktop.aperture.safe_1.2.0.jar (BSD)
org.semanticdesktop.aperture_1.2.0.jar (BSD)
rdf2go.api-4.7.0.jar (BSD)
rdf2go.impl.sesame22-4.7.0.jar (BSD)
slf4j-api-1.5.0.jar (MIT)
slf4j-jdk14-1.5.0.jar (MIT)
com.sun.media.jai (Sun Binary Code License Agreement) required by
PDFBox. Did not publish this bundle yet, as we can't use it in Smila.
License wise, the bundles are all EPL compatible except for
com.sun.media.jai.
Anotni is keeping a lookout on pfdbox because of that.
1) bundle org.semanticdesktop.aperture.safe_1.2.0.jar imports packages
from org.pdfbox_0.7.4.jar which in turn imports packages from
com.sun.media.jai. As the latter can't be provided by Smila (because of
LGPL) the other two bundles cannot be started if these packages are
missing!!! So we should separate the Extractors relying on PDFBox from
the other Extractors (putting them in their own bundle).
yep, for now this solves the issue
It seems to be a good approach in general, to provide the Extractors not
in one bundle but on a "bundle per extractor" basis.
I made this masterplan back last year, where I said:
* one aperture core OSGi bundle
* one OSGi bundle for each Extractor (only for extractors that depend
on "Eclipse-Friendly" 3rd party libs)
* all remaining crawlers & subcrawlers & extractors into an extra
OSGi package "the rest"
Antoni, we already prepared all the fine-grained-activators for this,
so the task at hand is just to check the weird dependencies in the
core OSGi bundle (lib/applewrapper, lib/aduna-commons-xml-2.0.jar)
and move - one by one - the most useful extractors into individual
OSGi bundles.
Once we got some core Extractors out there, we can do a release and done.
Can we get these running quick?
* Excel, Jpg, Office, OpenDocument, Pdf, Plaintext, Powerpoint, RTF
... + all others that depend on POI
(PDF will be a beast because we have no official release of PDFBox)
So we are halfway there - we still miss the individual bundles for each
extractor.
A proper packaging must somehow be "one bundle per extractor" because of
the 3rd party libs hassle.
At the moment we have "all safe extractors into one bundle" which we
call "contrib", which is a bit weird, because it is NOT what we have in
the aperture-contrib project, but anyway, it works (tm)
As nobody objected back then, I assume this is still the masterplan!
Daniel?
Antoni - should we change
http://aperture.wiki.sourceforge.net/ApertureInOSGi to reflect what I
said above?
Even though the
Licenses of the other 3rd party bundles are OK, this does NOT mean that
the bundles will pass eclipse legal process ! One common problem is code
provenance. So if all Extractors remain in one bundle
org.semanticdesktop.aperture.safe_1.2.0.jar and just one 3rd party
bundle used by one Extractor does not pass it's CQ, Aperture can't be
used in Smila until this CQ is resolved or the dependencies are removed.
Finer grained bundles will allow us to use Aperture with a subset of
available Extractors. Adding additional extractors when their CQs are
completed.
ha in the beginning you said one bundle for whole aperture,
now you follow the track of "one bundle for each extractor" ;-)
I guess we are thinking the same direction :-)
2) bundle org.semanticdesktop.aperture_1.2.0.jar contains 2 jar files
+ aduna-commons-xml.2.0.jar
+ applewrapper-0.2.jar
We need to create CQs for both jars and according to
http://aperture.wiki.sourceforge.net/Dependencies applewrapper-0.2.jar
is LGPL !? Are there any alternatives ?
this is fucked up,
but I think Antoni fixed it today.
3) do we need all those bundles for just mimetype detection and
extractors ? (e.g. sesame ?) Or could some dependencies be removed,
perhaps also by finer grained bundles ?
aperture is a SEMANTIC framework (as the S in SMILA :-),
so we build on RDF,
= sesame has to stay in or nothing will work in aperture.
theoretically, it can be exchanged by Jena, because we are based on RDF2go,
but you don't want to look into their own private hell of ~10mb of
dependencies
best
Leo
Bye,
Daniel
------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Aperture-devel mailing list
Aperture-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/aperture-devel
--
____________________________________________________
DI Leo Sauermann http://www.dfki.de/~sauermann
Deutsches Forschungszentrum fuer
Kuenstliche Intelligenz DFKI GmbH
Trippstadter Strasse 122
P.O. Box 2080 Fon: +49 631 20575-116
D-67663 Kaiserslautern Fax: +49 631 20575-102
Germany Mail: leo.sauermann@xxxxxxx
Geschaeftsfuehrung:
Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
____________________________________________________