Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[smila-dev] Re: [Aperture-devel] Aperture bundlization for SMILA

Hi Daniel!

good work!

In general: I proposed that fine grained packages are the way to go in december, and I thing we should
document all these proposals and the decisions here:
http://aperture.wiki.sourceforge.net/ApertureInOSGi

so Antoni, Daniel, please read on and say what is "the masterplan" now and then someone (= probably I) will change the ApertureInOSGi wikipage to show the masterplan.

answers within...

It was Daniel.Stucky@xxxxxxxxxxx who said at the right time 26.01.2009 11:08 the following words:
Hi all,

with the fixes provided by Antoni I managed to get the "bundelized"
aperture to run in Smila.
horray!

In Smila we should refactor our two existing aperture integration
bundles into just one and also clean up the code and implement a
ProcessingService instead of a pipelet (Aperture OSGi services are used
now which "cries" to use DS)
no, I think you must not refactor these into one!
from what I know about our architecture, refactoring the bundles would cause trouble: * aperture is separated into interfaces and implementations (framework <> implementation), bundling it into one would give the wrong impression to other developers who would then think that aperture is a monolithic piece of .... . whereas "really" Aperture is a perfectly osgi conformant framework, similar to eclipse extension points. (=you would also not bundle all implementations of an extension point into the bundle defining the extension!) * if there are different binary releases for OSGi and on sourceforge, this would cause desaster. we intentionally want to have ONE RELEASE as java and osgi versions, repackaging it for Eclipse would break this.

did I miss something? does this help?
Here is a list of all the bundles (and their License) required to run
"bundelized" aperture in Smila:

com.drew.metadata_2.4.0.jar (Public Domain)
javax.activation_1.1.1.jar (CDDL)
javax.mail_1.4.1.jar (CDDL)
jcl104-over-slf4j-1.5.0.jar (MIT)
openrdf-sesame-2.2.1-onejar-osgi.jar (BSD)
org.apache.poi_3.2.0.jar (Apache License 2.0)
org.bouncycastle.bcmail_1.32.0.jar (MIT)
org.bouncycastle.bcprovider_1.32.0.jar (MIT)
org.fontbox_0.2.0.jar (BSD)
org.htmlparser_1.6.0.jar (CPL 1.0)
org.jempbox.xmp_0.2.0.jar (BSD)
org.pdfbox_0.7.4.jar (BSD)
org.semanticdesktop.aperture.safe_1.2.0.jar (BSD)
org.semanticdesktop.aperture_1.2.0.jar (BSD)
rdf2go.api-4.7.0.jar (BSD)
rdf2go.impl.sesame22-4.7.0.jar (BSD)
slf4j-api-1.5.0.jar (MIT)
slf4j-jdk14-1.5.0.jar (MIT)
com.sun.media.jai (Sun Binary Code License Agreement) required by
PDFBox. Did not publish this bundle yet, as we can't use it in Smila.

License wise, the bundles are all EPL compatible except for
com.sun.media.jai.
Anotni is keeping a lookout on pfdbox because of that.

1) bundle org.semanticdesktop.aperture.safe_1.2.0.jar imports packages
from org.pdfbox_0.7.4.jar which in turn imports packages from
com.sun.media.jai. As the latter can't be provided by Smila (because of
LGPL) the other two bundles cannot be started if these packages are
missing!!! So we should separate the Extractors relying on PDFBox from
the other Extractors (putting them in  their own bundle).
yep, for now this solves the issue
It seems to be a good approach in general, to provide the Extractors not
in one bundle but on a "bundle per extractor" basis.

I made this masterplan back last year, where I said:


* one aperture core OSGi bundle
* one OSGi bundle for each Extractor (only for extractors that depend on "Eclipse-Friendly" 3rd party libs) * all remaining crawlers & subcrawlers & extractors into an extra OSGi package "the rest"

Antoni, we already prepared all the fine-grained-activators for this,
so the task at hand is just to check the weird dependencies in the core OSGi bundle (lib/applewrapper, lib/aduna-commons-xml-2.0.jar) and move - one by one - the most useful extractors into individual OSGi bundles.

Once we got some core Extractors out there, we can do a release and done.
Can we get these running quick?
* Excel, Jpg, Office, OpenDocument, Pdf, Plaintext, Powerpoint, RTF ... + all others that depend on POI
(PDF will be a beast because we have no official release of PDFBox)
So we are halfway there - we still miss the individual bundles for each extractor.

A proper packaging must somehow be "one bundle per extractor" because of the 3rd party libs hassle. At the moment we have "all safe extractors into one bundle" which we call "contrib", which is a bit weird, because it is NOT what we have in the aperture-contrib project, but anyway, it works (tm)

As nobody objected back then, I assume this is still the masterplan!
Daniel?

Antoni - should we change http://aperture.wiki.sourceforge.net/ApertureInOSGi to reflect what I said above?


 Even though the
Licenses of the other 3rd party bundles are OK, this does NOT mean that
the bundles will pass eclipse legal process ! One common problem is code
provenance. So if all Extractors remain in one bundle
org.semanticdesktop.aperture.safe_1.2.0.jar and just one 3rd party
bundle used by one Extractor does not pass it's CQ, Aperture can't be
used in Smila until this CQ is resolved or the dependencies are removed.
Finer grained bundles will allow us to use Aperture with a subset of
available Extractors. Adding additional extractors when their CQs are
completed.
ha in the beginning you said one bundle for whole aperture,
now you follow the track of "one bundle for each extractor" ;-)
I guess we are thinking the same direction :-)

2) bundle org.semanticdesktop.aperture_1.2.0.jar contains 2 jar files
	+ aduna-commons-xml.2.0.jar
	+ applewrapper-0.2.jar
  We need to create CQs for both jars and according to
http://aperture.wiki.sourceforge.net/Dependencies applewrapper-0.2.jar
is LGPL !? Are there any alternatives ?
this is fucked up,
but I think Antoni fixed it today.



3) do we need all those bundles for just mimetype detection and
extractors ? (e.g. sesame ?) Or could some dependencies be removed,
perhaps also by finer grained bundles ?
aperture is a SEMANTIC framework (as the S in SMILA :-),
so we build on RDF,
= sesame has to stay in or nothing will work in aperture.

theoretically, it can be exchanged by Jena, because we are based on RDF2go,
but you don't want to look into their own private hell of ~10mb of dependencies

best
Leo



Bye,
Daniel

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Aperture-devel mailing list
Aperture-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/aperture-devel


--
____________________________________________________
DI Leo Sauermann http://www.dfki.de/~sauermann Deutsches Forschungszentrum fuer Kuenstliche Intelligenz DFKI GmbH
Trippstadter Strasse 122
P.O. Box 2080           Fon:   +49 631 20575-116
D-67663 Kaiserslautern  Fax:   +49 631 20575-102
Germany                 Mail:  leo.sauermann@xxxxxxx

Geschaeftsfuehrung:
Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
____________________________________________________



Back to the top