[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[news.eclipse.technology.corona] Knowledge Collection in Corona Workbench

My apologies this is a long post. I'm looking for thoughts on Knowledge 
Collection in OSGi component based environment.  Let me know if this post 
makes sense or not and problems you can foresee.  If you have worked on a 
Knowledge Base in any way similar please take some time and have a read

- Thanks Glenn Everitt

Corona Project Knowledge
When we started thinking about the Corona we originally thought we would use 
project as the main object for collecting data. We later realized that we 
want to be able to gather information about objects both larger and smaller 
than a "project". We also argued that we would want to gather information 
about ongoing operations that really shouldn't be thought of as a project 
that ultimately ends. So rather than using the term project we decided 
something more abstract would be appropriate. The term IContainer from the 
RCP context seemed close and we ended up with ICollaborationContainer. As 
you continue reading think of the CollaborationContainer as a space where 
users and components collaborate.

Corona CollaborationContainer Knowledge
A Corona CollaborationContainer gives us an identifiable context from which 
we can create a knowledge base. We want to be able to ask interesting 
questions about Corona CollaborationContainers. For example, what are the 
most heavily used resources. Does my CollaborationContainer use a component 
with a security flaw. We would also like to be able to ask qualitative 
questions such as which projects are the best run and who are the best 
software developers working on them.

Gathering CollaborationContainer Knowledge
Corona CollaborationContainer knowledge is gathered by monitoring the events 
that occur within a CollaborationContainer. These events include resource 
events indicating which resources were added or removed from a project.
Collaboration Event - ECF
The events include collaboration events. We can gather information about 
which groups of people are talking with one another. We can use information 
from the ECF Sessions to identify the people involved and then look up their 
role and work group. We can then infer whether groups such as software 
development are talking with software test.

Status Events - ALF ?
What about status-update type of events - when is a project done, when is a 
task complete. We could monitor ALF lifecycle events. This support would 
require a common mechanism for identification of resources. If ALF lifecycle 
events included a URI indicating which lifecycle event affects which 
resource we could determine what stage the project is at and add in this 
information. I need to check with ALF Project to see whether they are 
thinking about software lifecycle events the same way we are.

Process Events - BPEL (again ALF)
If we wanted to know if a process within a project was complete we could 
also capture information about status of BPEL processes. I think BPEL 
processes started by ALF Event Manager should also be able to send ALF 
status type events from the BPEL Processes - again verify with ALF.


Event Processing
Corona CollaborationContainer Owners should determine what types of 
information they want to collect about their collaboration space. The 
Collaboration Knowledge Monitor is a Corona Component that allows 
CollaborationContainer Owner to define the types of events they are 
interested in. We think that event processing uses existing OSGi Event 
processing. Should we define specific event topics for publish and 
subscribe?

Structuring CollaborationContainer Knowledge
Since we have gathered information from many different sources about the 
context of a given CollaborationContainer we need an organized way to save 
this information. We have chosen to formally define the information 
structure of CollaborationContainer. The Resource Definition Framework (RDF) 
and Web Ontology Language (OWL) allow us to define the structure we will use 
to hold the information gathered for our Corona CollaborationContainers.

CollaborationContainer Knowledge Structure
We are currently using a project based Ontology to define the knowledge 
structure for our exemplary Collaboration Container. See the RDF/OWL 
definition of a Corona Project. This definition is from SemanticWeb.org  and 
we plan to change this ontology to add extensions for the Corona 
Environment. This file project.rdf has some minor changes from the posted 
definition so that it could be parsed by Jena2.3 Ontology Repostitory 
Project.

Adding CollaborationContainer Knowledge
The monitored events will have Knowledge Monitor event adapter. The Event 
Adapter will pull information from an event and convert it into a RDF 
Triple. The triple consists of a Subject Predicate and Object. This form 
tells us what the item is we are talking about. The predicate tells us the 
relationship to \the subject has to the object. Here is an example of a 
triple, \ "Jim", "Works on", "ProjectX" "Jim" is the subject, "Works on" is 
the predicate and "ProjectX" is the object. So if we received an event that 
indicated that Jim was added to Project X the Knowledge Monitor would create 
the RDF Triple and add that into the Knowledge Base. We need to investigate 
performance of this type of knowledge base.

Uniquely Identifying Project Resources
The RDF/OWL standard utilizes a URI naming scheme. Each subject, predicate, 
or object of a Triple can be defined by URI or a literal. There already 
exist many URI based vocabularies that build upon existing vocabularies. 
This approach can only work because all of the vocabulary definitions are 
namespaced via URI's. So, we can choose to add more detail and expand the 
content held in the Corona Knowledge Base by utilizing existing vocabulary 
defitions for resource types and resource properties.

Asking Questions about Projects
We can ask questions about Projects through the use of SPARQL queries 
against the CollaborationContainer Knowledge Base. I believe in general 
these will be predefined queries. However, a web service SPARQL query could 
be used to execute ad hoc queries

Ontology Usage

Project Related Ontologies
Multiple ontologies will be used in the Corona environment. There will be 
one ontology that describes a CollaborationContainer. This ontology will 
describe very basic vocabulary/relationships regarding Tasks, Activities, 
Components, Person etc. An ontology closely related to the 
CollaborationContainer Ontology is the Phase Ontology this ontology 
describes the time base vocabulary used for working with the 
CollaborationContainer. This ontology will describe the Lifecycle Events 
such as milestones, the design, development, and test phases of the project 
or Operational Events such Maintenance Windows, Shutdowns.

Corona Component Related Ontologies
The other area of ontology usage is in description of Corona/OSGi bundle 
capabilities. Each Corona Component should define its own ontology 
describing what it does and the information it produces. This ontology 
approach augments the Web Service Description Language (WSDL)definition. The 
WSDL describes the interface to the components services. The WSDL does not 
indicate what the service does, or what artifact files it creates or uses. 
The ontology also enhances the information available from the WSDM 
management interface. The WSDM interface provides an API for retrieving 
metadata about a web service. The API allows the specification of metadata 
description dialect. Our dialect will be explicitly defined via an RDF/OWL 
ontology.

Ontology Extension
Our current thinking is that Corona will provide a very basic 
CollaborationContainer Ontology. This CollaborationContainer Ontology will 
be extended with more specific ontology information for a particular 
company's usage. We will provide an RDF definition indicating where we think 
the basic CollaborationContainer Ontology should be extended. This property 
will be named something like <ontology-extension-point> this is just an 
indication of where the ontology could be extended. In this way you can 
think of the ontology extensions as Knowledge Base Extension Points. The 
extension ontologies allow for customization for specific types of Projects. 
[#1]more here ????


Relationships between Ontologies
When a project is created the CollaborationContainer Ontology should be 
available. The ontology extensions will be based upon an enterprise's 
standard project defintion of activities and tasks. The next ontology added 
would be the Phase Ontology this ontology will vary based upon the use of 
CollaborationContainer. When used for larger projects which have more phases 
and more time aggregation definitions the Phase Ontology could be fairly 
detailed.
A Project Ontology consists of the Collaboration Ontology with ontology 
extension points specific for your enterprise project process. The classes 
defined in a Project Ontology will be referenced by the Phase Ontology. The 
Phase Ontology will also be defined as ontology extension to the base 
Collaboration Ontology. The key to merging ontologies is through careful 
scoping to prevent overlap of class definitions. We anticipate the basic 
CollaborationContainer Ontology being able to general enough to allow 
obvious extension while still providing enough structure to be a useful 
guide of where information should be placed.


Corona Component Ontologies
As a project gets underway Corona Components to accomplish tasks will be 
added to the Corona Workbench (need def of Corona Workbench). On the client 
side Corona Component functionality is added to project via the Nature 
extension points. Natures in the Corona environment would also provide 
extension ontologies that indicate where a Corona Component's capability 
fits within the Project environment. It also should indicate where the 
artifacts produced by the Nature fit within the Project Ontology.
When Corona Components are added to a Project the Corona Component Ontology 
information for the Component is merged with the Project Ontology. This 
process is called Corona Component Project Registration. The ontology 
information allows us to infer information about resources that are added to 
Project. For example if a Debug Nature is added to a project and dump files 
are produced by the debugger we anticipate dump files being produced. We can 
also know which Corona Component produced the dump files and which product 
the dump files are associated with. So, information about artifacts being 
produced and consumed by the Corona Components can be added to the Knowledge 
Base.

When a Corona Component is used to accomplish a task in a project the Corona 
Component Ontology could define the phase (as defined by the Project Phase 
Ontology) when the Corona Components will be used.


Web Service Metadata
Corona Component can expose web services and provide WSDM management 
interface. The WSDM management interface includes an API for retrieving 
metadata about a web service. This API allows specification of a metadata 
dialect. Corona Components will support three interfaces:
  1.. wsdl
  2.. ontology
  3.. semantic
The WSDL metadata dialect will provide basic definition about parameters 
used in the web service call. It will also return the WSDM defined component 
capabilities.

The ontology dialect will return the Corona Component Ontology definition. 
The ontology as previously mentioned should be defined as an extension to 
the Project Ontology.

The semantic dialect will return information about how the Corona Component 
is used i.e. the context in which it can be used. The type of data it 
produces and the type of data it consumes. Artifacts that are produced. 
Current thinking is that the Project Knowledge can be queried to retrieve 
information about the context the component is being used. [2] [#2] need 
more specifics


Event Information Harvesting
As a Project progresses within the Corona Workbench, Corona Components 
generate artifacts and events indicating state changes to the project such 
as tasks completed, resources added removed, test success or failure. These 
event called Corona Collaboration Events are published to a topic. The 
Corona Knowledge Monitor subcribes to these topics and processes events. 
Information from the event may be ignored or enhance from other sources.

Non-Corona Events
Non-Corona events can be monitored by writing new Knowledge Monitors that 
listen for other event types. The events will still be converted into 
RDF/OWL triples and written into the Knowledge Base.

Decouple Event System from the Project Ontologies and Phase Ontologies
Events need a minimal amount of information to do their job. That job is 
deliver information from one component to another. So, they need a way to 
identify:
  a.. sender
  b.. receiver
  c.. event type (what happened)
  d.. event properties (information about what happened)
  e.. object identification (which objects were involved in the event.
By partitioning the information handled by the event into two classes we can 
reduce the impact of changing Ontologies on the event system.
We can look to JMS as a model for this approach it classifies information as 
envelope information and content information. The information in the content 
is not visible to the event system whereas the envelope information is 
visible. The event system uses the event envelope information to route 
messages to destinations. Only the event creator (publisher) and the event 
(consumer) care about the contents of the event ( the data payload).


Corona Identification
All objects in the Corona Workbench are identified by a URI. This approach 
allows components to be uniquely identified, this works well with the Web 
Services, WSDM, RDF/OWL however, there are identification schemes which 
would require mapping from non-URI naming domain to the URI naming domain.

Items requiring mapping from the Ontology to the Event System
  a.. Object identity
  b.. Object type







begin 666 out.png
MB5!.1PT*&@H````-24A$4@````8````&" ,```#7$A]Z````!W1)344'TP,4
M$"\#5Y:TG@````EP2%ES```*\ ``"O !0JPTF ```P!03%1%````@ ```( `
M@( ```" @ " `(" P,# P-S ILKP________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M____________________________________________________________
M______________OPH*"D@(" _P```/\`__\```#__P#_`/______]IM,,@``
M``MT4DY3_____________P!*3P'R````&TE$051XVF/@XOH)! Q<8)H!Q@,2
;7& ** ZE`+/>$R95GQ>!`````$E%3D2N0F""
`
end

begin 666 attachment_small.png
MB5!.1PT*&@H````-24A$4@````<````/`0,````2ST?]````%71%6'1#<F5A
M=&EO;B!4:6UE``?3`@L($2/?$_#J````!W1)344'TP(+"!$TB<&670````EP
M2%ES```*\ ``"O !0JPTF ````903%1%____````5<+3?@```!Y)1$%4>-IC
KD&%0`D)E(%X-A*O <#7#9H8F!A>&&@!0S0::(Z.*J ````!)14Y$KD)@@@``
`
end