Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [hyades-dev] Data Collection protocol -- performance concerns and suggestions

I fear there may be a can of worms about to open here, or perhaps a can of flames.  Hopefully all the flames are contained with an M8-derived "sandbox" at the moment so they won't cause to many problems.

Antony's explanation to me of the way the choreography stuff can be made to work (and my understanding of how WSDL came out of CCS )  is that you should think of it as one program rather than a bunch of agents.  In a program you always know the datatypes being passed across any interface, you don't parse the stack as you pass it into and out of an API call.  The BPEL is the same, it's just that the API calls happen to be distributed.

This means there shouldn't be any parsing cost in passing the data stream into and out of the choreography, the WSDL/XSD says what is in the stream.  If the choreography engine passes the data from one external API (e.g. an agent) to another (e.g. a loader) and doesn't read or change the data it won't even "open the packets", it just sends them across the wire - you can treat it like the parameters of a Java RMI, you don't parse them in the transport layer. If the choreography has to pull some data out of the XML using XPATH even then it's just a lookup.  However, in the case where it is driving native interfaces (without a SOAP binding) the choreography engine doesn't need to go through XML at all.  It's a very direct implementation of the BPEL language.  It just pulls the native types from the agent through the binding which  recasts them into the XSD subset of  Java types that the BPEL engine uses internally, where it can play with them. When it needs to communicate them it drops them into a virtual shared memory layer which deals with them as chunks of memory with "dirty" bits.  In a pairwise communication all that does is send them across the wire. You can also do explicit port communication in the BPEL language without any requirement to parse XML.

I think it would be extremely valuable if Antony was to explain this on the data collection call.  He will doubtless correct any mistakes in the above.

The astute amongst you will have noticed that the choreography engine can make a connection between the agent and the loader without the RAC in place. However, you probably wouldn't do that.  If the model contians XSD types, and the engine transfers the XSD subset of Java types , why do you need to go into and out of XML in the loader?

Kaylor, Andrew wrote:

I have concerns about introducing too much XML parsing into the DCE protocol.  I'd like to keep things as streamlined as possible so that the DCE will remain fast and light, while at the same time adding flexibility and extensibility.  On the other hand, I recognize the need of some components to have WSDL support.

 

At the base level implementation, rather than require components to completely describe their supported command sets, I would like to see us use something based on defined interfaces.  In this case, the client would be able to determine at run-time whether or not a component supports a given interface.  Once the client and a component agreed on an interface, the client would send commands (and get responses) whose format depended on the commonly-known definition of that interface.

 

I would imagine that communication between agents would be substantially similar to communications between the client and the DCE.

 

For performance reasons, I would like to see common interfaces be very simple, requiring minimal processing.  All messages would need to have a common stub to be used for routing by the DCE.  Beyond the details of the interface could be customized to meet the specific needs involved.  If the client is more performance oriented, the interface would be in a binary format.  Is it possible to meet the needs of WSDL-seeking clients by layering something on top of a simple binary interface?

 

Also for performance reasons, I would like there to be some way for the client and the server to negotiate communication format issues such as ASCII/EBCDIC and big-endian/little-endian so that the necessary reformatting can be minimized.

 

What I'm proposing is quite similar to the existing DCE protocol, but it would be extensible (via new interfaces).  This similarity to the current protocol would allow much of the existing client code to be reused.

 

I was picturing a command format something like this:

 

Header ID -- (identifies the format of the header, in case we want to change it in the future)

Target -- (a unique identifier that the DCE would use to rout the message)

Target Host -- (allows for communication between agents of different systems)

Source -- (will later become the target if the message requires a response)

Source Host -- (again, communication across peer DCEs)

Interface ID -- (tells the object receiving the message what the command ID means)

Command ID -- (identifies the command to be invoked)

Context -- (as currently, used to correlate commands with their responses)

Data Size -- (tells the DCE how big the command-specific data block will be)

Data -- (command-specific data)

 

This is just a rough sketch and is probably missing a lot of things.

 

The interface ID here needs to uniquely identify the interface at run time.  I was thinking we could get unique IDs using something like the Java class naming scheme (e.g. "org.eclipse.hyades.dce.agent"), but because we wouldn't want to have to do a bunch of string compares for every message, I thought maybe the DCE could provide a service to translate a string ID into a run-specific unique integer (kind of like Windows' RegisterWindowMessage function).

 

The command ID would be an interface-specific constant.  The object receiving the message would use the interface ID and the command ID to determine what the contents of the data block would be.  For performance critical interfaces, the data would map to a structure.  In other cases, it could be an XML block.

 

I would imagine that the mechanism of the header block could be made completely transparent to Java agents, which could receive an XML data block as if it were the entire content of the message.



Back to the top