We definitely need to account for large
volumes of data, but my question is whether we can use only the data channel
for these large blocks and assume that the control channel will only be used
for small blocks. I thought I understood Harm to be saying that there was
some need for large amounts of data on the control channel, but maybe I
misunderstood.
I believe the issue with encoding the data
into a standard XML stream is the time it would take to perform the decoding
(which is probably a serious issue for the control channel). On the other
hand, the problem with not encoding the data is that non-encoded streams would
be rejected by IT filters in the firewall scenario.
Anyway, I feel like this is heading down a
rabbit trail.
The real question I want to get answered
is, should we pursue XML as our primary format for the control channel?
I’d appreciate if everyone with an
interest in this could drop a quick ‘yes’ or ‘no’ (just
so I’m clear on where we are) and also voice their particular concerns if
applicable. We can address the particular concerns as necessary on this
list.
-Andy
-----Original Message-----
From: hyades-dev-admin@xxxxxxxxxxx
[mailto:hyades-dev-admin@xxxxxxxxxxx] On
Behalf Of Victor Havin
Sent: Monday, September 20, 2004
6:30 PM
To: hyades-dev@xxxxxxxxxxx
Cc: hyades-dev@xxxxxxxxxxx;
hyades-dev-admin@xxxxxxxxxxx
Subject: Re: [hyades-dev]
XML-based command data
I think we all agree that scenarios involving large
volumes of data should be properly addressed.
I
am not in a position to make recommendations as I have only limited experience
with XML and communication protocols in general. I just want to point out that
XML is not absolutely foreign to binary data transfer. For example, SMIL
standard that was introduced some time ago (appendix 1) deals explicitly with
transmission of large volumes of binary data over the Internet. Another example
is MIME multipart stream that can carry binary data along with XML parts
referencing it (appendix 2). XML itself is agile enough to carry any kind of
context, including unparsed data (appendix 3). If the concern here is data
inflation caused by binary stream encoding (eg. base64) then we should address
it explicitly. Given we are talking about protocol-independent data streams we
have to deal with encoding anyway. We can probably shift this responsibility to
payload normalizers, envelops and other internal facilities. If we are ready to
invest heavily into the internal infrastructure because of the scalability
requirements, then we can take this approach.
In
general I see at least three possible outcomes here:
1.
We follow the XML trail and hope it takes care of the most data marshalling
issues.
2.
We use XML as the main command delivery standard, but leave the door open for
occasionally sending a raw binary blob. We do realize that this blob requires
special parsing on every potential target platform. This blob should be
properly marked and carry an envelope with detailed description of data origin
and format.
3.
We always use proprietary binary format and just deal with it on all
target platforms.
The
first approach promises maximum flexibility and probably the heaviest overhead
in simple 'peer-to-peer over socket' cases. The last one can be very optimized
at the expense of our long work hours. The second is in between and I would
personally vote for it.
--Victor
Appendix
1.
SMIL
URL:
http://www.w3.org/TR/REC-smil/)
Appendix
2.
Multipart
MIME document.
RFC2387:
http://www.faqs.org/rfcs/rfc2387.html
Example:
Content-Type:
multipart/related; boundary=--xxxxxxxxxx;
--xxxxxxxxxx
Content-Type: text/xml
Content-ID: Contents
<?xml version="1.0" ?>
<objectDef uid="?">
<property><name>Width</name>
<value><i4>1024</i4></value>
</property>
<property><name>Height</name>
<value><i4>1024</i4></value>
</property>
<property><name>BitCount</name>
<value><i4>16</i4></value>
</property>
<property><name>BitCount</name>
<value><i4>16</i4></value>
</property>
<property><name>Pixels</name>
<value><stream href="" /></value>
</property>
--xxxxxxxxxx
Content-Type: application/binary
Content-Transfer-Encoding: Little-Endian
Content-ID: Pixels
Content-Length: 524288
....binary data here...
--xxxxxxxxxx
|
Appendix 3.
Example
of binary substream in HCE message
<hce:message>
...
<hce:data
hce:encoding="binary.base64">Z53815Zb82b</hce:data>
...
|
</hce:message>
----------------------------------------------------------------------------------------------------------------------------
"Kaylor, Andrew"
<andrew.kaylor@xxxxxxxxx>
Sent
by: hyades-dev-admin@xxxxxxxxxxx
09/20/2004 11:04 AM
Please
respond to
hyades-dev
|
|
To
|
<hyades-dev@xxxxxxxxxxx>
|
cc
|
|
Subject
|
[hyades-dev] XML-based command data
|
|
At last week's face-to-face we discussed the possibility of switching
the command data format for proposed HCE protocol from a binary,
byte-stream-based format to an XML-based format. Rather than make any
hasty changes to the current proposal, I wanted to air this explicitly on this
mailing list and give everyone (particularly those who weren’t present
last week) a chance to voice whatever concerns they might have.
I
believe there was some consensus that this was a good idea so long as we kept
it to a simple description of the data (perhaps leveraging some existing
standard).
However,
at some point on Thursday Harm raised the possibility that there are scenarios
in which it is necessary to send large amounts of binary data on the command
channel. I would have serious reservations about using XML some of the
time but occasionally switching to binary, but if someone familiar with this
case can propose something reasonable, I'm willing to listen.
We
agreed, I think, on Friday that this would apply only to the command header and
command data, not to the message envelope (which is specific to the transport
layer being used).
One
concern I have is that if we use XML to describe command data, I want to be
sure we have very clearly defined rules for how and when additional fields can
be added, including the expected behavior if a command with the new data is
sent to an object that isn't expecting it (i.e. what are the implications of
ignoring the data) and the expected behavior if a command without the data is
sent to an object that is looking for it (i.e. what is the default).
I
don't want the interfaces to become amorphous.
Other
comments?
Should
we switch to XML-based command data?