[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [p2-dev] URLs, URIs, and IDs (oh my)


Comments below.

Scott Lewis wrote:
Hi Ed,

Ed Merks wrote:

I think we're both in the unenviable position of appearing as if we are arguing in favor of injecting EMF or ECF into the platform. Hopefully folks will see beyond that so as not to lose sight of the technical unpinnings.

Hope so. I only started this discussion because it's sort of a one-time thing at the platform level...i.e. once a URI, always a URI. Also...I'm quite aware that our use cases (ECF's) aren't the same as 'everybody' elses (i.e. most people are naturally thinking primarily about *resource* identifiers for resource API, etc). But in any event...
For sure EMF's focus was to use them to identify resources and objects within resource...
If we were to start Eclipse from scratch based on all the things we've learned, we'd likely do quite a few things a little differently...

<stuff deleted>

Sure. One example needed extension is in URI construction. Many IDs have specific syntax requirements beyond the URI syntax spec...as a concrete example: xmppids. So it's desirable/necessary to be able to run custom String parsing code on construction.
I can see the point, but it seems a minor one and now I'm wondering about the context around this. Consider how often a URI appears serialized in a document. All I have is a string and I need to create a URI. How will I know the right way to construct an ID given that there are many choices?

I would think that it would have to be based upon application context (i.e. what application). That is, for p2 repositories the IDs are essentially resources representing p2 meta-data (artifact/metadata repos)...so URI can/could be naturally used. As an alternative, if the context is a buddy list/IM app, and the IDs are user ids associated with a particular protocol...then URI is not at all natural for representing those (at least that's been my experience).
If you squeezed it just right it would fit. :-P

<stuff deleted>

there are cases where unique identifiers (for some protocols) would have to be forced into being a URI.
Some concrete examples would help me a lot... Note that there is a high level dichotomy between hierarchical URIs (<scheme>: followed by /) and opaque URIs (<scheme>: followed by anything other than /); you probably know that already..
And for many other cases the URI class is overkill.
True, but if we analyze the memory footprint of the wrapped things and their wrappers and compare that to a overkill uniform representation, is it really bad? (I'm asking, I don't know.)

I haven't done such an analysis either, but in looking at the emf URI it looks like it has 7 or so Strings internally, and two booleans...versus (for some of our ids) a single String. So although the overhead might be small, with many instances the waste could be significant.
It's not that simple. If it were that simple, the platform would have using String instead of IPath/Path. As an example, a resource might have a huge number of references in it to another resource. Each results in a URI being constructed. but the URI implementation is such that uri.appendFragment("foo") reuses all the strings in uri to create a new uri. And uri.appendFragment(<anything>).trimFragment() == uri. And we can intern the scheme, since we expect a small number of those.

Two booleans. Hmm. That's 8 bytes. I suppose we could reuse two bits of the hashCode and save 8 bytes... Hmmm......

<stuff deleted>

I'm still curious how, if I have an arbitrary ID, I'd know what kind of adapter to create to do something meaningful for it. I get the sense that an all encompassing representation that captures all cases will be the thing to which most folks will want to adapt...

Maybe, but again I think it depends highly on application context. If you are not in the 'resource identifier' box, and being in that box doesn't really make sense, then it becomes much more of a force fit. I know that URI handles many boxes...and especially the boxes having to do with identifying out-of-process resources...but not everything is a file system (yes, the opaque form is not file-system focussed, but still :).
Any shoe fits in that box! :-P

Then IResourceID could be used in appropriate places within p2 (and/or e4) along with URIConverter, etc. This would, I think, be both an easy and useful way to go...as it would still be using emf.URI for implementation, but gain the extensibility benefits of using the namespace extension point. The main cost to the programmer would be calling (e.g.)
For the record, I doubt it's reasonable to move the existing EMF class somewhere else. It's probably reasonable to copy it...

Just for my information...how much/many classes would need to be copied.
Just one. It's self contained.

resourceID.getURI().getPath() rather than resourceURI.getPath() ...i.e. one level of redirection.

Just a thought. Even if URI is used directly in p2 we will certainly do this ourselves if emf URI is added somewhere in Equinox.
Would you imagine that all IDs could be adapted to be represented as a URI?

Possibly...although having such an adapter (to URI) would be unnecessary/cumbersome for some IDs.
I'm digging because if I can be handed an ID that don't adapt to a URI, then I can't via the static type system ensure that such IDs don't work their way into the framework...

Given such a URI, should it be possible to construct the "corresponding ID"?

Only if the corresponding ID is constructed by the same Namespace. That is, a single URI could be used to construct two IDs...i.e. one in namespace 'foo' and the other in namespace 'bar'. then

URI uri = new URI("myscheme:blah");
ID fooID = IDFactory.createID("foo",uri);
ID barID = IDFactory.createID("bar",uri);
fooID.equals(barID) -> false
So serializing an ID would require serializing the namespace and the other part of the string...

Or is it the case that a given URI represent might represent different IDs is different domains and that they just happen to map onto the same URI representation? (I think I'm asking the same ID -> String -> ID round trip question.)

Yes, a given URI might be used to create different IDs...if they were constructed via different namespaces as above.
I see... I could imagine someone having a model that wanted to serialize IDs and would need to do ID -> String -> ID mapping in order to achieve that goal. (Of course someone will want such a model, it goes without saying, but I thought I'd point out the obvious.)


p2-dev mailing list