Community
Participate
Working Groups
SaxRepositoriesWriter.parse() does not escape special characters in the attribute key.
IIRC, the tasks framework is able to write XML which it cannot then read.
Kenneth, can you describe your fix? It looks like it substantially changes the data format.
I changed the repository xml from looking like this: <TaskRepository property1=key1 property2=key2 property3=key3 ..../> to something like this: <TaskRepository url=url kind=kind> <Property key=1 value=value/> <Property key=2 value=value/> ... </TaskRepository>
This prevents invalid XML from being written, since the writer escapes the key and value strings.
Thanks, that seems like a good idea. A simpler fix would be to prevent using special characters in repository property names, but it seems like this is worth supporting.
Steffen, do you have any concerns with this? Going forward we'd only write data in the new format while retaining backwards compatibility with the old format by bumping the version number. Kenneth's change is at https://git.eclipse.org/r/#/c/52197/.
Wouldn't we want something like this instead? <TaskRepository url=url kind=kind> <Property key=1> value </Property> Generally speaking it sounds reasonable to me if it fixes an actual problem.
Why is that better? In any case, I'm a bit concerned that old versions of Mylyn won't be forwards compatible with this change, that is, if users upgrade and then downgrade to an earlier Mylyn version, they will no longer be able to read their repositories. We could consider adding a system property to control the output version but that doesn't really solve the problem.
I was thinking that you want values to be XML text rather than attributes which have more restrictions in terms of allowed characters. The bug report appears to be about the keys anyways which, on second thought, doesn't seem worth addressing as that should be controlled by the connector and I don't see a pressing need for supporting any special characters for that. What's the use case?
I think we should strive to be as permissive as possible when it comes to things like this. Connector developers shouldn't have to know all the character restrictions of XML attribute names when choosing their repository property keys, in fact they shouldn't even have to be aware that the repository will be serialized as XML, let alone that the repository property keys might be used as XML attribute names. The use case is to support connectors which have special characters such as $ in their repository property keys, this may arise if a connector developer wishes to have a simple correspondence between the properties which come from the system itself and the repository property keys. I think that allowing this behaviour allows for simpler and cleaner connector development which helps empower connector developers. Steffen, wouldn't XML attribute values have the same allowances as XML text as long as they are properly escaped? Or is XML text actually more permissive?
(In reply to comment #10) > I think we should strive to be as permissive as possible when it comes to things > like this. Yes, I agree in general but unfortunately we are stuck with some of the restrictions imposed by XML for now. > The use case is to support connectors which have special characters such as $ in > their repository property keys, this may arise if a connector developer wishes > to have a simple correspondence between the properties which come from the > system itself and the repository property keys. I think that allowing this > behaviour allows for simpler and cleaner connector development which helps > empower connector developers. Okay, that use case makes sense. > Steffen, wouldn't XML attribute values have the same allowances as XML text as > long as they are properly escaped? Or is XML text actually more permissive? I thought it was but I could be wrong. While we are this we should add a test that tries all possible characters (there is a similar test for task data). If it turns out that attributes cause problems we can consider reverting to text.
Steffen, do you think it matters that this change would break users who downgrade to an earlier version of Mylyn? Is there any implicit or explicit promise of forwards compatibility with future Mylyn versions?
(In reply to comment #12) > Steffen, do you think it matters that this change would break users who > downgrade to an earlier version of Mylyn? I can thinkg of 2 use cases: the new release introduces a regression so users may want to go back to the old release and users sharing the task list between Eclipse instances with different versions of Mylyn. What is the impact of downgrading? Would users lose some of their repository settings? > Is there any implicit or explicit > promise of forwards compatibility with future Mylyn versions? Not that I'm aware of but it's been a while since we last changed the configuration format so users may expect that. A note in the N&N would be good.
I think the impact of downgrading would be that some of the respository properties would be lost. In the case of Bugzilla that might just mean a loss of preferences such as which attributes to suppress. For other connectors it may mean that the connector can no longer connect to the repository.
is the risk of data loss on a downgrade enough to block this task from being done? I think that the added flexibility of allowing connector implementors to not have to worry so much about how the repositories are being serialized is a good benefit.
(In reply to comment #11) > > Steffen, wouldn't XML attribute values have the same allowances as XML text as > > long as they are properly escaped? Or is XML text actually more permissive? > > I thought it was but I could be wrong. While we are this we should add a test > that tries all possible characters (there is a similar test for task data). If > it turns out that attributes cause problems we can consider reverting to text. Jaxsun or Kenneth, have you established whether there's a difference? Another group of people who would be negatively impacted by this change are those who bootstrap/develop Mylyn and switch branches. To minimize the impact of the change, could we stage it so that for this release, we write the properties twice, once in each format, and then for the next release we remove the writing of the old format? That way it would only cause problems for people who go back two releases.
(In reply to Sam Davis from comment #16) > (In reply to comment #11) > > > Steffen, wouldn't XML attribute values have the same allowances as XML text as > > > long as they are properly escaped? Or is XML text actually more permissive? > > > > I thought it was but I could be wrong. While we are this we should add a test > > that tries all possible characters (there is a similar test for task data). If > > it turns out that attributes cause problems we can consider reverting to text. > > Jaxsun or Kenneth, have you established whether there's a difference? The only difference I have found is that the attributes need to escape " or ' depending on the quotes used for the attribute value. > Another group of people who would be negatively impacted by this change are > those who bootstrap/develop Mylyn and switch branches. To minimize the > impact of the change, could we stage it so that for this release, we write > the properties twice, once in each format, and then for the next release we > remove the writing of the old format? That way it would only cause problems > for people who go back two releases. Do you mean write both formats to the same file or write to two separate files?
I suppose angle brackets would be a special character for XML text, so we might as well stick with attributes. I meant to write them both to the same file, so that it can be read by both the current and older Mylyn versions. That obviously won't allow using special characters yet, but it will set us up to remove the old format in a future release.
(In reply to Sam Davis from comment #18) > I suppose angle brackets would be a special character for XML text, so we > might as well stick with attributes. > > I meant to write them both to the same file, so that it can be read by both > the current and older Mylyn versions. That obviously won't allow using > special characters yet, but it will set us up to remove the old format in a > future release. ok. I pushed a new patch to do this.
Thanks. I'll need to give it a closer look but it seems ok to me. I was actually thinking that we'd create one node for each repository and use a blend of the two formats so that the properties are specified both ways on the same node, but creating separate nodes as you've done is probably cleaner. It does mean that we'll read the data twice and use whichever version comes last, but that seems fine.
Due to the risky nature of this change, we're going to defer to the next release but we should merge it as soon after we branch as possible.
Unfortunately, I don't think I will have any time to spend on this in this release.
Gerrit change https://git.eclipse.org/r/52197 was merged to [master]. Commit: http://git.eclipse.org/c/mylyn/org.eclipse.mylyn.tasks.git/commit/?id=fa141daf3c63f922b8997c67cccabe91b2978648