Bug 472975 - enable writing repository properties with special characters in their names
Summary: enable writing repository properties with special characters in their names
Status: RESOLVED FIXED
Alias: None
Product: z_Archived
Classification: Eclipse Foundation
Component: Mylyn (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: 3.20   Edit
Assignee: Jaxsun McCarthy Huggan CLA
QA Contact:
URL:
Whiteboard:
Keywords: contributed
Depends on:
Blocks:
 
Reported: 2015-07-17 16:22 EDT by Kenneth Poon CLA
Modified: 2016-04-21 17:55 EDT (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kenneth Poon CLA 2015-07-17 16:22:22 EDT
SaxRepositoriesWriter.parse() does not escape special characters in the attribute key.
Comment 1 Sam Davis CLA 2015-07-20 13:54:01 EDT
IIRC, the tasks framework is able to write XML which it cannot then read.
Comment 2 Sam Davis CLA 2015-07-20 16:10:25 EDT
Kenneth, can you describe  your fix? It looks like it substantially changes the data format.
Comment 3 Kenneth Poon CLA 2015-07-20 16:23:41 EDT
I changed the repository xml from looking like this:
<TaskRepository property1=key1 property2=key2 property3=key3 ..../>

to something like this:
<TaskRepository url=url kind=kind>
<Property key=1 value=value/>
<Property key=2 value=value/> ... </TaskRepository>
Comment 4 Kenneth Poon CLA 2015-07-20 16:30:22 EDT
This prevents invalid XML from being written, since the writer escapes the key and value strings.
Comment 5 Sam Davis CLA 2015-07-20 17:14:05 EDT
Thanks, that seems like a good idea. A simpler fix would be to prevent using special characters in repository property names, but it seems like this is worth supporting.
Comment 6 Sam Davis CLA 2015-07-20 17:17:05 EDT
Steffen, do you have any concerns with this? Going forward we'd only write data in the new format while retaining backwards compatibility with the old format by bumping the version number. Kenneth's change is at https://git.eclipse.org/r/#/c/52197/.
Comment 7 Steffen Pingel CLA 2015-07-21 06:14:32 EDT
Wouldn't we want something like this instead?

<TaskRepository url=url kind=kind>
<Property key=1>
value
</Property>

Generally speaking it sounds reasonable to me if it fixes an actual problem.
Comment 8 Sam Davis CLA 2015-07-21 16:53:19 EDT
Why is that better?

In any case, I'm a bit concerned that old versions of Mylyn won't be forwards compatible with this change, that is, if users upgrade and then downgrade to an earlier Mylyn version, they will no longer be able to read their repositories. We could consider adding a system property to control the output version but that doesn't really solve the problem.
Comment 9 Steffen Pingel CLA 2015-07-22 02:34:28 EDT
I was thinking that you want values to be XML text rather than attributes which have more restrictions in terms of allowed characters. The bug report appears to be about the keys anyways which, on second thought, doesn't seem worth addressing as that should be controlled by the connector and I don't see a pressing need for supporting any special characters for that. What's the use case?
Comment 10 Jaxsun McCarthy Huggan CLA 2015-07-22 13:55:19 EDT
I think we should strive to be as permissive as possible when it comes to things like this.

Connector developers shouldn't have to know all the character restrictions of XML attribute names when choosing their repository property keys, in fact they shouldn't even have to be aware that the repository will be serialized as XML, let alone that the repository property keys might be used as XML attribute names.

The use case is to support connectors which have special characters such as $ in their repository property keys, this may arise if a connector developer wishes to have a simple correspondence between the properties which come from the system itself and the repository property keys. I think that allowing this behaviour allows for simpler and cleaner connector development which helps empower connector developers.

Steffen, wouldn't XML attribute values have the same allowances as XML text as long as they are properly escaped? Or is XML text actually more permissive?
Comment 11 Steffen Pingel CLA 2015-07-28 18:00:13 EDT
(In reply to comment #10)
> I think we should strive to be as permissive as possible when it comes to things
> like this.

Yes, I agree in general but unfortunately we are stuck with some of the restrictions imposed by XML for now.

> The use case is to support connectors which have special characters such as $ in
> their repository property keys, this may arise if a connector developer wishes
> to have a simple correspondence between the properties which come from the
> system itself and the repository property keys. I think that allowing this
> behaviour allows for simpler and cleaner connector development which helps
> empower connector developers.

Okay, that use case makes sense.

> Steffen, wouldn't XML attribute values have the same allowances as XML text as
> long as they are properly escaped? Or is XML text actually more permissive?

I thought it was but I could be wrong. While we are this we should add a test that tries all possible characters (there is a similar test for task data). If it turns out that attributes cause problems we can consider reverting to text.
Comment 12 Sam Davis CLA 2015-07-28 18:26:23 EDT
Steffen, do you think it matters that this change would break users who downgrade to an earlier version of Mylyn? Is there any implicit or explicit promise of forwards compatibility with future Mylyn versions?
Comment 13 Steffen Pingel CLA 2015-07-29 11:24:43 EDT
(In reply to comment #12)
> Steffen, do you think it matters that this change would break users who
> downgrade to an earlier version of Mylyn? 

I can thinkg of 2 use cases: the new release introduces a regression so users may want to go back to the old release and users sharing the task list between Eclipse instances with different versions of Mylyn. 

What is the impact of downgrading? Would users lose some of their repository settings? 

> Is there any implicit or explicit
> promise of forwards compatibility with future Mylyn versions?

Not that I'm aware of but it's been a while since we last changed the configuration format so users may expect that. A note in the N&N would be good.
Comment 14 Sam Davis CLA 2015-07-29 13:19:47 EDT
I think the impact of downgrading would be that some of the respository properties would be lost. In the case of Bugzilla that might just mean a loss of preferences such as which attributes to suppress. For other connectors it may mean that the connector can no longer connect to the repository.
Comment 15 Jaxsun McCarthy Huggan CLA 2015-08-12 19:31:18 EDT
is the risk of data loss on a downgrade enough to block this task from being done?

I think that the added flexibility of allowing connector implementors to not have to worry so much about how the repositories are being serialized is a good benefit.
Comment 16 Sam Davis CLA 2015-08-19 10:49:42 EDT
(In reply to comment #11)
> > Steffen, wouldn't XML attribute values have the same allowances as XML text as
> > long as they are properly escaped? Or is XML text actually more permissive?
> 
> I thought it was but I could be wrong. While we are this we should add a test
> that tries all possible characters (there is a similar test for task data). If
> it turns out that attributes cause problems we can consider reverting to text.

Jaxsun or Kenneth, have you established whether there's a difference?

Another group of people who would be negatively impacted by this change are those who bootstrap/develop Mylyn and switch branches. To minimize the impact of the change, could we stage it so that for this release, we write the properties twice, once in each format, and then for the next release we remove the writing of the old format? That way it would only cause problems for people who go back two releases.
Comment 17 Kenneth Poon CLA 2015-08-19 14:59:37 EDT
(In reply to Sam Davis from comment #16)
> (In reply to comment #11)
> > > Steffen, wouldn't XML attribute values have the same allowances as XML text as
> > > long as they are properly escaped? Or is XML text actually more permissive?
> > 
> > I thought it was but I could be wrong. While we are this we should add a test
> > that tries all possible characters (there is a similar test for task data). If
> > it turns out that attributes cause problems we can consider reverting to text.
> 
> Jaxsun or Kenneth, have you established whether there's a difference?

The only difference I have found is that the attributes need to escape " or ' depending on the quotes used for the attribute value.

> Another group of people who would be negatively impacted by this change are
> those who bootstrap/develop Mylyn and switch branches. To minimize the
> impact of the change, could we stage it so that for this release, we write
> the properties twice, once in each format, and then for the next release we
> remove the writing of the old format? That way it would only cause problems
> for people who go back two releases.

Do you mean write both formats to the same file or write to two separate files?
Comment 18 Sam Davis CLA 2015-08-19 16:55:27 EDT
I suppose angle brackets would be a special character for XML text, so we might as well stick with attributes.

I meant to write them both to the same file, so that it can be read by both the current and older Mylyn versions. That obviously won't allow using special characters yet, but it will set us up to remove the old format in a future release.
Comment 19 Kenneth Poon CLA 2015-08-19 17:34:46 EDT
(In reply to Sam Davis from comment #18)
> I suppose angle brackets would be a special character for XML text, so we
> might as well stick with attributes.
> 
> I meant to write them both to the same file, so that it can be read by both
> the current and older Mylyn versions. That obviously won't allow using
> special characters yet, but it will set us up to remove the old format in a
> future release.

ok. I pushed a new patch to do this.
Comment 20 Sam Davis CLA 2015-08-19 18:28:06 EDT
Thanks. I'll need to give it a closer look but it seems ok to me. I was actually thinking that we'd create one node for each repository and use a blend of the two formats so that the properties are specified both ways on the same node, but creating separate nodes as you've done is probably cleaner. It does mean that we'll read the data twice and use whichever version comes last, but that seems fine.
Comment 21 Sam Davis CLA 2015-08-27 18:37:01 EDT
Due to the risky nature of this change, we're going to defer to the next release but we should merge it as soon after we branch as possible.
Comment 22 Sam Davis CLA 2015-10-26 14:52:59 EDT
Unfortunately, I don't think I will have any time to spend on this in this release.