Bug 272066 - [repo] Repository location validation
Summary: [repo] Repository location validation
Status: RESOLVED FIXED
Alias: None
Product: Equinox
Classification: Eclipse Project
Component: p2 (show other bugs)
Version: 3.5   Edit
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: 3.6 M6   Edit
Assignee: John Arthorne CLA
QA Contact:
URL:
Whiteboard:
Keywords: api
: 272940 (view as bug list)
Depends on: 272530
Blocks: 268580
  Show dependency tree
 
Reported: 2009-04-13 14:56 EDT by John Arthorne CLA
Modified: 2010-02-09 14:55 EST (History)
3 users (show)

See Also:


Attachments
Remove unused validation API (13.28 KB, patch)
2010-02-09 14:54 EST, John Arthorne CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Arthorne CLA 2009-04-13 14:56:33 EDT
I20090407

Currently the Available Software Sites page doesn't do any kind of validation on the location when the user adds a new repository. I think we should be doing some lightweight validation here, such as calling IMetadataRepositoryManager#validateRepositoryLocation.
Comment 1 Susan McCourt CLA 2009-04-13 15:07:39 EDT
This is the 2009 version of bug bug #204184.
We added those repo manager validation methods to support that bug, and then stopped using them in bug #225062.

The intention was that there was simple validation that could be done in the UI, but we ended up with deadlock in certain cases. 

Of course, a lot has changed in the ECF/p2 transport world since this occurred, so it's possible we could revisit this...
Comment 2 Henrik Lindberg CLA 2009-04-13 15:19:53 EDT
It is safe to call IMetadataRepositoryManager#validateRepositoryLocation - it just validates p2 constraints on the URI (can not be relative, etc.) and it does not perform any actual communication.




Comment 3 Susan McCourt CLA 2009-04-13 15:22:28 EDT
The issue here is that we would need to document more specific behavior for what factories can and cannot do in the validate method.  For example, an UpdateSiteMetadataRepositoryFactory wants to see if there is a "site.xml" available.  So it tries to see if the file exists, which can timeout, etc....

I think we determined finally that the factories couldn't do any useful validation without contacting the network.  If we want to redefine the validation methods where they are workable for the UI, I'm all for it.  In the meantime I think there is another bug for allowing the user to automatically remove the URI from the manager if it's not correct.  Allowing them to do this, or even correct, the URI might be a big win.
Comment 4 Susan McCourt CLA 2009-04-13 15:24:58 EDT
(In reply to comment #2)
> It is safe to call IMetadataRepositoryManager#validateRepositoryLocation - it
> just validates p2 constraints on the URI (can not be relative, etc.) and it
> does not perform any actual communication.
> 
Just mid-air collided.  I still don't think it's safe to call this method in the UI.  For example, in UpdateSite (called by the update site metadata factory in the validate method)... 

	public static void validate(URI url, IProgressMonitor monitor) throws UserCancelledException, AuthenticationFailedException, FileNotFoundException, CoreException {
		URI siteURI = getSiteURI(url);
		long lastModified = getTransport().getLastModified(siteURI, monitor);
Comment 5 Henrik Lindberg CLA 2009-04-13 16:08:41 EDT
There should really be just one place where the validity of the URI as such is checked - i.e. a validation of the pointer as opposed to a validation of whatever it is assumed to point to. 

The next level of checking is to test if the ECF configuration has a factory for the scheme. Until dependencies on such configuration is captured as meta (install time) requirements some interesting issues can be created - if a repo is used using some non standard protocol, and then the support for this protocol is uninstalled for instance.

so
a) Is it a valid reference to "something"
b) Is the system configured to communicate with that "something" (i.e. can ECF be used to communicate)
c) What type of thing is at the location (p2, update site, etc.)
d) Is that thing valid (can it be "loaded")

I think the repository bundle should have a test that covers a) and b). Locations must always pass test a). Test b) must be passed if trying to communicate. c) and d) are already handled in how p2 currently figures out what the URI is a pointer to.


Comment 6 Susan McCourt CLA 2009-04-15 19:24:51 EDT
removing milestone and changing title.

This is a repo-manager/repo factory level issue.
We need staged validation as Henrik suggests, with calls that the UI can safely make in the UI thread.  Right now there are factories that are doing real work in validation so the UI can't do any better than simple URL validation and other client-side tricks.  The client-side tricks are documented in their own bugs (ie, comparing trailing slashes, defaulting to a scheme, etc...)
Comment 7 Henrik Lindberg CLA 2009-04-16 15:08:25 EDT
Bug 272530 is about have a method the UI can call to validate the URI entered by a user.
i.e. it validates the cases a) and b) described above.
Comment 8 John Arthorne CLA 2009-04-16 15:43:39 EDT
We will have to revisit this next release. Our current state is that we have a validation API method that contacts repositories, and is therefore not used at all. If this isn't proving useful, then we should switch this method to be a short-running validation that the UI can call as the user is typing (no progress monitor, no blocking). It looks like the "middle-weight" validation that contacts repositories but does not load all the contents is not particularly useful (since merely contacting the repository can take a long time).
Comment 9 Henrik Lindberg CLA 2009-04-16 16:13:22 EDT
(In reply to comment #8)
> We will have to revisit this next release. Our current state is that we have a
> validation API method that contacts repositories, and is therefore not used at
> all. If this isn't proving useful, then we should switch this method to be a
> short-running validation that the UI can call as the user is typing (no
> progress monitor, no blocking). It looks like the "middle-weight" validation
> that contacts repositories but does not load all the contents is not
> particularly useful (since merely contacting the repository can take a long
> time).
> 

Agree.
Comment 10 Susan McCourt CLA 2009-04-16 16:25:02 EDT
FWIW, at least the repo manager validation methods correctly document the possible performance problems...the fact that a load attempt might be made.

I suspect that for 3.6/4.0, we can remove the middleweight methods...
Comment 11 John Arthorne CLA 2009-04-20 13:28:53 EDT
*** Bug 272940 has been marked as a duplicate of this bug. ***
Comment 12 John Arthorne CLA 2009-04-20 13:30:31 EDT
Another aspect of validation is doing URI equivalence testing WRT trailing slashes. The repository manager or factories may be able to determine whether a trailing slash is significant for a given repository location.
Comment 13 John Arthorne CLA 2009-04-21 15:21:45 EDT
It crossed my mind that the CVS UI has a nice validation approach. The "add location" wizard has a checkbox in their case: "Validate connection on finish". This allows the user to choose not to validate if they are disconnected, on a slow network, etc. But it allows validation to happen if the user is not sure that they location they entered is correct.
Comment 14 Henrik Lindberg CLA 2009-04-21 19:20:08 EDT
(In reply to comment #13)
> It crossed my mind that the CVS UI has a nice validation approach. The "add
> location" wizard has a checkbox in their case: "Validate connection on finish".
> This allows the user to choose not to validate if they are disconnected, on a
> slow network, etc. But it allows validation to happen if the user is not sure
> that they location they entered is correct.
> 

That requires that it is possible to remember repositories without actually loading them. Which I think would be an improvement, but I don't think the repository manager currently is capable of doing that. 

The way I think it should work is that a syntactically valid URI can be added. It will at some point (either on demand: "test connection", or "install software", or via some background job, perhaps running periodically) be validated/read/loaded. When looking at a list of repositories, the repository status should reflect the current situation. (Just because a repo was valid some time ago, does not mean it is valid right now, etc.).

Comment 15 Susan McCourt CLA 2009-04-23 20:45:59 EDT
(In reply to comment #14)
> (In reply to comment #13)
> > It crossed my mind that the CVS UI has a nice validation approach. The "add
> > location" wizard has a checkbox in their case: "Validate connection on finish".
> > This allows the user to choose not to validate if they are disconnected, on a
> > slow network, etc. But it allows validation to happen if the user is not sure
> > that they location they entered is correct.
> > 
> 
> That requires that it is possible to remember repositories without actually
> loading them. Which I think would be an improvement, but I don't think the
> repository manager currently is capable of doing that. 

The repo manager *can* remember repos without loading them.  It just can't work with them without adding them.  

> 
> The way I think it should work is that a syntactically valid URI can be added.
> It will at some point (either on demand: "test connection", or "install
> software", or via some background job, perhaps running periodically) be
> validated/read/loaded. When looking at a list of repositories, the repository
> status should reflect the current situation. (Just because a repo was valid
> some time ago, does not mean it is valid right now, etc.).

yes, I think John's comment #13 is directed at giving the user more control over when the load-time validation occurs.  Right now it depends on what they are doing.  Typing into the combo will immediately load the repo because we are trying to filter on the results.  Adding it in the pref page will not load it.  So having a "validate" checkbox on the add dialog may make sense.



Comment 16 John Arthorne CLA 2010-02-09 13:57:45 EST
I think at this point we should just remove the validation method from the API because it is not used, and it's not clear if it is the right answer. Susan, I noticed there are a couple of references to the validation method in the UI, but they are never called because "contactRepositories" is always false. Do you mind if I just remove the RepositoryTracker.validateRepositoryLocationWithManager for 3.6 and if we find a need for this in the future we could add it back?
Comment 17 Susan McCourt CLA 2010-02-09 14:02:59 EST
(In reply to comment #16)
> I think at this point we should just remove the validation method from the API
> because it is not used, and it's not clear if it is the right answer. Susan, I
> noticed there are a couple of references to the validation method in the UI,
> but they are never called because "contactRepositories" is always false. Do you
> mind if I just remove the
> RepositoryTracker.validateRepositoryLocationWithManager for 3.6 and if we find
> a need for this in the future we could add it back?

Go for it, we should get rid of these methods for all the reasons mentioned in this bug.  Anything we add back in the future will likely be a lighterweight validation.
Comment 18 John Arthorne CLA 2010-02-09 14:54:17 EST
Created attachment 158623 [details]
Remove unused validation API
Comment 19 John Arthorne CLA 2010-02-09 14:55:42 EST
Released in HEAD.