Bug 544447 - [provider] Implement filetransfer provider using HttpClient 4.5 API
Summary: [provider] Implement filetransfer provider using HttpClient 4.5 API
Status: RESOLVED FIXED
Alias: None
Product: ECF
Classification: RT
Component: ecf.filetransfer (show other bugs)
Version: unspecified   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: ecf.core-inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: noteworthy
Depends on:
Blocks: 543205 544841
  Show dependency tree
 
Reported: 2019-02-14 10:18 EST by Carsten Reckord CLA
Modified: 2019-10-13 14:27 EDT (History)
9 users (show)

See Also:


Attachments
Filtered logging information (74.82 KB, application/octet-stream)
2019-05-30 04:34 EDT, Ed Merks CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Carsten Reckord CLA 2019-02-14 10:18:03 EST
In order to improve the HttpClient story with NTLM/SPNEGO authentication using the httpclient-win library, I've reimplemented the ECF HttpClient filetransfer provider based on the HttpClient 4.5 API for one of the products at my company, Yatta.

The provider in its current form is already shipped as part of Yatta's Profiles product and is in use at several corporate customers for quite some time now with no more issues regarding proxy support, where the previous provider didn't work.

We would like to contribute both the provider implementation and the fragment for httpclient-win support back to ECF. 

The code is already under EPL-2 and I can put it through the necessary CQ process. 

Current sources are available at http://cdn.yatta.de/update/profiles/plugins/de.yatta.ecf.provider.filetransfer.httpclient45.source_1.3.45.201808280835.jar and http://cdn.yatta.de/update/profiles/plugins/de.yatta.ecf.provider.filetransfer.httpclient45.win32.source_1.3.45.201808280835.jar

The reimplementation was more or less a fork of the current HttpClient filetransfer provider with parts rewritten to use the new 4.5 API. 

The biggest conceptual change in this: I opted to have the possibility to have a dedicated HttpClient instance made available as an OSGi service[*], which can be reused across requests. Only the HttpContext is freshly created per transfer in that case. This way, the client can use all the usual fun HttpClient stuff like connection pooling etc across requests.
 
One small caveat with that: In that setting, it didn't seem to make sense to support ECF's socket-level listener API, which would interfere with HttpClient's own socket management e.g. for connection pooling, so that's not implemented at the moment.


--
[*] There's still a few bits missing in the linked code to allow selecting an existing named client per transfer, but I'd add those along the way
Comment 1 Scott Lewis CLA 2019-02-14 17:32:05 EST
Thanksinadvance for the contribution Carsten and Yatta.

A couple of initial questions:  Does your fragment/provider work on non-windows clients?  

When was the fork done?   Does it incorporate any fixes to the httpclient  provider done since the fork?  Admittedly there have not been many so perhaps it's not an issue.

The use of OSGi services seems like a good idea.  I suppose that might suggest a change in p2, although perhaps not.

I expect we'll be able to get everything in place for 2019-6 release.  I'll take a look at the source in bundle form.  Can you make a git repo (with history since fork) available at some point?

And now the big question:  are you interested/willing to contribute to maintenance?...e.g. with new version of httpclient?

Again, thanks.
Comment 2 Carsten Reckord CLA 2019-02-15 04:43:42 EST
(In reply to Scott Lewis from comment #1)
> A couple of initial questions:  Does your fragment/provider work on
> non-windows clients?  

The provider itself is completely platform-independent. 

The fragment that adds SPNEGO authentication is Windows-only, using https://hc.apache.org/httpcomponents-client-ga/httpclient-win/project-summary.html. It does JNA calls to use the native authentication handler. 

In our experience, the vast majority of proxy-related issues are on Windows with corporate proxies that require NTLM/SPNEGO authentication. So adding that for Windows only still is a huge win.

> When was the fork done?   Does it incorporate any fixes to the httpclient 
> provider done since the fork?  Admittedly there have not been many so
> perhaps it's not an issue.

It was done around March 23, 2018. I'm pretty sure the code it is based on was from commit 49955d4, which is still the latest for the httpclient provider. But I didn't track that explicitly.

> The use of OSGi services seems like a good idea.  I suppose that might
> suggest a change in p2, although perhaps not.

Right now, the provider tracks a global HttpClient instance through its activator. That should work without changes for most clients, but I could imagine cases with special needs where more isolation would be needed. 

It would be easy to change this so we use one-off client instances again by default unless a named service is specifically requested. In that case, it should be completely backward-compatible.

> I expect we'll be able to get everything in place for 2019-6 release.  I'll
> take a look at the source in bundle form.  Can you make a git repo (with
> history since fork) available at some point?

The code is currently in our product repo, but I'll split it off and create a GitHub repo for it. I'll post the link once it's ready.

> And now the big question:  are you interested/willing to contribute to
> maintenance?...e.g. with new version of httpclient?

Sure.
Comment 3 Scott Lewis CLA 2019-02-15 12:13:58 EST
(In reply to Carsten Reckord from comment #2)
<stuff deleted>
> 
> In our experience, the vast majority of proxy-related issues are on Windows
> with corporate proxies that require NTLM/SPNEGO authentication. So adding
> that for Windows only still is a huge win.

I agree.  I was just clarifying.

If you have test cases that we could reuse then these could also be added to the ECF filetransfer tests.   One problem:  how to do platform-specific testing?   The existing filetransfer codebase (and in fact all of ECF) is pure-java/platform generic so there may be some releng work there.

> 
> > When was the fork done?   Does it incorporate any fixes to the httpclient 
> > provider done since the fork?  Admittedly there have not been many so
> > perhaps it's not an issue.
> 
> It was done around March 23, 2018. I'm pretty sure the code it is based on
> was from commit 49955d4, which is still the latest for the httpclient
> provider. But I didn't track that explicitly.

ok, thanks.   As I recall there haven't been many changes since then, so that should be easy.

> 
> > The use of OSGi services seems like a good idea.  I suppose that might
> > suggest a change in p2, although perhaps not.
> 
> Right now, the provider tracks a global HttpClient instance through its
> activator. That should work without changes for most clients, but I could
> imagine cases with special needs where more isolation would be needed. 
> 
> It would be easy to change this so we use one-off client instances again by
> default unless a named service is specifically requested. In that case, it
> should be completely backward-compatible.
> 
> > I expect we'll be able to get everything in place for 2019-6 release.  I'll
> > take a look at the source in bundle form.  Can you make a git repo (with
> > history since fork) available at some point?
> 
> The code is currently in our product repo, but I'll split it off and create
> a GitHub repo for it. I'll post the link once it's ready.

Ok, that would be great. 

> 
> > And now the big question:  are you interested/willing to contribute to
> > maintenance?...e.g. with new version of httpclient?
> 
> Sure.

Ok, terrific.  Are you going to be involved in the Orbit updates of httpclient?  These days, that's where most of the needs for new releases come from...as opposed to ECF filetransfer development.

BTW, do you know if this solves (on windows) the longstanding bug 422665 and/or bug 367960 ?

Adding Mat Booth for awareness and possible releng assistance.
Comment 4 Carsten Reckord CLA 2019-02-15 13:12:25 EST
(In reply to Scott Lewis from comment #3)
> If you have test cases that we could reuse then these could also be added to
> the ECF filetransfer tests.   One problem:  how to do platform-specific
> testing?   The existing filetransfer codebase (and in fact all of ECF) is
> pure-java/platform generic so there may be some releng work there.

I have to admit that the test story for the native stuff is an unsolved problem at this time. We've pretty much tested it manually for each of our product releases.

The problem is that you need 
1) a domain controller, 
2) a proxy server in the domain that does SPNEGO auth, and
3) a Windows workstation in the domain to run the tests on

1) and 2) can be implemented with a Linux Samba server e.g. in a Docker container managed by the build server, but for 3) you'll need a Windows VM or physical machine, which could then be used as a build node by Jenkins.

If that's something that's possible to set up in the CI environment at eclipse.org, then I can help with the setup and work on tests.

> Are you going to be involved in the Orbit updates of
> httpclient?  These days, that's where most of the needs for new releases
> come from...as opposed to ECF filetransfer development.

Yes. I'm already involved there since MPC consumes httpclient as well. I've just recently worked on the fix for the new 4.5.6 version.

> BTW, do you know if this solves (on windows) the longstanding bug 422665
> and/or bug 367960 ?

In my experience, it should. 

It worked for all our corporate clients so far, and I've used the same httpclient-win in MPC as well, with the result that folks suffering from this issue were able to browse the Marketplace fine, but then the P2 install failed.
Comment 5 Carsten Reckord CLA 2019-02-15 15:19:14 EST
(In reply to Scott Lewis from comment #3)
> > The code is currently in our product repo, but I'll split it off and create
> > a GitHub repo for it. I'll post the link once it's ready.
> 
> Ok, that would be great. 

I've just pushed https://github.com/YattaSolutions/ecf-filetransfer-httpclient45

I did a bit of cleanup while splitting the projects from our main repo, and I also added a few commits to the beginning of the history, recreating some of the transition from the original ECF httpclient4 code to ours.

There's some code reformatting going on, which makes a direct comparison a bit harder. I can try to rewrite the history to revert the format if you prefer. 

Otherwise, I'll do that once we move everything back into the org.eclipse.ecf namespace again for the contribution.

Also, right now the Tycho build won't work because the parent projects are missing. I'll add a simplified maven setup over the weekend or on Monday.
Comment 6 Scott Lewis CLA 2019-02-15 16:17:35 EST
(In reply to Carsten Reckord from comment #4)
<stuff deleted>

> > BTW, do you know if this solves (on windows) the longstanding bug 422665
> > and/or bug 367960 ?
> 
> In my experience, it should. 
> 
> It worked for all our corporate clients so far, and I've used the same
> httpclient-win in MPC as well, with the result that folks suffering from
> this issue were able to browse the Marketplace fine, but then the P2 install
> failed.

It's my understanding that MPC uses ECF filetransfer (and the same httpclient) that p2 does.   Does it make sense to you that MPC succeeds and P2 fails?
Comment 7 Carsten Reckord CLA 2019-02-15 16:26:47 EST
(In reply to Scott Lewis from comment #6)
> It's my understanding that MPC uses ECF filetransfer (and the same
> httpclient) that p2 does.   Does it make sense to you that MPC succeeds and
> P2 fails?

Actually, MPC added direct HttpClient use a while back for its REST calls to have direct access to HTTP methods and headers, just falling back to ECF if that fails.
Comment 8 Scott Lewis CLA 2019-02-15 16:57:41 EST
(In reply to Carsten Reckord from comment #7)
> (In reply to Scott Lewis from comment #6)
> > It's my understanding that MPC uses ECF filetransfer (and the same
> > httpclient) that p2 does.   Does it make sense to you that MPC succeeds and
> > P2 fails?
> 
> Actually, MPC added direct HttpClient use a while back for its REST calls to
> have direct access to HTTP methods and headers, just falling back to ECF if
> that fails.

They could have used ECF for headers and methods access, but ok.
Comment 9 Scott Lewis CLA 2019-03-18 21:40:28 EDT
(In reply to Carsten Reckord from comment #5)
<stuff deleted>
> I've just pushed
> https://github.com/YattaSolutions/ecf-filetransfer-httpclient45
> 
> I did a bit of cleanup while splitting the projects from our main repo, and
> I also added a few commits to the beginning of the history, recreating some
> of the transition from the original ECF httpclient4 code to ours.
> 
> There's some code reformatting going on, which makes a direct comparison a
> bit harder. I can try to rewrite the history to revert the format if you
> prefer. 

Hi Carsten.  Starting to work on this a little bit.

If you can make it easy to compare with existing provider it would be helpful...as I see there's a fair amount of existing code particularly in the de.yatta.ecf.provider.filetransfer.httpclient45 bundle


> 
> Otherwise, I'll do that once we move everything back into the
> org.eclipse.ecf namespace again for the contribution.
> 
> Also, right now the Tycho build won't work because the parent projects are
> missing. I'll add a simplified maven setup over the weekend or on Monday.

Is this simplified maven setup already there?

Just to be clear, I want to ask you if you expect that the de.yatta namespace will go away for both projects...and that they will both be under org.eclipse.ecf namespace?

I think as we begin this that we will have to figure out how to test this on win32, as up until now ECF doesn't have any platform-specific bundles, and so our testing of filetransfer has been easy for me to do locally.   I expect the platform, however, will insist that we have some platform-specific testing, and I don't have the personal resources to supply a testing environment for that.   P2 also has test code for filetransfer as I understand it, so it's possible that we might be able to piggy back on the platform's testing infrastructure.
Comment 10 Carsten Reckord CLA 2019-03-25 06:13:30 EDT
(In reply to Scott Lewis from comment #9)
> Just to be clear, I want to ask you if you expect that the de.yatta
> namespace will go away for both projects...and that they will both be under
> org.eclipse.ecf namespace?

Yes, I expect them both to move to org.eclipse.ecf

> If you can make it easy to compare with existing provider it would be
> helpful...as I see there's a fair amount of existing code particularly in
> the de.yatta.ecf.provider.filetransfer.httpclient45 bundle

I can do the rename back to org.eclipse.ecf and revert the change in code formatting. That should make the comparison with the equivalent classes in org.eclipse.ecf.provider.filetransfer.httpclient4 pretty straight forward.

I'll try to include a direct diff between the two as well.

I have a deadline tomorrow, but I'll work on it directly Wednesday morning.
 
> Is this simplified maven setup already there?

Not yet. I'll finish it on Wednesday as well.

> I think as we begin this that we will have to figure out how to test this on
> win32, as up until now ECF doesn't have any platform-specific bundles, and
> so our testing of filetransfer has been easy for me to do locally.   I
> expect the platform, however, will insist that we have some
> platform-specific testing, and I don't have the personal resources to supply
> a testing environment for that.   P2 also has test code for filetransfer as
> I understand it, so it's possible that we might be able to piggy back on the
> platform's testing infrastructure.

There's two parts to these tests: 

First we need domain controllers for NTLM/Kerberos authentication and domain proxies authenticating against them. This should be pretty straight-forward, because it can be done with Samba and Squid running in Docker.

The seconde part is trickier. We need Windows clients as members of these test domains. I don't think the existing test environments for Platform/P2 would work, mostly because they probably wouldn't appreciate the interference of us messing with their systems' network settings... 

Depending on the Windows infrastructure/licenses available to the Foundation, this should be possible with either Docker Enterprise on Windows Server (included in >=2016, also includes Windows licensing for any number of containers) or with dedicated Windows VMs (will need Windows licenses)(we could control those ourselves using e.g. Vagrant if we have a node to run them, or ask for pre-configured VMs).

I would suggest that I open a ticket against Infra to discuss this with Fred and loop Platform/P2 in on the ticket to check if their environments could be used.

But just to be clear: This only affects the optional Windows fragment (de.yatta.ecf.provider.filetransfer.httpclient45.win32). The base provider is still completely platform-independent and can be tested the same as before. So let's maybe start with that, and then get to the Windows fragment once we've figured out the testing story?
Comment 11 Carsten Reckord CLA 2019-03-25 06:53:14 EDT
(In reply to Carsten Reckord from comment #10)
> this should be possible with either Docker Enterprise on Windows
> Server

Looks like I spoke too soon on this one. According to [1], Windows containers cannot freely join  domains. So we would need some sort of VM setup...

[1] https://support.microsoft.com/en-us/help/4489234/support-policy-for-windows-containers-and-docker-on-premises#section-5
Comment 12 Scott Lewis CLA 2019-04-14 19:15:58 EDT
Ping.  I have a lot of other work commitments this spring, so timing might be starting to get a little tight on this...unless others can take on major pieces (e.g. setting up/doing win32-specific build, testing in proxy environments).
Comment 13 Carsten Reckord CLA 2019-04-17 11:38:47 EDT
My apologies. I just pushed a cleaned up version to GitHub that

- changes the naming to org.eclipse.ecf
- reformats the codebase with the original ECF settings for easier comparison
- adds a maven build, update site and target definition.

I also pushed a branch "ecf-compare", which contains the original ECF Httpclient4 code in place of my source files for easier comparison. You should be able to do a git diff ecf-compare from the master branch to get a good comparison of all the changes. (The ecf-compare branch does not compile, since I left all source files exactly as they were, including the different packages)

I'll open an infra ticket for the Windows test infrastructure as outlined in my previous comment. I'll also look into the actual platform-specific test setup. But let's maybe get started with the base provider bundle first?
Comment 14 Scott Lewis CLA 2019-04-19 01:50:46 EDT
(In reply to Carsten Reckord from comment #13)
> My apologies. I just pushed a cleaned up version to GitHub that
> 
> - changes the naming to org.eclipse.ecf
> - reformats the codebase with the original ECF settings for easier comparison
> - adds a maven build, update site and target definition.
> 
> I also pushed a branch "ecf-compare", which contains the original ECF
> Httpclient4 code in place of my source files for easier comparison. You
> should be able to do a git diff ecf-compare from the master branch to get a
> good comparison of all the changes. (The ecf-compare branch does not
> compile, since I left all source files exactly as they were, including the
> different packages)
> 
> I'll open an infra ticket for the Windows test infrastructure as outlined in
> my previous comment. I'll also look into the actual platform-specific test
> setup. But let's maybe get started with the base provider bundle first?

Hi Carsten...good news.  I was able to import your 45 projects and with only a very small change get it to successfully run the ECF filetransfer tests.

WRT the very small change...for reasons I don't understand, I was originally getting this stack trace on every test case:

java.lang.NoClassDefFoundError: org/apache/http/conn/socket/LayeredConnectionSocketFactory
	at org.eclipse.ecf.internal.provider.filetransfer.httpclient45.Activator.getHttpClientFactory(Activator.java:242)
	at org.eclipse.ecf.internal.provider.filetransfer.httpclient45.Activator.registerHttpClient(Activator.java:277)
	at org.eclipse.ecf.internal.provider.filetransfer.httpclient45.Activator.getRetrieveHttpClient(Activator.java:271)
	at org.eclipse.ecf.provider.filetransfer.httpclient45.HttpClientRetrieveFileTransferFactory.newInstance(HttpClientRetrieveFileTransferFactory.java:24)
	at org.eclipse.ecf.internal.provider.filetransfer.Activator.getFileTransfer(Activator.java:608)
	at org.eclipse.ecf.provider.filetransfer.retrieve.MultiProtocolRetrieveAdapter.sendRetrieveRequest(MultiProtocolRetrieveAdapter.java:92)
	at org.eclipse.ecf.tests.filetransfer.URLRetrieveTest.testReceive(URLRetrieveTest.java:132)
	at org.eclipse.ecf.tests.filetransfer.URLRetrieveTest.testReceiveNonCanonicalURLPath(URLRetrieveTest.java:209)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
	at junit.framework.TestCase.runTest(TestCase.java:176)
	at junit.framework.TestCase.runBare(TestCase.java:141)
	at org.eclipse.ecf.tests.filetransfer.AbstractFileTransferTestCase.runBare(AbstractFileTransferTestCase.java:38)
	at junit.framework.TestResult$1.protect(TestResult.java:122)
	at junit.framework.TestResult.runProtected(TestResult.java:142)
	at junit.framework.TestResult.run(TestResult.java:125)
	at junit.framework.TestCase.run(TestCase.java:129)
	at junit.framework.TestSuite.runTest(TestSuite.java:252)
	at junit.framework.TestSuite.run(TestSuite.java:247)
	at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:121)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:538)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:760)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:460)
	at org.eclipse.pde.internal.junit.runtime.RemotePluginTestRunner.main(RemotePluginTestRunner.java:229)
	at org.eclipse.pde.internal.junit.runtime.CoreTestApplication.run(CoreTestApplication.java:24)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
	at org.eclipse.equinox.internal.app.EclipseAppContainer.callMethodWithException(EclipseAppContainer.java:587)
	at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:198)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:134)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:104)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:388)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:243)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
	at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:656)
	at org.eclipse.equinox.launcher.Main.basicRun(Main.java:592)
	at org.eclipse.equinox.launcher.Main.run(Main.java:1498)
	at org.eclipse.equinox.launcher.Main.main(Main.java:1471)
Caused by: java.lang.ClassNotFoundException: org.apache.http.conn.socket.LayeredConnectionSocketFactory cannot be found by org.eclipse.ecf.provider.filetransfer.httpclient45_0.1.0.qualifier
	at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:508)
	at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:419)
	at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:411)
	at org.eclipse.osgi.internal.loader.ModuleClassLoader.loadClass(ModuleClassLoader.java:150)
	at java.lang.ClassLoader.loadClass(Unknown Source)
	... 46 more


Do you understand this?

In any event, if I add this Import-Package to org.eclipse.ecf.provider.filetransfer.httpclient45 manifest.mf

 org.apache.http.conn.socket;version="[4.5.0,5.0.0)",

The ECF tests run successfully except for the tests that attempt to interrupt the socket connect.   I imagine that would be expected given I think you said you took out the logic for the client connect customization that was there.

The only test class that fails is:  org.eclipse.ecf.tests.filetransfer.URLRetrieveTestCancelConnectJob

For next steps, I'm thinking that I should commit these projects to the ECF git repo in the appropriate existing directories:

org.eclipse.ecf.provider.filetransfer.httpclient45
org.eclipse.ecf.provider.filetransfer.httpclient45.win32
org.eclipse.ecf.provider.httpclient45.feature

One question:  Is there any .ssl fragment for the 45 provider?  For the 4 provider this fragment implements these interfaces: ISSLSocketFactoryModifier, INonconnectedSocketFactory

I'm not sure if these are needed for the 45 provider but they are needed to allow for creation of an appropriate sslsocket factory and socket.  Anyway, I'm not clear if it's needed or not for the 45 provider.

I think it probably makes sense to modify the existing .target configs to support the build of the 45 provider as a part of the ECF builder.

So does moving these projects to the ecf git repo work for you?  Before I do so, however, it will probably be necessary for you to put a copyright header in all of the classes in 45 and 45.win32.  Some of them should probably use the existing Composent, IBM copyright, but adding Yatta Solutions, the new ones should have just the Yatta Solutions header, and etc.

I expect I will also have to create a Contribution Questionnaire for the 45 provider and have the EF approve.   I think we should get the copyright notices all squared away before I submit the CQ, as I'll have to put the current source in CQ.

LMK if this all works for you/Yatta.

One other question:  What should be done with the 45.win32 bundle and the org.apache.httpcomponents.httpclient.win bundle for non-windows distributions?   Should they be included and just not resolve?  Or not included?   I/we will have to work this out with the platform releng folks I expect.

Thanks again to you and Yatta for the contribution.  Your additions and changes look very good and I'm looking forward to learning more about them.
Comment 15 Scott Lewis CLA 2019-04-22 22:51:02 EDT
Another question for Carsten:  In the httpclient45 feature are two bundles:  com.sun.jna and com.sun.jna.platform.  Are these required?  If so, what is their licensing status?  Thanks.
Comment 16 Carsten Reckord CLA 2019-04-23 06:35:20 EDT
(In reply to Scott Lewis from comment #14)
> Hi Carsten...good news.  I was able to import your 45 projects and with only
> a very small change get it to successfully run the ECF filetransfer tests.

That's great :)

> Do you understand this?
> 
> In any event, if I add this Import-Package to
> org.eclipse.ecf.provider.filetransfer.httpclient45 manifest.mf
> 
>  org.apache.http.conn.socket;version="[4.5.0,5.0.0)",

It looks like the transitive dependency from SSLConnectionSocketFactory to LayeredConnectionSocketFactory breaks somehow, which is strange because
1) I would have expected that to be internal to the httpclient bundle, which contains both of these classes
2) I've never seen this in production.

I also noticed that the code failed on line 242, which means that it didn't find the factory as a registered OSGi service and fell back to creating one on the fly. Maybe the OSGi environment isn't fully up during the test?

> The ECF tests run successfully except for the tests that attempt to
> interrupt the socket connect.   I imagine that would be expected given I
> think you said you took out the logic for the client connect customization
> that was there.

Yes, I took that out because directly manipulating the sockets is problematic in a connection pooling setting.

> The only test class that fails is: 
> org.eclipse.ecf.tests.filetransfer.URLRetrieveTestCancelConnectJob

I can look into that.

> For next steps, I'm thinking that I should commit these projects to the ECF
> git repo in the appropriate existing directories:
> 
> org.eclipse.ecf.provider.filetransfer.httpclient45
> org.eclipse.ecf.provider.filetransfer.httpclient45.win32
> org.eclipse.ecf.provider.httpclient45.feature
> 
> One question:  Is there any .ssl fragment for the 45 provider?  For the 4
> provider this fragment implements these interfaces:
> ISSLSocketFactoryModifier, INonconnectedSocketFactory

Right now, no. I didn't quite understand the use-case of the ssl fragment, so I left that out at first. Right now, the HttpClientDefaultSSLSocketFactoryModifier is used directly without implementing those interfaces. 

I can see providing a version of ISSLSocketFactoryModifier for clients to hook into. But the INonconnectedSocketFactory might be an issue, because that goes down to the socket layer, again potentially breaking the HttpClient's own management of that.

> I'm not sure if these are needed for the 45 provider but they are needed to
> allow for creation of an appropriate sslsocket factory and socket.  Anyway,
> I'm not clear if it's needed or not for the 45 provider.

The 45 provider right now provides a full working SSL stack out of the box. As I said, I didn't get the use case, but regular SSL/HTTPS should just work.

> I think it probably makes sense to modify the existing .target configs to
> support the build of the 45 provider as a part of the ECF builder.

Yes. You'll notice that I'm currently pulling in http://download.eclipse.org/mpc/httpclient/drops/I-builds/201902131947 to get the HttpClient bits. 

This is due to bug 514633 in Tycho, which won't let you consume platform-specific bits in your target platform directly. So I wrapped all the bundles in a feature with appropriate platform filters, which can be used in the TP without problem. The feature is not part of the distribution afterwards.

We can move that site over to ECF if you like.

> So does moving these projects to the ecf git repo work for you?  Before I do
> so, however, it will probably be necessary for you to put a copyright header
> in all of the classes in 45 and 45.win32.  Some of them should probably use
> the existing Composent, IBM copyright, but adding Yatta Solutions, the new
> ones should have just the Yatta Solutions header, and etc.

Sure, that works for me/us. I'll go over the copyright headers this evening.

> LMK if this all works for you/Yatta.

Yes, sounds great.

> One other question:  What should be done with the 45.win32 bundle and the
> org.apache.httpcomponents.httpclient.win bundle for non-windows
> distributions?   Should they be included and just not resolve?  Or not
> included?   I/we will have to work this out with the platform releng folks I
> expect.

I would suggest to move them to ECF as well to have all the code in place, and just leave them out of the feature (and possibly the update site) until the testing story is cleared.

(In reply to Scott Lewis from comment #15)
> Another question for Carsten:  In the httpclient45 feature are two bundles: 
> com.sun.jna and com.sun.jna.platform.  Are these required?  If so, what is
> their licensing status?  Thanks.

These are required by the httpclient.win library to access the Windows Domain Auth token API. 

They are approved and in Orbit, see https://dev.eclipse.org/ipzilla/show_bug.cgi?id=15797 and https://dev.eclipse.org/ipzilla/show_bug.cgi?id=15796. So you should be able to simply file a piggyback CQ.
Comment 17 Carsten Reckord CLA 2019-04-25 03:52:41 EDT
I've updated the copyright headers and plugin/feature copyright data. I hope they are in order this way. Let me know if anything needs to change about them.
Comment 18 Scott Lewis CLA 2019-04-29 23:04:01 EDT
Hi Carsten.  I've submitted a Code CQ for this contribution:

https://dev.eclipse.org/ipzilla/show_bug.cgi?id=19623

I put your name/company/email on the CQ so I would expect the EF folks to contact you.
Comment 19 Scott Lewis CLA 2019-05-02 17:32:22 EDT
The CQ associated with this contribution:

https://dev.eclipse.org/ipzilla/show_bug.cgi?id=19623

has been approved, and I've pushed these three projects:

org.eclipse.ecf.provider.httpclient45
org.eclipse.ecf.provider.httpclient45.win32
org.eclipse.ecf.provider.httpclient45.feature

to the ECF git repo with this commit:

https://git.eclipse.org/c/ecf/org.eclipse.ecf.git/commit/?id=24cd2e550dec5ccea26d403bf8eb2595c8114b19

note the bundles are in providers/bundles and the feature project is in releng/features.

There were only a few small changes from what was provided:

1) version changes from 0.1.0 to 1.0.0 in feature and bundle projects.
2) References changed to parent project
3) A security fix wrt handling https.protocols via bug 546896 recently applied to httpclient4 

Mat Booth and I will incorporate this into the ECF build for 3.14.5 release as soon as possible.   I'll leave this bug open until that's done.  

On behalf of ECF and the consumers, thanks again Carsten and Yatta for the contribution. 

We will work together to figure out how best to test the win32 proxying with the platform team and foundation.  

Carsten we may need your support over the next 1.5 months in case platform consumers find issues in some or all environments.  Thanksinadvance for that.   I think you are already on cross-projects-issues-dev mailing list.  Please jump in as needed when the platform usage and testing questions come up.
Comment 20 Scott Lewis CLA 2019-05-09 12:09:41 EDT
For everyone's info:

The ECF cbi builder is now building the httpclient45 provider with this build job:

https://ci.eclipse.org/ecf/job/ecf-tycho.master/

Currently all bundles are included in one feature:  org.eclipse.ecf.filetransfer.httpclient45.feature as Carsten originally defined.  This feature includes both the all-platform bundle:

org.eclipse.ecf.provider.filetransfer.httpclient45
depends upon:   org.apache.httpcomponents.httpclient 4.5.6 (Orbit)

and the windows-specific bundle:

org.eclipse.ecf.provider.filetransfer.httpclient45.win32

One question for platform releng:  How do you want to consume the httpclient45 provider?   I.e. is one feature created by Carsten sufficient?  Or should the windows-specific bundles be in a separate feature (e.g. org.eclipse.ecf.filetransfer.httpclient45.win32.feature?)

It's my understanding that Mat Booth is looking into this question with the platform releng team.
Comment 21 Scott Lewis CLA 2019-05-14 12:52:02 EDT
The ECF builder is now deploying the latest build (including httpclient45 provider, features, and deps) to snapshot area:

https://download.eclipse.org/rt/ecf/snapshot/

I've downloaded the zip, and installed things from the p2 repo (in site.p2 directory) and the shape of things looks good to me.

Carsten, would you please look over the shape of the p2 repo contents, and the zip contents (they should be the same) and verify that the new httpclient45 provider is complete and functioning as expected?   

Also:   could you and/or others engage in some testing behind firewalls that should be addressed by new code in httpclient45...and verify that eclipse/p2/filetransfer now works in those contexts?  Hopefully bug 422665 and bug 367960

Thanks again...and inadvance.
Comment 22 Mat Booth CLA 2019-05-15 10:39:30 EDT
(In reply to Scott Lewis from comment #20)
> One question for platform releng:  How do you want to consume the
> httpclient45 provider?   I.e. is one feature created by Carsten sufficient? 
> Or should the windows-specific bundles be in a separate feature (e.g.
> org.eclipse.ecf.filetransfer.httpclient45.win32.feature?)
> 
> It's my understanding that Mat Booth is looking into this question with the
> platform releng team.

As I indicated in an email, it is satisfactory to ship all the bundles in a single feature because the product assembly stage of the platform build automatically does the right thing about including platform-specific bundles.

If you want to see a test build of the Eclipse Platform containing the new ECF provider, I made one available here in case anyone would like to inspect the contents of the product and try it out:

https://fedorapeople.org/~mbooth/eclipse_new_ecf/
Comment 23 Scott Lewis CLA 2019-05-15 11:25:21 EDT
(In reply to Mat Booth from comment #22)
<stuff deleted>
> If you want to see a test build of the Eclipse Platform containing the new
> ECF provider, I made one available here in case anyone would like to inspect
> the contents of the product and try it out:
> 
> https://fedorapeople.org/~mbooth/eclipse_new_ecf/

I (Scott) will try this in my environment.

Hopefully Carsten (and Yatta) could do some testing with this test release and/or Eclipse M3.

I request that the platform + pmc folks and their employers consider assisting with this testing in corporate proxy environments. 
 
I would also suggest that some folks on bug 422665 and bug 367960 be recruited for community testing as well.
Comment 24 Mat Booth CLA 2019-05-16 16:26:09 EDT
I think we've got all the necessary platform changes in today and a I-build of the platform just completed that has the httpclient45 provider:

https://download.eclipse.org/eclipse/downloads/drops4/I20190516-1405/

Please give it a try!

The platform unit tests, when they complete, will appear here: https://download.eclipse.org/eclipse/downloads/drops4/I20190516-1405/testResults.php
Comment 25 Carsten Reckord CLA 2019-05-21 14:08:59 EDT
Hey Scott and Mat,

I'll be testing the provider in our test proxy environments asap. I'm currently having some trouble connecting to our Windows test domain from my home office, so I need to look into that first, but I'll update you here with the results in the next 1-2 days.
Comment 26 Mat Booth CLA 2019-05-22 07:02:36 EDT
(In reply to Carsten Reckord from comment #25)
> Hey Scott and Mat,
> 
> I'll be testing the provider in our test proxy environments asap. I'm
> currently having some trouble connecting to our Windows test domain from my
> home office, so I need to look into that first, but I'll update you here
> with the results in the next 1-2 days.

Thanks that's appreciated since I do not have access to any Windows environment.

I think we (platform) are pretty much done for M3, so if you see any problem, let's try to get it addressed for RC1 :-)
Comment 27 Ed Merks CLA 2019-05-27 23:21:23 EDT
As I mentioned on the cross projects mailing list, I've run into a few surprises using this new version of ECF's providers. 

One problem is that my timeouts weren't used so I had to add the ones that are used (without refering to the actual headers):

      Map<Object, Object> requestOptions = new HashMap<Object, Object>();
      requestOptions.put(IRetrieveFileTransferOptions.CONNECT_TIMEOUT, CONNECT_TIMEOUT);
      requestOptions.put("org.eclipse.ecf.provider.filetransfer.httpclient4.retrieve.connectTimeout", CONNECT_TIMEOUT);
      requestOptions.put(IRetrieveFileTransferOptions.READ_TIMEOUT, READ_TIMEOUT);
      requestOptions.put("org.eclipse.ecf.provider.filetransfer.httpclient4.retrieve.readTimeout", READ_TIMEOUT);

Does it really make sense that there are now two different options for setting this depending on which provider is registered?

The other problem is really my fault because of the horrible reflective hack for being able to control cookie processing, which is used to do form-based authentication, but I got that working with a new horrible hack.

The worst problem is the behavior of the connection pool.  I appears to be buggy to me.  This behavior is easily reproduced by the SetupArchiver.  It visits a very large number of resources and it appears to me that it very quickly gets to the point where the connection pool hangs because it is blocked to provide new leases.  I.e., all threads are blocked waiting for a lease (with a 120 second timeout) but no thread is actively using any connections.  Eventually these waiting threads timeout and fail to make a connection.   I see that ECF itself increases the numbers:

        builder.setMaxConnPerRoute(100);
        builder.setMaxConnTotal(300);

I tried setting them to 1000 and 10000 and then everything starts to work again.

So to my thinking there is some fundamental problem with the pool implementation or how the pool is used by the new provider.  I think such problems only become evident when one exceeds the limit numbers hard coded in the builder.

I can imagine p2 could well run into similar problems. In any case, the problem is easily reproducible with the SetupArchiver locally and that's easy to set up to run it's launcher.  One can switch the launcher to use or not use the new feature to see how it works in both cases and you can delete this line to reproduce again the behavior I'm seeing with the pool limits set to ECF's default settings:

    System.setProperty("oomph.setup.ecf.force.large.connection.pool", "true");
Comment 28 Dani Megert CLA 2019-05-28 10:12:16 EDT
On the mailing list I got contradictory information when I asked whether just shipping both httpClient versions would be enough:

Scott: I believe that the older provider (httpclient4) would resolve these issues as this provider has configurable timeouts...for connect and others.

Ed: To set the timeouts I did have to change code.

The latter indicates that clients will have to detect the problem and change their code.


Is there a chance to get a fix in httClient45 and/or ECF for 4.12 very soon? Note that the Platform has RC1 this week and plans to be done next week with RC2.
Comment 29 Scott Lewis CLA 2019-05-28 10:29:23 EDT
(In reply to Dani Megert from comment #28)
> On the mailing list I got contradictory information when I asked whether
> just shipping both httpClient versions would be enough:
> 
> Scott: I believe that the older provider (httpclient4) would resolve these
> issues as this provider has configurable timeouts...for connect and others.


I believe the httpclient4 provider would not have these issues.

If multiple filetransfer providers are present, individual providers may be disabled at start time via a system property 

https://wiki.eclipse.org/Disabling_Apache_Httpclient45

<stuff deleted>

> Is there a chance to get a fix in httClient45 and/or ECF for 4.12 very soon?

Carsten is likely needed to address the httpclient45 issues reported by Ed.
Comment 30 Dani Megert CLA 2019-05-28 10:50:36 EDT
(In reply to Scott Lewis from comment #29)
> I believe the httpclient4 provider would not have these issues.
> 
> If multiple filetransfer providers are present, individual providers may be
> disabled at start time via a system property 
> 
> https://wiki.eclipse.org/Disabling_Apache_Httpclient45
Yes, got that, but that's something the client has to do, right?
Comment 31 Dani Megert CLA 2019-05-28 11:16:53 EDT
We discussed this in today's PMC call. We will wait until next week for a fix. If there is no fix, we will ship ONLY the OLD version. That's the only way to make sure we don't break anyone.
Comment 32 Scott Lewis CLA 2019-05-28 12:05:33 EDT
(In reply to Dani Megert from comment #30)
> (In reply to Scott Lewis from comment #29)
> Yes, got that, but that's something the client has to do, right?

Yes of course.   FWIW, there is also a way to disable/enable filetransfer providers at runtime, but of course that requires addiitional code.

>We discussed this in today's PMC call. We will wait until next week for a fix. If there is no fix, we will ship ONLY the OLD version. That's the only way to make sure we don't break anyone.

For those that want the win32 proxy improvements included in the httpclient45 provider (this bug), I suggest that someone attempt get a hold of Carsten directly.
Comment 33 Scott Lewis CLA 2019-05-28 12:31:05 EDT
Ed and others can test against the (old) httpclient4 feature available here 

https://download.eclipse.org/rt/ecf/snapshot/site.p2

1) Install the httpclient4 feature from repo into eclipse
2) Disable httpclient45 provider as per https://wiki.eclipse.org/Disabling_Apache_Httpclient45
Comment 34 Ed Merks CLA 2019-05-28 14:26:12 EDT
I have two launchers. One that included the 45 version feature and one that didn't so I could easily compare the two.  I only saw the problem of "all thread waiting for a lease" with the 45 version so definitely the older version behaves as it has for years.

Hopefully Carsten reads his emails.  I.e., the only direct contact to him is via email and that's not different than posting to this Bugzilla I imagine.

I'm more than willing to help get to the bottom of this issue so we can continue to include it in the next release.
Comment 35 Carsten Reckord CLA 2019-05-29 02:52:23 EDT
I'll just copy my reply to cross-platform here:

> I think both issues (pool size and using the old options) should be easily 
> fixable. In fact, I would have expected using the old options to have worked, 
> since we did just that with our original code before bringing it into shape for 
> the eclipse.org contribution - I'll check up on that.
>
> I'm on the road today, but I have a free day tomorrow to look into it.

In the meantime, fwiw, Windows-specific tests behind a corporate firewall are looking good. But I'll be setting up a couple more different flavors of NTLM/SPNEGO authentication to check them out as well.
Comment 36 Ed Merks CLA 2019-05-29 03:19:27 EDT
If there's anything I can do to help, please don't hesitate to ask!  Just send an email or contact me on Skype; you're in my contact list.
Comment 37 Mat Booth CLA 2019-05-29 11:00:38 EDT
(In reply to Ed Merks from comment #27)
> In any case, the
> problem is easily reproducible with the SetupArchiver locally and that's
> easy to set up to run it's launcher.  One can switch the launcher to use or
> not use the new feature to see how it works in both cases and you can delete
> this line to reproduce again the behavior I'm seeing with the pool limits
> set to ECF's default settings:
> 
>     System.setProperty("oomph.setup.ecf.force.large.connection.pool",
> "true");

Ed, can you share more details about how can I can reproduce the problem locally?

Here's what I tried so far:

 * Clone latest head of oomph repo
 * Comment out the line from SetupArchiver that sets the "oomph.setup.ecf.force.large.connection.pool" property that you mention above
 * mvn clean verify
 * cd products/org.eclipse.oomph.setup.installer.product/target/products/org.eclipse.oomph.setup.installer.product/linux/gtk/x86_64
 * ./eclipse-inst -console -consoleLog
 * Try installing a product

I tried installing a couple of different products with no problem whatsoever There was a couple of slow mirrors (I see some retries scroll past in the log), but all products I tried so far installed in less than a minute.
Comment 38 Ed Merks CLA 2019-05-29 11:16:24 EDT
Matt,

It's generally easiest to use the instructions here to set up the environment:

https://ci.eclipse.org/oomph/

The target platform will have the latest platform Integration build in it.

There will already exist a Launcher to launch to launch SetupArchiver in the Eclipse Application category.  No need to do a Maven build.

You should already see this launcher in your manually configured environment (unless you did this all via the command line and not in an IDE).

Indeed the installer works okay, so I'm not really sure what exactly causes problems with the SetupArchiver. 

The SetupArchiver is a separate application that's used to crawl through all the gitc links so that no user ever directly uses gitc.

In the launcher you'll see the httpclient45 feature selected.  If you unselect it you can see the behavior with the old provider.  With that selected, you see the behavior of the new provider.  After a short while, all the threads will be blocked.


If you also comment out these lines in ECFURIHandlerImpl the timeout will be very long so you'll see the blocked threads blocked on same URIs for a long time.

      requestOptions.put(IRetrieveFileTransferOptions.CONNECT_TIMEOUT, CONNECT_TIMEOUT);
      requestOptions.put("org.eclipse.ecf.provider.filetransfer.httpclient4.retrieve.connectTimeout", CONNECT_TIMEOUT);
      requestOptions.put(IRetrieveFileTransferOptions.READ_TIMEOUT, READ_TIMEOUT);
      requestOptions.put("org.eclipse.ecf.provider.filetransfer.httpclient4.retrieve.readTimeout", READ_TIMEOUT);
Comment 39 Mat Booth CLA 2019-05-29 11:26:47 EDT
(In reply to Ed Merks from comment #38)
> The SetupArchiver is a separate application

Aha, that's the vital piece of information I was missing... Thanks I'll try again
Comment 40 Ed Merks CLA 2019-05-29 11:36:57 EDT
Thanks for looking into this!

One thought I had was maybe somehow the connections are not closed and hence the lease isn't released.

Certainly p2 downloads lots of artifacts too and somehow the problem seems different.

And I do see leases being released during the setup archiving process so you'd think the pool wouldn't just fill up.
Comment 41 Eclipse Genie CLA 2019-05-29 12:52:03 EDT
New Gerrit change created: https://git.eclipse.org/r/143032
Comment 42 Mat Booth CLA 2019-05-29 13:01:46 EDT
(In reply to Eclipse Genie from comment #41)
> New Gerrit change created: https://git.eclipse.org/r/143032

Hi Ed, this change attempts to fix the problem with the timeout properties, one of the timeout values was being erroneously ignored, please try this build to confirm:

https://ci.eclipse.org/ecf/job/ecf-tycho.gerrit-2019-03/20/artifact/releng/org.eclipse.ecf.releng.repository/target/repository/

(i.e. you should no longer have to specify the provider-specific timeout properties)
Comment 44 Scott Lewis CLA 2019-05-29 15:33:16 EDT
(In reply to Mat Booth from comment #42)
> (In reply to Eclipse Genie from comment #41)
> > New Gerrit change created: https://git.eclipse.org/r/143032
> 
> Hi Ed, this change attempts to fix the problem with the timeout properties,
> one of the timeout values was being erroneously ignored, please try this
> build to confirm:
> 
> https://ci.eclipse.org/ecf/job/ecf-tycho.gerrit-2019-03/20/artifact/releng/
> org.eclipse.ecf.releng.repository/target/repository/
> 
> (i.e. you should no longer have to specify the provider-specific timeout
> properties)

This gerrit change was merged and the builder should start shortly. 

Just to be clear:   Mat is this the only change necessary to address Ed's connection issues?  Or is there more/other work remaining for Carsten and/or others?
Comment 45 Scott Lewis CLA 2019-05-29 16:00:20 EDT
(In reply to Eclipse Genie from comment #43)
> Gerrit change https://git.eclipse.org/r/143032 was merged to [master].
> Commit:
> http://git.eclipse.org/c/ecf/org.eclipse.ecf.git/commit/
> ?id=4eef8c9b66f035b9b520da7fbabccd0ed4530210

ECF build including this change available here

https://download.eclipse.org/rt/ecf/snapshot/site.p2

the only change in this build was the above commit to the httpclient45 provider
Comment 46 Mat Booth CLA 2019-05-29 18:05:15 EDT
(In reply to Scott Lewis from comment #44)
> (In reply to Mat Booth from comment #42)
> > (In reply to Eclipse Genie from comment #41)
> > > New Gerrit change created: https://git.eclipse.org/r/143032
> > 
> > Hi Ed, this change attempts to fix the problem with the timeout properties,
> > one of the timeout values was being erroneously ignored, please try this
> > build to confirm:
> > 
> > https://ci.eclipse.org/ecf/job/ecf-tycho.gerrit-2019-03/20/artifact/releng/
> > org.eclipse.ecf.releng.repository/target/repository/
> > 
> > (i.e. you should no longer have to specify the provider-specific timeout
> > properties)
> 
> This gerrit change was merged and the builder should start shortly. 
> 
> Just to be clear:   Mat is this the only change necessary to address Ed's
> connection issues?  Or is there more/other work remaining for Carsten and/or
> others?

I'm still looking at the other connection starvation problem, Carsten should look too when he's back online since he's more familiar with the code.
Comment 47 Ed Merks CLA 2019-05-30 04:34:38 EDT
Created attachment 278779 [details]
Filtered logging information

I have an environment setup that includes ECF directly from the Git repository, so it sees all the latest commits and includes the EMF source. 

In fact it's the environment produced for 

https://wiki.eclipse.org/Eclipse_Platform_SDK_Provisioning

but with Oomph's setup added as well.

So it's really all the latest of everything. 

I've included these options in the SetupArchive launch configuration's VM arguments:

-Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.SimpleLog
-Dorg.apache.commons.logging.simplelog.showdatetime=true
-Dorg.apache.commons.logging.simplelog.log.org.apache.http.impl.conn=DEBUG
-Dorg.apache.commons.logging.simplelog.log.org.apache.http.impl.client=DEBUG
-Dorg.apache.commons.logging.simplelog.log.org.apache.http.client=DEBUG

So now I get a trace of what the http client is doing.  It's a lot of information.  But I've filtered it to shows the tracing for the the leasing and releasing of connections in the attached file.

It's clear that there are way more leases than releases and it's clear the problem kicks in when this happens:

2019/05/30 10:19:12:611 CEST [DEBUG] PoolingHttpClientConnectionManager - Connection leased: [id: 206][route: {}->http://git.eclipse.org:80][total kept alive: 1; route allocated: 100 of 100; total allocated: 207 of 300]
2019/05/30 10:19:13:779 CEST [DEBUG] PoolingHttpClientConnectionManager - Connection leased: [id: 260][route: {s}->https://raw.githubusercontent.com:443][total kept alive: 1; route allocated: 100 of 100; total allocated: 261 of 300]


I.e., when each of these routes reaches its maximum number of allowed per-route connections.

Now I need to understand how and when the releases happen and why these not happening often enough.
Comment 48 Ed Merks CLA 2019-05-30 05:30:01 EDT
Here's what I've noticed so far.  When a org.apache.http.client.entity.LazyDecompressingInputStream is used to read the http entity, reading that stream to the end, does *not* result in org.apache.http.conn.EofSensorInputStream.checkEOF(int) being called. The only time I see releases of leases is when EofSensorInputStream is used directly without being wrapped by a LazyDecompressingInputStream.  Then the file is read to the end by org.eclipse.ecf.provider.filetransfer.retrieve.AbstractRetrieveFileTransfer.fileTransferRunnable.new IFileTransferRunnable() {...}.performFileTransfer(IProgressMonitor) and the lease is released.  Otherwise the lease is not released and closing the stream has no effect because it's wrapped org.eclipse.ecf.provider.filetransfer.httpclient45.HttpClientRetrieveFileTransfer.NoCloseWrapperInputStream which makes close a no-op.

If I change (just to test), the wrapper to look like this:

	// Added to address bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=389292
	class NoCloseWrapperInputStream extends FilterInputStream {

		protected NoCloseWrapperInputStream(InputStream in) {
			super(in);
		}

		@Override
		public void close() throws IOException {
			super.close();
			// do nothing
		}
	}

Then connections are released properly and all works properly.  So this seems like the fundamental problem.  At some point during in the finally block of org.eclipse.ecf.provider.filetransfer.retrieve.AbstractRetrieveFileTransfer.fileTransferRunnable.new IFileTransferRunnable() {...}.performFileTransfer(IProgressMonitor) it must properly ensure that the stream is closed (and mostly that the deeply wrapped underlying EofSensorInputStream is closed because only then is the lease released.
Comment 49 Eclipse Genie CLA 2019-05-30 05:56:57 EDT
New Gerrit change created: https://git.eclipse.org/r/143057
Comment 50 Mat Booth CLA 2019-05-30 07:24:19 EDT
(In reply to Eclipse Genie from comment #49)
> New Gerrit change created: https://git.eclipse.org/r/143057

Nice catch! I believe I am able to confirm your fix works for me too. I also tested pausing and resuming downloads and that also continues to work for me with your fix applied.
Comment 51 Ed Merks CLA 2019-05-30 07:44:41 EDT
I think the only change smaller than moving one character with in a line is to delete/add one character. :-P

I think this also explains why the problem was not more obviously seen elsewhere in p2's usage. I.e, most of the content that p2 downloads is already compressed so it would be silly for the transport layer to send it via a compression algorithm.  But with the SetupArchiver, most of the resources are XML so it makes good sense for the transport layer to transport them in compressed form.
Comment 52 Carsten Reckord CLA 2019-05-30 07:58:24 EDT
I'm currently looking into the hardClose() method and the changes in Ed's patch. 

I think I got fooled by the pre-existing code for aborting the getMethod. Looking at the cases in which hardClose() is called, the httpResponse should really always be closed.

I still have to run through some more tests though.
Comment 53 Ed Merks CLA 2019-05-30 08:35:45 EDT
It seems to me that the loop can terminate because of isPaused becomes true and when I look at the commit history, there was a problem with that which is why the no-close stream was added (because readInputStream's close is called unconditionally).  So I don't know the logic here very well, but I imagine that if the loop terminates because of a "pause" that at some point one unpauses it and wants to continue reading from the connection. So I think there must be a good reason not to always close the connection, i.e., when isPaused is true.  I imagine the extra done check is because if you're paused by already done, then there's actually nothing left to read...

But you guys must understand all this better than me. :-P
Comment 54 Carsten Reckord CLA 2019-05-30 09:28:15 EDT
> It seems to me that the loop can terminate because of isPaused becomes true and 
> when I look at the commit history, there was a problem with that which is why 
> the no-close stream was added (because readInputStream's close is called 
> unconditionally). 

Yes. But the reason I'm dubious about the !isDone() && isPaused() / isDone() && !isPaused() logic looking at it now is that either we're in the case where we are already done or really want to close the connection for good (i.e. cancel() ), in which case, we should obviously close the connection. 

Or we want to pause and resume. But the resume code will create a new getMethod and open a new connection with a new httpResponse anyway (see openStreamsForResume() ).
Comment 55 Ed Merks CLA 2019-05-30 10:31:59 EDT
Hmmm, then why the none closing stream which was added to deal with pause. I guess Scott knows that history better.

In any case, I don't use pause and I just want the streams to close and the connection to be released. :-P
Comment 57 Scott Lewis CLA 2019-05-30 11:33:57 EDT
(In reply to Eclipse Genie from comment #56)
> Gerrit change https://git.eclipse.org/r/143057 was merged to [master].
> Commit:
> http://git.eclipse.org/c/ecf/org.eclipse.ecf.git/commit/
> ?id=a75749442339274ac12d30b98536462303bb73b4

New snapshot build including this commit from Ed now here:

https://download.eclipse.org/rt/ecf/snapshot/site.p2
Comment 58 Mat Booth CLA 2019-05-30 12:34:32 EDT
(In reply to Scott Lewis from comment #57)
> (In reply to Eclipse Genie from comment #56)
> > Gerrit change https://git.eclipse.org/r/143057 was merged to [master].
> > Commit:
> > http://git.eclipse.org/c/ecf/org.eclipse.ecf.git/commit/
> > ?id=a75749442339274ac12d30b98536462303bb73b4
> 
> New snapshot build including this commit from Ed now here:
> 
> https://download.eclipse.org/rt/ecf/snapshot/site.p2

Thanks for merging, the platform will consume this build in the next I-build.

Hopefully that's done it :-)
Comment 59 Carsten Reckord CLA 2019-05-30 13:14:12 EDT
Actually, I have a couple more changes in a moment...
Comment 60 Eclipse Genie CLA 2019-05-30 14:12:09 EDT
New Gerrit change created: https://git.eclipse.org/r/143079
Comment 61 Eclipse Genie CLA 2019-05-30 14:12:13 EDT
New Gerrit change created: https://git.eclipse.org/r/143078
Comment 62 Eclipse Genie CLA 2019-05-30 14:12:16 EDT
New Gerrit change created: https://git.eclipse.org/r/143080
Comment 63 Carsten Reckord CLA 2019-05-30 14:27:39 EDT
(In reply to Eclipse Genie from comment #61)
> New Gerrit change created: https://git.eclipse.org/r/143078

I noticed during testing that even after completing the transfer, the HttpResponse should actually stay around to get information out of it.

(In reply to Eclipse Genie from comment #60)
> New Gerrit change created: https://git.eclipse.org/r/143079

This is the important one. I believe that even with Ed's fix, we're keeping connections around too often (i.e. in the paused case). I'll just copy the commit message here, which explains the change:

> Make sure connection is closed in hardClose().
>
> There are four cases to consider:
>
> 1. Transfer is complete
> 2. Transfer was cancelled
> 3. There was an exception during transfer
> 4. The transfer was paused
>
> In case 3, hardClose() is called from cancel() directly, in the other
> cases it gets called from the finally block in performFileTransfer().
>
> Cases 1-3 are easy: In case 1, the stream encountered EOF, isDone() ==
> true, and the connection was already closed as a result. In case 2
> and 3, it is obvious that we want the connection closed as a result.
>
> In case 4, we might want to resume later, but when resuming, we are
> actually creating a new HttpGet request and get a new HttpResponse as a
> result. So we also want to close the previous one when pausing.

I've also made some changes to the URLRetrievePauseResumeTest to cover the case for which the previous conditions were introduced. As it was, it never actually paused anything successfully when I ran it locally, because either the download wasn't started yet when checking if it was pausable, skipping the whole pause() scenario, or the download was already completed when calling pause().

As it stands now, the test asserts that it goes through all the expected phases, and the downloaded file is large enough that it should work with today's faster connections.

(In reply to Eclipse Genie from comment #62)
> New Gerrit change created: https://git.eclipse.org/r/143080

In case there are other unexpected usecases out there that run into problems with the shared HttpClient instance, I added a property to disable it and go back to the "one transfer, one client" behavior.
Comment 64 Carsten Reckord CLA 2019-05-30 14:40:13 EDT
(In reply to Eclipse Genie from comment #60)
> New Gerrit change created: https://git.eclipse.org/r/143079

Forgot to mention: Scott, as the author of the original code for bug 389292, could you have a specially close look at this change? 

Between my manual tests and the fixed URLRetrievePauseResumeTest, and my banging my head against the removed conditions for quite some time, I feel pretty good about the change, but four eyes are better than two ;)

(In reply to Ed Merks from comment #55)
> Hmmm, then why the none closing stream which was added to deal with pause. I
> guess Scott knows that history better.

The non-closing stream was introduced because before we get to the hardClose(), AbstractRetrieveFileTransfer.performFileTransfer() actually calls close() on the response's InputStream. But instead of just closing the stream immediately, this actually blocks until all content is read from the connection. With the non-closing stream, we actually reach hardClose() immediately and can do the right thing, which is aborting the request.

You can read more about it in bug 370801 comment 5 and 6. However, unfortunately that doesn't explain how the condition in hardClose() came to be...
Comment 68 Carsten Reckord CLA 2019-05-30 14:51:06 EDT
To recap Ed's issues:

> One problem is that my timeouts weren't used so I had to add the ones that are
> used (without refering to the actual headers)

Mat fixed the missing ConnectionRequestTimeout configuration. Also, all of the options should be configurable with both the general IRetrieveFileTransferOptions and the provider-specific ones (the httpclient45 provider uses the same ones as the httpclient4 one, so there's no need to adapt those in clients).

> The other problem is really my fault because of the horrible reflective hack 
> for being able to control cookie processing, which is used to do form-based 
> authentication, but I got that working with a new horrible hack.

Ed, could you open a separate bug for this? I would like to support this cleanly through options on the Filetransfer, which should be fairly easy to do.

> The worst problem is the behavior of the connection pool.  I appears to be
> buggy to me. 

The connection starvation issue should be fixed now as well. So hopefully that is it now :)

I'll be adding httpclient45-specific unit tests for the connection pool to ensure that this doesn't get broken accidentally in the future, but it might take me until the weekend.
Comment 69 Eclipse Genie CLA 2019-05-30 15:48:14 EDT
New Gerrit change created: https://git.eclipse.org/r/143084
Comment 70 Carsten Reckord CLA 2019-05-30 15:49:33 EDT
(In reply to Eclipse Genie from comment #69)
> New Gerrit change created: https://git.eclipse.org/r/143084

Sorry, one more - I changed the shared client setting to false for some testing and accidentally committed that...
Comment 72 Scott Lewis CLA 2019-05-30 16:07:03 EDT
(In reply to Eclipse Genie from comment #71)
> Gerrit change https://git.eclipse.org/r/143084 was merged to [master].
> Commit:
> http://git.eclipse.org/c/ecf/org.eclipse.ecf.git/commit/
> ?id=64d72dbbd754c38e9402e1f0342f982f258c03c8

Merged...the snapshot build is underway (about 15 minutes).

>Forgot to mention: Scott, as the author of the original code for bug 389292, could you have a specially close look at this change? 

I did look at this.   I can't remember writing this code myself...and it may have been contributed via patch with me committing.  

In any case, I don't see a problem with it as long as the methods called pass all test cases, and never throw exceptions...under any/all possible conditions.  If they did, that could cause a problem.
Comment 73 Scott Lewis CLA 2019-05-30 16:38:04 EDT
Snapshot with all of latest from Ed, Mat, and Carsten available here

https://download.eclipse.org/rt/ecf/snapshot/site.p2
Comment 74 Ed Merks CLA 2019-05-30 23:37:18 EDT
I tested locally in a development environment pulling all the changes from Git and that all works well, but I don't think these changes are in the platform's 4.12-I-Build yet; maybe later today it will be, or tomorrow...
Comment 75 Sravan Kumar Lakkimsetti CLA 2019-05-31 00:28:50 EDT
(In reply to Ed Merks from comment #74)
> I tested locally in a development environment pulling all the changes from
> Git and that all works well, but I don't think these changes are in the
> platform's 4.12-I-Build yet; maybe later today it will be, or tomorrow...

We are completing our RC1 today. We will pick up these in the next I-build (later today)
Comment 76 Dani Megert CLA 2019-05-31 02:46:03 EDT
(In reply to Sravan Kumar Lakkimsetti from comment #75)
> (In reply to Ed Merks from comment #74)
> > I tested locally in a development environment pulling all the changes from
> > Git and that all works well, but I don't think these changes are in the
> > platform's 4.12-I-Build yet; maybe later today it will be, or tomorrow...
> 
> We are completing our RC1 today. We will pick up these in the next I-build
> (later today)
I discussed this with Sravan and Manoj. We will deliver an RC1a today with only that change.
Comment 77 Ed Merks CLA 2019-05-31 08:26:24 EDT
I updated the target platform in my Oomph development environment with the latest I-build, removed all my hacks for this problem, and I can confirm that it works like a charm again.

Thanks for the quick turn around on this to everyone involved!!
Comment 78 Scott Lewis CLA 2019-06-04 13:29:32 EDT
Mat and I have created a final release build for ECF 3.14.5 with repo here:

https://download.eclipse.org/rt/ecf/3.14.5/site.p2

This has the same bits that were contributed to RC1a and will be used for ECF's contribution to 2016-06.

Thanks(inadvance) to Carsten/Yatta for the httpclient45 provider contribution and the support for use by oomph and platform.

Thanks(inadvance) to Ed and others for testing and reporting issues.

I will keep my fingers crossed that this new provider will help people wrt proxy handling for bug 422665 and bug 367960.  

I would suggest that the platform publicize this addition so that people who have been previously stymied will give 2019-06 a try.

Resolving this issue.
Comment 79 Dani Megert CLA 2019-06-05 04:00:01 EDT
(In reply to Scott Lewis from comment #78)
> I would suggest that the platform publicize this addition so that people who
> have been previously stymied will give 2019-06 a try.
I suggest that either you or Carsten add an entry to our New & Noteworthy page:

Git repo: ssh://git.eclipse.org/gitroot/www.eclipse.org/eclipse/news.git
Gerrit: ssh://git.eclipse.org:29418/www.eclipse.org/eclipse/news.git
Live website: https://www.eclipse.org/eclipse/news/4.12/
Comment 80 Mat Booth CLA 2019-06-05 17:11:24 EDT
(In reply to Dani Megert from comment #79)
> (In reply to Scott Lewis from comment #78)
> > I would suggest that the platform publicize this addition so that people who
> > have been previously stymied will give 2019-06 a try.
> I suggest that either you or Carsten add an entry to our New & Noteworthy
> page:
> 
> Git repo: ssh://git.eclipse.org/gitroot/www.eclipse.org/eclipse/news.git
> Gerrit: ssh://git.eclipse.org:29418/www.eclipse.org/eclipse/news.git
> Live website: https://www.eclipse.org/eclipse/news/4.12/

Carsten can you take this? You are probably familiar with which use-cases are most improved :-)
Comment 81 Carsten Reckord CLA 2019-06-07 16:03:53 EDT
(In reply to Mat Booth from comment #80)
> > Git repo: ssh://git.eclipse.org/gitroot/www.eclipse.org/eclipse/news.git
> > Gerrit: ssh://git.eclipse.org:29418/www.eclipse.org/eclipse/news.git
> > Live website: https://www.eclipse.org/eclipse/news/4.12/
> 
> Carsten can you take this? You are probably familiar with which use-cases
> are most improved :-)

Sure, I'll take a stab. But would probably be nice for a native speaker to look it over afterwards ;)
Comment 82 Carsten Reckord CLA 2019-06-07 16:10:27 EDT
Hm, where would be the proper place for "infrastructure" stuff like this? 

Is the  "New features in the Platform and Equinox" section the right place? It's not really a flashy new feature for people to behold - except maybe for those whose proxy connections work now :)

And would I create an ECF section or would this fall under the Equinox/P2 umbrella for the platform?
Comment 83 Dani Megert CLA 2019-06-08 03:26:07 EDT
(In reply to Carsten Reckord from comment #82)
> Hm, where would be the proper place for "infrastructure" stuff like this? 
> 
> Is the  "New features in the Platform and Equinox" section the right place?
> It's not really a flashy new feature for people to behold - except maybe for
> those whose proxy connections work now :)
> 
> And would I create an ECF section or would this fall under the Equinox/P2
> umbrella for the platform?
I would put it in 'Platform and Equinox API' (platform_isv.php). You can mention ECF in the entry but don't add a separate section for it.
Comment 84 Torbjörn Svensson CLA 2019-10-13 14:27:06 EDT
@Carsten Reckord: In bug 551818 I tried to fix broken links in the about.html file in the ECF plugins. While doing that, I noticed that the about.html file for the org.eclipse.ecf.provider.filetransfer.httpclient45.win32 plugin states that it contains third party software but as far as I can tell, it simply uses whatever is available within the target platform.

Can these third party clauses be removed from the about.html file without any legal issues?

I'm no legal expert, but if that section needs to be in the about.html, then how far out the tree of dependencies for a certain plugin do one need to go to collect all license to list? I mean, why would 1 level be enough here? I don't see any difference in walking 1 level or 10 levels of dependencies...

If the sections are needed in the about.html, then the links needs to work and currently, the link to ASL__VERSION_2.html (about_files/ASL__VERSION_2.html) is broken.