Bug 229982 - cvs, extssh, checkout dies with msg: Error: Unknown response received from cvs server
Summary: cvs, extssh, checkout dies with msg: Error: Unknown response received from cv...
Status: VERIFIED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: CVS (show other bugs)
Version: 3.3.2   Edit
Hardware: PC Linux
: P3 major (vote)
Target Milestone: 3.4 RC1   Edit
Assignee: platform-cvs-inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords: greatbug
Depends on:
Blocks:
 
Reported: 2008-05-02 11:37 EDT by John Mudd CLA
Modified: 2011-09-19 04:52 EDT (History)
4 users (show)

See Also:
Michael.Valenta: review+


Attachments
screen shot of error (69.15 KB, image/jpeg)
2008-05-02 11:37 EDT, John Mudd CLA
no flags Details
a patch for o.e.team.cvs.ssh2 (1.56 KB, patch)
2008-05-05 11:29 EDT, Atsuhiko Yamanaka CLA
no flags Details | Diff
screen shot of patch attempt (78.87 KB, image/jpeg)
2008-05-06 11:23 EDT, John Mudd CLA
no flags Details
a patch for o.e.team.cvs.ssh2 (2.12 KB, patch)
2008-05-07 13:10 EDT, Atsuhiko Yamanaka CLA
no flags Details | Diff
a patch for o.e.team.cvs.ssh2 (1009 bytes, patch)
2008-05-08 03:04 EDT, Atsuhiko Yamanaka CLA
no flags Details | Diff
a patch for o.e.team.cvs.ssh2 (2.54 KB, patch)
2008-05-08 12:37 EDT, Atsuhiko Yamanaka CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Mudd CLA 2008-05-02 11:37:08 EDT
Created attachment 98451 [details]
screen shot of error

Build ID: M20080221-1800

Steps To Reproduce:
1. Try to checkout my source tree.
2. It makes progress downloading files for about a few minutes then fails.



More information:
What's special about my environment is that network access to the cvs server is throttled.  I get fast access for a second or two, then about a 60 pause.  These two modes repeat to throttle overall access. It's when access resumes that Eclipse tends to report the "Error: Unknown response received from cvs server" msg and the checkout dies.  But not always.  Sometimes access resumes and Eclipse does not complain.  Note:  I have no problem using the cvs client to perform the same checkout.

The throttle is controlled by the customer.  There is no way we can get that lifted.  In their defense, it does not interfere with the standard cvs client program.

The problem happens consistently in that I can reproduce it easily in a few minutes.  It's inconsistent in that there is no pattern as far as which source file will fail to download.

Originally Eclipse timed out during the slow checkout.  I increased the CVS timeout preference in Eclipse.  Now we don't time out but get the "Error: Unknown response received from cvs server" termination.

Our work around is to restart by selecting Team/Update.  We have to do this many times.  Eventually we get all the files.  But a secondary problem is that the Team/Update does not resume correctly for large files that were partially downloaded.  I'm not worried about the partially checked out files.  I'm trying to get a fix for the "Error: Unknown response received from cvs server"  situation that causes the checkout to be interrupted.  

After having experienced this myself I'm reluctant to use Eclipse to check-in my work.

This error seems similar to bug 161292.  I'm also using extssh and the cvs server is a Solais machine.  

I've available to work with Eclipse developers to get this resolved.  Let me know how to proceed.
Comment 1 Atsuhiko Yamanaka CLA 2008-05-03 20:41:55 EDT
Is it possible to try 3.4M7[1] to find where is the problem?

According to the message "Unknown response received ...",
it seems to me that ssh connections have been established successfully,
but failed to exec "cvs server" command on the remote.
How about the version number[2] of CVS system on the remote?


> This error seems similar to bug 161292.  I'm also using extssh and the cvs
> server is a Solais machine.  

May I ask you the version of that Solaris machine?
Through the customer support for jsch to our enterprise users,
we have learned and recognized that 'Solaris9' has included buggy sshd.

[1] http://download.eclipse.org/eclipse/downloads/drops/S-3.4M7-200805020100/index.php
[2] http://wiki.eclipse.org/CVS_FAQ#What_server_versions_of_CVS_are_supported_by_Eclipse.3F
Comment 2 John Mudd CLA 2008-05-03 23:45:30 EDT
CVS Server:

$ uname -a
SunOS hotline 5.10 Generic sun4u sparc SUNW,Netra-440

$ /usr/lib/ssh/sshd -?  
sshd: illegal option -- ?
sshd version Sun_SSH_1.1

$ cvs -v
Concurrent Versions System (CVS) 1.11.20 (client/server)




Here's the client I use at home that seems to work okay:

$ cvs -version
Concurrent Versions System (CVS) 1.12.13 (client/server)
Comment 3 Atsuhiko Yamanaka CLA 2008-05-04 07:38:56 EDT
Thank you for prompt feedback.

(In reply to comment #2)
> CVS Server:
> $ uname -a
> SunOS hotline 5.10 Generic sun4u sparc SUNW,Netra-440
> $ /usr/lib/ssh/sshd -?  
> sshd: illegal option -- ?
> sshd version Sun_SSH_1.1

It seems the bug I mentioned at comment #2 is not there.
This is the off-topic, but if there is a bug in sshd,
you will get two lines from 
  $ strings /usr/lib/ssh/sshd | grep channel_still_open

> $ cvs -v
> Concurrent Versions System (CVS) 1.11.20 (client/server)
> Here's the client I use at home that seems to work okay:
> $ cvs -version
> Concurrent Versions System (CVS) 1.12.13 (client/server)

It seems there is no problem about CVS version.

Do you have also the same problem from home?
Comment 4 John Mudd CLA 2008-05-04 11:11:10 EDT
I ran the strings command twice, once with redirect.  I just wanted to let you know that there is a difference depending how it's run.  I take it that the bug is not present.

$ strings /usr/lib/ssh/sshd | grep channel_still_open
channel_still_open: bad channel type %d
$ strings < /usr/lib/ssh/sshd | grep channel_still_open
channel_still_open
channel_still_open: bad channel type %d
$ 


> Do you have also the same problem from home?

Well, I run both Eclipse and standard cvs client (cvs) from home.  The problem is always there via Eclipse.  I've never seem the cvs command have a problem, even when checking out the entire tree which takes hours due to the network throttling.  
 
Comment 5 Atsuhiko Yamanaka CLA 2008-05-05 01:15:38 EDT
(In reply to comment #4)
> > Do you have also the same problem from home?
> Well, I run both Eclipse and standard cvs client (cvs) from home.  The problem
> is always there via Eclipse.  I've never seem the cvs command have a problem,
> even when checking out the entire tree which takes hours due to the network
> throttling.  

Does it take hours?
Can you try pserver connection?  
It seems that the problem has not come from ssh connection, IMHO.
I guess that org.eclipse.team.cvs.core plug-in has just dropped the cvs
communication due to the timeout.

By the way, is it impossible to try Eclipse 3.4M7?
Comment 6 John Mudd CLA 2008-05-05 10:00:20 EDT
> Does it take hours?

Yes, with either Eclipse or the command line cvs client.  Due to network throttling.


> Can you try pserver connection?

I'm not familar with that but I'll look into it.


> I guess that org.eclipse.team.cvs.core plug-in has just dropped the cvs
> communication due to the timeout.

The original error was a cvs timeout.  I increased the Eclipse preference under Window, Preferences, Team, CVS, Connection, Connection timeout(s) from 60 to 6000.  That stopped the explicit timeout errors.  Now it complains of an unexpected response.  Do you think the unexpected response was due to some other internal timeout?


> is it impossible to try Eclipse 3.4M7?

Sorry, I wasn't sure how strongly that was being suggested.  I installed it just now and the problem seems the same as before.  No difference.  

    Eclipse SDK
    Version: 3.4.0
    Build id: I20080502-0100
Comment 7 John Mudd CLA 2008-05-05 10:30:20 EDT
> Can you try pserver connection?

No, it's not an option.  
Comment 8 Atsuhiko Yamanaka CLA 2008-05-05 11:23:43 EDT
(In reply to comment #6)
> The original error was a cvs timeout.  I increased the Eclipse preference under
> Window, Preferences, Team, CVS, Connection, Connection timeout(s) from 60 to
> 6000.  That stopped the explicit timeout errors.

You can set '0' as no-timeout.
Comment 9 Atsuhiko Yamanaka CLA 2008-05-05 11:29:24 EDT
Created attachment 98646 [details]
a patch for o.e.team.cvs.ssh2

(In reply to comment #6)
> Sorry, I wasn't sure how strongly that was being suggested.  I installed it
> just now and the problem seems the same as before.  No difference.  
>     Eclipse SDK
>     Version: 3.4.0
>     Build id: I20080502-0100

Ok, now you can use Eclipse 3.4M7.

May I ask you to try the attached patch for org.eclipse.team.cvs.ssh2 plug-in,
which is availabe at
  :pserver:anonymous@dev.eclipse.org:/cvsroot/eclipse/org.eclipse.team.cvs.ssh2
Comment 10 John Mudd CLA 2008-05-05 15:40:05 EDT
I checked out the patch using this command.

$ cvs -d :pserver:anonymous@dev.eclipse.org:/cvsroot/eclipse co org.eclipse.team.cvs.ssh2

That gives me the source for the patch.  How do I apply it to Eclipse?
Comment 11 Atsuhiko Yamanaka CLA 2008-05-06 08:58:43 EDT
(In reply to comment #10)
> I checked out the patch using this command.
> $ cvs -d :pserver:anonymous@dev.eclipse.org:/cvsroot/eclipse co
> org.eclipse.team.cvs.ssh2
> That gives me the source for the patch.  How do I apply it to Eclipse?

After checking out that plug-in on Eclipse 3.4M7,
right-click on 'org.eclipse.team.cvs.ssh2' in Package Explorer
  Team > Applay Patch...

To try pached plug-in, 
  Run > Run As > Eclipse Application
Comment 12 John Mudd CLA 2008-05-06 11:23:28 EDT
Created attachment 98857 [details]
screen shot of patch attempt
Comment 13 John Mudd CLA 2008-05-06 11:24:13 EDT
> After checking out that plug-in on Eclipse 3.4M7,

I'm not sure I did this correctly.  It does show up now in my Package Explorer.

> right-click on 'org.eclipse.team.cvs.ssh2' in Package Explorer
>  Team > Applay Patch...

I right-clicked on 'org.eclipse.team.cvs.ssh2' in Package Explorer and selected Team/Apply Patch.  It asks for a Patch Input Specification file.  Am I on the right track?  I've attached a screen shot.
Comment 14 Atsuhiko Yamanaka CLA 2008-05-06 18:05:19 EDT
(In reply to comment #13)
> > After checking out that plug-in on Eclipse 3.4M7,
> I'm not sure I did this correctly.  It does show up now in my Package Explorer.
> > right-click on 'org.eclipse.team.cvs.ssh2' in Package Explorer
> >  Team > Applay Patch...
> I right-clicked on 'org.eclipse.team.cvs.ssh2' in Package Explorer and selected
> Team/Apply Patch.  It asks for a Patch Input Specification file.  Am I on the
> right track?  I've attached a screen shot.

You are on the right track.
Choose "File" and put the path for patch file attached at comment #9.
Comment 15 John Mudd CLA 2008-05-07 11:03:33 EDT
> You are on the right track.
> Choose "File" and put the path for patch file attached at comment #9.

Thanks, that much worked and I was able to run using Run > Run As > Eclipse Application as you suggested.  

Same results though.  In two tests the cvs download failed after the first (60 second) network pause, just as data started to flow again.  In a couple of tests the cvs download made it past the first network resume but failed on the second resume, a little over two minutes into the run.  In one case the download failed after the third resume, a little over three minutes into the run.  In each case it ends with the same "Error: Unknown response received from cvs server: " msg.  This is the same behavior I saw before.

Here's how the errors appear in the .metadata/.log file.

!ENTRY org.eclipse.team.cvs.core 4 -4 2008-05-07 10:44:37.787
!MESSAGE Unknown response received from cvs server:

!ENTRY org.eclipse.team.cvs.core 4 -4 2008-05-07 11:01:07.657
!MESSAGE Unknown response received from cvs server:
Comment 16 Atsuhiko Yamanaka CLA 2008-05-07 11:12:52 EDT
(In reply to comment #15)
> > You are on the right track.
> > Choose "File" and put the path for patch file attached at comment #9.
> Thanks, that much worked and I was able to run using Run > Run As > Eclipse
> Application as you suggested.  
> Same results though.  In two tests the cvs download failed after the first (60
> second) network pause, just as data started to flow again. 

Have you changed the value for "Connection timeout(s)" at
  Window > Preferences... > Team > CVS > Connection
in this testing?  "Run > Run As > Eclipse" will use yet another worksapce,
so I guess that "60" has been specified.

Comment 17 John Mudd CLA 2008-05-07 11:46:50 EDT
> Have you changed the value for "Connection timeout(s)" at
>   Window > Preferences... > Team > CVS > Connection
> in this testing?

Yes, I had to increase it from the default of 60.  I've used large numbers and 0 as suggested.  Both work to disable the timeout.  Here's a sample timeout I get if I use the default 60 sec limit.  It's very different from the "Unknown response" error.

!ENTRY org.eclipse.team.cvs.core 4 4 2008-05-07 10:32:28.583
!MESSAGE An error occurred checking out imei: Problem writing resource '/imei/docs/CingularIMEIDocs.zip'. Timeout while reading from input stream
!STACK 1
org.eclipse.team.internal.ccvs.core.CVSException: Problem writing resource '/imei/docs/CingularIMEIDocs.zip'. Timeout while reading from input stream
    at org.eclipse.team.internal.ccvs.core.CVSException.wrapException(CVSException.java:57)



I just had an interesting test run.  The network throttle changed a bit, gave me better throughput by using 30 second sleeps instead of the 60 second sleeps.  Eclipse did not fail and ran until the download completed, a little over eight minutes.  There were still network pauses but only 30 seconds long instead of 60 seconds.  It's as if there's another 60 second timeout hardcoded somewhere in Eclipse or the network code.  30 second pauses don't cause trouble but a full 60 second pause causes the "Unknown response" condition.  That's just my observation, I don't know the code. 

The network throttle behavior is not under my control but this explains something I heard from other Eclipse users.  They wait until the middle of the night to download their cvs because that's when the throttle eases up.  Sometimes. I just tried to repeat it and it's back to 60 second pauses.
Comment 18 Atsuhiko Yamanaka CLA 2008-05-07 13:10:30 EDT
Created attachment 99120 [details]
a patch for o.e.team.cvs.ssh2

May I ask you to try the attached patch again?

This patch is for CVS HEAD, so you need to delete
"org.eclipse.team.cvs.ssh2" on Package Explorer and
check out the source from cvs server again.
Comment 19 John Mudd CLA 2008-05-07 13:46:34 EDT
> May I ask you to try the attached patch again?

I deleted org.eclipse.team.cvs.ssh2 from my Package Explorer.  I was given the option to "delete all source" but I didn't use that option.

I'll try to install the new patch but, honestly, I don't know I managed to get the previous one into the Package Manager.  I got a clue from something I Googled before.  I'll look for that again but this may take a while.  Let me know if you have instructions.  
Comment 20 John Mudd CLA 2008-05-07 15:11:12 EDT
I have the patch running now.

New / Project / CVS / Project from CVS
Use pserver:anonymous@dev.eclipse.org:/cvsroot/eclipse
Next
Module name: org.eclipse.team.cvs.ssh2
Finish
The resource org.eclipse.team.cvs.ssh2 exists & will be deleted.  Proceed?
Yes

Download new patch file to disk
ight-click on 'org.eclipse.team.cvs.ssh2' in Package Explorer
  Team > Apply Patch
  Select patch file from disk

Run > Run As > Eclipse Application

So far it's successfully performing the cvs download despite what appear to be long (60 sec) network pauses.  I'll post more info later.
Comment 21 John Mudd CLA 2008-05-07 16:40:40 EDT
I've been running cvs downloads for 90 minutes now and still no errors.  I've timed the network pauses and they are still 60 seconds long so this looks like a good fix to me.
Comment 22 Atsuhiko Yamanaka CLA 2008-05-08 03:04:23 EDT
Created attachment 99227 [details]
a patch for o.e.team.cvs.ssh2

May I ask you to try attached patch, which is simpler than
previous one.  If there is not a problem, I will commit it to CVS HEAD.

It was problem that the interval time for keep-alive had been hard coded.
As for the fix, timeout value for CVS is used, but a patch for bug 222178
may be better solution.
Comment 23 John Mudd CLA 2008-05-08 12:04:52 EDT
The new (2008-05-08 03:04 -0400) patch works.  I let it run for an hour and it tolerates 60 second pauses in network traffic with no errors and no timeouts.  

But there is a minor change from the previous patch.  It now only works if I set the Window/Preferences/Team/CVS/Connection/Connection Timeout to a large number such as 100000.  If I set it to 0 (zero) then I get the following timeout error as soon as there is a pause in network traffic.

!ENTRY org.eclipse.team.cvs.core 4 4 2008-05-08 11:01:44.638
!MESSAGE An error occurred checking out imei: Problem writing resource '/imei/openwave/mdmEventRE.tar.gz'. Timeout while reading from input stream
!STACK 1
org.eclipse.team.internal.ccvs.core.CVSException: Problem writing resource '/imei/openwave/mdmEventRE.tar.gz'. Timeout while reading from input stream
    at org.eclipse.team.internal.ccvs.core.CVSException.wrapException(CVSException.java:57)

Just to be clear, this is not a timeout after 60 seconds of no network traffic.  This timeout seems to happen the instant there is a pause in traffic.  I get the error message less than a second after I see the light go out on my Network Monitor applet icon.  Again, this only happens if I set the timeout preference to zero.  

Another detail is that the CVS Update screen displays as soon as I select Team/Update.  But the download would sometimes be delayed for up to a minute due to the network throttling.  I do not see timeouts during this startup period even when I set the timeout preference to zero.  I only see timeout after the files start to come across the network and then there's a network pause.  And only when I set the timeout preference to zero.
Comment 24 Atsuhiko Yamanaka CLA 2008-05-08 12:37:51 EDT
Created attachment 99321 [details]
a patch for o.e.team.cvs.ssh2

Please try this patch with zero timeout.
I hope this is the last patch.

Thank you for your co-operations.
Comment 25 John Mudd CLA 2008-05-08 16:21:29 EDT
Attachment (id=99321) works well with timeout preference set to zero or 100000.  No errors.

Thanks!
Comment 26 Atsuhiko Yamanaka CLA 2008-05-08 22:09:39 EDT
The fix has been committed to CVS HEAD.
Somebody, please change the status of this entry to FIXED.
Comment 27 Michael Valenta CLA 2008-05-09 09:47:13 EDT
I believe the process is to get all patches approved for RC1. The patch looks good to me so marking as fixed and approved. And I must say I enjoyed following the progress of this bug as a good example of what open source is all about. Thanks guys!
Comment 28 Szymon Brandys CLA 2008-05-19 11:41:59 EDT
(In reply to comment #27)
> I believe the process is to get all patches approved for RC1. 

True. Now when we are in RC2, we should be careful cause we need 2 extra commiters to approve a change.

(In reply to comment #27)
> And I must say I enjoyed following
> the progress of this bug as a good example of what open source is all about.
> Thanks guys!

That's why I added the greatbug keyword.

John, could you pick up RC1 (when ready), double check and change the bug status  to VERIFIED?

Comment 29 John Mudd CLA 2008-05-19 13:34:41 EDT
> John, could you pick up RC1 (when ready), double check and change the bug
> status  to VERIFIED?

Yes, I assume I wait until I see 3.4RC1 listed here:
http://download.eclipse.org/eclipse/downloads/
Comment 30 Szymon Brandys CLA 2008-05-20 04:04:47 EDT
You can use I20080516-1333. This is RC1.
Comment 31 John Mudd CLA 2008-05-20 11:13:44 EDT
I'm running now with RC1.

Eclipse SDK
Version: 3.4.0
Build id: I20080516-1333

I retested with timeout set to zero and 100000.  The fix still works.
Comment 32 Szymon Brandys CLA 2008-05-20 11:34:00 EDT
Thanks John.