Community
Participate
Working Groups
Build Identifier: jetty-hightide-7.1.5.v20100705 When using unicode chars i virtual host name server fails to handle requests. I've taken a working webapp and changes the virtual host entry to one with unicode chars in it. Browsing to the URL I get a long connecting period before browser return default error page. Reproducible: Always Steps to Reproduce: 1. Add virtual host name with unicode characters such as swedish едц. 2. Let context deploy. 3. Browse url.
Kjell, Was there any debug log messages on the server side? Also, can you capture a tcpdump (eg with wireshark) of the request so we can see what bytes were transmitted for the URL? Note that URLs are really supposed to be ASCII only, with non ascii % url encoded. The actual character encoding of those bytes probably depends on your browser settings. So your browser may be encoding it as Unicode, but Jetty's default is UTF-8. So, if you can see that the request is going to be arriving in some other encoding, you can either call request.setCharacterEncoding() before you read any contents, or you can set the org.eclipse.jetty.util.UrlEncoding.charset system property. Its probably worthwhile having a read of this wiki page, as background: http://docs.codehaus.org/display/JETTY/International+Characters+and+Character+Encodings Finally if nothing is working, then send the wireshark/tcp dump and attach to this bug report so we can take a look at what is actually on the wire. thanks Jan
Ah yes! UTF-8 names are now allowed! Can you capture a tcpdump or wireshark trace of the actual bytes being sent by your browser. thanks
Created attachment 173628 [details] tcpdump file
Added a tcpdump file. Usecase: browsed to localhost:8080 and get the 404 showing list of contexts (1). I note that the context string showed in the browser has the correct format. I then clicked on contexts and waited for browser to timeout. I'm using Crome version 5.0.375.99, but I'm getting an simular response when using firefox. I'm guessing it's RFC 3490 support that is failing, or me doing something totally wrong. Is this RFC officially supported?
Hi Kjell, Firstly, here are some good links for information on international characters in domain names: http://www.chromium.org/developers/design-documents/idn-in-google-chrome http://en.wikipedia.org/wiki/Punycode http://tools.ietf.org/html/rfc3492 http://unicode.org/faq/idn.html In a nutshell, when you enter a url with non-ascii chars in it as part of the hostname, then browser will "punycode" it to an ascii representation. This ascii representation must be configured into your dns service, and also *as the virtual host* for jetty. For example, say I have the domain www.едц.com and I'm running a webapp on port 8080 at context /test. The url I type into my browser is: http://www.едц.com:8080/test/ The browser translates this to the ascii equivalent: http://www.xn--4cab6c.com:8080/test/ If www.едц.com is a virtual host, then I would configure it's ascii equivalent in the context xml file for the context: <Configure class="org.eclipse.jetty.webapp.WebAppContext"> <Set name="contextPath">/</Set> <Set name="war"><SystemProperty name="jetty.home" default="."/>/webapps/test.war</Set> <Set name="virtualHosts"> <Array type="String"> <Item>www.xn--4cab6c.com</Item> </Array> </Set> </Configure> Now, as I have no webapp deployed at /, if I hit http://www.едц.com:8080/, jetty's default handler will show me the virtual host www.xn--4cab6c.com. Clicking on the link provided will take me to my webapp. I think it would be nicer if you could configure the original form of the hostname in the jetty config files, rather than the punycoded form - it's friendlier :) So I'm changing this issue to an enhancement. cheers Jan
Well this has been open for years as an enhancement and there seems to be zero demand for it. I'm going to close it. If anyone is desperately keen for it, then please reopen and attach your code contribution to implement it :) Jan