Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jetty-dev] URIUtils performance

Hi,

I had the same idea (an array of indexes of the / ).
To "check for '.' and return", should be a bit more studied because if the method is used to get canonical path of "files"
like /css/style.css or /images/bg.png,  the checks is more a penality than an optimization.

I will check the call stack of canonicalPath to be sure.

Do you have access at a (big) list of URI (or path) from a large and real website in order to benchmark against an non-biased list of paths?
I don't want to optimize for ../../././  :)

Regards,

Guillaume
 


2016-04-06 1:05 GMT+02:00 Greg Wilkins <gregw@xxxxxxxxxxx>:

Guillaume,

thanks for testing this.... yes definitely want to have faster implementation... even if it is all protected by a check for .

I'm in two minds.... one part of me wants to revert to an approach that builds the StringBuffer as it parses the URI.  I'm sure it can be done with less complex code..... the other part of me thinks I don't really care about clients that send .. in URIs,  but back on the other hand I don't want ../../../.. to be used as a DOS attack vector on a server.

So now that we have a clean implementation, I think we should consider that a specification to use to create a faster impl (the previous version was so unclear it was not a specification).

I'm thinking that an initial pass through to count the number of "/" characters would be a good start.  If the count is 0, we return. Otherwise allocate a fixed size array to hold the indexes of the start of each segment.    Then another pass through the URI looking segments start values to set in the array plus checking for . and .. segments, which will then update the segment array.
The final sweep through the segment array to generate the final URI array.

So this could be done with the allocation of a single array and single StringBuffer, plus 2 iterations through the URI characters and 1 iteration through the segments.

cheers





On 6 April 2016 at 06:28, Guillaume Maillard <guillaume.maillard@xxxxxxxxx> wrote:
Hi,

The new version of URIUtils.canonicalPath is "cleaner" but from my JMH benckmark :
- previous implementation : 12 Mops/s
- new implementation : 1 Mops/s

By replacing :
List<String> directories = new ArrayList<>();
Collections.addAll(directories, __PATH_SPLIT.split(path));

by a faster split method like :
public static final List<String> fastSplit(final String string, final char sep) {
final List<String> l = new ArrayList<String>();
final int length = string.length();
final char[] cars = string.toCharArray();
int rfirst = 0;
for (int i = 0; i < length; i++) {
if (cars[i] == sep) {
l.add(new String(cars, rfirst, i - rfirst + 1));
rfirst = i + 1;
}
}

if (rfirst < length) {
l.add(new String(cars, rfirst, length - rfirst + 1));
}
return l;
}

performance are 3Mops/s .

A faster iterator can boost performance too. Are you interested by such improvments?
If yes, I have some hours to spend on it.

Best regards,

Guillaume


_______________________________________________
jetty-dev mailing list
jetty-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-dev



--

_______________________________________________
jetty-dev mailing list
jetty-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-dev


Back to the top