Community
Participate
Working Groups
This is a clone of jdt Bug 571614 for platform.resources. In LocalFile getCanonicalPath() is used to test whether files are physically the same. Same can be done with Files.isSameFile() - which is much faster.
New Gerrit change created: https://git.eclipse.org/r/c/platform/eclipse.platform.resources/+/180307
New Gerrit change created: https://git.eclipse.org/r/c/platform/eclipse.platform.resources/+/180308
Performance comparison on Windows: java.io.File.getCanonicalPath ~ 170 us*2 (needed for both files!) java.nio.file.Files.isSameFile ~ 54 us (opens both files in worst case) java.io.File.exists ~ 18 us (checks just the target) Could be probably further improved by totally skipping the isSameFile check if we could detect that the destination does not exist during the subsequent copy operation.
Gerrit change https://git.eclipse.org/r/c/platform/eclipse.platform.resources/+/180307 was merged to [master]. Commit: http://git.eclipse.org/c/platform/eclipse.platform.resources.git/commit/?id=d0f808286733732eb124bd584c47f4d8e8a07192
New Gerrit change created: https://git.eclipse.org/r/c/platform/eclipse.platform.resources/+/181509
Created attachment 286537 [details] jmh micro benchmakr isSame vs getCanonicalPath File573409.java Windows, java 11-16: Benchmark Mode Cnt Score Error Units File573409.exists ss 100 56,920 ± 6,016 us/op File573409.getCanonicalPath_Cached ss 100 7,325 ± 1,128 us/op File573409.getCanonicalPath_Uncached ss 100 251,373 ± 18,891 us/op File573409.isSame ss 100 3,818 ± 0,781 us/op
Created attachment 286538 [details] jmh micro benchmakr isSame vs getCanonicalPath two files: File573409_2.java windows, two different files compared: Benchmark Mode Cnt Score Error Units File573409_2.getCanonicalPath_Cached ss 100 8,884 ± 2,731 us/op File573409_2.exists ss 100 43,743 ± 3,083 us/op File573409_2.isSame ss 100 88,839 ± 6,565 us/op File573409_2.getCanonicalPath_Uncached ss 100 300,320 ± 63,524 us/op We basically see: isSame ~ 2*exists getCanonicalPath_Uncached ~ n*exists
(In reply to Jörg Kubitz from comment #6) > Created attachment 286537 [details] > jmh micro benchmakr isSame vs getCanonicalPath File573409.java > > Windows, java 11-16: > > Benchmark Mode Cnt Score Error Units > File573409.exists ss 100 56,920 ± 6,016 us/op > File573409.getCanonicalPath_Cached ss 100 7,325 ± 1,128 us/op > File573409.getCanonicalPath_Uncached ss 100 251,373 ± 18,891 us/op > File573409.isSame ss 100 3,818 ± 0,781 us/op So we replace 2x cached getCanonicalPath() with exists + isSame? The sum is smaller for the old code.
(In reply to Andrey Loskutov from comment #8) > So we replace 2x cached getCanonicalPath() with exists + isSame? The sum is > smaller for the old code. "isSame" wont be used unless you overwrite a file with another. But typically we delete all files in the target folder first ("clean"). So we replaced 2* "cached" getCanonicalPath with 1*exists. And well the cache was normally empty unless you copy the same file over and over. Only the prefix cache was normally filled as you normally copy many files of the same directory. We had 2*lookup for the files name and replaced it with 1.
Gerrit change https://git.eclipse.org/r/c/platform/eclipse.platform.resources/+/181509 was merged to [master]. Commit: http://git.eclipse.org/c/platform/eclipse.platform.resources.git/commit/?id=e97aa804ca87276cf9372f7931269fe2e0308aec
Benchmark on Gentoo Linux, on NVMe SSD using ext4 filesystem: With JDK 15.0.2, OpenJDK 64-Bit Server VM, 15.0.2+7-27: Benchmark Mode Cnt Score Error Units File573409_2.exists ss 100 6.502 ± 1.275 us/op File573409_2.getCanonicalPath_Cached ss 100 8.357 ± 2.369 us/op File573409_2.getCanonicalPath_Uncached ss 100 15.517 ± 2.297 us/op File573409_2.isSame ss 100 18.571 ± 2.600 us/op With JDK 11.0.9.1, OpenJDK 64-Bit Server VM, 11.0.9.1+1: Benchmark Mode Cnt Score Error Units File573409_2.exists ss 100 8.029 ± 1.387 us/op File573409_2.getCanonicalPath_Cached ss 100 7.401 ± 1.507 us/op File573409_2.getCanonicalPath_Uncached ss 100 18.860 ± 2.867 us/op File573409_2.isSame ss 100 19.486 ± 3.207 us/op
(In reply to Michael Haubenwallner from comment #11) > Benchmark on Gentoo Linux, on NVMe SSD using ext4 filesystem: > > With JDK 15.0.2, OpenJDK 64-Bit Server VM, 15.0.2+7-27: > > Benchmark Mode Cnt Score Error Units > File573409_2.exists ss 100 6.502 ± 1.275 us/op > File573409_2.getCanonicalPath_Cached ss 100 8.357 ± 2.369 us/op > File573409_2.getCanonicalPath_Uncached ss 100 15.517 ± 2.297 us/op > File573409_2.isSame ss 100 18.571 ± 2.600 us/op > > With JDK 11.0.9.1, OpenJDK 64-Bit Server VM, 11.0.9.1+1: > > Benchmark Mode Cnt Score Error Units > File573409_2.exists ss 100 8.029 ± 1.387 us/op > File573409_2.getCanonicalPath_Cached ss 100 7.401 ± 1.507 us/op > File573409_2.getCanonicalPath_Uncached ss 100 18.860 ± 2.867 us/op > File573409_2.isSame ss 100 19.486 ± 3.207 us/op Thanks @Michael. The numbers show that getCanonicalPath() in the same speed as exists() and isSame is ~2*getCanonicalPath. So in the normal case where we replaced 2* getCanonicalPath with 1*exist. We also have a speedup on linux. Independent of java version. We also see performance loss with the uncached variant on linux. Even though it is not as drastic as under windows.
Gerrit change https://git.eclipse.org/r/c/platform/eclipse.platform.resources/+/180308 was merged to [master]. Commit: http://git.eclipse.org/c/platform/eclipse.platform.resources.git/commit/?id=0e8e3734a1cba8c0844e4a6633df43af3756ee30