Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [jgit-dev] Writing packs with bitmaps sometimes slower as without bitmaps?
  • From: "Halstrick, Christian" <christian.halstrick@xxxxxxx>
  • Date: Thu, 5 Mar 2020 15:08:41 +0000
  • Accept-language: de-DE, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=sap.com; dmarc=pass action=none header.from=sap.com; dkim=pass header.d=sap.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ddLej6FE7O1MtorhUJTBba3dUaYtQjedIHB65C3m6Ns=; b=fZfGSdeYnfyE2MoBHo6tRC/oiCbsh4jdKziD+CJjr7M73BJCaYzHsNmNWcdThQRNgEcV99aWUg0wBYp4M+L6NNmXxEJIYl4/AFrl37vG5Ia7oEQZ3s9GZkn1kgQphBph4kt/TasHVVLnEHSC8vkYcfNNW/vKQ+p47oEeEKI7YnOf+8V7aQluoSoD6hE7oxLq3LCparj/TW55fTr1wIidzJyQFdLNACu1vtxZXq/Ul2o3DzvS6ef3hKn9oLSmfyJJQsCN86iZp1Jxd43dWMxZ5iZDL4BrtLnIFG3eNYIgn27flN1t1lOZVs5TZyWivkLchn6p7LYxTFOwY37qrZKI2Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Hx6oWLKg8WM9csOiFNlpWs1YqonTrrugr/tN5r+LtATCrUQ0UGGFbFhtxyd7r+V4so9M6K7S8leCEawoikRvdY4jlZ+WlU5InT9Gr2r5vJybDGXdIT9SbnFzqkobC47buvL8RyVAFU7sNJic6QxKltzSrF4uFZ+wz0d3vqiif90eETB+9sWOxGC3olCiC0RRKH/7e8pPpA/Q/ddna5dvIUq6INq4TxET8ZxBP4hY/MAKI3zjtlbH3Yc6GcqqZQzAG5eSZKSz2IFIRITB/pVB0TWDJgPMygb6mS1SbI/gscBMdZGCa+poSdMbG9efPQI49V0aYViiNj8rL49epjvkxw==
  • Delivered-to: jgit-dev@xxxxxxxxxxx
  • Ironport-sdr: hLrEpvT5k5MGOtdq/RSUSY7exgBShM/bDMBYFcOaSnygMVdt0KE1PRyr+T5Viq0tAjPMxbO6Kp f1T1+aW/r4DNKNiCs6I9udhzIwaWo6sqXleiIpou9r7+FDZ9C+UvwoSvWObkxWETIvFVHM0Xho HmBmqOLPZ6BLdeRjKSyoAfVCrbdKXHCOB7xlRIxbn1REJDiB9H3BnmUELzbE0xoWq89vqhQ/lO GP1e1RH+X3/CYF4OBA8iSkNZiXuU4WhHHR5WZ6PMpPWdfemu6HQvlnLDSZYHNwS4QzTrqoF4bg smcv8aJLvAVnrHzIHXdr1/D2
  • List-archive: <https://www.eclipse.org/mailman/private/jgit-dev>
  • List-help: <mailto:jgit-dev-request@eclipse.org?subject=help>
  • List-subscribe: <https://www.eclipse.org/mailman/listinfo/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=subscribe>
  • List-unsubscribe: <https://www.eclipse.org/mailman/options/jgit-dev>, <mailto:jgit-dev-request@eclipse.org?subject=unsubscribe>
  • Thread-index: AQHVCYmayXKAoG5ieUmdZOp148LwOKaQgpIAgatpQFk=
  • Thread-topic: [jgit-dev] Writing packs with bitmaps sometimes slower as without bitmaps?

Hi Ivan,

recreated the problem also with latest jgit. I think it's still a problem and therefore I created bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=560821. Lets continue discussion there.

Von: jgit-dev-bounces@xxxxxxxxxxx <jgit-dev-bounces@xxxxxxxxxxx> im Auftrag von Christian Halstrick <christian.halstrick@xxxxxxxxx>
Gesendet: Freitag, 7. Juni 2019 18:05
An: Ivan Frade <ifrade@xxxxxxxxxx>
Cc: JGit Developers list <jgit-dev@xxxxxxxxxxx>
Betreff: Re: [jgit-dev] Writing packs with bitmaps sometimes slower as without bitmaps?
 
Hi,
found time to proceed on this issue.

I can now reproduce the issue with the public gerrit repo. When pushing one new commit to a local clone of the gerrit repo (a repo having a lot of refs) then the push is 8 times slower if the sending repo has bitmaps. Most of the time is spent in org.eclipse.jgit.revwalk.BitmapWalker.findObjectsWalk(). The script which execution log I append can be downloaded from https://gist.github.com/chalstrick/864fecf5cc45c056e90225418d6b9c89


++ jgit --version
jgit version 5.1.8-SNAPSHOT
++ rm -fr gerrit.src.git gerrit.dst.git gerrit.client
++ git clone --bare --mirror https://gerrit.googlesource.com/gerrit gerrit.dst.git
...
++ cp -r gerrit.dst.git gerrit.dst.git.backup
++ git clone --bare --mirror gerrit.dst.git gerrit.src.git
Cloning into bare repository 'gerrit.src.git'...
done.
++ git clone gerrit.src.git gerrit.client
Cloning into 'gerrit.client'...
done.
++ cd gerrit.client
++ date
++ git add README.md
++ git commit -m 'modify README.md'
[master 7d26b6c82f8] modify README.md
 1 file changed, 1 insertion(+)
++ git push origin
...
++ cd gerrit.src.git

##### The fast push (2s) when we don't have bitmaps

++ jgit push origin HEAD:refs/heads/master
Counting objects:       3
Finding sources:        100% (3/3)
Getting sizes:          100% (2/2)
Compressing objects:    100% (4210/4210)
Writing objects:        100% (3/3)
remote: Updating references: 100% (1/1)To /Users/d032780/git/repl_test/gerrit.dst.git
   78e6f24..7d26b6c  HEAD -> master

real    0m2.173s
user    0m5.084s
sys     0m0.525s
++ rm -fr gerrit.dst.git
++ cp -r gerrit.dst.git.backup gerrit.dst.git
++ cd gerrit.src.git
### Force creating bitmaps
++ git repack -a -d -b
Enumerating objects: 1053003, done.
Counting objects: 100% (1053003/1053003), done.
Delta compression using up to 8 threads
Compressing objects: 100% (436944/436944), done.
Writing objects: 100% (1053003/1053003), done.
Selecting bitmap commits: 238041, done.
Building bitmaps: 100% (352/352), done.
Total 1053003 (delta 467117), reused 1052949 (delta 467065)

##### The slow push (9s) when we have bitmaps

++ jgit push origin HEAD:refs/heads/master
Counting objects:       528225
Finding sources:        100% (3/3)
Getting sizes:          100% (2/2)
Compressing objects:    100% (1142/1142)
Writing objects:        100% (3/3)
remote: Updating references: 100% (1/1)To /Users/d032780/git/repl_test/gerrit.dst.git
   78e6f24..7d26b6c  HEAD -> master

real    0m9.070s
user    0m12.130s
sys     0m1.015s
>
  

On Mon, May 13, 2019 at 2:43 PM Christian Halstrick <christian.halstrick@xxxxxxxxx> wrote:
Hi,
I was trying to reproduce it on a big public repo but up to know had
no success with that. That's a very time consuming task because I need
huge repos with a ton of refs. Tried with linux, no luck. Now I check
with chromium. I inform you when I managed that.

But by just inspecting the code and looking at the trace output I see
a difference between the bitmap aware code and the non-bitmap-aware
code in PackWriter#findObjectsToPack. In my repeatable case
BitmapWalker.findObjects(have, null, true)
runs for 15 minutes :-(

On Fri, May 10, 2019 at 10:53 PM Ivan Frade <ifrade@xxxxxxxxxx> wrote:
>
> Hi Christian,
>
>  I have been working with bitmaps lately [1]. My changes doesn't affect PackWriter (they are not even committed yet!), but I am interested in everything bitmap-related.
>
>  Did you gather any more information about this issue? Any chance to reproduce it in a test?
>
>  Regards,
>
> Ivan
>
>  [1] https://git.eclipse.org/r/c/140958/
>
>
>
> From: Christian Halstrick <christian.halstrick@xxxxxxxxx>
> Date: Wed, May 8, 2019 at 4:04 PM
> To: JGit Developers list
>
>> Hi,
>>
>> I am investigating a performance problem non-public gerrit servers.
>> The problem boils
>> down to slow performance of PackWriter when using bitmaps. When the
>> usage of bitmaps
>> is turned off then performance is at least 5 times better.
>>
>> I uploaded a the change https://git.eclipse.org/r/141842 to
>> demonstrate that. That change
>> adds some printfs to emit performance data for the PackWriter class.
>> Additionally a System
>> property PackWriterForceNoBitmap is introduced that when set to true
>> forces PackWriter not
>> to use bitmaps.
>>
>> Problem is the performance of PackWriter#findObjectsToPack. That
>> method delegates to
>> PackWriter#findObjectsToPackUsingBitmaps() which is in my case much
>> slower than using
>> the default code not using bitmaps. It looks like
>> findObjectsToPackUsingBitmaps() is first calculating all have objects,
>> then all want objects and then calculates the difference. In huge
>> repos calculating all have objects is consuming 900sec.
>> The non-bitmap code in findObjectsToPack() creates a walk where have
>> and want objects are both used and has to walk only over very few
>> objects which takes only 200sec.
>> In the end both algorithms (bitmap and non-bitmap aware code) find the
>> same result: only one commit with one new blob has to be sent.
>>
>> Is somebody aware of the fact that when working on packfiles of 2GB
>> size findObjectsToPackUsingBitmaps() is so much slower than
>> non-bitmap-aware code?
>>
>> Stats of the repo:
>> $ du -sh *.pack
>> 1.1M    pack-097c243a771df372d4e1098af0d89a3d25be8de6.pack
>>  36K    pack-0bc2fecc4d3c070b82285b70a6394b15f87a9e50.pack
>>  64K    pack-57c34e10b1cc0305784f48bda175064e2ad8fa1a.pack
>> 240K    pack-7cc06b124492dcab6fda3e797f3c16fe06929e2a.pack
>> 2.0G    pack-e7a9322fa39225c670cab245b9e012c1aa6f3a61.pack
>>
>>
>> Here are the traces:
>>
>> =================================================
>> added performance printfs, forced not to use bitmaps
>> =================================================
>> ...
>> Counting objects:       1
>> Counting objects:       4
>> TracePushPerf: findObjectsToPush(): code not using bitmaps runtime: 251776
>> TracePushPerf: findObjectsToPack() runtime: 251916
>>
>> Finding sources:         25% (1/4)
>> Finding sources:         50% (2/4)
>> Finding sources:         75% (3/4)
>> Finding sources:        100% (4/4)
>> Finding sources:        100% (4/4)
>>
>> Getting sizes:           33% (1/3)
>> Getting sizes:           66% (2/3)
>> Getting sizes:          100% (3/3)
>> Getting sizes:          100% (3/3)
>>
>> Compressing objects:     99% (8001/8060)
>> Compressing objects:    100% (8060/8060)
>> Compressing objects:    100% (8060/8060)
>>
>> Writing objects:         25% (1/4)
>> Writing objects:         50% (2/4)
>> Writing objects:         75% (3/4)
>> Writing objects:        100% (4/4)
>> Writing objects:        100% (4/4)
>>
>> remote: Updating references: 100% (1/1)To
>> /Users/d032780/git/repl_test/hana.dst.git/
>>    766c519..cdcc784  temp_d056507 -> temp_d056507
>>
>> =================================================
>> added performance printfs, using bitmaps
>> =================================================
>> Counting objects:       1
>> Counting objects:       41507
>> Counting objects:       100967
>> Counting objects:       165489
>> ...
>> Counting objects:       3946346
>> Counting objects:       3948655
>> TracePushPerf: findObjectsToPackUsingBitmaps() ms to find find haves: 929147
>>
>> Counting objects:       4065807
>> TracePushPerf: findObjectsToPackUsingBitmaps()  ms to find find want: 638
>> TracePushPerf: findObjectsToPackUsingBitmaps()  ms to find find need: 8
>> TracePushPerf: findObjectsToPackUsingBitmaps()  ms to add needed: 2
>> TracePushPerf: findObjectsToPackUsingBitmaps() runtime: 929795
>>
>> Counting objects:       4103019
>> TracePushPerf: findObjectsToPack() runtime: 938584
>>
>> Finding sources:         25% (1/4)
>> Finding sources:         50% (2/4)
>> Finding sources:         75% (3/4)
>> Finding sources:        100% (4/4)
>> Finding sources:        100% (4/4)
>>
>> Getting sizes:           33% (1/3)
>> Getting sizes:           66% (2/3)
>> Getting sizes:          100% (3/3)
>> Getting sizes:          100% (3/3)
>>
>> Compressing objects:     99% (8001/8060)
>> Compressing objects:    100% (8060/8060)
>> Compressing objects:    100% (8060/8060)
>>
>> Writing objects:         25% (1/4)
>> Writing objects:         50% (2/4)
>> Writing objects:         75% (3/4)
>> Writing objects:        100% (4/4)
>> Writing objects:        100% (4/4)
>>
>> remote: Updating references: 100% (1/1)To
>> /Users/d032780/git/repl_test/hana.dst.git/
>>    766c519..cdcc784  temp_d056507 -> temp_d056507
>> _______________________________________________
>> jgit-dev mailing list
>> jgit-dev@xxxxxxxxxxx
>> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
>> https://www.eclipse.org/mailman/listinfo/jgit-dev

Back to the top