Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[cdt-dev] Symbol translation concerns

Hey all,

  I'm currently doing some work in the area of the symbol translator.  I'm
particularly
looking at improving a few performance bottlenecks that are easily attacked
and then 
looking at ways to get rid of the larger issues (ie can we have a native
parser for the
symbols instead of addr2line/cppfilt combinations).

  I have a couple of questions to see if I can tickle the larger community
memory since
I'd rather not break things that are working.  I don't think that this will,
but I'm 
probing anyway since there is a collective memory here.  The first target is
to make
a few changes to the Elf/ElfHelper classes:

1- The SymbolSortCompare class converts all of its symbols to lowercase (via
.toLowerCase())
and then performs a comparison.  On large symbol files (anything over a
couple of megs)
this can create a flurry of object creations.  I'm going to change this code
to not 
convert the strings to lower case but instead (initially) use
compareToIgnoreCase().  The
performance gains observed are not insignificant:

Original (time to sort the symbol array used by the ElfHelper class):
- Sorting 53580 symbols took 6203ms
Modified (traded toLowerCase for compareToIgnoreCase)
- Sorting 53580 symbols took 828ms

2- The second "thing" that the SymbolSortCompare class is to strip out any
leading 
underscores.  I'm not sure what the rationale is for doing this but it seems
wrong
to me to be arbitrary ripping out leading underscores (__start != _start !=
start).

Looking at the bigger picture, the general use of the symbols does not seem
to require
them being sorted at all, and since the behaviour is not documented I'm
tempted to 
toss the lot .. but baby steps first.

Proposal:
- I'm going to switch the sorting to continue to be case in-sensitive but
using the
compareToIgnoreCase
- I'm going to remove the leading underscore "filtering" of the sorter.
This may
have some relation to PR 87698, I'm still looking into that.

Thanks,
 Thomas
 


Back to the top