[emf-dev] EMF Compare Name Similarity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

[emf-dev] EMF Compare Name Similarity

From: Simon <goood.guy@xxxxxx>
Date: Fri, 05 Jul 2013 14:53:57 +0200
Delivered-to: emf-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/emf-dev>
List-help: <mailto:emf-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/emf-dev>, <mailto:emf-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/emf-dev>, <mailto:emf-dev-request@eclipse.org?subject=unsubscribe>
User-agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20130620 Thunderbird/17.0.7

Hi,

at the moment I am reverse engineering EMF Compare and I've already readmuch material. I think I found some inconsistencies among the materialand want to task if I understand things right.


That are the statements in question:

a) According to [1] EMF Compare uses Levenshtein distance for stringsimilarity.b) According to [3] EMF Compare 1.3 is similar to [4]. In [4] the Dicecoefficient (although it is not named explicitly) is used for stringsimilarity.



After a code review of [2] and [5], I came to the following conclusions:

I) EMF Compare 1.x and 2.x use the Dice coefficient with bi-grams forstring similarityII) EMF Compare 2.x uses the Longest Common Subsequence to determinechanges in multi-references of EObjects

III) a) is wrong/outdated.

I appreciate if someone can approve my conclusions.




References:

[1]http://eclipsesummit.org/summiteurope2006/presentations/ESE2006-EclipseModelingSymposium10_EMFCompareUtility.pdf

[2]http://git.eclipse.org/c/emfcompare/org.eclipse.emf.compare.git/tree/plugins/org.eclipse.emf.compare.match/src/org/eclipse/emf/compare/match/internal/statistic/NameSimilarity.java?h=1.3

[3]http://wiki.eclipse.org/EMF_Compare/FAQ/1.3#What_kind_of_.22strategies.22_use_EMF_compare_.3F


[4] http://ase.cs.uni-due.de/olbib/p54-xing-241.pdf

[5]http://git.eclipse.org/c/emfcompare/org.eclipse.emf.compare.git/tree/plugins/org.eclipse.emf.compare/src/org/eclipse/emf/compare/utils/DiffUtil.java?h=2.1

Follow-Ups:
- Re: [emf-dev] EMF Compare Name Similarity
  - From: Cédric Brun

Prev by Date: [emf-dev] Switching EMF to use Gerrit
Next by Date: Re: [emf-dev] Switching EMF to use Gerrit
Previous by thread: [emf-dev] Switching EMF to use Gerrit
Next by thread: Re: [emf-dev] EMF Compare Name Similarity
Index(es):
- Date
- Thread

Breadcrumbs