Re: [cross-project-issues-dev] Anonymisation of public data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [cross-project-issues-dev] Anonymisation of public data

From: Mickael Istria <mistria@xxxxxxxxxx>
Date: Thu, 26 Apr 2018 09:02:09 +0200
Delivered-to: cross-project-issues-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/cross-project-issues-dev>
List-help: <mailto:cross-project-issues-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev>, <mailto:cross-project-issues-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/cross-project-issues-dev>, <mailto:cross-project-issues-dev-request@eclipse.org?subject=unsubscribe>

Hi Boris,

The basic idea is to simply replace all identifiers with asymmetrically encrypted strings, so all IDs have the same ciphered result. RSA is used for the encryption, and the private key is thrown away once the encoding is done, making it impossible (according to common encryption standards) to retrieve the original string.

Is this a requirement, at this point, to make it impossible to retrieve the original stream for anyone?

I understand that the providing anonymous dataset is interesting as you explained, but what couldn't you or Eclipse Foundation keep the private RSA key safely to decode the id if you find some unexpected patterns? If you make id anonymous and find a set of id which have a strange correlation and that you'd like to explain, wouldn't it be helpful to decode the id and find out who are the individuals behind it to better understand the cause of the correlation or even set up chats with selected contributors to better understand their practices?
I have the impression there could be value in keeping ability to decode strings, while I don't think fully discarding the key is much safer than keeping it in a safe place (like an EF server with strong restriction on who can access the key).

My 2c (or maybe even less ;)

Mickael Istria

Eclipse IDE developer, for Red Hat Developers

Follow-Ups:
- Re: [cross-project-issues-dev] Anonymisation of public data
  - From: Mike Milinkovich
- Re: [cross-project-issues-dev] Anonymisation of public data
  - From: Boris Baldassari

References:
- [cross-project-issues-dev] Anonymisation of public data
  - From: Boris Baldassari

Prev by Date: [cross-project-issues-dev] Anonymisation of public data
Next by Date: Re: [cross-project-issues-dev] Anonymisation of public data
Previous by thread: [cross-project-issues-dev] Anonymisation of public data
Next by thread: Re: [cross-project-issues-dev] Anonymisation of public data
Index(es):
- Date
- Thread

Breadcrumbs