Community
Participate
Working Groups
I'd like to understand if the Submission System is still inputting entries into the Foundation DB. The records the Submission System has created in the past is causing the DB to have duplicate records for those folks that become committers at a later date as all new committers must complete the online committer questionnaire which populates a DB record for them. This further complicates the issue of committer provisioning as the Portal doesn't always understand which record is to be used for the committer account. When this happens, IP Team must override date in the Portal with information to get the correct record updated. Webmaster than has the chore to determine which record is the correct record for the committer account. We also believe this is further causing some bugs to appear with respect to committer records in the DB. As the Foundation DB is really the "bible" for committer records; there shouldn't be duplicate records for folks. If the Submission System is no longer functioning, then perhaps Webmaster can run a script again on the duplicates and delete accordingly. If the Submission System is still functioning or if it will in the future, perhaps we can figure out a way to avoid having duplicate records in the system. Thanks, Sharon
Gabe, can you help out here?
The Submissions System does enter peoples name and email addresses into the Foundation DB. It has been doing this from the very begging of it existence. The idea was that we would have more people in the Foundation DB. The issue with duplicates boils down to people having multiple emails address (and Bugzilla accounts). When the Submission System looks up an email address and it can't find it, it make a new record in the Foundation DB. There was some code started to allow committers to link all their email address together. I am not sure what the state of that code, but I can try and find out what happen to it. I can't think an automagical way we can handle the duplicated in the Foundation DB. It would be possible to create a page where the duplicates could be found and removed by a human however. It should also be possible to extend the Submission System to handle the duplicates better or to not insert new records into the Foundation DB at all.
Thanks for the information Gabe. I've included Webmaster on this bug as they may have some suggestions and/or know about the previous linking code you are referring to as well. Even if this issue wasn't causing problems with committer provisioning, I still think it incorrect to have a system hold duplicate records for one person or one entity. It just sets up for problems. For example; Submission System creates record for droy Questionnaire creates record for committer droy1 Portal becomes confused about which record is which. Manual intervention required by IP Team and Webmaster in order to provision committer... Webmaster: Do you have anything you care to add here. Regards, Sharon
Infrazilla bug 532 also caused by submission system. Thanks Denis for fixing, I was really concerned about that one (https://foundation.eclipse.org/infrazilla/show_bug.cgi?id=532). Sharon
> Submission System creates record for droy > > Questionnaire creates record for committer droy1 I think that a userid collision should be our first red flag. Just adding a '1' at the end and ending it there is not good. For instances like this, could a simple email be sent out to indicate a collision? This would allow us to intervene manually and delete the dupe, update email addresses, etc.
Check out the saved SQL query in the DB entitled "People created by Submissions System". You'll soon notice that even the Submission System is creating duplicates - some even created on the same day....
I looked at the foundation DB web interface and couldn't see any saved queries.
(In reply to comment #6) > Check out the saved SQL query in the DB entitled "People created by > Submissions System". You'll soon notice that even the Submission System is > creating duplicates - some even created on the same day.... Sharon, saved queries are your own ... Can you just paste the SQL here?
my bad...I guess only I can see it. Here is the query.... select P.PersonID, P.FName, P.LName, P.EMail from People AS P Where P.Comments LIKE "%Submissions System%" Thanks, Sharon
Just as a note non of these people seemed to have the same email address. I added 'order by email' to the query to make finding duplicate email addresses easier. select P.PersonID, P.FName, P.LName, P.EMail from People AS P Where P.Comments LIKE "%Submissions System%" order by EMail
To help prune this list down I can cross reference this list with the submission system database. I can look for any of these people who don't have any talks proposed for any of the past conferences (and econ2010). These people could then be removed from the foundation DB (provided that don't have any other reason to be in the DB).
As many parts of the organization depend on the Database, the scope of this bug is being expanded to include membership's requirement as well. A meeting/call will take place later in March (following EclipseCon) to discuss the DB for moving forward with a review to resolving issues being experienced. Thanks, Sharon
Denis and I have discussed - this issue is currently being set on the back burner for now....
Moving to the Submission System.
For reference, here is where work needs to be done: http://dev.eclipse.org/viewcvs/index.cgi/org.eclipse.dash/submissions/classes/user.class.php?root=Technology_Project
If we change the submission system as I propose in bug 303756 comment 1, this problem will go away. I see no real reason to 'pollute' the Foundation DB with entries for People who simply want to create submissions, and we have already been centralizing on the Bugzilla login anyway.
That reference should be bug 324277 comment 1.
+1 -- I never liked the idea of putting EclipseCon Data into Foundation DB, and was told my view of what the FoundationDB should be for was too "narrow"...
In bug 324277 we're saying we should use the Bugzilla ID (an integer) as the primary means of identifying people instead of their email address (which changes in BZ but is hard to propagate) and instead of crafting up a user id (which is error prone). To do that, we must divorce the Conference Submission System from our internal data. This means the conference will refer to Bugzilla directly for user email addresses and names.
One thing to consider is the company affiliation of a submitter, which is currently fetched from live data in the Foundation's internal database. https://www.eclipsecon.org/submissions/2011/view_talk.php?id=2393 We could still TRY and make the connection between the Bugzilla email address and some record in the Foundation DB, but that will certainly break for non-committers. Or we can capture this in the bio pages and record it as a point-in-time value (ie, it would be stored with the user's data for that conference). Does that even remotely make sense?
I am working on this now... (and bug 324277), so if we can keep changes to the Sub System to a minimum, that will make my life easier.
Progress is coming along nicely. I can log in and create a submission without ever touching the Foundation DB. The big work will be fixing all the references to PersonID ("droy") so that the integer BugzillaID is used instead. The patch for this bug will be just massive.
Created attachment 188198 [details] Early Patch Made good progress today. Just about all the "user" functionality is refactored, but it all needs tons of testing. Tomorrow I'll focus on the conf queen/chair/pc functionality. Here is a patch of what is done so far, in case anyone with a local copy wants to try it out.
Created attachment 188271 [details] Early Patch v2 Didn't do as much progress as I would have liked today, but I started on the chair/pc stuff. I'll continue today, but I have some questions for Anne in the next post. In the meanwhile, here is an updated patch of where I am at. So far, these SQL statements also need to be executed to bring the schema up to date: use conferences; drop table people; create table people (BugzillaUserID mediumint unsigned not null primary key, Picture blob null, PictureMime varchar(32) null); alter table speakers add BugzillaUserID mediumint unsigned after PersonID; alter table submission_authors add BugzillaUserID mediumint unsigned after PersonID; alter table submission_comments add BugzillaUserID mediumint unsigned after PersonID; alter table people_roles add BugzillaUserID mediumint unsigned after PersonID; alter table watching_submissions add BugzillaUserID mediumint unsigned after PersonID; alter table watching_everything add BugzillaUserID mediumint unsigned after PersonID; alter table watching_categories add BugzillaUserID mediumint unsigned after PersonID; alter table watching_types add BugzillaUserID mediumint unsigned after PersonID; alter table watching_tags add BugzillaUserID mediumint unsigned after PersonID; alter table watching_new_submissions add BugzillaUserID mediumint unsigned after PersonID; # Getting Bugzilla ID from email address select distinct s.PersonID, P.EMail, B.userid from speakers as s LEFT JOIN myfoundation_demo.People AS P ON P.PersonID = s.PersonID LEFT JOIN myfoundation_demo.profiles AS B on B.login_name = P.EMail; # Updating PersonID tables to BugzillaUserID update speakers as s, People_to_Bugzilla as b set s.BugzillaUserID = b.userid where s.PersonID = b.PersonID; update submission_authors as s, People_to_Bugzilla as b set s.BugzillaUserID = b.userid where s.PersonID = b.PersonID; update submission_comments as s, People_to_Bugzilla as b set s.BugzillaUserID = b.userid where s.PersonID = b.PersonID; update people_roles as s, People_to_Bugzilla as b set s.BugzillaUserID = b.userid where s.PersonID = b.PersonID; update watching_submissions as s, People_to_Bugzilla as b set s.BugzillaUserID = b.userid where s.PersonID = b.PersonID; update watching_everything as s, People_to_Bugzilla as b set s.BugzillaUserID = b.userid where s.PersonID = b.PersonID; update watching_categories as s, People_to_Bugzilla as b set s.BugzillaUserID = b.userid where s.PersonID = b.PersonID; update watching_types as s, People_to_Bugzilla as b set s.BugzillaUserID = b.userid where s.PersonID = b.PersonID; update watching_tags as s, People_to_Bugzilla as b set s.BugzillaUserID = b.userid where s.PersonID = b.PersonID; update watching_new_submissions as s, People_to_Bugzilla as b set s.BugzillaUserID = b.userid where s.PersonID = b.PersonID;
Created attachment 188272 [details] Screenshot Anne: Are there any options in that list that you know are no longer used? The Scheduling Tool contains references to 2009, and it doesn't work on my dev environment. Since the other buttons seem to rely on the schedule, I hesitate to continue. Can you fill me in on how the schedule is done? Point me to a doc? Thanks
I'll be revisiting this next week once the excitement of EclipseCon 2011 has begun to die down.
The final patch has been committed. I've also fixed some SQL injection vulnerabilities, as well as information leakage (user.php). On the surface it all looks good, and I've tested plenty, but there will be issues. Please file separate requests and I'll fix them as quickly as I can.