[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [recommenders-dev] Snipmatch GSOC was: How to start with Recommender's source code


On 20.02.2012, at 08:56, chen cheng wrote:

EGit in included in Eclipse 4, use Git as snippet repositories seems to be a good idea. I agree with this solution.

Users can even set up sync client side snippet repositories with server side automatically or manually. In automatically mode, we use a thread to sync for users, in manul mode, user sync and update/commit snippet by themselves. But we should notice that, there are two kinds of snippet repositories in server side: Common repositories and Personal repositories.

exactly. people have their own repos. "group" repositories might be possible on top of this.

1.Common repositories is used by all the SnipMatch users, they can only update from remote server as they like, but they can not submit freely. How can users contribute common snippets ? May be we should formulate a rule or something. 

There the webservice comes into play. or we allow registered users to create and edit new files. Finally, it's version controlled. So we might rollback these changes easily. But there is some kind of trust involved. Same as with Eclipse Wiki pages (and if I got Wayne's tweets right), there are spammers too. This will become an issue. But not yet, I think :)


2.Personal repositories is distinguished by user id, user can submit or update in/from remote server as he like.

exactly.

If Marcel can build a search engine base on Lucene and GIT file system, it is perfect. 

Nothing simpler than that :)

About snippet storage format, one snippet one file, named with its description ?

Simple formats can be: JSON or any arbitrary text format/markup. If we manage to create a lightweight grammar, we can even provide simple and fast editors with Xtext (although I wouldn't spent too much time on this detail yet).

Organized by category directory ? It seems there may be much small files, very slow to load all these files, open close,open close ...

I think, organization by category only makes sense if 1 category is permitted. Actually, I think a snippet has many tags, comments etc. Thus, I'd stick with one directory with all files. Regarding performance: The search index will filter what's needed. There is no need to load hundreds of files. Also, you OS does a lot of caching. I think, this will not become a bottleneck. But others may prove me wrong here. 

But let's  hide the store behind some slim interface, make the implementation interchangeable, and go for a local file system approach first?



2012/2/20 Marcel Bruch <bruch@xxxxxxxxxxxxxxxxxx>
Hi Cheng, Hi Doug,

regarding server-side backend:
We use Apache CouchDB as database and JAX-RS as RESTful server-interface for client communication. However, I just wonder whether using GIT repositories as backend would be sufficient or even better than CouchDB in our case. It can be synced easily between clients and server, support for many potential sources is straight-forward, and with JGit and EGit we have quite usable front-ends and APIs to work with. I'd just add the Lucene search index on top of the file system resources. Best of it (for the moment): we can start immediately w/o waiting too long for the server-side.

What do you think? Since it's merely a file-based approach with slim syncing capabilities we don't spent too much time on it if it proves not usable. But at least Github has proven that using GIT as snippet repositories works (at least they say GISTs are single git repositories).


Chen,
yes, please go for the proposal with the points you mentioned. Project mentor should be Doug, I'll be second mentor.

Doug,
you have to sign in as Mentor on the GSOC page and send Wayne an email that he confirms you are Eclipse Committer and eligible to be a Mentor for Eclipse GSOCs.

Regarding search-engine:
I'll be glad to write the search interface. We just need to agree on a snippet storage format.
(my favorite for the moment is plain text with some mark-up)

Marcel

On 20.02.2012, at 04:03, chen cheng wrote:

> Hi Doug,
>
> Yeah, In my initial idea, server side search engine and client side engine should be implemented the same, at least have similar search result.
>
> Also, i am happy to work for the server side if i have enough time, but one question. I am not very sure about your solution, you mean we develop a brand new Java based server? Or we still use the current PHP server, but implement search algorithm in Java (May even use Lucence etc in the future), PHP code invoke Java search result ?
>
> I guess you mean solution two, right ?
>
> Here is my plan about improving SnipMatch client side:
>
> 1. Implement all the design in my last post, create new Eclipse preference, local storage, improve GUI etc for SnipMatch. Leaving data interface for search engine (use a simple string compare algorithm at beginning, then improve the search engine in the future).
>
> 2. Wait for the backend of Doug's job, implement search engine both for client side and server side. After i finished client side work, i can work with Doug together for the Java based search engine both for client side and server side. In fact, i am thinking there may be other search engines in Recommenders' other module such as Code Complete feature, is it possible for us to use some existing search engine ? Marcel, need your answer here :-)
>
> Doug & Marcel, is it OK about this plan ? If everything is prepared, i will write a detailed project proposal for this SnipMatch's merging and improving job, and start coding soon. And as a GSoC project, i need a project mentor, so i am just waiting for your favored here :-D
>
>


_______________________________________________
recommenders-dev mailing list
recommenders-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/recommenders-dev



--
Best Regards From Cheng Chen [chengchendoc@xxxxxxxxx]
_______________________________________________
recommenders-dev mailing list
recommenders-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/recommenders-dev

Thanks,
Marcel