Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [recommenders-dev] How to start with Recommender's source code

Hi Doug,

Yes, we decided to use the same search algorithm in both client side and server side. At the beginning, i will use a simple match algorithm. Meanwhile, Marcel will start a Lucene and GIT file system based search engine for us, both server side and client. In the future, i can use this engine directly, but for your server side, may be you should do some basic fundamental job to use PHP invoke Java search engine interface.

Then, everything is ready :-)

2012/2/21 Doug Wightman <douglasw@xxxxxxxxx>
Hi Cheng,

This sounds great! I'd like the local storage to use the same search
and ranking algorithm - it's actually quite simple - is this ok?
Please see the description below - it shouldn't be too hard to
implement. You can start with a simple matching algorithm (whatever
makes sense to you) and then refine it to work like this afterwards.
This would save a lot of time - the java code written for the local
storage can then be used also for remote storage.

SnipMatch search algorithm:
1) First, we perform a keyword search, looking for each of the words
in the search query within the snippet _search patterns_ (defined by
the snippet creators - e.g. "lowercase $str"). If there is at least
one overlapping word, we consider a match to have been found.
2) Then, we rank the results and display the top ten. There are two
cases for ranking:

a) The search pattern (defined by the snippet creator - e.g.
"lowercase $str") is _in-order_ with respect to the search query (e.g.
the search query might be "lowercase a" or "low", etc.).
b) The search pattern is _not_ in order with respect to the search query

All in-order results are ranked higher than unordered results.
Finally, results are also ordered within these cases:

In-order ranking:
-results are ranked by the number of words in the search query
matching words in the search pattern. when there is a tie, we rank
results with fewer missing arguments (variables) higher.
Unordered ranking:
-results are only ranked by the number of words in the search query
matching words in the search pattern.

This description is complete, with one non-essential exception: in the
current implementation, we also search for the types of the local
variables that are mentioned in the search query - for example, if you
search for "a" and "a" is a local string, you'll see results such as
"lowercase a". This special case is easy to add once the rest is done.

Doug


p.s.

I can write the method to determine if the search query is in order -
I've already written it in PHP :)



On Sun, Feb 19, 2012 at 10:03 PM, chen cheng <chengchendoc@xxxxxxxxx> wrote:
> Hi Doug,
>
> Yeah, In my initial idea, server side search engine and client side engine
> should be implemented the same, at least have similar search result.
>
> Also, i am happy to work for the server side if i have enough time, but one
> question. I am not very sure about your solution, you mean we develop a
> brand new Java based server? Or we still use the current PHP server, but
> implement search algorithm in Java (May even use Lucence etc in the future),
> PHP code invoke Java search result ?
>
> I guess you mean solution two, right ?
>
> Here is my plan about improving SnipMatch client side:
>
> 1. Implement all the design in my last post, create new Eclipse preference,
> local storage, improve GUI etc for SnipMatch. Leaving data interface for
> search engine (use a simple string compare algorithm at beginning, then
> improve the search engine in the future).
>
> 2. Wait for the backend of Doug's job, implement search engine both for
> client side and server side. After i finished client side work, i can work
> with Doug together for the Java based search engine both for client side and
> server side. In fact, i am thinking there may be other search engines in
> Recommenders' other module such as Code Complete feature, is it possible for
> us to use some existing search engine ? Marcel, need your answer here :-)
>
> Doug & Marcel, is it OK about this plan ? If everything is prepared, i will
> write a detailed project proposal for this SnipMatch's merging and improving
> job, and start coding soon. And as a GSoC project, i need a project mentor,
> so i am just waiting for your favored here :-D
>
> 2012/2/20 Doug Wightman <douglasw@xxxxxxxxx>
>>
>> Thanks Cheng :)
>>
>> This sounds great. I have a suggestion, though: I think local search
>> should be implemented the same as "remote" (server-side) search, with
>> the local search results simply cached locally. There are several
>> advantages to this approach:
>> -one search algorithm (keeping results and UI behavior consistent for the
>> user)
>> -easier to blend results from the local and remote sources
>> -easier to build up the local cache (can automatically locally store
>> results that you have recently searched for locally, etc.)
>>
>> What do you think?
>>
>> Now, the one caveat is that if we take this approach you'll need to
>> wait for the backend to be ported to java (so that you can use the
>> same algorithm client-side), and I'm behind schedule on the port. If
>> you have time, it would be great to have your help writing the java
>> code for the server - we could do this together - you wouldn't need to
>> learn PHP.
>>
>> Doug
>>
>>
>>
>> On Sat, Feb 18, 2012 at 9:44 PM, chen cheng <chengchendoc@xxxxxxxxx>
>> wrote:
>> > Hi Doug,
>> >
>> > Finally, see your words here :-D You guys did a really gread job, i am
>> > just
>> > standing on the shoulders of giants, HAHA
>> >
>> > Discussed with Marcel before, I have downloded and read SnipMatch's
>> > client
>> > code carefully. Regarding server side, I am familiar with Java relative
>> > solution personally, not know PHP much, so, i do not involve deeply in
>> > server side code, Fortunately, Doug will cover server side, so i can
>> > focus
>> > on SnipMatch's client merge and important job.
>> >
>> > Following is my personal idea and advises, we can discuss together, if
>> > there
>> > is something wrong in my opinion, please point out :-)
>> >
>> > Now, SnipMatch works like this:
>> >
>> > User input search label (such as "print" strings), SnipMatch client post
>> > this query string to server side. Search engine in server side will get
>> > the
>> > search result (such as "System.out.println()") and response to client
>> > side,
>> > i do not know search engine's detail, but very sure, we will keep
>> > improving
>> > it in the future. Client side get the response result list, user get one
>> > of
>> > them, replace the parameters, then insert into current editor.
>> >
>> > This solution works well, and we call it "Directly Search" solution, but
>> > i
>> > think there are still some problems in it:
>> > 1. No internet service, no SnipMatch, user can not use SnipMatch at all
>> > without internet. In some bad internet service area, such as China, this
>> > solution will affect many people. Or just consider this situation, if
>> > some
>> > group are developing hardly for a secret project, for security, they
>> > aren't
>> > connected with internet, but they want to use SnipMatch, what can they
>> > do ?
>> > 2. Search query will grow rapidly when SnipMatch users grow, we have
>> > to invest more and more money about server side hardware in the future.
>> >
>> > So i am thinking supply another SnipMatch's client solution, call it
>> > "Local
>> > Search". User can choose his favorite way between "Directly Search" and
>> > "Local Search".
>> >
>> > Local Search solution works like this.
>> > 1. User store Snip Code Templates in local cache & storage, there are
>> > three
>> > kind of code template:
>> >
>> > I. Public common templates which keep synchronous with templates in
>> > remote
>> > SnipMatch's server.There is a thread keep the synchronous
>> > things over a period of time when internet service is free.
>> > II. Personal templates which also keep synchronous with his own template
>> > base in remote SnipMatch's server
>> > (Use SnipMatch account as store ID in SnipMatch's server database).
>> > User can decide whether share these templates with local anonymous users
>> > or
>> > other login users have SnipMatch
>> > account, perform this in SnipMatch's Eclipse preferences.
>> > III. Anonymous local templates, each user (No matter login or not) can
>> > use
>> > these templates.
>> >
>> > 2. We store templates in local storage, so we have to supply a local
>> > template search engine. At the beginning, may be
>> > just a string compare crude engine, but we will improve it as a Lucence
>> > based powerful one in the future. May be we can
>> > consider reuse Recommenders' other search engine.
>> >
>> > 3. Just like Apache Maven, user can set up third part Code Template
>> > repository besides SnipMatch's official
>> > side(current http://languageinterfaces.com/) .
>> > Build the server side service by using SnipMatch's server side solution,
>> > then add/remove remote address in SnipMatch's Eclipse preferences.
>> >
>> > Besides Jakob's snippet editor job, and advanced local search engine, i
>> > can
>> > handler all the other jobs. If there is no proper guys can support local
>> > search engine. I can just build a simple one, design proper interfaces,
>> > then
>> > cover this after all the other basic jobs done.
>> >
>> > Here is my initial plan, your advises? Time is not much if we want to
>> > catch
>> > Eclipse's release, may be i should start development job before GSoC
>> > start
>> > in April.
>> >
>> >
>> >
>> > 2012/2/19 Doug Wightman <douglasw@xxxxxxxxx>
>> >>
>> >> Hi Cheng!
>> >>
>> >> Yes, please go ahead and merge SnipMatch into recommenders. Sorry for
>> >> the delay - I've been stuck organizing a conference (tei-conf.org)
>> >> that's starting tomorrow. I think this would be an awesome GSOC
>> >> project and would be happy to support any way that I can.
>> >>
>> >> Great to have you on board, happy to answer any questions you have!
>> >>
>> >> Doug
>> >>
>> >>
>> >> On Fri, Feb 17, 2012 at 9:22 PM, chen cheng <chengchendoc@xxxxxxxxx>
>> >> wrote:
>> >> > Ok, i will read the SnipMatch's source code first.
>> >> >
>> >> > One question: Does SnipMatch and Recommenders' merge job start  yet
>> >> > or
>> >> > not ?
>> >> > In my view, it is not hard if we just merge SnipMatch's source code
>> >> > into
>> >> > Recommenders. Do we have a full plan about how to improve SnipMatch's
>> >> > current features ? Or we just start this part of job from zero.
>> >> >
>> >> > 2012/2/18 Marcel Bruch <bruch@xxxxxxxxxxxxxxxxxx>
>> >> >>
>> >> >> Hi Cheng,
>> >> >>
>> >> >> I was hoping Doug would jump in and provide more details on
>> >> >> Snipmatch.
>> >> >> Maybe he's off for a few days.
>> >> >>
>> >> >> Your summary is right to the point. That's how the system works.
>> >> >> Doug
>> >> >> has
>> >> >> committed the sources to our Eclipselabs repositories. You might
>> >> >> spent
>> >> >> some
>> >> >> time on the sources and dive into the internals
>> >> >>
>> >> >>
>> >> >> here http://code.google.com/a/eclipselabs.org/p/code-recommenders/source/browse/?repo=kiyomi
>> >> >>
>> >> >> Let's wait a few days and get Doug and/or Zi involved...
>> >> >>
>> >> >> Best,
>> >> >> Marcel
>> >> >>
>> >> >>
>> >> >>
>> >>
>> >> _______________________________________________
>> >> recommenders-dev mailing list
>> >> recommenders-dev@xxxxxxxxxxx
>> >> http://dev.eclipse.org/mailman/listinfo/recommenders-dev
>> >
>> >
>> >
>> >
>> > --
>> > Best Regards From Cheng Chen [chengchendoc@xxxxxxxxx]
>> >
>> > _______________________________________________
>> > recommenders-dev mailing list
>> > recommenders-dev@xxxxxxxxxxx
>> > http://dev.eclipse.org/mailman/listinfo/recommenders-dev
>> >
>> _______________________________________________
>> recommenders-dev mailing list
>> recommenders-dev@xxxxxxxxxxx
>> http://dev.eclipse.org/mailman/listinfo/recommenders-dev
>
>
>
>
> --
> Best Regards From Cheng Chen [chengchendoc@xxxxxxxxx]
>
> _______________________________________________
> recommenders-dev mailing list
> recommenders-dev@xxxxxxxxxxx
> http://dev.eclipse.org/mailman/listinfo/recommenders-dev
>
_______________________________________________
recommenders-dev mailing list
recommenders-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/recommenders-dev



--
Best Regards From Cheng Chen [chengchendoc@xxxxxxxxx]

Back to the top