Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [recommenders-dev] How to start with Recommender's source code

Hi Cheng,

This sounds great! I'd like the local storage to use the same search
and ranking algorithm - it's actually quite simple - is this ok?
Please see the description below - it shouldn't be too hard to
implement. You can start with a simple matching algorithm (whatever
makes sense to you) and then refine it to work like this afterwards.
This would save a lot of time - the java code written for the local
storage can then be used also for remote storage.

SnipMatch search algorithm:
1) First, we perform a keyword search, looking for each of the words
in the search query within the snippet _search patterns_ (defined by
the snippet creators - e.g. "lowercase $str"). If there is at least
one overlapping word, we consider a match to have been found.
2) Then, we rank the results and display the top ten. There are two
cases for ranking:

a) The search pattern (defined by the snippet creator - e.g.
"lowercase $str") is _in-order_ with respect to the search query (e.g.
the search query might be "lowercase a" or "low", etc.).
b) The search pattern is _not_ in order with respect to the search query

All in-order results are ranked higher than unordered results.
Finally, results are also ordered within these cases:

In-order ranking:
-results are ranked by the number of words in the search query
matching words in the search pattern. when there is a tie, we rank
results with fewer missing arguments (variables) higher.
Unordered ranking:
-results are only ranked by the number of words in the search query
matching words in the search pattern.

This description is complete, with one non-essential exception: in the
current implementation, we also search for the types of the local
variables that are mentioned in the search query - for example, if you
search for "a" and "a" is a local string, you'll see results such as
"lowercase a". This special case is easy to add once the rest is done.

Doug


p.s.

I can write the method to determine if the search query is in order -
I've already written it in PHP :)



On Sun, Feb 19, 2012 at 10:03 PM, chen cheng <chengchendoc@xxxxxxxxx> wrote:
> Hi Doug,
>
> Yeah, In my initial idea, server side search engine and client side engine
> should be implemented the same, at least have similar search result.
>
> Also, i am happy to work for the server side if i have enough time, but one
> question. I am not very sure about your solution, you mean we develop a
> brand new Java based server? Or we still use the current PHP server, but
> implement search algorithm in Java (May even use Lucence etc in the future),
> PHP code invoke Java search result ?
>
> I guess you mean solution two, right ?
>
> Here is my plan about improving SnipMatch client side:
>
> 1. Implement all the design in my last post, create new Eclipse preference,
> local storage, improve GUI etc for SnipMatch. Leaving data interface for
> search engine (use a simple string compare algorithm at beginning, then
> improve the search engine in the future).
>
> 2. Wait for the backend of Doug's job, implement search engine both for
> client side and server side. After i finished client side work, i can work
> with Doug together for the Java based search engine both for client side and
> server side. In fact, i am thinking there may be other search engines in
> Recommenders' other module such as Code Complete feature, is it possible for
> us to use some existing search engine ? Marcel, need your answer here :-)
>
> Doug & Marcel, is it OK about this plan ? If everything is prepared, i will
> write a detailed project proposal for this SnipMatch's merging and improving
> job, and start coding soon. And as a GSoC project, i need a project mentor,
> so i am just waiting for your favored here :-D
>
> 2012/2/20 Doug Wightman <douglasw@xxxxxxxxx>
>>
>> Thanks Cheng :)
>>
>> This sounds great. I have a suggestion, though: I think local search
>> should be implemented the same as "remote" (server-side) search, with
>> the local search results simply cached locally. There are several
>> advantages to this approach:
>> -one search algorithm (keeping results and UI behavior consistent for the
>> user)
>> -easier to blend results from the local and remote sources
>> -easier to build up the local cache (can automatically locally store
>> results that you have recently searched for locally, etc.)
>>
>> What do you think?
>>
>> Now, the one caveat is that if we take this approach you'll need to
>> wait for the backend to be ported to java (so that you can use the
>> same algorithm client-side), and I'm behind schedule on the port. If
>> you have time, it would be great to have your help writing the java
>> code for the server - we could do this together - you wouldn't need to
>> learn PHP.
>>
>> Doug
>>
>>
>>
>> On Sat, Feb 18, 2012 at 9:44 PM, chen cheng <chengchendoc@xxxxxxxxx>
>> wrote:
>> > Hi Doug,
>> >
>> > Finally, see your words here :-D You guys did a really gread job, i am
>> > just
>> > standing on the shoulders of giants, HAHA
>> >
>> > Discussed with Marcel before, I have downloded and read SnipMatch's
>> > client
>> > code carefully. Regarding server side, I am familiar with Java relative
>> > solution personally, not know PHP much, so, i do not involve deeply in
>> > server side code, Fortunately, Doug will cover server side, so i can
>> > focus
>> > on SnipMatch's client merge and important job.
>> >
>> > Following is my personal idea and advises, we can discuss together, if
>> > there
>> > is something wrong in my opinion, please point out :-)
>> >
>> > Now, SnipMatch works like this:
>> >
>> > User input search label (such as "print" strings), SnipMatch client post
>> > this query string to server side. Search engine in server side will get
>> > the
>> > search result (such as "System.out.println()") and response to client
>> > side,
>> > i do not know search engine's detail, but very sure, we will keep
>> > improving
>> > it in the future. Client side get the response result list, user get one
>> > of
>> > them, replace the parameters, then insert into current editor.
>> >
>> > This solution works well, and we call it "Directly Search" solution, but
>> > i
>> > think there are still some problems in it:
>> > 1. No internet service, no SnipMatch, user can not use SnipMatch at all
>> > without internet. In some bad internet service area, such as China, this
>> > solution will affect many people. Or just consider this situation, if
>> > some
>> > group are developing hardly for a secret project, for security, they
>> > aren't
>> > connected with internet, but they want to use SnipMatch, what can they
>> > do ?
>> > 2. Search query will grow rapidly when SnipMatch users grow, we have
>> > to invest more and more money about server side hardware in the future.
>> >
>> > So i am thinking supply another SnipMatch's client solution, call it
>> > "Local
>> > Search". User can choose his favorite way between "Directly Search" and
>> > "Local Search".
>> >
>> > Local Search solution works like this.
>> > 1. User store Snip Code Templates in local cache & storage, there are
>> > three
>> > kind of code template:
>> >
>> > I. Public common templates which keep synchronous with templates in
>> > remote
>> > SnipMatch's server.There is a thread keep the synchronous
>> > things over a period of time when internet service is free.
>> > II. Personal templates which also keep synchronous with his own template
>> > base in remote SnipMatch's server
>> > (Use SnipMatch account as store ID in SnipMatch's server database).
>> > User can decide whether share these templates with local anonymous users
>> > or
>> > other login users have SnipMatch
>> > account, perform this in SnipMatch's Eclipse preferences.
>> > III. Anonymous local templates, each user (No matter login or not) can
>> > use
>> > these templates.
>> >
>> > 2. We store templates in local storage, so we have to supply a local
>> > template search engine. At the beginning, may be
>> > just a string compare crude engine, but we will improve it as a Lucence
>> > based powerful one in the future. May be we can
>> > consider reuse Recommenders' other search engine.
>> >
>> > 3. Just like Apache Maven, user can set up third part Code Template
>> > repository besides SnipMatch's official
>> > side(current http://languageinterfaces.com/) .
>> > Build the server side service by using SnipMatch's server side solution,
>> > then add/remove remote address in SnipMatch's Eclipse preferences.
>> >
>> > Besides Jakob's snippet editor job, and advanced local search engine, i
>> > can
>> > handler all the other jobs. If there is no proper guys can support local
>> > search engine. I can just build a simple one, design proper interfaces,
>> > then
>> > cover this after all the other basic jobs done.
>> >
>> > Here is my initial plan, your advises? Time is not much if we want to
>> > catch
>> > Eclipse's release, may be i should start development job before GSoC
>> > start
>> > in April.
>> >
>> >
>> >
>> > 2012/2/19 Doug Wightman <douglasw@xxxxxxxxx>
>> >>
>> >> Hi Cheng!
>> >>
>> >> Yes, please go ahead and merge SnipMatch into recommenders. Sorry for
>> >> the delay - I've been stuck organizing a conference (tei-conf.org)
>> >> that's starting tomorrow. I think this would be an awesome GSOC
>> >> project and would be happy to support any way that I can.
>> >>
>> >> Great to have you on board, happy to answer any questions you have!
>> >>
>> >> Doug
>> >>
>> >>
>> >> On Fri, Feb 17, 2012 at 9:22 PM, chen cheng <chengchendoc@xxxxxxxxx>
>> >> wrote:
>> >> > Ok, i will read the SnipMatch's source code first.
>> >> >
>> >> > One question: Does SnipMatch and Recommenders' merge job start  yet
>> >> > or
>> >> > not ?
>> >> > In my view, it is not hard if we just merge SnipMatch's source code
>> >> > into
>> >> > Recommenders. Do we have a full plan about how to improve SnipMatch's
>> >> > current features ? Or we just start this part of job from zero.
>> >> >
>> >> > 2012/2/18 Marcel Bruch <bruch@xxxxxxxxxxxxxxxxxx>
>> >> >>
>> >> >> Hi Cheng,
>> >> >>
>> >> >> I was hoping Doug would jump in and provide more details on
>> >> >> Snipmatch.
>> >> >> Maybe he's off for a few days.
>> >> >>
>> >> >> Your summary is right to the point. That's how the system works.
>> >> >> Doug
>> >> >> has
>> >> >> committed the sources to our Eclipselabs repositories. You might
>> >> >> spent
>> >> >> some
>> >> >> time on the sources and dive into the internals
>> >> >>
>> >> >>
>> >> >> here http://code.google.com/a/eclipselabs.org/p/code-recommenders/source/browse/?repo=kiyomi
>> >> >>
>> >> >> Let's wait a few days and get Doug and/or Zi involved...
>> >> >>
>> >> >> Best,
>> >> >> Marcel
>> >> >>
>> >> >>
>> >> >>
>> >>
>> >> _______________________________________________
>> >> recommenders-dev mailing list
>> >> recommenders-dev@xxxxxxxxxxx
>> >> http://dev.eclipse.org/mailman/listinfo/recommenders-dev
>> >
>> >
>> >
>> >
>> > --
>> > Best Regards From Cheng Chen [chengchendoc@xxxxxxxxx]
>> >
>> > _______________________________________________
>> > recommenders-dev mailing list
>> > recommenders-dev@xxxxxxxxxxx
>> > http://dev.eclipse.org/mailman/listinfo/recommenders-dev
>> >
>> _______________________________________________
>> recommenders-dev mailing list
>> recommenders-dev@xxxxxxxxxxx
>> http://dev.eclipse.org/mailman/listinfo/recommenders-dev
>
>
>
>
> --
> Best Regards From Cheng Chen [chengchendoc@xxxxxxxxx]
>
> _______________________________________________
> recommenders-dev mailing list
> recommenders-dev@xxxxxxxxxxx
> http://dev.eclipse.org/mailman/listinfo/recommenders-dev
>


Back to the top