Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [recommenders-dev] Stacktrace parser/detector

Sorry to answer from a digest, I've already changed that configuration.

@Johannes

The papers are attached. I have only found three papers and still they don't treat the theme exactly as we trying to address.

The most similar to what we trying to do is Identifying Software Problems Using Symptoms. It is an old paper, deals with the problem using a more low level approach using memory dump, but the idea is the same: it gets the memory dump (with the runtime stack trace) and tries to determine whether the failure is a recurrence or not. The paper is very careful describing possible problems to do this kind of task which can be very useful for us to keep them in head even if they are somewhat common sense. For instance, the paper defines two types of symptoms (symptom can be seen as a synonym for stack trace) one caused by wrong data and the other caused by wrong logic. Plus, it recalls that symptoms can cause different fault (on a system) and a fault can have different symptoms. The paper also defines some heuristics for stack trace matching and evaluates them in a real case.

The paper Do Stack Traces Help Developers Fix Bugs? basically investigates how far a developer go in a stack trace to diagnose the software fault. Thus, it can be useful as an heuristic for searching similar stacks.

The paper What Would Other Programmers Do? Suggesting Solutions to Error Messages as the title suggests is more focused in suggesting solutions to the developers. But to do that it uses stack traces to index similar errors (and also compiler errors which are not relevant to us). To use stack as an index it proposes a very simple algorithms for that which is included among the ones proposed in the first paper.

The last paper's title says all: Extracting Structural Information from Bug Reports.

Great idea to have a crawler in platforms like bugzilla, I didn't thought about that. In my “roadmap” I was only thinking about users inserting the stacks via IDEs or the web interface and then discussing about them using the web interface.

Regarding to what I have implemented. Well, not much. I have a stack trace parser which basically breaks down the stack lines. The implementation is also capable of persist the parsed stacks. The benefits of doing that is that we can be much more flexible implementing the matchers like ignoring some lines or even parts of them. I was also finishing my first matcher which is the more “rigid” or basic matching stacks by completely comparing them (all lines should match in the same order). Based on the readings of the papers I was planning to have at least 4 more: differ by on line, partial match (which matches on chained stack traces), top down and bottom up (maybe in this order for the relevance of results :-). Each of these matchers can use different approaches to match the stack lines – currently I have thought of: complete match, ignoring line match (ignores the source code line of the method) and wildcard match (ignores some defined portion of the lines).

I'm using Scala (sbt + specs + intellij) and Mongodb. I was planning to use Scalatra or Bowler for a rest interface to this core. I was using them only to learn some new technology, so I'm don't have any problem to change that.

@Marcel

When I was in one of my masters course disciplines (+- 2008) I was already thinking about implementing something similar to this and tried to use Lucene. I wasn't very successful with using Lucene, but my approach was very different at that time. I haven't used Lucene to match stacks directly, but to better calibrate (improve the precision of) the results returned by Google. Good to know that you have successfully implemented that.

So you too are also making available a server-side through an rest API, and using a document database. Interesting :-).

At the stage that you are I think that the web ui is natural step. In fact, I was just finishing the basic matcher to try to sketch some UI. So, for me it ok to go with that. Lets try that. I've idealized some screens in my mind, but hasn't stop to draw anything. And, as you said, the UI is a really important component of the system. It should turn the search natural and should really help to discuss and find useful information about the faults. Some touch of social web (or Web 2.0) can be thought also such as users indicating that one stack is related to the same fault of another stack. But this is only in future versions.

Finally, regarding the papers, well as a PhD student this is always welcome :-). Nevertheless, I was really wanting to push that to a “real” project and not wanting to stop only on the prototype. I don't know what are your expectations about that. For the evaluations, the Eclipse bugzilla dataset will be very useful. I work as software architect at Petrobras and can conduct some evaluations there in the future ;-). My research group expertise is Experimental Software Engineering and my masters was about using Action Research methodology in Software Engineering and we can think about future studies.

Regards,
Paulo.

On Tue, May 31, 2011 at 1:02 PM, <recommenders-dev-request@xxxxxxxxxxx> wrote:
Send recommenders-dev mailing list submissions to
       recommenders-dev@xxxxxxxxxxx

To subscribe or unsubscribe via the World Wide Web, visit
       http://dev.eclipse.org/mailman/listinfo/recommenders-dev
or, via email, send a message with subject or body 'help' to
       recommenders-dev-request@xxxxxxxxxxx

You can reach the person managing the list at
       recommenders-dev-owner@xxxxxxxxxxx

When replying, please edit your Subject line so it is more specific
than "Re: Contents of recommenders-dev digest..."


Today's Topics:

  1. Stacktrace parser/detector (Paulo S?rgio Medeiros)
  2. Re: Stacktrace parser/detector (Johannes Lerch)
  3. Re: Stacktrace parser/detector (Marcel Bruch)


----------------------------------------------------------------------

Message: 1
Date: Tue, 31 May 2011 00:45:21 -0300
From: Paulo S?rgio Medeiros <pasemes@xxxxxxxxx>
To: recommenders-dev@xxxxxxxxxxx
Subject: [recommenders-dev] Stacktrace parser/detector
Message-ID: <BANLkTinnsbD6oMcfi_hZ77w78qDF_CCHDw@xxxxxxxxxxxxxx>
Content-Type: text/plain; charset="iso-8859-1"

Hi all,

I'm here coming from my first contact with Marcel in his post at code
recommender's blog (
http://code-recommenders.blogspot.com/2011/05/oh-stacktrace-my-stacktrace.html
).

I have already stated in my comment there that I'm trying to build (
https://github.com/pasemes/buggenome - very initial stage, look at the
devmatcher branch for more recent code base) something very similar to the
ideas exposed in the post. So, the intent of this thread is to offer myself
to contribute to the initiative.

 First, I would like to know if you have already elaborated something like
an initial list of functionalities or initial sketch of the concepts and/or
architecture.

 I don't have much documentation on what I have done. But to organize my
initial ideas I have searched for papers (I'm a PhD student :-) to see if
anyone implemented something similar. I found some papers the describe some
heuristics to search for similar stacks. However, reading Marcel's post I
realized that I've missed that there are implementations of similar stack
detectors in Jira and Bugzilla :-(. Anyway, I can send the papers if you are
interested. Another thing that I've is a mindmap with some of my
understanding of the readings. It's in Portuguese, my mother tongue, but I
can translate that easily and send to you too.

 So, that's it for now. I just want to hear what steps to follow. I took a
look at the project wiki and saw that you have lots of docs for
contributors, but I'll wait for your pointers on how to setup the source
control tools and stuff like that if that's the case.


Best regards,
Paulo S?rgio.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://dev.eclipse.org/mailman/private/recommenders-dev/attachments/20110531/608ae72d/attachment.htm>

------------------------------

Message: 2
Date: Tue, 31 May 2011 09:28:23 +0200
From: Johannes Lerch <lerch@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
To: Recommenders developer discussions <recommenders-dev@xxxxxxxxxxx>
Subject: Re: [recommenders-dev] Stacktrace parser/detector
Message-ID: <BANLkTi=h=sNtBa5B7NfrDoZ_nn6evab+eQ@xxxxxxxxxxxxxx>
Content-Type: text/plain; charset=ISO-8859-1

Hi Paulo S?rgio,

can you send me the papers you found?

We have some initial thoughts about what we want to do in the first
steps, but i think these ideas are not documented yet. First step is
to crawl multiple platforms like bugzilla for stacktraces and save
those with a link to it's source. If we have some initial dataset we
will create a search index and provide a platform on which anyone can
search for a stacktrace. We should be able to show links to the
previously crawled resources in the web which match the searched
stacktrace.
A later step could be, that we allow discussions and help topics on
that search platform itself.

Can you outline what is implemented in your project already?

Regards,
Johannes


2011/5/31 Paulo S?rgio Medeiros <pasemes@xxxxxxxxx>:
> Hi all,
>
> I'm here coming from my first contact with Marcel in his post at code
> recommender's blog
> (http://code-recommenders.blogspot.com/2011/05/oh-stacktrace-my-stacktrace.html).
>
> I have already stated in my comment there that I'm trying to build
> (https://github.com/pasemes/buggenome - very initial stage, look at the
> devmatcher branch for more recent code base) something very similar to the
> ideas exposed in the post. So, the intent of this thread is to offer myself
> to contribute to the initiative.
>
> First, I would like to know if you have already elaborated something like an
> initial list of functionalities or initial sketch of the concepts and/or
> architecture.
>
> I don't have much documentation on what I have done. But to organize my
> initial ideas I have searched for papers (I'm a PhD student :-) to see if
> anyone implemented something similar. I found some papers the describe some
> heuristics to search for similar stacks. However, reading Marcel's post I
> realized that I've missed that there are implementations of similar stack
> detectors in Jira and Bugzilla :-(. Anyway, I can send the papers if you are
> interested. Another thing that I've is a mindmap with some of my
> understanding of the readings. It's in Portuguese, my mother tongue, but I
> can translate that easily and send to you too.
>
> So, that's it for now. I just want to hear what steps to follow. I took a
> look at the project wiki and saw that you have lots of docs for
> contributors, but I'll wait for your pointers on how to setup the source
> control tools and stuff like that if that's the case.
>
> Best regards,
> Paulo S?rgio.
> _______________________________________________
> recommenders-dev mailing list
> recommenders-dev@xxxxxxxxxxx
> http://dev.eclipse.org/mailman/listinfo/recommenders-dev
>
>


------------------------------

Message: 3
Date: Tue, 31 May 2011 10:13:46 +0200
From: Marcel Bruch <marcel.bruch@xxxxxxxxx>
To: Recommenders developer discussions <recommenders-dev@xxxxxxxxxxx>
Subject: Re: [recommenders-dev] Stacktrace parser/detector
Message-ID: <0DE07707-331F-4502-B290-198303CC8572@xxxxxxxxx>
Content-Type: text/plain; charset=iso-8859-1

Just to extend Johannes' mail.

We conducted several evaluations on how well duplicate detection with various search algorithms works on the Eclipse Bugzilla dataset. We evaluated these search engines using various Apache Lucene configurations (own word splitter, different scoring etc.) as well as several classifiers.

The UI screenshots you see in the blog-post is a student prototype created during a hands-on. We have set up a preliminary server-side based on JAX-RS (Oracle Jersey) + Apache CouchDB + Apache Lucene to store and search for similar stacktraces. Metadata to represent stacktraces is also their - but is slightly different to your representation in Scala. BTW: its all written in Java.

We also have a preliminary version of a crawler for various forums to identify stacktraces in forum posts etc. This is thought to create an inital dataset to see how well this engine actually works.

Just to make a quick shot:

We need a clear vision of how the UI should look like and how users will/should use the tool. We have though of a web ui + an Eclipse ui. How about starting to design a Web UI that gives a clear intuition how people could use the web interface? I think this UI could be quite simple w/o advanced editing functionality etc.

Paulo, what's your though? Do you want to come up with a draft for a web ui? Afterwards, we should think about how we fill the data into a server side. And writing a paper would always be interesting (thats actually one of our goals too. So we could benefit both from joint work). I've several evaluation ideas already in my mind...


Best,
Marcel

On 31.05.2011, at 09:28, Johannes Lerch wrote:

> Hi Paulo S?rgio,
>
> can you send me the papers you found?
>
> We have some initial thoughts about what we want to do in the first
> steps, but i think these ideas are not documented yet. First step is
> to crawl multiple platforms like bugzilla for stacktraces and save
> those with a link to it's source. If we have some initial dataset we
> will create a search index and provide a platform on which anyone can
> search for a stacktrace. We should be able to show links to the
> previously crawled resources in the web which match the searched
> stacktrace.
> A later step could be, that we allow discussions and help topics on
> that search platform itself.
>
> Can you outline what is implemented in your project already?
>
> Regards,
> Johannes
>
>
> 2011/5/31 Paulo S?rgio Medeiros <pasemes@xxxxxxxxx>:
>> Hi all,
>>
>> I'm here coming from my first contact with Marcel in his post at code
>> recommender's blog
>> (http://code-recommenders.blogspot.com/2011/05/oh-stacktrace-my-stacktrace.html).
>>
>> I have already stated in my comment there that I'm trying to build
>> (https://github.com/pasemes/buggenome - very initial stage, look at the
>> devmatcher branch for more recent code base) something very similar to the
>> ideas exposed in the post. So, the intent of this thread is to offer myself
>> to contribute to the initiative.
>>
>> First, I would like to know if you have already elaborated something like an
>> initial list of functionalities or initial sketch of the concepts and/or
>> architecture.
>>
>> I don't have much documentation on what I have done. But to organize my
>> initial ideas I have searched for papers (I'm a PhD student :-) to see if
>> anyone implemented something similar. I found some papers the describe some
>> heuristics to search for similar stacks. However, reading Marcel's post I
>> realized that I've missed that there are implementations of similar stack
>> detectors in Jira and Bugzilla :-(. Anyway, I can send the papers if you are
>> interested. Another thing that I've is a mindmap with some of my
>> understanding of the readings. It's in Portuguese, my mother tongue, but I
>> can translate that easily and send to you too.
>>
>> So, that's it for now. I just want to hear what steps to follow. I took a
>> look at the project wiki and saw that you have lots of docs for
>> contributors, but I'll wait for your pointers on how to setup the source
>> control tools and stuff like that if that's the case.
>>
>> Best regards,
>> Paulo S?rgio.
>> _______________________________________________
>> recommenders-dev mailing list
>> recommenders-dev@xxxxxxxxxxx
>> http://dev.eclipse.org/mailman/listinfo/recommenders-dev
>>
>>
> _______________________________________________
> recommenders-dev mailing list
> recommenders-dev@xxxxxxxxxxx
> http://dev.eclipse.org/mailman/listinfo/recommenders-dev



------------------------------

_______________________________________________
recommenders-dev mailing list
recommenders-dev@xxxxxxxxxxx
http://dev.eclipse.org/mailman/listinfo/recommenders-dev


End of recommenders-dev Digest, Vol 5, Issue 6
**********************************************

Attachment: Do Stack Traces Help Developers Fix Bugs.pdf
Description: Adobe PDF document

Attachment: Extracting Structural Information from Bug Reports.pdf
Description: Adobe PDF document

Attachment: Identifying Software Problems Using Symptoms.pdf
Description: Adobe PDF document

Attachment: What Would Other Programmers Do.pdf
Description: Adobe PDF document


Back to the top