Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [cdt-dev] [DSF] Making Sequence more robust

Hi,

I think it would be really useful if we could make
the use of RequestMonitors in DSF more robust.
Not calling done() is one of the biggest issues of DSF
in my mind.

There is no easy way to enforce calling done() as we
are dealing with asynchronous programming.

I believe the way to make this more robust is through
static code analysis.  I envision a DSF-tools plugin
that can be installed in my development Eclipse; that
plugin will statically analyse classes that use RequestMonitors
and will determine if there is a missing done() call.

This is similar to the compiler making sure that a 'return'
is used on every code path of a method that returns a value.
In fact, if we can get that algorithm, we can enhance it
to fit our needs.

For example:

void dsfMethod(RequestMonitor rm) {
  if (b) {
    rm.done();
  } else {
    // can detect missing rm.done();
  }
}

void dsfMethod2(RequestMonitor rm) {
  if (b) {
    // can detect missing rm.done();
    return;
  }
  // can detect missing rm.done();
}

It's probably not a trivial effort, but it also shouldn't
be impossible.  I think the problem is finding someone
willing to spend the time on it.

Marc


> -----Original Message-----
> From: cdt-dev-bounces@xxxxxxxxxxx 
> [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Pawel Piech
> Sent: Thursday, October 20, 2011 1:40 PM
> To: Vladimir Prus
> Cc: cdt-dev@xxxxxxxxxxx
> Subject: Re: [cdt-dev] [DSF] Making Sequence more robust
> 
> On 10/20/2011 05:49 AM, Vladimir Prus wrote:
> > On Monday, October 17, 2011 20:20:13 Pawel Piech wrote:
> >
> > Pawel,
> >
> > thanks for guidance here. I have a couple of questions:
> >
> > 1. I haven't quite worked our through the complexities of 
> catching runtime
> > exceptions by DsfExecutor, but I wanted to clarify what the 
> goal should be.
> > At least for launch code, the workflow is that GdbLaunch 
> delegates executes
> > startup sequences at waits for them using Query, so I guess 
> the right goal
> > is that all runtime exceptions thrown while we're 'inside' 
> the execute
> > method of a query end up in 'get' method of the query throwing?
> Yes, I think that's perfectly reasonable.  Adding a guard in 
> Query.run() 
> to catch runtime exceptions is a simple fix for a limited set 
> of cases.  
> The only downside I can think of is that it may give the caller an 
> impression that all cases where the rm is not completed will 
> be caught.  
> Another remote possibility would be that a runtime exception could be 
> thrown after the RM is passed down to some other service.  Then when 
> that service eventually completes, the rm.done() will have 
> been called 
> twice.  Overall though I think the benefit is worth these two risks.
> 
> > 2. While a general repository for outstanding RM's might be 
> a good idea,
> > what would you say if we start small, and implement the 
> solution that would
> > remember outstanding RequestMonitor instances for some 
> limited cases (for
> > example, 'target *-remote' commands in DSF-GDB, and then 
> cancel those
> > RequestMonitors after a timeout? This is somewhat 
> orthogonal from the
> > initial goal of catching lost RMs, but seems sufficiently 
> related and
> > immediately useful.
> This is something that could be done at the service level, 
> where these 
> commands are issued.  If we expect that the back-end may 
> never complete 
> some commands then in those cases a timeout guard should even be 
> required.  For example, we queue commands with the command 
> control, but 
> we count on the process monitor to notify us if the back end process 
> dies.  At that point we complete any outstanding command in 
> queue with 
> an error.  So our process monitor is the guard.
> > 3. What do you think about changing the interface of RequestMonitor
> > along the lines I have originally suggested, so that the implementor
> > is forced to declare success or failure using return value?
> While the sequence is designed to perform async operations in 
> each step, 
> in reality most steps return immediately.  So your suggestion could 
> simplify majority of those steps.  I would suggest to introduced an 
> extension: SyncStep, which would then include the syncExecute() or 
> (xexecute) method.
> 
> Cheers,
> Pawel
> 
> > Thanks,
> >
> >> I think there's at least a couple things we could do to 
> improve this:
> >>
> >> *Report runtime exceptions thrown in DSF executor.*
> >> Currently a quirk in the implementation of 
> ScheduledThreadPoolExecutor
> >> keeps us from catching these exceptions unless assertions 
> are enabled.
> >> If a brave souls steps up to do a clean-room reimplement 
> this class we
> >> could get around this bug.  It wouldn't need to be as 
> efficient or as
> >> complicated implementation because we only need a single 
> thread.  Or
> >> maybe someone more clever can think of a way to get around 
> this problem
> >> without reimplementing or degrading performance. (bug 236915)
> >> *
> >> Create a registry for outstanding request monitors.*
> >> Most request monitors are passed around between functions, 
> sometimes
> >> directly as parameters to functions, and sometimes as 
> parents to other
> >> RMs.  There's only a few clients outside of the DSF framework that
> >> collect request monitors and don't complete them 
> immediately: caches and
> >> things like the GDB command control.  We could catch the 
> lost RMs if we
> >> create a registry where RMs are added when created and removed when
> >> collected by a service or another component.  So a few 
> example flows
> >> would be:
> >>
> >> _Simple completion_
> >> - An RM is created and added to the registry.
> >> - A service calls RM.done() and the RM is removed from the 
> registry.
> >>
> >> _Waiting on async completion_
> >> - An RM is created and added to the registry.
> >> - A service sends a command to an asynchronous data 
> provider (command
> >> control)
> >> - The service manually removes the RM from registry.
> >>
> >> _RM cached, waiting on another RM to complete
> >> _- An RM is created and added to the registry.
> >> - A service finds that another RM is already waiting in 
> cache for the
> >> same result
> >> - Cache removes the RM from registry.
> >> _
> >> Stacked RM waiting on async completion_
> >> - An RM is created and added to the registry.
> >> - Another method takes the RM and adds it as a parent to 
> another RM.
> >> - A service sends a command to async. data provider:
> >> - Service removes RM from registry.
> >> - RM removes its parent from registry too.
> >>
> >> At the end of any sequence there should be no RMs in the 
> registry.  So a
> >> post-executor if a DSF executor could check for RMs in the 
> registry and
> >> log exception or even immediately halt execution.
> >>
> >> We'd chance a lot of false positives at first and we may 
> have some hard
> >> to resolve edge cases, but with some work we'd  end up with a more
> >> predictable system.  Also, for backward compatibility, 
> this tracking
> >> would need to be optional.
> >>
> >> Cheers,
> >> Pawel
> >>
> >> On 10/16/2011 03:07 AM, Vladimir Prus wrote:
> >>> Hi,
> >>>
> >>> presently, if any step of a Sequence fails to call 'done' 
> or 'cancel',
> >>> the whole Sequence just hangs, and it's rather painful to debug.
> >>>
> >>> Can we do something to improve this long-standing 
> problem? For example:
> >>>       abstract public static class Step {
> >>>
> >>>           ....
> >>>
> >>>           public final void execute(RequestMonitor rm) { 
> // note: final.
> >>> 			
> >>> 			xexecute(rm);
> >>> 			if (!rm.isDone()) // isDone would have 
> to be added
> >>> 			{
> >>> 			
> >>> 				rm.setStatus(new 
> Status(IStatus.ERROR, ..., "Misbehaving step",
> >>> 				....));
> >>> 				
> >>> 	            rm.done();
> >>> 			
> >>> 			}
> >>> 			
> >>>           }
> >>> 		
> >>> 		abstract protected void xececute(RequestMonitor rm);
> >>> 		
> >>>       }
> >>>
> >>> Or, we can take take this even further:
> >>>       abstract public static class Step {
> >>>
> >>>           ....
> >>>
> >>>           public final void execute(RequestMonitor rm) { 
> // note: final.
> >>> 			
> >>> 			rm.setStatis(xexecute(rm))
> >>> 			rm.done();
> >>> 			
> >>>           }
> >>> 		
> >>> 		abstract protected IStatus xececute(RequestMonitor rm);
> >>> 		
> >>>       }
> >>>
> >>> In this case, the compiler will object if a step fails to 
> return a value,
> >>> although there's a risk that a step might fail to handle 
> failure of any
> >>> operation it does and still return IStatus.OK anyway.
> >>>
> >>> Comments?
> 
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/cdt-dev
> 

Back to the top