MantisBT - Hall D Offline
View Issue Details
0000150Hall D OfflineGeneralpublic2011-10-18 11:032011-11-07 14:43
davidl 
staylor 
normalmajoralways
resolvedfixed 
0000150: Reconstruction hangs on certain events
Here is the e-mail originally sent by Igor Senderovich:

Hi David, hi Simon,

I've isolated an event that essentially hangs the reconstruction, i.e. halts with 100% CPU
http://zeus.phys.uconn.edu/~senderovich/tmp/hdg_smeared_badevt.hddm [^] <http://zeus.phys.uconn.edu/%7Esenderovich/tmp/hdg_smeared_badevt.hddm> [^]

You can reproduce this with the phys_tree plugin. Trying to view the event with hdview2 couldn't work either for the same problem, presumably: it froze on me too.

I suspect the rate of such chokes is about 1 in a hundred events. This was event 136. Running reconstruction with 7 threads produced crippled all the threads before reaching event 1000. (I can try to isolate the others for you)

I should note, this is with revision 8351

Thanks, and let me know how I could help.
-Igor
No tags attached.
Issue History
2011-10-18 11:03davidlNew Issue
2011-10-24 16:54staylorNote Added: 0000186
2011-10-24 16:58staylorStatusnew => confirmed
2011-10-24 16:58staylorStatusconfirmed => assigned
2011-10-24 16:58staylorAssigned To => staylor
2011-11-07 14:42staylorNote Added: 0000198
2011-11-07 14:43staylorStatusassigned => resolved
2011-11-07 14:43staylorResolutionopen => fixed

Notes
(0000186)
staylor   
2011-10-24 16:54   
I have confirmed the behavior on ifarm1102: the code hangs on this event and eventually causes a segmentation fault. I have isolated the problem to what appears to be
nonsense in the start counter (T=9.13056e+22) for one of the tracks. When the DChargedTrackHypothesis code tries to match this to the RF bucket it effectively enters an infinite loop...
(0000198)
staylor   
2011-11-07 14:42   
I have put a trap in the code to deal with nonsense in the start counter (or any other detector) by looking at times relative to the smallest drift time in the drift chambers.