Difference between revisions of "Investigations and Repairs"

From GlueXWiki
Jump to: navigation, search
Line 1: Line 1:
This page is intended to be a reference for past and ongoing issues regarding base failures, investigations, and repairs.
+
This page is intended to be a reference for past and ongoing issues regarding base failures, investigations, and repairs. Dates are provided on some bullet points for extra clarity.
  
  
Line 13: Line 13:
  
 
===Monitoring Setup Issues===
 
===Monitoring Setup Issues===
* A set of interconnected issues: electrical arcing occurs inside some bases when back plates are touching. Jon has experienced electrical shock from touching back plates as well. When back plates are not touching no problems occur (arcing happens on order of seconds, no problems when not touching for over 12 hours).
+
* A set of interconnected issues: electrical arcing occurs inside some bases when back plates are touching. Jon has experienced electrical shock from touching back plates as well. When back plates are not touching no problems occur (arcing happens on order of seconds, no problems when not touching for over 12 hours). (4/8/16)
 +
** It was discovered that the source of the problem was one or two bases building up a charge on back plates. See https://logbooks.jlab.org/entry/3396569 for more information (4/8/16)
 +
 
 +
===Bootloader Issues===
 +
* There were a number of issues when attempting to reprogram bases via the bootloader. No issues have been observed while reprogramming via SWIM cables. The bootloader scheme of reprogramming bases was devised by Dan Bennett, with some advice from one of the seller companies.
 +
* Sketch of bootloader checking procedure: the normal operating firmware is restricted to a certain address range. Other firmware exists for bootloader operation, but can only be altered when reprogrammed via swim cables.
 +
** The firmware is sent along CAN messages. Each message contains data, address to reprogram, and a checksum of the rest of message. The base sums the message locally, and sends a response byte to indicate whether the local and server checksums match.
 +
* In practice, the checksum scheme is insufficient: a large number of issues occur. In some cases, one base can corrupt ALL other bases on a strand. From studies done last summer, it was determined that all bases fail the upload procedure if performed enough times, and can lead to corruption of other bases.
 +
* Jon estimates two week of labor to upload a new firmware to the FCAL.

Revision as of 09:42, 11 April 2016

This page is intended to be a reference for past and ongoing issues regarding base failures, investigations, and repairs. Dates are provided on some bullet points for extra clarity.


HV Issues

  • Bases continue to experience HV failure at a rate of ~1 new base per day.
  • High temperature (in the ballpark of 35-40 C) is correlated with base HV failure. The cause and effect nature of is still not entirely clear at this point.
    • ONGOING: A setup in CEBAF can externally heat up to 12 bases to ~35 C at present. Currently this can only run during daytime hours. Over three eight hour periods no HV failures were seen among 12 recently removed from FCAL bases. Jon intends to investigate for a few days at 40 C before drawing conclusions.


Communications Issues

  • From Feb-April 8 2016 nine bases had severe communications issues. These bases were addressed both individually and together with all other bases on strand. They gave no responses back for a matter of days. Power cycling via software controls did not help. However, when removed from the FCAL these bases appear to respond just fine (no long term monitoring has been performed yet).
    • TO DO: Check to see if these bases can still receive messages. This is most easily done by seeing if the base in question can receive an LED ON command.

Monitoring Setup Issues

  • A set of interconnected issues: electrical arcing occurs inside some bases when back plates are touching. Jon has experienced electrical shock from touching back plates as well. When back plates are not touching no problems occur (arcing happens on order of seconds, no problems when not touching for over 12 hours). (4/8/16)

Bootloader Issues

  • There were a number of issues when attempting to reprogram bases via the bootloader. No issues have been observed while reprogramming via SWIM cables. The bootloader scheme of reprogramming bases was devised by Dan Bennett, with some advice from one of the seller companies.
  • Sketch of bootloader checking procedure: the normal operating firmware is restricted to a certain address range. Other firmware exists for bootloader operation, but can only be altered when reprogrammed via swim cables.
    • The firmware is sent along CAN messages. Each message contains data, address to reprogram, and a checksum of the rest of message. The base sums the message locally, and sends a response byte to indicate whether the local and server checksums match.
  • In practice, the checksum scheme is insufficient: a large number of issues occur. In some cases, one base can corrupt ALL other bases on a strand. From studies done last summer, it was determined that all bases fail the upload procedure if performed enough times, and can lead to corruption of other bases.
  • Jon estimates two week of labor to upload a new firmware to the FCAL.