2004 CDF E-Log -- Eve shift. Tue Feb 24, 2004
SciCo DAQ Ace Monitoring Ace CO (Operations Manager)
Robert Harris Simon Sabiq Chris Marino Josh Tuttle JJ Schmidt


Start of Shift Notes:  

Inerit store 3254 with initial luminosity of 17E30 and an  
antiproton stack of 92 mA.  Plan is to take data with silicon 
off and COT on.  At end of store we are supposed to call 
Julia Thom (218-8940) and David Clark (722-8740) to swap VRB. 
Beate suggests that if the store ends the BMU people may also want to investigate the sector that
has tripped and for which  

we have turned off the HV (and masked off the BMU inhibit in  
the trigger).

Tue Feb 24 16:30:57 We are getting a few high deadtime messages. But everything looks fine with the triggers. - Simon :: (run 179314)
-- Tue Feb 24 16:53:59 comment by...Simon --  
This is what it says:

(MLE) b0dap74.fnal.gov:resmgr.ReplyMonitor 6:4:20:55 PM->[Bad code] Arguments:  Dead Time is High 

(MLE) b0dap74.fnal.gov:resmgr.ReplyMonitor 6:4:29:32 PM->[Bad code] Arguments:  Dead Time is High  

-- Tue Feb 24 17:02:51 comment by...Donatella T. --  as I mentioned here (sat. eve. shift), probably restarting the ScalerMonitor program running on b0dap74 will get rid of these stale messages. You may want to check with the current DAQ person on-call.
Tue Feb 24 17:04:02 restarted daqmon (ScalerMonitor) on dap74 to try to fix spurious deadtime popups with no meaningful messages, which have been happening occasionally since the weekend. Please make a note in the elog if this happens again. - Jane & Bill
Tue Feb 24 17:07:52 ICICLE crashed. Restarted it. - Simon
Tue Feb 24 17:14:01
 - Christopher
-- Tue Feb 24 17:16:06 comment by...Christopher --  Abort Gap losses for this store(left) and last store(right)
TevMon Error has occured since B0PAGC crossed 20 kHz
-- Tue Feb 24 17:54:13 comment by...jj --  
Store 3254 was 22 hours long at time of the left plot.
Store 3252 was 32 hours long (right plot).

Tue Feb 24 17:27:19
 - Christopher
-- Tue Feb 24 17:28:15 comment by...Christopher --  Abort Gap losses for stores 3249 and 3245
-- Tue Feb 24 17:55:32 comment by...jj --  
Store 3249 was 28 hours long (left plot).
Store 3245 was 35 hours long (right plot).

Tue Feb 24 17:34:20
 - Christopher
-- Tue Feb 24 17:34:42 comment by...Christopher --  Hourly Plots
Tue Feb 24 18:08:21
 - Christopher
Tue Feb 24 18:34:56
The plan from Accelerator Run Coordinator Jim Morgan. 

16:35:51-  

The Plan:  
End of store studies (synch light) begin at 0600, terminate the  
store at 0800. B-1 wet engine work can take place while at 150  
GeV in shot setup. If the store falls out early, recover and do  
the B-1 wet engine work, then go into shot setup. Only 1 SY120   
event tonight, no other studies cycles. Shot Strategy: Protons  
265-275E9 per bunch at 150 GeV, pbars as per guidelines. 

 - JPM  
 - jj
Tue Feb 24 18:37:41 Phil Schlabach is going to attempt to fix the problem with the BMU. He will bring up the voltage on the bad sector, and will have removed the trigger inhibit. During this time we are to ignore any trips that show up on the HV global alarms page. He informs us that if the run stops due to his work, it means he messed up with the trigger inhibit, and we should inform him. He will be working on the BMU problem on the first floor.  - Robert Harris :: (run 179314)
Tue Feb 24 18:47:49

silicon revival plan

  • we need 2-3 hours to cool down the silicon, if this has to happen before the next store. Page Si Cooling at 218-8626. give 1 hour heads up.
  • for si maintenance, please page si main pager and SPL 1 hour before end of store.  - Rainer
    Tue Feb 24 18:54:09 Several BMU alarms later Phil Schlabach completes his BMU work. He is unable to fix the HV trips and concludes that an access is required to fix the BMU problem. - Robert Harris :: (run 179314)
    Tue Feb 24 18:56:25
    The proposed CDF plan (consultation between Rob
    Roser and JJ): 
    
    
    1) Request one hour access at end of this store for Roser and 
       tech to inspect COT plumbing in collision hall to understand 
       modifications needed to reverse COT gas flow. 
    
    Option A: 
    
    * Roser decides materials and people can be assembled to 
      reverse COT gas flow after the next store. 
    * Next store: continue running with Silicon off and COT 
      fully ON with nitrogen gas mix. 
    * End of next store: Request access (4 hours?) to put in 
      plumbing and reverse COT gas flow. 
    * Following store: Run with Silicon ON, COT fully ON, 
      gas flow reversed, and no nitrogen in mix. 
    * Run in this mode until March 15 shutdown. 
    
    Option B: 
    
    * Roser decides COT gas flow can NOT be reversed after 
      the next store (or two at most). 
    * Run for 4 stores with Silicon ON, COT fully ON with 
      nitrogen gas mix. 
    * Run rest of stores until March 15 shutdown with 
      Silicon ON but COT in "compromised state" with 
      SL1-2 off, SL3-4 reduced gain and no nitrogen 
      in gas mix.
     - JJ
    Tue Feb 24 18:58:06 ICICLE heartbeat not updated for 10 mins. Restarted. - Simon
    Tue Feb 24 19:06:48 The anode on BMU 72E trips at currents in the 4000-5000 range. The cathode also tripped once with the anode unplugged. I swapped 72 and 84 again. The problem follows the chamber. I conclude that an access is required (a short one in which the offending stack could be isolated and bypassed, followed by a longer one in which it would be fixed or just a long one). I restored everything except the cable swap, so the channel now off is masquerading as 84E. - ps
    Tue Feb 24 19:07:13
     - Christopher
    -- Tue Feb 24 19:07:44 comment by...Christopher --  Hourly Plots
    Tue Feb 24 19:11:57
    Abort Gaps back below 18 kHz
     - Christopher
    -- Tue Feb 24 19:19:40 comment by...JJ --  
    With Silicon out, I am handling abort gap losses on a 
    "situational" basis. That is I expect people on shift 
    to consider the trends and compare to previous stores
    and decide if "beam quality" issues require a consultation
    with MCR. When in doubt with Silicon out, call OPS.

    Tue Feb 24 19:17:23
    I will call MCR (and run coordinator if necessary) to request 
    an access at end of this store for 90 minutes to: 
    
    * inspect COT plumbing to understand what is necessary 
      to reverse COT gas flow 
    * bypass problem with BMU stack that is causing  
      runs to be marked bad for BMU 
    * work on FCAL00 and FCAL01 (optional)
     - JJ
    Tue Feb 24 19:42:54 At JJ's request I page Koji Terashi and ask whether he is willing to do an access for the miniplug at 8 a.m. Wednesday. Koji is willing, and I indicate we will let Koji know when we have a confirmation from MCR, and Koji will contact Mary Convery to do the access with him. - Robert Harris
    Tue Feb 24 19:51:11
    Salah (MCR crew chief) called to confirm a GO for access at 
    end of store. 
    
    Parameters if store terminates normally: 
    
    * End of store studies (synch light) begin at 0600 
    * Store terminates at 0800 
    * CDF and D0 are granted access up to 90 minutes 
      starting at 0815 to 0830.  
    * CDF sends in: 
      - Rob Roser and Tech (COT plumbing inspection) 
      - Dan Cyr and Phil Schlabach BMU work 
        (need to confirm with Dan) 
      - Koji Terashi and Mary Convery (FCAL00-01 work) 
    
    If store terminates early: 
    
    * call Roser to come in 
    * confirm with MCR how long an access we 
      can have 
    * if access time is long enough to get  
      Dan and Phil here plus 90 minutes, call 
      Dan and Phil 
    * Do not call Koji/Mary 
     - JJ
    Tue Feb 24 20:07:58 I confirm with Koji that we will have an access, Koji contacts Mary Convery, and she can go.  - Robert Harris
    Tue Feb 24 20:08:59
    You may wish to note that the top portion of the single-run  
    RunSummary page has been reformatted in an effort to make  
    the run information more readable.   I told you not to parse  
    these pages to extract run information, so you should be safe. 
    
    Also added a summary link page to Charles Plager's ConsumerPlots  
    pages.  Click on "ConsumerPlots" near the top of the RunSummary  
    page. 
    
    For an example, click on the run number immediately 
    below this entry.  If you have any comments or suggestions,  
    feel free to send me e-mail.  (Please do not page the  
    DAQ pager about this subject) 
    
     - W.Badgett :: (run 179314)
    -- Tue Feb 24 21:23:43 comment by...ps --  a big wow for consumer plots, no need to come to work now.
    Tue Feb 24 20:21:48 Run 179314 Terminated at 2004.02.24 20:11:56 - RunControl
    Tue Feb 24 20:21:49 Run 179314 TERMINATE: to clean up event builder to try to fix busy dead time. - Simon x2080
    -- Tue Feb 24 21:48:29 comment by...Simon --  HRR failed to fix the busy dead time.
    Tue Feb 24 20:21:51 Run 179323 Activated at 2004.02.24 20:20:20 - RunControl
    Tue Feb 24 20:26:32 Run 179323 Terminated at 2004.02.24 20:25:21 - RunControl
    Tue Feb 24 20:26:33 Run 179323 TERMINATE: timeout - Simon x2080
    Tue Feb 24 20:28:03
     - Christopher
    -- Tue Feb 24 20:28:21 comment by...Christopher --  Hourly Plots
    Tue Feb 24 20:34:16 Icicle crashed again; restarted it - Christopher
    Tue Feb 24 20:35:07
    L1 Cal EM hot triggers - source of high deadtime?
     - josh
    Tue Feb 24 20:38:41 CMU trips NE channels 18-23. Recovered. - Christopher
    Tue Feb 24 20:46:13
    DCAS-ADMEM plot - requested by expert
     - josh
    Tue Feb 24 20:58:56 Run 179324 Activated at 2004.02.24 20:58:52 - RunControl
    Tue Feb 24 20:59:40 Run 179324 ACTIVATE: AAA_CURRENT PHYSICS_2_02[2,424,421] - Simon x2080
    -- Tue Feb 24 21:24:37 comment by...Simon --  without silicon crates
    Tue Feb 24 21:20:45 Run 179324 Terminated at 2004.02.24 21:20:38 - RunControl
    Tue Feb 24 21:21:06 Run 179324 TERMINATE: ~100% dead time still - Simon x2080
    Tue Feb 24 21:26:45 Run 179325 Activated at 2004.02.24 21:26:22 - RunControl
    Tue Feb 24 21:27:47 Run 179325 ACTIVATE: AAA_NOSILICON PHYSICS_2_02 [2,424,431] - Simon x2080
    -- Tue Feb 24 21:29:13 comment by...Simon --  without PCAL_07 card ADMEM_156 in slot20
    Tue Feb 24 21:28:27 Removed PCAL_07 ADMEM_156 in slot20. Dead time back to normal. - Simon :: (run 179325)
    -- Tue Feb 24 21:40:14 comment by...Simon --  Prior to removing PCAL_07 ADMEM_156 in slot20, we tried to reset reboot and shepherd in RC, we tried to reboot in vxwork, and issue bus reset in VISIONdemo. It all failed to prevent L2 rate from too high for the event builder system. The dead time was above 99%.
    Tue Feb 24 21:29:15
     - Christopher
    -- Tue Feb 24 21:30:51 comment by...Christopher --  Hourly Plots
    Tue Feb 24 21:59:20 Run 179325 Terminated at 2004.02.24 21:58:59 - RunControl
    Tue Feb 24 21:59:59 Run 179325 TERMINATE: Expert requests whole B0pcal07 crate to see what is the problem with it. Removing it. - Simon x2080
    Tue Feb 24 21:59:59 Data taking is halted for 1 1/4 hours due to a 99% deadtime problem. It is eventually traced to a faulty admem card in slot 20 in b0pcal07 for wedge 6 of the east plug EM calorimeter. The card is removed from the run and we resume data taking. Details of how we got there follow. Problem with 99% deadtime, higher than normal l2 accept rate. L2 trigger rate of 425 Hz. Tried to cleanup the event builder and restart the run. Still 99% deadtime and L2 trigger rate of 436 Hz. We page the L2 pager, and Charles Plager replies and looks into the problem. Independently we notice that the L2 rate is dominated by L2 EM triggers, and that the L1 EM trigger is occuring at the bunch crossing rate. Charles concludes the same and recommends we page the L1 cal trigger people. Carla Grosso-Pilcher replies and says it is one Admem in the east plug. Four channels that fire all the time. She recommends we reboot the crate, PCAL07, after paging the calorimeter people to consult. Monitoring Ace looked at the particular ADMEM and says it is bad in 1 out of 2 events with errors. Paged plug Calorimeter people. Willis Sakumoto replies and said to reboot the crate and page electronics people if that doesn't work. We reboot PCAL07 via Run Control and problem remains. We reboot PCAL07 in VXworks and that doesn't work. Tried to reset the bus in vision demo and that doesn work. Paged Calorimeter electronics long range pager for first time and get not response. Willis says it is Mark Matson or Vivek Tiwari who are on the pager. Try to call Mark and Vivek in their offices and they are not in. JJ suggests we kick it back up to the Cal SPL, so we paged plug cal again and get Willis. We suggest to Willis that we just remove b0pcal07 and restart the run, but Willis says we should isolate the ADMEM card that has the problem and restart run control. CO determines that it is slot 20. We remove the card, admem_156 and restart run control. We are taking data again succesfully! Willis calls back again and says it appears to be a single trigger tower in the EM(east, phi6, eta19) so we should get Carla to mask off that particular trigger tower so we don't lose the whole crates worth of data. We page Carla who says that more than one tower is bad. She thinks that the entire card has a problem and calls Willis to discuss it. Willis calls us back and said that Carla had access to more information and he agrees with her assesment that we should just run without the card.  - Robert Harris
    Tue Feb 24 22:01:45 I've returned the cal ADMEM pager call. I've asked for pcal07 to be removed from the run, so I'll have a chance to run diagnostics. If the problem is electronics, then I'll want to make an access tomorrow. If it is software, then it may be possible to fix this now. - m mattson
    Tue Feb 24 22:04:41 Run 179326 Activated at 2004.02.24 22:04:26 - RunControl
    Tue Feb 24 22:05:09 Run 179326 ACTIVATE: AAA_NOSILICON PHYSICS_2_02[2,424,431]  - Simon x2080
    -- Tue Feb 24 22:06:13 comment by...Simon --  without b0pcal07
    Tue Feb 24 22:06:05 Mark Matson calls and I explain the situation with the bad admem card in slot 20 of b0pcal07 and what we did. I mention to him that we are taking an access tomorrow morning and want to know whether he needs to go in to fix the card. Mark says that it is impossible to tell unless he can have the crate for 15 minutes to test whether there is indeed a problem. I agree, and we stop the run and restart again without b0pcal07, while Mark conducts his test on that crate. - Robert Harris
    Tue Feb 24 22:15:54 Run 179326 Terminated at 2004.02.24 22:15:36 - RunControl
    Tue Feb 24 22:16:05 Run 179326 TERMINATE: To put back b0pcal07. - Simon x2080
    Tue Feb 24 22:16:30
     - Christopher
    -- Tue Feb 24 22:17:36 comment by...Christopher --  Hourly plots
    Tue Feb 24 22:17:53 Mark Mattson calls back and says the problem is a blown fuse in the Admem card in slot 20. He says we can put back the b0pcal07 crate in the running with the card taken out. I ask him if he is willing to go in tomorrow morning on an access to fix the card, and he agrees. - Robert Harris
    -- Tue Feb 24 22:25:30 comment by...m mattson --  Slot 20 has symptoms consistent with a blown -15V fuse I've talked with JJ, ops manager. He is not sure at this time if there will be enough time for the repair (which requires opening up the NE torroid steel). I've confirmed that I will be in early, in case there is an opportunity.
    -- Tue Feb 24 22:31:03 comment by...JJ --  
    Talked to Roser and we will attempt to do this repair as well
    in the morning.

    Tue Feb 24 22:19:45 Run 179327 Activated at 2004.02.24 22:19:01 - RunControl
    Tue Feb 24 22:24:08 Run 179327 ACTIVATE: AAA_NOSILICON PHYSICS_201[2,424,431] - Simon x2080
    -- Tue Feb 24 22:24:35 comment by...Simon --  without b0pcal07
    Tue Feb 24 22:28:50 Run 179314 RUNSTATUS:
    Marked Bad, explanation:
    Silicon is off, one sector of the BMU is off, and the last 10 minutes of this 20 hour
    long run are contaminated for the PCAL by a bad ADMEM that caused the east plug EM to trigger 
    every bunch crossing.
    PCAL 
    IMU 
    SVX 
    ISL 
    L00 
    
     - cdfscico
    Tue Feb 24 22:33:02 Run 179325 RUNSTATUS:
    Marked Bad, explanation:
    Silicon is off, one sector of the BMU is off, and one wedge of the East PEM has been 
    removed from the readout due to a bad Admem card.
    PCAL 
    IMU 
    SVX 
    ISL 
    L00 
    
     - cdfscico
    Tue Feb 24 22:35:44
    I am calling MCR and alerting crew chief (Salah) that we have 
    added another task to access list and that we will definitely 
    need full 90 minutes (if not a little more). 
    
     JJ
     - jj
    Tue Feb 24 22:51:55 Run 179327 Terminated at 2004.02.24 22:51:31 - RunControl
    Tue Feb 24 22:52:12 Run 179327 TERMINATE: to put b0pcal07 again. - Simon x2080
    Tue Feb 24 22:52:32
    UPDATE ON ACCESS REQUEST FOR Wednesday morning: 
    ----------------------------------------------- 
    
    Parameters if store terminates normally: 
    
    * End of store studies (synch light) begin at 0600 
      (CDF goes to "between store HV state" except for 
       CLC during this time) 
    * Store terminates at 0800 
    * CDF and D0 are granted access up to 90 minutes 
      starting at 0815 to 0830. (Note added later - 
      Crew Chief Salah was informed that 90 minutes 
      would be tight now that we added admem repair.) 
    * CDF sends in: 
      - Rob Roser and Tech (COT plumbing inspection) 
      - Dan Cyr and Phil Schlabach BMU work 
      - Koji Terashi and Mary Convery (FCAL00-01 work) 
      - Mark Mattson and tech to pull East toroid and access PCAL07 
        to work on admem in slot 20 
    
    If store terminates early: 
    
    * call OPS (JJ) LRP 314-4862 
    * call Roser (399-2609) to come in 
    * confirm with MCR how long an access we can have 
    * call Mark Mattson ( LRP 266-8338 ) 
    * if access time is long enough to get 
      Dan and Phil here plus 90 minutes, call 
      Dan (715-3704) and Phil (264-0674) 
    * Do not call Koji/Mary 
    
     - JJ
    Tue Feb 24 22:57:22 Run 179328 Activated at 2004.02.24 22:57:10 - RunControl
    Tue Feb 24 22:58:37 Run 179328 ACTIVATE: PHYSICS_2_02[2,424,431] with b0pcal07, wihout card card ADMEM_156 in slot 20 - Simon x2080
    Tue Feb 24 23:00:05 MCR calls and says that the Tevatron will be doing studies beginning at 6 a.m. tomorrow. By about 6:30 a.m. the study will include beam bumps that should generate losses. We need to have our HV down by 6:30 a.m. to avoid trips from the losses. - Robert Harris
    Tue Feb 24 23:09:10 Run 179327 RUNSTATUS:
    Marked Bad, explanation:
    Silicon off, one sector of BMU off, one crate in the PEM east is off.
    PCAL 
    IMU 
    SVX 
    ISL 
    L00 
    
     - cdfscico
    Tue Feb 24 23:55:40
    Run Number Data Type Physics Table Begin Time End Time Live Time L1 Accepts L2 Accepts L3 Accepts Live Lumi, nb-1 GR SC RC
    179314 x2BC72 BEAM PHYSICS_2_02 [2,424,431] 23:22:04 20:11:56 19:22:38 1,086,031,971 11,605,614 1,925,799 1566.497 0 1 1
    179325 x2BC7D BEAM PHYSICS_2_02 [2,424,431] 21:26:22 21:58:59 00:32:26 21,125,046 178,707 37,459 27.153 0 1 1
    179326 x2BC7E BEAM PHYSICS_2_02 [2,424,431] 22:04:26 22:15:36 00:11:03 3,882,220 50,907 12,292 9.135 1 0
    179327 x2BC7F BEAM PHYSICS_2_02 [2,424,431] 22:19:01 22:51:31 00:32:18 20,324,983 173,690 37,001 26.306 0 1 1
    179328 x2BC80 BEAM PHYSICS_2_02 [2,424,431] 22:57:10 00:52:25 35,460,226 292,017 59,765 42.046 1
    Totals 23:55:02 21:30:52 1,166,824,446 12,300,935 2,072,316 1671.136
     - End of Shift Report
    Tue Feb 24 23:56:02 Shift Summary:
    Busy shift with lots of problems during low luminosity
    running. 
    
    
    - Phil Schlabach attempts to fix the HV problem with a sector of the BMU. 
    - Run 179314 ends due to 99% deadtime from an ADMEM card with a blown fuse.Slot 20 of b0pcal07 for
    wedge 6 of the PEM east. Isolating the problem takes 1 1/4 hours. Along the way we page L2 trigger,
    L1 trigger, Plug Cal and Cal electronics pagers. Finally we begin running again without the card. 
    
    - Run for 30 minutes without the bad card. (Run 179325) 
    - Run for 10 minutes w/o b0pcal07 so that Mark Mattson can test the crate and verify the problem
    with the admem. (Run 179326) 
    
    - Run for 30 minutes w/0 b0pcal07 by mistake; wanted to run w/o card. (179327) 
    - Catch our mistake and begin run with b0pcal07 but w/o the bad card. (179328) 
    - MCR notifies us to lower our HV between 6 and 6:30 a.m. tomorrow to avoid trips due to losses from
    studies. 
    
    

    End of Shift Numbers
    CDF Run II

    Runs                   
    Delivered Luminosity   421.7   
    Acquired Luminosity    332.8   
    Efficiency             78.9
    
    
     - Robert Harris
    Wed Feb 25 00:04:35 PEM PHA PSH tripped; was able to recover by toggling to standby and on a couple of times - Christopher :: (run 179328)
    Wed Feb 25 00:05:28
     - Christopher
    Wed Feb 25 00:06:33
    DateTimeBLMDose
    2004.02.2500:04:34W Inner BLM901.31RADS
    2004.02.2500:04:34W Outer BLM0.00RADS
    2004.02.2500:04:34E Inner BLM20.67RADS
    2004.02.2500:04:34E Outer BLM621.32RADS
    Integrated dosage - Christopher