2004 CDF E-Log -- Day shift. Thu Feb 12, 2004
SciCo DAQ Ace Monitoring Ace CO (Operations Manager)
Camille Ginsburg Ian Vollrath Andrew Ivanov Martin Griffiths Mary Convery


Start of Shift Notes:  

Taking data.  B0Lum 35E30, stack 0.

Thu Feb 12 08:51:46
halted run earlier in this shift due to p loss spike. cmx, cmp tripped. see previous shift elog
for plot of spike. 
 - ian :: (run 179055)
Thu Feb 12 09:08:01
 - 8:00-9:00 status plots - Andrew
Thu Feb 12 09:09:33 L1_BMU10_BSUR_TSUO_&_CLC_PS1 is almost 50% above expected - presumably due to spike in BMU west, which now extends over 11 channels in Layer 0 and Layer 3. - Martin
-- Thu Feb 12 09:23:20 comment by...Martin --  The noisy channels are:

Layer 0 and Layer 3: 132-136, 138-143
Layer 1 and Layer 2: 138-143
-- Thu Feb 12 10:10:26 comment by...Camille Ginsburg --  Channels BMU-W-[132:136] are on ASD-16-J1; Channels BMU-W-[138:143] are on ASD-16-J3. It seems the ASD's and/or the crate slot 16, are picking up the noise.
Thu Feb 12 09:37:29 MCR is apparently doing something, since our abort gap losses (B0PAGC) have reduced from 13.5 kHz to 11 kHz in the last 5 minutes. - Camille Ginsburg
-- Thu Feb 12 11:09:47 comment by...ronmoore --  Nope...it looks like you can thank a D0 pot again....
Thu Feb 12 10:03:25
got error: 

Host b0eb12.fnal.gov, task tRec_0 
SCPU-P1-E-TracerEventId: Event 2573753, crate 0, channel 3 has either bad Tracer ID or bad markers
around Tracer word 


Attention!!!. SCPU_TRACER_EVENT_ID Error !!! 
 Hardware EVB has detected a problem with data quality in 
 SCPU b0eb12 ( FER crate : NOT_AVAILABLE). 


second time this has happened. in both cases hrr worked. 
 - ian :: (run 179055)
-- Thu Feb 12 14:59:07 comment by...ian --  
same error once more

-- Thu Feb 12 15:15:27 comment by...ian --  
again:

0l3pcom1.fnal.gov:main:3:14:07 PM->Host b0eb18.fnal.gov, task tRec_0
SCPU-P1-E-VrbHeader: Dump of header words for event 6591439 from VRB in slot 20:

Thu Feb 12 10:12:29
 - 9:00 -10:00 - status plots - Andrew
Thu Feb 12 10:22:15
There was a PSM alarm on 1RR12G_2 on tuesday evening. See original entry. According to VOLTMAN, channel 2 (-6V) flucuated out of tolerance and hence the alarm. Dervin told me that the ace on shift checked the front panel of the power supply which read a proper value. My suspicion is that this read out was for channel 1 (+6) which was fine at the time. However, I'm not ruling out a communication failure between the power supply and VOLTMAN. I'm leaving the power supply masked, but am emailing Dervin and Roberto.
 - Dan Ryan
-- Thu Feb 12 10:29:49 comment by...dan --  Okay, I take back what I said about the ace reading channel 1 on the front panel. However, VOLTMAN is currently reflecting the voltage on the front panel at the present moment, so it is possible that by the time the ace read the voltage on the front panel, voltage jumped back to a value within tolerance...
-- Thu Feb 12 10:52:13 comment by...dan --  Okay, I take back what I said about VOLTMAN reflecting the front panel. The noise is in the data line and not the actual voltage. Investigating where the noise is coming from...
Thu Feb 12 10:44:47
twice got the error: 

TIMEOUT: Mod done not set for slot(s) 14 (Clist)  Raw: (0x0007f7ff, cnt 4)XTRP: (BC = 130, Buffer no
= 1)SVT: (BC = 130, Buffer no = 1) 


hrr worked as expected.
 - ian :: (run 179055)
-- Thu Feb 12 11:38:39 comment by...ian --  
again but now for slot 12

-- Thu Feb 12 13:36:23 comment by...ian --  
slot 12 again

Thu Feb 12 11:05:36
 - 10:00-11:00 status plots - Andrew
Thu Feb 12 11:14:51 Still not stacking pbars. Message on TV says "Stacking on hold for A:QDF and D:IKIK repairs. - Camille Ginsburg
Thu Feb 12 11:31:55
got error: 

Host b0eb14.fnal.gov, task tRec_0 
SCPU-P1-E-VrbHeader: Dump of header words for event 3908032 from VRB in slot 10: 
0x00000000 0x00002a40 0x002f00ef 0x08ec02f0 0x02f00580 0x06e80548 0x037c0000 0x03100310 

hrr worked. 
 - ian :: (run 179055)
Thu Feb 12 11:39:03 We verified that L1A to CDF Clock timing has not changed. Studies had no adverse effect to data taking. (Details).  - SCN, RGF
Thu Feb 12 11:50:22
over the past few hours 3 L3 filter states have died: nodes 097, 123, 225. 

 
 - ian :: (run 179055)
-- Thu Feb 12 13:57:40 comment by...ian --  
2 more: 050, 078

Thu Feb 12 12:05:51
 - 11:00 -12:00 status plots - Andrew
-- Thu Feb 12 12:06:35 comment by...Andrew --  
Abort gap losses are gradually rising

Thu Feb 12 12:14:41
12:13:13 PM->Silicon Timeout:BUSY- Slots:  08:fa00 10:fa20 12:fa40 16:f800 18:f820 20:f840 

hrr worked.
 - ian :: (run 179055)
Thu Feb 12 12:16:06 Back to stacking pbars. Stack 3 mA, stackrate 10 mA/h. - Camille Ginsburg
Thu Feb 12 12:45:00 I informed MCR (Dave, but I don't know which one) that our abort gap losses (B0PAGC) are above the 15 kHz Silicon WARNING limit, and that we would turn the Silicon to standby if the losses got up to 20 kHz. I also asked him, iff they decided to use collimators to address this problem (and I'm not telling him his business), to please let us know beforehand so we can turn the pesky chambers and silicon to standby first. - Camille Ginsburg
-- Thu Feb 12 12:52:11 comment by...Camille Ginsburg --  It was Dave Sutherland.
Thu Feb 12 12:55:31 MCR (Dave S.) says they'll move the E03 vertical collimator to reduce our abort gap losses, so we halt the run and turn chamber+silicon HV's to standby. - Camille Ginsburg
Thu Feb 12 13:04:11 MCR (Dave S.) says they're done with the collimator movement. We turn everything back on. - Camille Ginsburg
-- Thu Feb 12 13:15:13 comment by...Andrew --  
Got CMP trip in South Wall, when turning all voltages back on.
Recovered.

Thu Feb 12 13:14:19
Abort gap losses plot during the collimator movement.
 - Andrew
-- Thu Feb 12 13:37:38 comment by...Camille Ginsburg --  Reduction in B0PAGC from 16 kHz to 10 kHz.
Thu Feb 12 14:05:56
 - 12:00 - 14:00 status plots - Andrew
Thu Feb 12 14:34:26
Got a trigger inhibit and further PHA and PEM trips. 
PHA and PEM iFix pages had a "diagnostic" button red, but 
by clicking on it, nothing happened.  
It turns out that software for Plug HV Monitoring running 
on the computer on the 3rd floor crashed.  
Willis restarted it and fixed the problem. 
 - Andrew
-- Thu Feb 12 14:58:02 comment by...Willis --  The plug trigger inhibit occurred because a large fraction of channels in CAEN crate 4 had HV readbacks that were about 10% lower than nominal; the reason is not known. During the plug HV recovery procedures, the plug HV monitoring program crashed.
Thu Feb 12 15:03:25
 - 14:00-15:00 status plots - Andrew
Thu Feb 12 15:05:18 From the Run Coordinator e-log:
There are cryostat vacuum problems at B3 and C4. It is holding for now. We will need to replace the roughing pumps/stations at both of these locations before the next store. - DAJ
Dan Johnson, acting Run Coordinator - Camille Ginsburg
Thu Feb 12 15:27:10
MLE) b0l3pcom1.fnal.gov:main:3:22:25 PM->Host b0eb16.fnal.gov, task tRec_0 
SCPU-P1-E-VrbReadFailed: Error reading VRB in slot 10 for event 6697334. 
The VRB had no event. 
first got si timeout: 

(MLE) b0l3pcom1.fnal.gov:main:3:22:26 PM->Host b0eb16.fnal.gov, task tRec_0 
SCPU-P1-E-VrbReadFailed: Error reading VRB in slot 10 for event 6697335. 
The VRB had no event. 
(MLE) b0tsi02:Messenger:3:22:30 PM->DONE timeout 

then pcal05 done timeout: 

(MLE) b0dap73.fnal.gov:Thread-110125:3:22:31 PM->Requested Halt-Recover-Run issued [errmon] 
(MLE) b0dap73.fnal.gov:Thread-110125:3:22:31 PM->Done Timeout: PCAL_05 

successfully shepherded pcal05. 
 - ian :: (run 179055)
Thu Feb 12 15:37:35
notice that b0clc00, b0fcal00, and now (after shepherding) b0pcal05 become red intermittently
on vxworks with the U (Update) error condition. don't see any other errors corresponding to this. i
believe b0fcal00 was shepherded during the owl shift last night ... what is causing this ...? 


 - ian :: (run 179055)
Thu Feb 12 15:39:13
3:38:05 PM->Busy Timeout: VRB_ISL_06 

hrr worked.
 - ian :: (run 179055)
Thu Feb 12 15:41:22
3:40:15 PM->Silicon Timeout:BUSY- Slots:  08:fa00 10:fa20 12:fa40 16:f800 18:f820 20:f840 

hrr worked.
 - ian :: (run 179055)
Thu Feb 12 15:55:36
Run Number Data Type Physics Table Begin Time End Time Live Time L1 Accepts L2 Accepts L3 Accepts Live Lumi, nb-1 GR SC RC
179055 x2BB6F BEAM PHYSICS_2_01 [4,416,424] 07:11:20 07:40:45 410,368,603 7,080,883 1,570,771 824.389 1
Totals 15:55:02 07:40:45 410,368,603 7,080,883 1,570,771 824.389
 - End of Shift Report
Thu Feb 12 15:58:32
 - 15:00-15:55 status plots - Andrew
Thu Feb 12 16:09:09 Shift Summary:
Still taking data smoothly with store 3228. 
B0lum 23E30, pbar stack 40 mA.  Plan to continue 
until at least 6am or 7am tomorrow morning 
at which time there will be 4 hours of TeV studies  
or an access to fix two TeV roughing pumps or both, 
then another store. 

Had trouble with the PEM and PHA HV, otherwise no  
significant CDF problems.

End of Shift Numbers
CDF Run II

Runs                   179055
Delivered Luminosity   821.4 nb-1  
Acquired Luminosity    720.6 nb-1  
Efficiency             87.7%

 - Camille Ginsburg