2004 CDF E-Log -- Owl shift. Wed Feb 18, 2004
SciCo DAQ Ace Monitoring Ace CO (Operations Manager)
Weiming Yao Anadi Capena Jan Ehlers Cheng-Ju Lin Rob Harr


Start of Shift Notes:  

Continue taking data through the shift .
 Lum=19E30 and Stack=126E10 

Wed Feb 18 00:19:21
Busy time out for crate b0svx06 associated to error in evb b0eb22. 

From the error log: 
SCPU-P1-E-VrbReadFailed: Error reading  
VRB in slot 16 for event 1898730.VRB_BUS_ERROR:  
Alignment or VME bus error. 
Transferred 0 bytes before the error. 
SCPU-P1-E-EventIdMismatch:  
VRB ID for event 1898731 is 0 instead of 11 (slot 16).
 - Anadi :: (run 179156)
Wed Feb 18 00:45:54
As usual TOF heartbeat alarm appeared 
-> usual procedure 
-> iFix problems 
-> rebooting computer 
-> everything is fine again
 - Anadi & Jan
Wed Feb 18 01:14:41
 - Jan
-- Wed Feb 18 01:15:08 comment by...Jan --  
The Hourlies

Wed Feb 18 01:26:15
A block of error seen in TrigMon, DCAS vs ADMEM comparison. Paged Carla. She's investigating the problem right now.
 - Cheng-Ju
-- Wed Feb 18 01:45:55 comment by...carla --  
The errors occur in a cluster of events in a short range of time. 
All towers involved are in PCAL5. The same problem happened 
about a week ago, and it went away by itself.
It appears to be ok at the moment.
If it comes back, reboot the crate - check with Calorimeter people first.

Wed Feb 18 02:11:24
 - Jan
-- Wed Feb 18 02:12:50 comment by...Jan --  
abort gap losses basically between 15 - 20 kHz

Wed Feb 18 03:22:07
 - Jan
-- Wed Feb 18 03:23:16 comment by...Jan --  
abort gap losses still fluctuate between 15 and 20 kHz

Wed Feb 18 03:41:10
Automatic HRR issued for Done TO in crate b0cot17.  
From the error log: 
(MLE) b0cot17:Messenger:3:36:14 AM->TDC Invalid Header evt 4123342 slot 18: exp/fnd mod 487/487 rol
7/4 l2b 2/2 

(MLE) b0cot17:Messenger:3:36:14 AM->Event 4123342: Bunchcounters in slot 3 (BC=62) and slot 18
(BC=0) disagree (L2B=2) 

 - Anadi
Wed Feb 18 04:23:59
 - Jan
-- Wed Feb 18 04:24:27 comment by...Jan --  
same same

Wed Feb 18 04:49:39
Long sequence of L3 REFORMAT ERROR.  
The reformatter rejected total rate is 0.01%. 
The Rejection rate of node 267 chain 1 is 1.  
From the error log: 
Error on L3 node b0l3267 (partition 1) 
Found same CrateId twice! 
CrateId     PartId      EvtId       Tracer Word 
0x00000079  0x00000001  0x0044150b  0x001b0179 


Error on L3 node b0l3267 (partition 1) 
VRB word not consistent with the existing one. 
Bank Name   Bank Type   Default     Faulty 
0x44504d54  0x00000003  0x00000309  0x000000a9 
Error during reformatting! 
Exact position not identifiable! 




 - Anadi
Wed Feb 18 05:11:41
 - Jan
-- Wed Feb 18 05:12:16 comment by...Jan --  
4:00 - 5:00 .......

Wed Feb 18 05:41:22
Halt the run since losses are at 20 kHz and MCR is moving the collimator 
 - Anadi
Wed Feb 18 05:41:58 Abort Gap loss exceeds 20K, called MCR to move collimator to reduce the loss. - weiming
Wed Feb 18 05:42:13
abort gap losses touch 20 kHz: 

collimator movement 
-> usual standby procedure
 - Jan
Wed Feb 18 06:33:45 After playing almost an hour, the abort gap loss is reduced to 16k. Resuming the run. - weiming
Wed Feb 18 06:35:13 start the run again after collimator moved - Anadi
Wed Feb 18 06:37:19
These collimator movements took this time really long - almost 1 hour!!! 
detectors are in on-state again
 - Jan
Wed Feb 18 06:48:55
 - Jan
-- Wed Feb 18 06:50:28 comment by...Jan --  
5:00 - 6:45 : including collimator movements

Wed Feb 18 07:11:07
L3 reformat error from node 267 partition 1 again.  
From the error log: 
Reject rate: 9.181 percent 
VRB word not consistent with the existing one. 
Bank Name   Bank Type   Default     Faulty 
0x44504d54  0x00000003  0x0000720b  0x000000f6 
Error during reformatting! 
Exact position not identifiable! 


 - Anadi
Wed Feb 18 07:38:17
 - Jan
-- Wed Feb 18 07:40:56 comment by...Jan --  
6:45 - 7:30: stormy abort gap losses till the end

Wed Feb 18 07:46:17 Run 179156 Terminated at 2004.02.18 07:46:06 - RunControl
Wed Feb 18 07:47:39 Run 179156 TERMINATE: end run for separator scan test  - Anadi x2080
Wed Feb 18 07:47:39 MCR called, they are ready to do separator scans. End the run. - weiming
Wed Feb 18 07:55:20
Run Number Data Type Physics Table Begin Time End Time Live Time L1 Accepts L2 Accepts L3 Accepts Live Lumi, nb-1 GR SC RC
179156 x2BBD4 BEAM PHYSICS_2_01 [4,416,424] 20:54:29 07:46:06 09:06:17 447,895,478 5,967,723 1,195,098 576.374 1 1
Totals 07:55:02 09:06:17 447,895,478 5,967,723 1,195,098 576.374
 - End of Shift Report
Wed Feb 18 07:55:27 Shift Summary:
1) Continue taking data throgh the shift @ L=19E30 with
default table. 


2) 1:20 AM, Co noticed some errors in Trigmon, DCAS vs ADMEM, paged expert. Carla called back, the
errors occur in PCAL5 in a short period time and went away by itself.  


3) 5:40 AM abort gap loss exceeds 20K, called MCR to move the collimator to reduce the loss. After
playing  almost an hour,  

the loss is reduced to 16K. 

4) 6:30 AM Resuming the run.  

5) 7:45 AM, MCR called, they are ready to do separator scan. End the run. Leave CLC on, put COT off
and the rest to standby.  

 

End of Shift Numbers
CDF Run II

Runs                   179156
Delivered Luminosity   468 nb-1  
Acquired Luminosity    403.4 nb-1  
Efficiency             86.2

 - weiming
Wed Feb 18 07:58:34 Run 179157 ACTIVATE: L2_TORTURE[15,390,406] - Anadi x2080