2004 CDF E-Log -- Eve shift. Sun Mar 7, 2004
SciCo DAQ Ace Monitoring Ace CO (Operations Manager)
Rei Tanaka Andrew Ivanov Ian Vollrath /
Tom Schwarz
Guenakh Mitselmakher Mary Convery


Start of Shift Notes:  

Inherit store#3277. Lumi=53E30. Aiming to get good data in windy sunday evening.

Sun Mar 7 16:20:05
 - plots (14:48 (shot) - 16:00) --ian
-- Sun Mar 7 16:54:16 comment by...Rei Tanaka --  Here is MCR e-Log which explains what they did at the beginning of this store when LOSTP>20kHz. Secondary collimator needs to be aligned ?
-- Sun Mar 7 15:45:35 comment by...rm -- Lostp was bumping up to 25 kHz; CDF even turned off their silicon. I tried raising and lowering both horz and vertical base tunes. The only seemingly satisfactory thing was lowering vert base tune...I left in a 0.0006 decrease on vert base tune. But, that was the last thing I tried, so losses may have started decaying naturally. CDF called to say they were going to turn on, so I stopped twiddling.
-- Sun Mar 7 15:46:37 comment by...rm -- Lostp has also been "spikier" than usual. I'm beginning to wonder if a secondary collimator needs to be aligned.
Sun Mar 7 16:51:18
Extra hot channel 4R east11, as compared to the reference plots
 - G.Mitselmakher
Sun Mar 7 17:08:53
Experiencing lots of event builder errors coming from b0eb16  
from VRB in slot 12: 
(happening every 5-10 minute now) 

 SCPU_TRACER_EVENT_ID Error !!! 
 Hardware EVB has detected a problem with data quality in 
 SCPU b0eb16 

  
 - Andrew :: (run 179726)
-- Sun Mar 7 17:09:34 comment by...Andrew --  All errors are HRR recovered.
Sun Mar 7 18:10:50
 - plots (16:00-18:00) --ian
Sun Mar 7 18:24:24
lostp spike at ~17:58 up to ~24kHz.
 - ian
Sun Mar 7 18:33:47
CListMon errors are back. Got L2 decision timeout.  

Continue having evb16 errors.
 - Andrew :: (run 179726)
Sun Mar 7 18:47:06 Some reason, old (no longer existing) online monitor process were still connected to csl_consend (csl consumer sending) process and taking up socket connection slots. There is a maximum number allowed to connect to CSL (20?). So at some point, we could get into a situation where we can not start a new online monitor. We had 17 consumers connected, of which 11 were official active ones. Andrew (daq ace) and I killed 6 idle csl_consend processes. We managed not to kill the main CSL sender process. So everything is ok at a moment. Now, we need to find out why the old connection is hanging. But monitors should be ok for tonight.  - kaori :: (run 179726)
-- Sun Mar 7 18:49:06 comment by...kaori --  I paged Tony (carrying CSL pager) earlier, and he replied right away, and told me how to find/kill these csl_consend processes.
Sun Mar 7 19:10:16
got cot gas alarms: 

FE-COTBUBBYP  COT Alcohol Bub. Byp Flow  
FE-COTBUBSUP  COT Alcohol Bub. Sup Flow 

cyro told us the values exceeded their limits slightly and we should not worry about it. 
 - ian
-- Sun Mar 7 19:18:21 comment by...Rei Tanaka --  See comments in Cryo&Gas e-Log by Jim H.
Sun Mar 7 19:39:07
in WHA Occupancy/Leading edge plot wedge 4 East looks hot compared to the reference plot
 - G.Mitselmakher
Sun Mar 7 19:41:29
 - plots (18:00-19:30) --ian
Sun Mar 7 20:43:13
 - plots (19:30-20:30) --ian
Sun Mar 7 21:34:01
 - plots (20:30-21:30) --ian
Sun Mar 7 21:35:33
There are many "byte shift" errors coming from COT17. Channel data = 0x00000000 0x00000228 0x432b0142 0x004f5444 0x00140012 0x00000000 0x000001ff 0x00000003 ... ... 0x0006e2d0 0x000a0b68 0x000a8200 0x000a8390 0xaa000000 0x55aaaaaa 0x00555555 0x802b0142 --> This type of error is usually due to noise on the backplane. Should verify that the TRACER that was put in the crate was a modified one, with the buffer chip added to filter the backplane noise. The TRACER in this crate will have to be replaced to fix this type of error.
 - Frank Chlebana
-- Sun Mar 7 21:37:00 comment by...Frank Chlebana --  
This time in a nicer format...

There are many "byte shift" errors coming from COT17. 

Channel data = 0x00000000 0x00000228 0x432b0142 0x004f5444 0x00140012 0x00000000 0x000001ff
0x00000003 ... ... 0x0006e2d0 0x000a0b68 0x000a8200 0x000a8390 0xaa000000 0x55aaaaaa 0x00555555
0x802b0142 --> 


This type of error is usually due to noise on the backplane. 

Should verify that the TRACER that was put in the crate was a modified one, with the buffer chip
added to filter the backplane noise. 


The TRACER in this crate will have to be replaced to fix this type of error.

-- Sun Mar 7 21:41:49 comment by...Andrew --  
There are many tracer ID errors coming from evb16. The messages
say that they are forwarded by mostly COT17 and CMP_00. There 
were also a few errors forwarded by other WCAL,PCAL crates

-- Sun Mar 7 21:43:56 comment by...Rei Tanaka --  Yes, we are suffering from errors every 5-15min. Paged DAQ. Jane Nachtman is in contact with our DAQ ace.
-- Sun Mar 7 21:58:25 comment by...Rei Tanaka --  Seems these errors are happening after we swapped tracer in b0cot17 during an access this morning . Jane is talking to Frank.
-- Sun Mar 7 22:22:20 comment by...Rei Tanaka --  Jane called us. She talked to Frank. We would like to have an access again to check/replace the TRACER in COT17. Ops manager to be informed.
-- Sun Mar 7 22:27:24 comment by...Rei Tanaka --  Informed Ops manager (Mary) that we need an access again.
-- Sun Mar 7 22:31:16 comment by...Rei Tanaka --  Our efficiency in evening shift is 92-93% since beginning. The DAQ dead time is 4-5%. The COT17 tracer problem is not playing the major role in DAQ dead time, but of course better to fix it asap.
Sun Mar 7 21:46:36
PCAL_04 . Shepherded the crate and did HRR.
 - Andrew :: (run 179726)
Sun Mar 7 22:46:11
 - (hourlies) Tom
Sun Mar 7 23:55:25
Run Number Data Type Physics Table Begin Time End Time Live Time L1 Accepts L2 Accepts L3 Accepts Live Lumi, nb-1 GR SC RC
179726 x2BE0E BEAM PHYSICS_2_02 [2,424,431] 14:56:35 07:56:20 430,552,095 8,447,093 1,730,812 1182.479 1
Totals 23:55:02 07:56:20 430,552,095 8,447,093 1,730,812 1182.479
 - End of Shift Report
Sun Mar 7 23:55:25
 - (hourlies) Tom
Sun Mar 7 23:58:56 Shift Summary:
Accelerator 
- Took data for whole shift with store #3277. 
  Luminosity 53(initial) -> 31E30(Final) 
- LOSTP was "spikier" than usual at the beginning of the shot. 

CDF 
- Lots of event builder "byte shift" errors from COT17. 
  Suspect the errors are caused by noise on the backplane of the replaced TRACER. 
  We need an another access to check the TRACER. We informed Ops manager.  
- Instantaneous COT gas flow alarm, but quickly cleared. 
- Idle 6 CSL_consend processes were stopped. 
- Unmasked WHA readout channels. Known problem, but data are OK. 

Plan: 
- Continue quiet data taking. 

End of Shift Numbers
CDF Run II

Runs                   179726
Delivered Luminosity   1145.7nb-1  
Acquired Luminosity    1061.7nb-1  
Efficiency             92.7%

 - Rei Tanaka