2004 CDF E-Log -- Owl shift. Mon Mar 8, 2004
SciCo DAQ Ace Monitoring Ace CO (Operations Manager)
Rick Field Susana Cabrera Tom Schwarz Andrei Loginov Mary Convery


Start of Shift Notes:  

Continuing Store 3277.  Taking good data.

Mon Mar 8 00:22:06
at 12:08 SCPU_TRACER_EVENT_ID Error: 
Hardware EVB detected again a problem with data quality in  
SCPU b0eb16 (forwarded by FER crate COT_17).  
 
 - Susana.
-- Mon Mar 8 02:48:36 comment by...Susana. --  
This problem presists every 10 minutes approximately.

Mon Mar 8 00:22:40
TOF PC frozen, two trigger inhibits: IFIX:TOF HV
 - Susana.
Mon Mar 8 00:22:56
SMACS frozen on TOF PC.  Restarted PC.  All should be fine.
 - Tom
Mon Mar 8 00:40:49 WHA has two hot towers Phi-3eta-8-R and Phi-10-eta-6-L. This causes the WHA Global Alarm and HW Summary to be yellow. This is a know problem since it appears in the day E-log (3-7-04). We are continuing to run with it yellow (as they did earlier). - Rick Field :: (run 179726)
Mon Mar 8 01:00:16
 - (hourlies) Tom
Mon Mar 8 01:26:19
PHA Tower Plots(YMon #16 PHA) cold tower in the PHA: phi=18, eta=21
 - Andrei
Mon Mar 8 01:34:21
WHA Tower Plots(YMon#11 WHA): 2 "new" hot towers compared to reference plots: (eta=-6,phi=23), (eta=-7, phi=13) see previous shift for other plots of WHA(they are the same as those we have now)
 - Andrei
-- Mon Mar 8 07:46:40 comment by...Larry Nodulman --  This actually looks pretty good - average ph is not a good monitor with low statistics.
Mon Mar 8 01:43:45
CPR Wire Plots(YMon #113 CPR) look hotter than the reference plots.
 - Andrei
-- Mon Mar 8 07:48:13 comment by...Larry Nodulman --  Reference plot is probably for lower luminosity, CPR counting grows.
Mon Mar 8 01:55:25
XMon: the hottest trigger: 

 L3 80 EXO_TWO_TRACK_L1_SEVEN_TRK2_v2 7.53  

average x-sec:22.28 
expected x-sec:8.88±1.78
 - Andrei :: (run 179726)
-- Mon Mar 8 07:05:36 comment by...Andrei --  
now it is
9.42 (standard deviations from the expected x-sec)
comparing to
7.53 before...

Mon Mar 8 02:02:55
 - (hourlies) Tom
-- Mon Mar 8 02:03:43 comment by...Tom --  sub 3 minute hourly plotting. A new record for me.
Mon Mar 8 02:13:06 B0PAGC is at 17 kHz and has increased 3 kHz over the last hour. At this rate we will be at 20 kHz in another hour. - Rick Field
Mon Mar 8 02:46:18 B0PAGC went above 18 kHz (reached a high of about 18.5) and TeVMon went "pink". I called MCR and they said that they would make a few minor adjustments. B0PAGC came down to around 16 kHz so things are okay for now. I did not page Silicon. - Rick Field
Mon Mar 8 02:47:45
CER_SVXMON_HALT_RECOVER_RUN_ERROR: 
Stuck Cellid I/B5/W5/L1/C8-11 .  
AUTO HRR was issued 
 - Susana.
Mon Mar 8 03:09:09
 - (hourlies) Tom
Mon Mar 8 03:30:37
CER_SVXMON_HALT_RECOVER_RUN_ERROR 
Stuck Cellid S/B1/W5/L4/C7-13 .  
AUTO HRR was issued 
 - Susana.
Mon Mar 8 03:42:34
 SCPU_TRACER_EVENT_ID Error 
 Hardware EVB has detected a problem with data quality in  
 SCPU b0eb16  
The errorlog says: 
(forwarded by FER crate COT_17).  
The window that popped up from run control says: 
(forwarded by FER crate WCAL_02). 
Automatic HRR.
 - Susana
-- Mon Mar 8 07:34:43 comment by...Donatella T. --  the window (if orange) also comes from the error logger. I'll look about the discrepancy in the crate name.
Mon Mar 8 03:48:46
SCPU_TRACER_EVENT_ID Error 
Hardware EVB has detected a problem with data quality in  
SCPU b0eb16  
Errorlog says: forwarded by FER crate COT_17 
Run Control says: forwarded by FER crate LEVEL1_CAL_01 
Automatic HRR 

 - Susana.
Mon Mar 8 04:01:21
 - (hourlies) Tom
Mon Mar 8 04:08:06
SCPU_TRACER_EVENT_ID Error 
Hardware EVB has detected a problem with data quality in  
SCPU b0eb16 (forwarded by FER crate COT_17). (idem in the  
errorlog and in the run control) 
AUTO HRR
 - Susana.
Mon Mar 8 04:16:04
 SCPU_BAD_VRB_BYTE_COUNT Error 
 Hardware EVB has detected a problem with data quality in  
 SCPU b0eb11.  
 AUTO HRR
 - Susana.
Mon Mar 8 04:30:57 Abort gap losses above 20kHz. Very erratic at this point. HV in state prepared for scraping. Unsure if MCR will scrape. Been in this state for roughly 15 minutes. - Tom
Mon Mar 8 04:37:41
Plot of Abort Gap losses last 20 minutes. at 21kHz right now. Silicon put in standby after first dip downwards in abort gap.
 - Susana/Tom
Mon Mar 8 04:42:11 Abort Gap Losses reached 19 kHz and TevMon became "pink". I called MCR and they said they would try and improve things. However, the losses remained around 19 kHz for about 15 min and then we saw a sharp dip downward. I decided to put the Silicon on "stand-by". I called MCR and told them we were going to "stand-by" and they said good. I asked them to call be if they were going to move the collimator. Right after we put Silicon on "stand-by" the abort gap losses jumped to 30 kHz. We halted the run and put everything in "move collimator" mode and we are waiting for things to stabalize and for MCR to call back. I paged Silicon and told her the situation. - Rick Field
-- Mon Mar 8 04:47:27 comment by...Rick Field --  Note that we put the Silicon on "stand-by" before TevMon turned red!
Mon Mar 8 04:44:24
SVX MON is not sending the heart beat to the consumer error interface in the last 20 minutes.
The CO person is going to re-start SVXMON consumer.
 - Susana.
-- Mon Mar 8 04:56:16 comment by...Andrei --  Well, take a look at the link below. It is the same message/problem as we had last night
Mon Mar 8 05:03:53 MCR called and said they were going to scrape.  - Rick Field
Mon Mar 8 05:13:06
 - Tom and Susana
-- Mon Mar 8 05:14:26 comment by...Tom and Susana. --  
B0PAGC below 20KHz after scrapping.

Mon Mar 8 05:20:13 MCR called and said they finished scraping. B0PAGC is now down to around 9 kHz. I paged Silicon and we have permission to turn the Silicon back on. We are going to start taking data again... after more than an hour of dead time. - Rick Field
Mon Mar 8 05:24:08
In the time interval 4:18 - 5:20 the run 179726 has been halted 
because of the situation related high abort gap losses.  
At 5:20 it was recovered and run.
 - Susana.
Mon Mar 8 05:24:57
 - (hourlies) Tom
Mon Mar 8 06:02:31
Trigger inhibit is set from IFIX:VME POWER for a few seconds.
 - Susana.
Mon Mar 8 06:38:07
 - hourlies Tom
Mon Mar 8 07:18:31 IFIX/HVMON Froze in MUON 3 PC. Restarted the PC. All is fine now. - Tom
Mon Mar 8 07:33:06
 - hourlies Tom
Mon Mar 8 07:40:06
Frequescy of AUTO HRR due to SCPU_TRACER_EVENT_ID Error 
1:11  + 
1:13  + 2 min  
1:19  + 7 min 
1:57  + 38 min  
2:10  + 13 min 
2:14  + 4  min 
2:31  + 27 min 
2:44  + 13 min   
3:15  + 31 min  
3:39  + 24 min 
3:42  +  3 min 
3:47  +  5 min 
4:06  +  59 min 
4:14  +  8 min 
4:15  +   1 min 
6:00  +   105 min  
 - Susana.
Mon Mar 8 07:48:55 Shift Summary:
*** Store 3277 Continuing *** 
Run 179726:  Taking good data. 

*** WHA Hot Towers *** 
WHA has two hot towers Phi-3-eta-8-R and Phi-10-eta-6-L.  
This causes the WHA Global Alarm and HW Summary to be  
yellow. This is a know problem since it appears in the  
day E-log (3-7-04). We are continuing to run with it 
yellow (as they did earlier). 

*** Abort Gap Losses (1st time) *** 
B0PAGC went over 18 kHZ (reached a high of about 18.5). 
I called MCR and they said they would make a few 
minor adjustments.  The lowered the vertical base 
by 0.0005 and B0PAGC came down to 16 kHz so we 
did not have to turn Silicon to "stand-by". 

*** Abort Gap Losses (2nd time) *** 
B0PAGC went over 18 kHZ. I called MCR and they said  
they would make a few minor adjustments.  However  
this time B0PAGC stayed at around 19 kHz for about  
15 min and then there was a sharp drop.  I decided  
to turn the Silicon to "stand-by".  TevMon was not  
red yet but things seemed erratic and I was afraid  
the sharp drop would be followed by an even sharper  
rise.  I called MCR and told them that I was going to  
"stand-by" mode and they said "good".  Right after 
we went to "stand-by" the abort gap losses went 
over 20 kHz and TeVMon went red.  There was a spike 
to 30 kHz (see the plot).  We went to "scraping" 
mode and waited.  B0PAGC stayed above 20 kHz for about 
45 min and then MCR called and said they were going  
to scrape (note that D0 was just then going to 
"stand-by").  We were, of course, already in 
"stand-by".  After the scraping the abort gap 
losses stabalized at around 9 kHz and we turned 
everything back on (after we got permission from  
Silicon).  Unfortunately we had over an hour of 
down time... but we protected the Silicon! 

*** Muon PC 2 Died Twice *** 
Tom had to go down and restart it. 


End of Shift Numbers
CDF Run II

Runs                   179726
Delivered Luminosity   718.8 nb-1  
Acquired Luminosity    594.2 nb-1  
Efficiency             82.7

 - Rick Field
Mon Mar 8 07:56:10
Run Number Data Type Physics Table Begin Time End Time Live Time L1 Accepts L2 Accepts L3 Accepts Live Lumi, nb-1 GR SC RC
179726 x2BE0E BEAM PHYSICS_2_02 [2,424,431] 14:56:35 14:32:50 779,371,763 13,941,470 2,931,644 1796.705 1
Totals 07:55:03 14:32:50 779,371,763 13,941,470 2,931,644 1796.705
 - End of Shift Report