2004 CDF E-Log -- Eve shift. Sat Feb 28, 2004
SciCo DAQ Ace Monitoring Ace CO (Operations Manager)
Stephan Lammel Anadi Canepa Jan Ehlers Diego Cauz JJ Schmidt


Start of Shift Notes:  

Store 3261 (luminosity 2.0E31) and run 179473 (Silicon in, COT in degraded config) in
progress.

Sat Feb 28 16:31:19
trigger inhibit from TOF system (HV bars turned blinking red) 
cleared itself after 30 seconds 
global alarms for TOF shows wrong button description
 - Jan :: (run 179473)
Sat Feb 28 16:44:40 Run 179473 Terminated at 2004.02.28 16:44:14 - RunControl
Sat Feb 28 16:45:03 Run 179473 TERMINATE: Event funnel crashed  - Anadi x2080
Sat Feb 28 16:45:28
requested plot by MCR. Rainer and me agree that it looks pretty flat
 - du
Sat Feb 28 16:48:23
I have to end the run and reset since c16 subfarm lost tbe monitoring (gold).  
Event funnel crashed on the processor node.
 - Anadi
Sat Feb 28 16:52:39 Run 179474 Activated at 2004.02.28 16:51:02 - RunControl
Sat Feb 28 16:52:40 Run 179474 ACTIVATE: start run after L3 problem (subfarm c16, event funnel crashed)  - Anadi x2080
Sat Feb 28 17:06:46
 - Jan
-- Sat Feb 28 17:08:30 comment by...Jan --  
Hourlies (1515-1700)
-in addition to the PAGC spikes a slow rise is also seen

Sat Feb 28 17:14:20
total hits appear to have increased one or two times compared to the reference plots. Is this correct?
 - diego
Sat Feb 28 17:16:31
same comment, especially layer 3
 - diego
Sat Feb 28 17:42:35
Automatic HRR issued due to L2 TO, following almost 10 "dumping CLIST data" error. 
 - Anadi :: (run 179474)
Sat Feb 28 17:42:41
I also cheched the suspect (dead/noisy) channels for the IMU, 
this is the layout I have found: 
("-" means that a channel left the list;"+" that it joined it) 
WEST: L0:-1+2	L1:+2	L2:+2	L3:-3+2 
EAST: L0:-1	L1:-1	L2:-2	L3:-2 
so apparently the west detector has 8 'new' suspect channels, 
compared to the reference plots taken about one year ago.
 - diego :: (run 179473)
Sat Feb 28 18:14:49 MCR called regarding the proton abort gap loss spikes: they confirm that the problem is with the Tevatron, they observe the same structure in an E0 abort gap monitor. The abort gap sikes are correlated with the proton global phase noise monitor, i.e. RF related. They are involving additional experts. - Stephan
-- Sat Feb 28 18:25:42 comment by...jj --  
Note that our eLog from previous shift has multiple plots
showing the E0 loss monitor and the correlation. There is some
commonality in the readout system between the CDF abort gap
losses and the E0 variable however. This is why MCR asked for
plot of temperatures in the readout crate.

The proton global phase noise monitor is the first truly
independent variable that proved to the MCR that this is
not an instrumentation problem.

-- Mon Mar 1 07:53:06 comment by...R.J. Tesarek --  
The E0 and CDF variables only have a common method for readout. The crate containing the logic for the E0 variables is located at E0. The crate containing the B0 variables is located at B0. A third variable, B0MSC3 uses an independent readout (MADC compared with the scalers). This variable also shows the bumps in the abort gap (see plots for day shift).

-- Mon Mar 1 08:45:46 comment by...R. Vidal --  unless something has changed, B0MSC3 is also read-out through the scalars...
Sat Feb 28 18:20:02
Tevatron expert has identified one more variable that 
correlates with abort gap pulsations. Look at 
T:PGPN (Proton Global Phase Noise) - it can be  
plotted from data logger .Inst2  .  
Ops are contacting TeV RF experts to look further.
 - JJ
Sat Feb 28 18:29:17
Tevatron has turned off the longitudinal dampers to see 
if abort gap oscillation goes away. If no change, TeV will 
turn dampers back on.
 - jj
-- Sat Feb 28 18:32:14 comment by...JJ --  
I might note that Rick Vidal told me to ask Tevatron to
check state of the longitudinal dampers at 11:30 a.m. this
morning. I did not....

Sat Feb 28 18:53:06
SCPU_TRACER_EVENT_ID_ Error in the Hardware EVB due to FER crate XFT_FINDER_04. Automatic HRR
issued twice.  


 - Anadi
Sat Feb 28 19:01:41 Run 179474 Terminated at 2004.02.28 19:01:31 - RunControl
Sat Feb 28 19:02:32 Run 179474 TERMINATE: end run since MCR has to open the helix  - Anadi x2080
Sat Feb 28 19:03:58
 - Jan
-- Sat Feb 28 19:04:42 comment by...Jan --  
Hourlies from 1700-1900
abort gap losses fall slowly

-- Sat Feb 28 19:09:56 comment by...jj --  
Did abort gap pulsations go away??

Sat Feb 28 19:05:03 MCR (Dan) called, they would like to start opening the helix. There will be anti-proton studies at 8 pm and they would like to do it before. Bringing down HV to standby, COT off, CLC kept on. - Stephan
Sat Feb 28 19:07:07
Tevatron helix studies: 

ramping down voltages to standby state (except CLC [on] and COT [off]
 - Jan
Sat Feb 28 19:37:04 MCR (Dan) called helix studies complete. Abort gap losses 5 kHz. - Stephan
Sat Feb 28 19:47:02 Run 179475 Activated at 2004.02.28 19:46:36 - RunControl
Sat Feb 28 19:47:30 Run 179475 ACTIVATE: start test run TEST_ALPHA_CLUSTERING_NOSPIKES[2,430,403] - Anadi x2080
Sat Feb 28 20:57:04 Run 179475 Terminated at 2004.02.28 20:56:44 - RunControl
Sat Feb 28 20:58:37 from MCR e-log: =============== - Sat Feb 28 18:37:33 comment by...CE -- We're also looking at the longitudianal dampers as a potential culprit. Tan instructed us to turn it off for a while as a risk-free test. -- Sat Feb 28 19:09:54 comment by...DJC -- We have not had a spike in the abort gap losses since the longitudinal damper was turned off. It would appear there is a timing problem with the damper system. - Stephan
Sat Feb 28 21:09:27 Run 179476 ACTIVATE: DMODE Si calibration DPS ON  - anadi X 2080
Sat Feb 28 21:15:47 Run 179477 ACTIVATE: start Si DMODE Calib DPS ON  - anadi X 2080
Sat Feb 28 21:16:56
At the end of the ALPHA clustering test run, rc was stuck and I had to close and restarted it.
For this reason I couldn't select "potentially good run". We tried to mark it good but we were not
allowed.  

 - Anadi :: (run 179475)
Sat Feb 28 21:19:06 Run 179475 RUNSTATUS:
Marked Bad, explanation:
SVX  \
ISL   |
L00   | Alpha Clustering Test Run without Silicon
SVT  /
 - cdfscico
Sat Feb 28 21:20:21 Run 179477 TERMINATE: end Si DMODE Calib DPS ON  - anadi X 2080
Sat Feb 28 21:23:44 Run 179478 ACTIVATE: start Si DMODE Calib DPS OFF  - anadi X 2080
Sat Feb 28 21:28:43
 - Jan
-- Sat Feb 28 21:29:56 comment by...Jan --  
Hourlies (1900-2100)

huge broad spikes are due to the HELIX STUDIES

Sat Feb 28 21:39:37 Run 179480 Activated at 2004.02.28 21:39:09 - RunControl
Sat Feb 28 21:40:10 Run 179480 ACTIVATE: start new physics run after helix studies and Si Calib new trigger table PHYSICS_2_03[1,431,435] - anadi X 2080
Sat Feb 28 22:46:27
 
SVX over current trip DVDD, barrel 4 wedge 3 
   - ACE try to reset but gets "Handshake no responds from FIB" 
   - page Silicon, Pete calls back, will login remote to check 
   - BIAS trips too, barrel 4 wedge 3 
 - Stephan
Sat Feb 28 22:46:51 we halt the run to recover a silicon trip.  - Anadi
Sat Feb 28 22:56:05
I do not currently know when store 3261 will end.
My guess is 

that TeV Run Coordinator will let it go until early Sunday 
morning. However, if store is ended before 0800 Sunday or 
if it terminates itself during owl shift, here is plan. 
(Please pass to owl SciCo and Aces..) 

0) Page Ops if store terminates abnormally. 
1) Do Silicon checkout if appropriate. 
2) Do quiet time and non-quiet time calibrations. 
* D0 has asked for 1 hour controlled access so we 
  should have plenty of quiet time. * (CDF currently 
  has no short access requests.) 
3) Test PHYSICS_HIGHLUM_2_03[1,432,436] with no beam 
   and AAA_SHOTSETUP configuration. Run for 5 minutes, 
   there should be at least 1.7 MHz of MB_XING in the 
   trigger rate but probably nothing else. 
4) Take cosmics while waiting for shot setup. 
5) For next shot setup, page ops when final protons being 
   loaded and Silicon Primary and SPL when pbars start to 
   be loaded. 
6) Start next store with PHYSICS_HIGHLUM_2_03[1,432,436] 
   if it was tested successfully in item #3.
 - JJ
Sat Feb 28 23:09:19 we recover the run from the Si trip.  - Anadi
Sat Feb 28 23:14:16 Usual window with the high dead time warning (no voice message). L1 rate 13kHz, L2 rate 165 Hz, L3 rate 37 Hz  - Anadi
Sat Feb 28 23:14:26
DVDD of B4W3L0 and B4W3L1 tripped 
shortly after that also their bias tripped 

the normal procedure for Si trip recovery DIDN'T work 
paged Si expert - > solved problem (the how is unfortunately secret ;-)
 - Jan & Anadi in thoughts
-- Sun Feb 29 12:06:54 comment by...Rainer --  hmmm .... the normal procedure for Si trip recovery DID work - only YOU did not put the run in HALT, as is part of the normal procedure. once you did that, the normal trip recovery worked. also, there were no bias trips, at least not reported in the log files. See silicon elog
-- Sun Feb 29 14:53:41 comment by...Anadi --  We tried to recover with the run in active state, but soon after we recalled that it should be in halt. Once we put it in halt we tried again and the normal procedure didn't work. After that the Si pager called and worked on the ladder. With bias trip we probably mean that the bias cell in the monitor went red.
-- Sun Feb 29 16:45:20 comment by...Rainer --  See comment here
Sat Feb 28 23:20:44
 - Jan
-- Sat Feb 28 23:21:21 comment by...Jan --  
Hourlies 2100-2315

Sat Feb 28 23:55:19
Run Number Data Type Physics Table Begin Time End Time Live Time L1 Accepts L2 Accepts L3 Accepts Live Lumi, nb-1 GR SC RC
179473 x2BD11 BEAM PHYSICS_2_03 [1,431,435] 09:56:16 16:44:14 06:36:30 347,468,354 5,148,616 1,020,649 530.084 1 1 1
179474 x2BD12 BEAM PHYSICS_2_03 [1,431,435] 16:51:02 19:01:31 02:06:32 117,417,684 1,454,340 312,876 142.922 1 1 1
179475 x2BD13 BEAM TEST_ALPHA_CLUSTERING_NOSPIKES [2,430,403] 19:46:36 20:56:44 01:06:27 1,087,171 208,800 208,800 69.290 1 1
179480 x2BD18 BEAM PHYSICS_2_03 [1,431,435] 21:39:09 01:09:12 51,959,834 659,962 147,990 67.886 1
Totals 23:55:01 10:58:42 517,933,043 7,471,718 1,690,315 810.182
 - End of Shift Report
Sat Feb 28 23:57:10 Shift Summary:
Store 3261 still in progress luminosity dropped from
2.0E31 to 1.?E31 

   - continued data taking with new low lum trigger table (COT in degraded 
      config, Silicon on) 
   - sikes in proton abort gap loss traced by MCR to timing problem with 
      longitudinal dampers 
   - MCR opened helix at 19:00 done about 30 min later 
   - took Alpha Clustering Test data 
   - took Silicon dmode calibrations 
   - resumed data taking with new low lum trigger table (COT in degraded 
      config, Silicon on) 
   - SVX over current DVDD/BIAS trip 
   - plug HV problem 

Plan is to continue data taking until end of store 
   - about 1 hour access after store, no CDF requests 
   - see JJ's owl shift plan above

End of Shift Numbers
CDF Run II

Runs                   179473, 179474, 179475, 179480
Delivered Luminosity   0.504 fb^-1  
Acquired Luminosity    0.344 fb^-1  
Efficiency             68.2

 - Stephan
Sat Feb 28 23:58:00
plug (PEM,PHA,PSH)hardware problems (HV went yellow) and while trying the power cycle procedure
several times we lost communication (HV went grey) 

-> after doing the three recovery steps in the third floor the plug is back to normal!
 - Jan & Anadi
Sun Feb 29 00:03:33 Shift Summary:
ending again
 - miscetti
Sun Feb 29 16:44:23 ok now I understand from your clarification that you realized that you as daq ace needed to go into HALT - this was not clear to me from the original entry. so I am sorry.

the second time did not work because ladders might trip on DVDD first, but the second time it usually works. alternatively, the monitoring ace could have put the whole wedge to OFF and ON as written in the instructions in front of the PSGui computer, as was then finally done. but it is also true that resetting single ladder trips is a faster way - the penalty being that one has to recover maybe twice.  (entry outside this shift's time range ) - Rainer