|
2004 CDF E-Log -- Eve shift. Sat Feb 28, 2004 |
| SciCo |
DAQ Ace |
Monitoring Ace |
CO |
(Operations Manager) |
| Stephan Lammel |
Anadi Canepa |
Jan Ehlers |
Diego Cauz |
JJ Schmidt |
Start of Shift Notes:  Store 3261 (luminosity 2.0E31) and run 179473 (Silicon in, COT in degraded config) in
progress.
Sat Feb 28 16:31:19
trigger inhibit from TOF system (HV bars turned blinking red)
cleared itself after 30 seconds
global alarms for TOF shows wrong button description
- Jan :: (run 179473)
Sat Feb 28 16:44:40
Run 179473
Terminated at 2004.02.28 16:44:14 - RunControl
Sat Feb 28 16:45:03
Run 179473
TERMINATE: Event funnel crashed - Anadi x2080
Sat Feb 28 16:45:28
 | requested plot by MCR.
Rainer and me agree that it looks pretty flat |
- du
Sat Feb 28 16:48:23
I have to end the run and reset since c16 subfarm lost tbe monitoring (gold).
Event funnel crashed on the processor node.
- Anadi
Sat Feb 28 16:52:39
Run 179474
Activated at 2004.02.28 16:51:02 - RunControl
Sat Feb 28 16:52:40
Run 179474
ACTIVATE: start run after L3 problem (subfarm c16, event funnel crashed) - Anadi x2080
Sat Feb 28 17:06:46



- Jan
-- Sat Feb 28 17:08:30 comment by...Jan -- Hourlies (1515-1700)
-in addition to the PAGC spikes a slow rise is also seen
Sat Feb 28 17:14:20
 | total hits appear to have increased one or two times
compared to the reference plots.
Is this correct? |
- diego
Sat Feb 28 17:16:31
 | same comment, especially layer 3 |
- diego
Sat Feb 28 17:42:35
Automatic HRR issued due to L2 TO, following almost 10 "dumping CLIST data" error.
- Anadi :: (run 179474)
Sat Feb 28 17:42:41
I also cheched the suspect (dead/noisy) channels for the IMU,
this is the layout I have found:
("-" means that a channel left the list;"+" that it joined it)
WEST: L0:-1+2 L1:+2 L2:+2 L3:-3+2
EAST: L0:-1 L1:-1 L2:-2 L3:-2
so apparently the west detector has 8 'new' suspect channels,
compared to the reference plots taken about one year ago. - diego :: (run 179473)
Sat Feb 28 18:14:49
MCR called regarding the proton abort gap loss spikes: they
confirm that the problem is with the Tevatron, they observe the same structure in an E0 abort gap monitor. The abort gap sikes are correlated with the proton global phase noise monitor, i.e.
RF related. They are involving additional experts. - Stephan
-- Sat Feb 28 18:25:42 comment by...jj -- Note that our eLog from previous shift has multiple plots
showing the E0 loss monitor and the correlation. There is some
commonality in the readout system between the CDF abort gap
losses and the E0 variable however. This is why MCR asked for
plot of temperatures in the readout crate.
The proton global phase noise monitor is the first truly
independent variable that proved to the MCR that this is
not an instrumentation problem.
-- Mon Mar 1 07:53:06 comment by...R.J. Tesarek -- | | The E0 and CDF variables only have a common method for readout.
The crate containing the logic for the E0 variables is located
at E0. The crate containing the B0 variables is located at B0.
A third variable, B0MSC3 uses an independent readout (MADC compared with the scalers). This variable also shows the bumps
in the abort gap (see plots for day shift). |
-- Mon Mar 1 08:45:46 comment by...R. Vidal -- unless something has changed, B0MSC3 is also read-out through the scalars...
Sat Feb 28 18:20:02
Tevatron expert has identified one more variable that
correlates with abort gap pulsations. Look at
T:PGPN (Proton Global Phase Noise) - it can be
plotted from data logger .Inst2 .
Ops are contacting TeV RF experts to look further.
- JJ
Sat Feb 28 18:29:17
Tevatron has turned off the longitudinal dampers to see
if abort gap oscillation goes away. If no change, TeV will
turn dampers back on.
- jj
-- Sat Feb 28 18:32:14 comment by...JJ -- I might note that Rick Vidal told me to ask Tevatron to
check state of the longitudinal dampers at 11:30 a.m. this
morning. I did not....
Sat Feb 28 18:53:06
SCPU_TRACER_EVENT_ID_ Error in the Hardware EVB due to FER crate XFT_FINDER_04. Automatic HRR
issued twice.
- Anadi
Sat Feb 28 19:01:41
Run 179474
Terminated at 2004.02.28 19:01:31 - RunControl
Sat Feb 28 19:02:32
Run 179474
TERMINATE: end run since MCR has to open the helix - Anadi x2080
Sat Feb 28 19:03:58



- Jan
-- Sat Feb 28 19:04:42 comment by...Jan -- Hourlies from 1700-1900
abort gap losses fall slowly
-- Sat Feb 28 19:09:56 comment by...jj -- Did abort gap pulsations go away??
Sat Feb 28 19:05:03
MCR (Dan) called, they would like to start opening the helix. There will be anti-proton studies at 8 pm and they would like to do it before. Bringing down HV to standby, COT off, CLC kept on. - Stephan
Sat Feb 28 19:07:07
Tevatron helix studies:
ramping down voltages to standby state (except CLC [on] and COT [off]
- Jan
Sat Feb 28 19:37:04
MCR (Dan) called helix studies complete. Abort gap losses 5 kHz. - Stephan
Sat Feb 28 19:47:02
Run 179475
Activated at 2004.02.28 19:46:36 - RunControl
Sat Feb 28 19:47:30
Run 179475
ACTIVATE: start test run TEST_ALPHA_CLUSTERING_NOSPIKES[2,430,403] - Anadi x2080
Sat Feb 28 20:57:04
Run 179475
Terminated at 2004.02.28 20:56:44 - RunControl
Sat Feb 28 20:58:37
from MCR e-log:
===============
- Sat Feb 28 18:37:33 comment by...CE -- We're also looking at the longitudianal dampers as a potential culprit. Tan instructed us to turn it off for a while as a risk-free test.
-- Sat Feb 28 19:09:54 comment by...DJC -- We have not had a spike in the abort gap losses since the longitudinal damper was turned off. It would appear there is a timing problem with the damper system. - Stephan
Sat Feb 28 21:09:27
Run 179476
ACTIVATE: DMODE Si calibration DPS ON - anadi X 2080
Sat Feb 28 21:15:47
Run 179477
ACTIVATE: start Si DMODE Calib DPS ON - anadi X 2080
Sat Feb 28 21:16:56
At the end of the ALPHA clustering test run, rc was stuck and I had to close and restarted it.
For this reason I couldn't select "potentially good run". We tried to mark it good but we were not
allowed.
- Anadi :: (run 179475)
Sat Feb 28 21:19:06
Run 179475
RUNSTATUS: Marked Bad, explanation:
SVX \
ISL |
L00 | Alpha Clustering Test Run without Silicon
SVT /
- cdfscico
Sat Feb 28 21:20:21
Run 179477
TERMINATE: end Si DMODE Calib DPS ON - anadi X 2080
Sat Feb 28 21:23:44
Run 179478
ACTIVATE: start Si DMODE Calib DPS OFF - anadi X 2080
Sat Feb 28 21:28:43



- Jan
-- Sat Feb 28 21:29:56 comment by...Jan -- Hourlies (1900-2100)
huge broad spikes are due to the HELIX STUDIES
Sat Feb 28 21:39:37
Run 179480
Activated at 2004.02.28 21:39:09 - RunControl
Sat Feb 28 21:40:10
Run 179480
ACTIVATE: start new physics run after helix studies and Si Calib
new trigger table PHYSICS_2_03[1,431,435] - anadi X 2080
Sat Feb 28 22:46:27
SVX over current trip DVDD, barrel 4 wedge 3
- ACE try to reset but gets "Handshake no responds from FIB"
- page Silicon, Pete calls back, will login remote to check
- BIAS trips too, barrel 4 wedge 3
- Stephan
Sat Feb 28 22:46:51
we halt the run to recover a silicon trip. - Anadi
Sat Feb 28 22:56:05
| I do not currently know when store 3261 will end.
My guess is
that TeV Run Coordinator will let it go until early Sunday
morning. However, if store is ended before 0800 Sunday or
if it terminates itself during owl shift, here is plan.
(Please pass to owl SciCo and Aces..)
0) Page Ops if store terminates abnormally.
1) Do Silicon checkout if appropriate.
2) Do quiet time and non-quiet time calibrations.
* D0 has asked for 1 hour controlled access so we
should have plenty of quiet time. * (CDF currently
has no short access requests.)
3) Test PHYSICS_HIGHLUM_2_03[1,432,436] with no beam
and AAA_SHOTSETUP configuration. Run for 5 minutes,
there should be at least 1.7 MHz of MB_XING in the
trigger rate but probably nothing else.
4) Take cosmics while waiting for shot setup.
5) For next shot setup, page ops when final protons being
loaded and Silicon Primary and SPL when pbars start to
be loaded.
6) Start next store with PHYSICS_HIGHLUM_2_03[1,432,436]
if it was tested successfully in item #3. |
- JJ
Sat Feb 28 23:09:19
we recover the run from the Si trip. - Anadi
Sat Feb 28 23:14:16
Usual window with the high dead time warning (no voice message).
L1 rate 13kHz, L2 rate 165 Hz, L3 rate 37 Hz - Anadi
Sat Feb 28 23:14:26
DVDD of B4W3L0 and B4W3L1 tripped
shortly after that also their bias tripped
the normal procedure for Si trip recovery DIDN'T work
paged Si expert - > solved problem (the how is unfortunately secret ;-)
- Jan & Anadi in thoughts
-- Sun Feb 29 12:06:54 comment by...Rainer -- hmmm .... the normal procedure for Si trip recovery DID work - only YOU did not put the run in HALT, as is part of the normal procedure. once you did that, the normal trip recovery worked. also, there were no bias trips, at least not reported in the log files. See silicon
elog
-- Sun Feb 29 14:53:41 comment by...Anadi -- We tried to recover with the run in active state, but soon after we recalled that it should be in halt. Once we put it in halt we tried again and the normal procedure didn't work. After that the Si pager called and worked on the ladder.
With bias trip we probably mean that the bias cell in the monitor went red.
-- Sun Feb 29 16:45:20 comment by...Rainer -- See comment
here
Sat Feb 28 23:20:44



- Jan
-- Sat Feb 28 23:21:21 comment by...Jan -- Hourlies 2100-2315
Sat Feb 28 23:55:19
| Run Number |
Data Type |
Physics Table |
Begin Time |
End Time |
Live Time |
L1 Accepts |
L2 Accepts |
L3 Accepts |
Live Lumi, nb-1 |
GR |
SC |
RC |
|
179473
x2BD11 |
BEAM |
PHYSICS_2_03 [1,431,435] |
09:56:16 |
16:44:14 |
06:36:30 |
347,468,354 |
5,148,616 |
1,020,649 |
530.084 |
1 |
1 |
1 |
|
179474
x2BD12 |
BEAM |
PHYSICS_2_03 [1,431,435] |
16:51:02 |
19:01:31 |
02:06:32 |
117,417,684 |
1,454,340 |
312,876 |
142.922 |
1 |
1 |
1 |
|
179475
x2BD13 |
BEAM |
TEST_ALPHA_CLUSTERING_NOSPIKES [2,430,403] |
19:46:36 |
20:56:44 |
01:06:27 |
1,087,171 |
208,800 |
208,800 |
69.290 |
1 |
|
1 |
|
179480
x2BD18 |
BEAM |
PHYSICS_2_03 [1,431,435] |
21:39:09 |
|
01:09:12 |
51,959,834 |
659,962 |
147,990 |
67.886 |
|
|
1 |
| Totals |
|
|
|
23:55:01 |
10:58:42 |
517,933,043 |
7,471,718 |
1,690,315 |
810.182 |
|
|
|
- End of Shift Report
Sat Feb 28 23:57:10
Shift Summary: Store 3261 still in progress luminosity dropped from
2.0E31 to 1.?E31
- continued data taking with new low lum trigger table (COT in degraded
config, Silicon on)
- sikes in proton abort gap loss traced by MCR to timing problem with
longitudinal dampers
- MCR opened helix at 19:00 done about 30 min later
- took Alpha Clustering Test data
- took Silicon dmode calibrations
- resumed data taking with new low lum trigger table (COT in degraded
config, Silicon on)
- SVX over current DVDD/BIAS trip
- plug HV problem
Plan is to continue data taking until end of store
- about 1 hour access after store, no CDF requests
- see JJ's owl shift plan above
End of Shift Numbers
|
CDF Run II
Runs 179473, 179474, 179475, 179480
Delivered Luminosity 0.504 fb^-1
Acquired Luminosity 0.344 fb^-1
Efficiency 68.2
|
- Stephan
Sat Feb 28 23:58:00
plug (PEM,PHA,PSH)hardware problems (HV went yellow) and while trying the power cycle procedure
several times we lost communication (HV went grey)
-> after doing the three recovery steps in the third floor the plug is back to normal!
- Jan & Anadi
Sun Feb 29 00:03:33
Shift Summary: ending again
- miscetti
Sun Feb 29 16:44:23
ok now I understand from your clarification that you realized that you as daq ace needed to go into HALT - this was not clear to me from the original entry. so I am sorry.
the second time did not work because ladders might trip on DVDD first, but the second time it usually works. alternatively, the monitoring ace could have put the whole wedge to OFF and ON as written in the instructions in front of the PSGui computer, as was then finally done. but it is also true that resetting single ladder trips is a faster way - the penalty being that one has to recover maybe twice. (entry outside this shift's time range ) - Rainer