|
2004 CDF E-Log -- Eve shift. Fri Mar 5, 2004 |
| SciCo |
DAQ Ace |
Monitoring Ace |
CO |
(Operations Manager) |
Rei Tanaka / Stephan Lammel |
Andrew Ivanov |
Simon Sabik |
Guenakh Mitselmakher / Diego Cauz |
Mary Convery |
Start of Shift Notes:  Still in access for 5-8 hours (D0).
Fri Mar 5 16:35:58
D0 needs a couple of hours more for an access. - Rei Tanaka
Fri Mar 5 16:38:28
Played around testing TOF loading of HV file settings.
Looks ok! - bauerg
Fri Mar 5 17:01:19
Run 179672
Activated at 2004.03.05 17:00:43 - RunControl
Fri Mar 5 17:01:20
Run 179672
ACTIVATE: COSMICS[12,391,403] - Andrew X2080
Fri Mar 5 17:06:51
About Level3 page
It looks like experimental trigger table tests from previous shift made level3 control program l3_node crash.
when l3_node crashes it dumps the core in the format core.xxx
after they tried a few times
almost all the nodes of level3 farm had 100 % full /cdf partition
That's why level3 did not work after they returned to a default trigger table
Now , to fix this there is a linux config file
called /etc/sysctl.conf
which controls how the program dumps the core
I checked that and it was fine
kernel.core_uses_pid = 0
which tells the kernel to dump the core in core format
What remains unclear is why l3_node still dumps core.xxx
because other level3 programs dump just a core file
One likely reason is that variable is overwritten somewhere in the level3 code
but this can not be because for change to take place
the machine has to be rebooted
The other reason is different core dump happens whether the program is
single threaded or multi threaded
To fix Level3 i cleaned up core files on all the farm
Now level3 seems to be fine - Arkadiy
Fri Mar 5 17:44:17
from the Run Coordinator elog:
16:35:33- D0 estimates that their access will end around 1900. They can have more time if they think they have identified the problem and can fix it. After the access, we will recover and go into shot setup. Pbar studies can continue until shot setup if the studier has the stamina. - JPM
- convery
Fri Mar 5 18:16:46
TOF channel trip. It was not possiblre to reset it from IFIX. It said "system is busy". I resetted it from the TOF PC. It is back up. - Simon
Fri Mar 5 18:39:11
Run 179672
Terminated at 2004.03.05 18:38:44 - RunControl
Fri Mar 5 18:39:19
Run 179672
TERMINATE: end the run for a clean restart - Andrew X2080
Fri Mar 5 18:40:39
Cryo shift Jim H. came up. As written in Cryo&Gas e-log, Rack 2RR23C has a single fan failure, the bottom fan is off. Jim says that this is not an urgent problem, and sould be notified at 8a.m. meeting on Monday and to be repaired. - Rei Tanaka
Fri Mar 5 18:43:44
Run 179673
ACTIVATE: another trial - Andrew X2080
Fri Mar 5 18:46:15
TOF channel tripped again. Channel 308 (same as last time). Ramped it back on from IFIX. - Simon
Fri Mar 5 19:10:31
Run 179674
ACTIVATE: ISL frontend setting checkout - 6880 rsw
Fri Mar 5 19:11:03
Called MCR asking the status. D0 has finished their access. They will check D0 first if everything is OK. Then shop setup will follow. - Rei Tanaka
Fri Mar 5 19:11:31
Run 179674
TERMINATE: end ISEL FE setting test run - 6880 rsw
Fri Mar 5 19:13:11
Run 179675
ACTIVATE: try one more time - 6880 rsw
Fri Mar 5 19:15:25
Run 179675
TERMINATE: end test run for ISL - 6880 rsw
Fri Mar 5 19:20:56
Run 179676
ACTIVATE: test run ISL with svxmon enabled in partition 0 - 6880 rsw
Fri Mar 5 19:28:41
Run 179673
Terminated at 2004.03.05 19:28:36 - RunControl
Fri Mar 5 19:29:42
Run 179673
TERMINATE: stop for a clean start - Andrew X2080
Fri Mar 5 19:31:37
Run 179677
Activated at 2004.03.05 19:30:59 - RunControl
Fri Mar 5 19:31:52
Run 179677
ACTIVATE: start new cosmic run for pulsar fiber splitting (MUON) - Andrew X2080
Fri Mar 5 19:33:10
Run 179677
Terminated at 2004.03.05 19:33:01 - RunControl
Fri Mar 5 19:33:45
Run 179677
TERMINATE: end cosmics - Andrew X2080
Fri Mar 5 19:37:31
Run 179676
TERMINATE: end test run - 6880 rsw
Fri Mar 5 19:41:36
Run 179678
Activated at 2004.03.05 19:41:15 - RunControl
Fri Mar 5 19:42:21
Run 179678
ACTIVATE: new run for the muon fiber splitting - Andrew X2080
-- Fri Mar 5 20:05:53 comment by...Andrew -- it is cosmics run
Fri Mar 5 19:46:21
ICICLE heart beat on IFIX. Restarted it. - Simon
Fri Mar 5 20:16:02
| | D0 said they disengaged a part of their Toroid, now the
magnet is in full power and they no longer see the noise in the
calorimeter. So, they seem to have fixed the problem! |
- nigel and kaori
Fri Mar 5 20:43:46
MCR called us. Shall start shot setup in 5-10 minutes. - Rei Tanaka
-- Fri Mar 5 20:54:05 comment by...Rei Tanaka -- MCR called us again. Shot setup started.
Fri Mar 5 20:55:14
Run 179678
Terminated at 2004.03.05 20:54:27 - RunControl
Fri Mar 5 20:55:15
Run 179678
TERMINATE: end run - Andrew X2080
Fri Mar 5 21:00:50
Run 179679
Activated at 2004.03.05 20:56:53 - RunControl
Fri Mar 5 21:00:51
Run 179679
ACTIVATE: new run for MUON fiber splitting, all fibers splitted, accumulate statistics - Andrew X2080
Fri Mar 5 21:05:57
CMX trip. SW channels 9 to 12. Ramped back up. - Simon
-- Fri Mar 5 21:10:23 comment by...Simon -- ALL HV on standby for shot setup.
Fri Mar 5 21:21:43
Run 179679
Terminated at 2004.03.05 21:21:33 - RunControl
Fri Mar 5 21:22:23
Run 179679
TERMINATE: switching to SHOTSETUP - Andrew X2080
Fri Mar 5 21:33:18
Run 179680
ACTIVATE: SHOTSETUP run + accumulating statistics for fiber splitting test - Andrew X2080
Fri Mar 5 21:49:32
| Date | Time | BLM | Dose |
| 2004.03.05 | 21:49:07 | W Inner BLM | 0.00 | RADS |
| 2004.03.05 | 21:49:07 | W Outer BLM | 0.00 | RADS |
| 2004.03.05 | 21:49:07 | E Inner BLM | 0.00 | RADS |
| 2004.03.05 | 21:49:07 | E Outer BLM | 0.00 | RADS |
Integrated dosage - Simon
Fri Mar 5 21:53:33
For Guenakh:
if trying to stop the consumers ^C fails, one has to
log into the machine the consumer is running on.
If also the login fails, try with:
> kdestroy
> kticket
> ssh -l cdfdaq
- diego
Fri Mar 5 21:54:43
| | Tonight for the first time we got the fiber splitting for muon
path working (still waiting for more statistics to test the
robustness of the splitting). We have tried in a systematic
way to understand the problems encountered in the past.
To make the long story short, the problems were mostly due
to one bad fiber (from splitter to L2 muon Pulsar input in
the L2 decision crate). The details can be found at
Pulsar e-log .
We plan to leave this splitting setup in for
the upcoming store, to test the robustness. PLEASE watch the
TrigMon for the L2 Pulsar plots, if there is any problem, please
page Pulsar pager: 218-9486 (Burkard), 630-544-7530(Burkard cell), 630-357-1530 (Burkard home). 630-988-9986 (Ted cell),
and 630-357-9986(Ted home). Please also let the people for the
upcoming shifts know about this new fiber splitter setup for the
L2 muon path.
|
- Burkard and Ted
-- Sat Mar 6 01:47:58 comment by...Burkard and Ted -- the splitting didn't work in the beam (see later entry on this
shift e-log. three channels failed. the splitters are removed.
Fri Mar 5 22:26:11
Run 179680
TERMINATE: switch to test alpha table - Andrew X2080
Fri Mar 5 22:39:14
Run 179681
ACTIVATE: TEST_ALPHA_CLUSTERING_NOSPIKES[4,436,403] - Andrew X2080
Fri Mar 5 22:43:03
Run 179681
TERMINATE: end test - Andrew X2080
Fri Mar 5 23:04:21
Run 179682
Activated at 2004.03.05 23:04:10 - RunControl
Fri Mar 5 23:05:35
Run 179682
ACTIVATE: PHYSICS_2_02[2,424,431] - Andrew X2080
Fri Mar 5 23:09:29
At L=67 we have ~30% deadtime with the default PHYSICS_2_02
L1 14 kHz
L2 390 Hz
L3 70 Hz
- convery :: (run 179682)
-- Fri Mar 5 23:22:57 comment by...Taka -- L3 looks having room. Problem might be a L2 rate.
-- Fri Mar 5 23:36:07 comment by...Taka -- "problem" means 30% dead time.
Fri Mar 5 23:11:21
--------------------------------------------------------
Mar.05, 2004 Store #3275 Time p pbar
--------------------------------------------------------
Previous store dumped. 11:12
Shot setup started 20:54 169E10
proton injection started 21:50 (call Ops Manager)
pbars loading started 22:13 (page SVX Primary)
pbars loading finished 22:38 E10 E10
Scraping end
Lumi = E30 E30
MCR called us at 22:56 (scraping finished)
CLC HV up 23:00 (call MCR)
C:B0ILUM 68.30 E30
C:LOSTP 11 kHz
C:LOSTPB 0.6KHz
TevPR 8438 10**9
TevPB 1201 10**9
H.V. (except Si) ON 23:02
H.V. Silicon ON 23:05
Physics RUN# start 179682 PHYSICS_2_02[2,424,431]
Lumi = E30 67.03 E30
--------------------------------------------------------
- Rei Tanaka
-- Sat Mar 6 00:11:49 comment by...convery -- FYI from the run coordinator elog:
19:31:06- Shot Strategy: Protons 270-280E9 per bunch at 150 GeV in the Tevatron, pbars as per guidelines. No extra or "aggressive" cooling on this shot, we want to make sure smaller pbars aren't skewing the luminosity lifetime. After the shot, stack to 40E10 for pbar shots to the Recycler. - JPM
Fri Mar 5 23:11:57
SVT new firmware looks OK at least no harmful so far. - Taka Maruyama
Fri Mar 5 23:22:30
CO finds "too many errors" message in TrigMon L2 Pulsar plots.
We also got one trigger time out error. Paged experts, and he is coming. - Rei Tanaka
Fri Mar 5 23:25:12
Problem with ACNET. I can do the plots, but I can't save the files. Mary is calling Steve. - Simon
Fri Mar 5 23:26:23
ISL05, L00 is tripped. Trigger inhibit is set. - Ace
-- Fri Mar 5 23:28:55 comment by...Shift crew -- Precisely: TRIP:ISL05, IFIX:ISL HV and IFIX:L00 were set.
-- Fri Mar 5 23:44:48 comment by...Andrew -- CAEN Crate #12 power supplies dissappeared from the Silicon PS GUI
window. I went downstairs and reset the crate.After that we powered
Silicon back on and had a few DVDD trips during transition.
Recovered from trips and everything is back to normal.
Fri Mar 5 23:26:27
 | L2 Pulsar Errors, 10 min into the run... |
- G. Mitselmakher
-- Fri Mar 5 23:38:55 comment by...Ace -- Pulsar guys is here. And work for this problem.
-- Sat Mar 6 00:02:13 comment by...Burkard and Ted -- This error was due to the fact that we splitted the muon fibers
for a test run with beam(see entry earlier in this shift). The
splitting worked with cosmic run and shot setup runs, but
doesn't work with beam. We now removed all fiber splitting for
L2 muon path and put the system back to the default setting
and started a new run. Now there is no more error. We will
stay here for a while to make sure. The splitting was done
mostly with old type SVX/SVT fiber splitters (we only have a
few new better performance splitters in hand at this point).
Since L2 muon trigger is still in tagging mode, this run
should not be marked as bad simply due to this problem.
Fri Mar 5 23:39:38
Got Solenoid Trip alarm at 23:30, and it went away 10sec later. Called Cryo, and he says that it happened due to current limit excess. - Rei Tanaka
Fri Mar 5 23:43:06
Both CPU of L3 node#12 are pink. - Ace
Fri Mar 5 23:48:30
Run 179682
Terminated at 2004.03.05 23:48:15 - RunControl
Fri Mar 5 23:49:36
Run 179682
TERMINATE: end the run and start a new one without fiber splitting - Andrew X2080
Fri Mar 5 23:51:42
Run 179683
Activated at 2004.03.05 23:51:25 - RunControl
Fri Mar 5 23:51:52
Run 179683
ACTIVATE: PHYSICS_2_02[2,424,431] - Andrew X2080
Fri Mar 5 23:55:48
| Run Number |
Data Type |
Physics Table |
Begin Time |
End Time |
Live Time |
L1 Accepts |
L2 Accepts |
L3 Accepts |
Live Lumi, nb-1 |
GR |
SC |
RC |
|
179682
x2BDE2 |
BEAM |
PHYSICS_2_02 [2,424,431] |
23:04:10 |
23:48:15 |
00:17:28 |
19,796,653 |
533,233 |
92,122 |
68.289 |
|
1 |
1 |
| Totals |
|
|
|
23:55:02 |
00:17:28 |
19,796,653 |
533,233 |
92,122 |
68.289 |
|
|
|
- End of Shift Report
Fri Mar 5 23:57:22
Shift Summary: - D0 made 8-hour access to investigate the noise on their
calorimeter and muon
systems coming from their toroid's spark. They disengaged the troid jumper,
the problem has been fixed.
- Smooth shot setup followed.
Work done:
- TOF HV worked on by Gerry Bauer.
- L2 people, Burkard and Ted, worked on muon fibre splitting.
Watch the TrigMon for the L2 Pulsar plots.
Known problems:
- Water leak in roof which cannot be fixed until monday.
- Rack 2RR23C has a single fan failure, the bottom fan is off.
To be repaired on monday.
- One Solenoid Trip alarm due to current limit excess.
- ACNET cannot save files.
- L2 Pulsar errors in TrigMon. Experts are working.
Plan:
- Take good quality data!
- Rei Tanaka
Sat Mar 6 00:00:54
Luminosity summary
Begin shift or beam at 22:57:46
End shift or beam at 23:58:04
Delivered luminosity: 230.79
Acquired luminosity: 83.54
| Totals |
| Date: | 2004.03.05 |
| Shift: | eve |
| Delivered luminosity: | 230.8 nb-1 |
| Acquired luminosity: | 83.5 nb-1 |
| Efficiency: | 36.2 |
Plot not available
This script has been called
3500
times since Aug 16th 2003
- Rei Tanaka
Sat Mar 6 00:01:09
Got PCAL_02 timeout. Shepherded the crate and continued running.
- Andrew :: (run 179683)
Sat Mar 6 00:05:32
 | After we removed the L2 muon fiber splitters, no more
errors in L2 Pulsar TrigMon plots. |
- Burkard