|
2004 CDF E-Log -- Eve shift. Thu Mar 4, 2004 |
| SciCo |
DAQ Ace |
Monitoring Ace |
CO |
(Operations Manager) |
| Stephan LAmmel |
Ian Vollrath |
Christopher Marino |
Diego Cauz |
Mary Convery |
Start of Shift Notes:  Store 3273 (inst lumi 2.68E31) and run 179643 (PHYSICS_2_02)
in progress
Thu Mar 4 16:38:16
| Date | Time | BLM | Dose |
| 2004.03.04 | 16:34:05 | W Inner BLM | 263.19 | RADS |
| 2004.03.04 | 16:34:05 | W Outer BLM | 0.00 | RADS |
| 2004.03.04 | 16:34:05 | E Inner BLM | 3.03 | RADS |
| 2004.03.04 | 16:34:05 | E Outer BLM | 263.19 | RADS |
Integrated dosage - Christopher
Thu Mar 4 16:47:05



- Christopher
Thu Mar 4 16:59:55
Question for the IMU/BMU people: is there any reason why all the
East layers have channel n. 71 dead and all the
West layers have channels n. 72, 142, 143 dead?
- diego :: (run 179643)
-- Fri Mar 5 09:20:06 comment by...Camille Ginsburg -- Dear Diego,
Thanks for the careful check of the data.
BMU-E-71 and BMU-W-72 are un-instrumented. They
are at the very top of the detector, where there
was no space. BMU-W-142 and BMU-W-143 have
the pre-amplifiers unplugged because these stacks
were oscillating. (probably only one is actually
oscillating, but these two stacks are jumpered together)
We will repair them at the next access opportunity
(presumably March 15).
Thu Mar 4 17:30:19
Trigmon Slide n. 2 L2 TriggerMonitor 2.2.3.2
Ncluster: TL2D vs. TC2D
shows an error rate slightly over the limit: 0.12% vs. 0.10%
I'm keeping an eye over it.
- diego :: (run 179643)
-- Thu Mar 4 17:32:57 comment by...diego -- oops! I goofed. The limit is 1% not 0.1%
Thu Mar 4 17:35:48
got silicon resonance error:
MLE) b0dap73.fnal.gov:Thread-28:5:31:51 PM->FrontEnd Error: VRB_SVX_02
CT: 2004.03.04 17:32:03
32'3" 1 crate/s: b0svx02(112), in error.[RXPT]b0svx02:Messenger:5:31:59 PM->SRC Fatal Error:Sl 5
Resonance Det
ected
-->
Additional Information:
Attention !!!. FERML_SRC_FATALITY ERROR !!!
SRC Fatal Error from b0svx02: Sl 5 Resonance Detected
hrr worked.
- ian :: (run 179643)
Thu Mar 4 17:50:15
got error:
MLE) b0dap73.fnal.gov:Thread-28:5:43:21 PM-> Level 2 Decision Timeout
(MLE) b0dap73.fnal.gov:Thread-28:5:43:21 PM->Requested Halt-Recover-Run issued [errmon]
(MLE) b0l2de00:SpyAlpha:5:43:23 PM->
L1Mon: saw 210 L1 DMA transfers, expect 1 (buffer number 1)
L1Mon: Dumping data for 1 word.
hrr worked.
- ian :: (run 179643)
-- Thu Mar 4 17:51:12 comment by...ian -- again at 17:49
Thu Mar 4 17:52:37



- Christopher
-- Thu Mar 4 17:53:49 comment by...Christopher -- Hourly Plots: Note LOSTP spike at 5:30
Thu Mar 4 17:56:18
 | is this red plot something we should worry about? |
- diego
-- Thu Mar 4 19:25:23 comment by...ps -- no
Thu Mar 4 18:05:50
got error:
Attention!!!. SCPU_TRACER_EVENT_ID Error !!!
Hardware EVB has detected a problem with data quality in
SCPU b0eb15 (forwarded by FER crate WCAL_04).
hrr worked.
- ian :: (run 179643)
Thu Mar 4 18:43:05



- Christopher
Thu Mar 4 18:44:09
got error:
(MLE) b0l2de00:SpyAlpha:6:40:13 PM->
L1Mon: saw 210 L1 DMA transfers, expect 1 (buffer number 0)
L1Mon: Dumping data for 1 word.
followed by:
(MLE) b0l3pcom1.fnal.gov:main:6:40:31 PM->Host b0eb20.fnal.gov, task tRec_0
SCPU-P1-E-VrbHeader: Dump of header words for event 5299331 from VRB in slot 16:
hrr worked.
- ian :: (run 179643)
Thu Mar 4 19:04:30
Process System GAS alarm has tripped twice in last 3 minutes
Ethane Heater Box N2Purge DP under Shed Alarms is to blame - Christopher
-- Thu Mar 4 19:10:33 comment by...Christopher -- Process Systems Report: Alarm is a recurring weather related problem
Thu Mar 4 19:10:55
Reiner spotted yellow (no data) boxes in SVXmon SVX chip status map display; ask to page Silicon; Florencia called back and will
come in to check things out - Stephan :: (run 179643)
-- Thu Mar 4 19:17:37 comment by...rainer -- this looks like a GLINK issue - try an HRR to get it into shape.
-- Thu Mar 4 22:38:50 comment by...rainer -- VRB problem in slot #11 b0svx02
HRR did not help to get VRB back in shape
intended to reset the VRB and re-start the run
due to communication failure with shift crew, reset VRB while ending of run was still in progress
the run was aborted and event builder cleanup attempted, which seem to have failed -
as a result L3 appeared to be somewhat
completely broken
reset of VRB and crate CPU fixed the problem and brought data back ffrom SB2W0/1
SVT was not affected, see
here
strange thing was that SVXMon was frozen and only got re-started at 6:30pm. normally, the absence of SVXMon should have been alerted in run control. last confirmed sign of life was 2:37pm.
sometime between 2pm and 3 pm, the VRB started to flake out.
two wedges out I believe is a run quality criterium BAD for SVX if I remember correctly ...\
details see silicon
elog
-- Thu Mar 4 22:56:06 comment by...rainer -- pricetag: 125min (15 min VRB reset)
Thu Mar 4 19:42:42
bunch of clist errors, e.g:
MLE) b0l2de00:SpyAlpha:7:39:45 PM->
ClistMon: Error on pass 0, cluster 0: 2 != 1
(MLE) b0l2de00:SpyAlpha:7:39:45 PM->
ClistMon: Error on pass 2, cluster 0: 2 != 1
(MLE) b0l2de00:SpyAlpha:7:39:46 PM->
ClistMon: Error on pass 2, cluster 1: 2 != 1
+lots more
hrr worked.
- ian :: (run 179643)
Thu Mar 4 19:46:03



- Christopher
-- Thu Mar 4 19:46:44 comment by...Christopher -- Hourly Plots: Another LOSTP spike around 6:30
Thu Mar 4 20:06:13
got error:
(MLE) b0xft00:Messenger:8:03:16 PM->Runtime Error 500, Event 6052890: Bunch counter mismatch,
mismatch count = 1
(MLE) b0l3pcom1.fnal.gov:main:8:04:07 PM->Host b0eb19.fnal.gov, task tRec_0
SCPU-P1-E-VrbHeader: Dump of header words for event 6062536 from VRB in slot 20:
0x009e0198 0x01900018 0x0086008a 0x0052004e 0xe2a038fc 0x831b00f3 0x5406551f 0x56ef5726
(MLE) b0dap73.fnal.gov:Thread-28:8:04:11 PM->Requested Halt-Recover-Run issued [errmon]
(MLE) b0svx06:Messenger:8:04:07 PM->Silicon Timeout:BUSY- Slots: 08:fa00 10:fa20 12:fa40 16:f800
18:f820 20:f840
hrr worked.
- ian :: (run 179643)
Thu Mar 4 20:12:56
Run 179643
Terminated at 2004.03.04 20:11:58 - RunControl
Thu Mar 4 20:12:57
Run 179643
TERMINATE: run ended due to silicon problems - Ian x2080
Thu Mar 4 20:36:39
problem with level 3 converter node 1: stuck after VRB reset during run termination; page level 3; called back working with Ian - Stephan
Thu Mar 4 20:39:25
 | inserting SVX occupancy, on Reiner's request |
- diego
-- Thu Mar 4 21:08:32 comment by...diego -- I meant SVT occupancy
Thu Mar 4 20:44:40



- Christopher
Thu Mar 4 21:00:46
Run 179643
RUNSTATUS: Marked Bad, explanation:
SVX readout problem (bulkhead 2 wedge 0 and 1) for unknown length of time a end of run
- cdfscico
Thu Mar 4 21:35:14
Run 179644
Activated at 2004.03.04 21:35:04 - RunControl
Thu Mar 4 21:36:18
Run 179644
ACTIVATE: PHYSICS_2_02[2,424,431] after L3 work - Ian x2080
Thu Mar 4 21:36:56



- Christopher
Thu Mar 4 21:38:13
About Level3 page
When i came level3 was completely broken.
To bring it back to life i had to reboot converters 1 and 8
and later Scanner manager b0eb10 and scpu25
Now Level3 seems to be fine
- Arkadiy
-- Thu Mar 4 22:44:18 comment by...rainer -- No doubt there was some relationship with action reported
here
Thu Mar 4 21:49:40
COT alarm, but no trip -- in alarm log reads TEMP_ALR
Everything green now - Christopher
-- Thu Mar 4 21:55:05 comment by...Christopher -- Morris (just happened to enter) confirms COT looks OK
Note any recurrance
Thu Mar 4 22:14:47
 | the CMP part B looks a bit out of shape |
- diego
-- Fri Mar 5 08:40:45 comment by...Lucio Cerrito -- the plot has low statistics; part Bottom is alot more shielded than the rest so it accumulates hits alot slower than the others.
Thu Mar 4 22:22:44
 | DIRAC/preFRED not always =1. Is it all right? |
- diego
Thu Mar 4 22:41:40
more CAEN madness
several ladders showing pink in IMON - reaback reveals daunting voltages on low voltage power supply. corrupted readback - hockerized crate and now ok. Details see silicon
elog - rainer
-- Thu Mar 4 22:54:04 comment by...rainer -- pricetag: 15min.
Thu Mar 4 23:05:34



- Christopher
Thu Mar 4 23:06:39
| We have put a DQMon monitor on b0dap51 for
Validation Period by the CO. Please report
(mmp@fnal.gov) any comment/complain/observation
In case you may want to restart it:
* xhost + b0dap30
* login on b0dap30 as cdfdaq
* setup -r /data1/mmp/fer fer
* java rc.mon.DQMon &
I explained the present CO how to use it. Please mention
to the new shift the instructions. |
- Mario
Thu Mar 4 23:44:29



- Christopher
Thu Mar 4 23:46:36
| Date | Time | BLM | Dose |
| 2004.03.04 | 23:44:03 | W Inner BLM | 521.69 | RADS |
| 2004.03.04 | 23:44:03 | W Outer BLM | 0.00 | RADS |
| 2004.03.04 | 23:44:03 | E Inner BLM | 3.03 | RADS |
| 2004.03.04 | 23:44:03 | E Outer BLM | 447.26 | RADS |
Integrated dosage - Christopher
Thu Mar 4 23:56:12
| Run Number |
Data Type |
Physics Table |
Begin Time |
End Time |
Live Time |
L1 Accepts |
L2 Accepts |
L3 Accepts |
Live Lumi, nb-1 |
GR |
SC |
RC |
|
179643
x2BDBB |
BEAM |
PHYSICS_2_02 [2,424,431] |
12:08:03 |
20:11:58 |
07:24:30 |
397,695,812 |
6,122,026 |
1,325,284 |
728.122 |
1 |
1 |
1 |
|
179644
x2BDBC |
BEAM |
PHYSICS_2_02 [2,424,431] |
21:35:04 |
|
02:16:28 |
95,216,684 |
1,330,472 |
324,667 |
151.650 |
|
|
1 |
| Totals |
|
|
|
23:55:02 |
09:40:59 |
492,912,496 |
7,452,498 |
1,649,951 |
879.772 |
|
|
|
- End of Shift Report
Fri Mar 5 00:00:33
Shift Summary: Store 3273 (inst lumi 1.73E31) and run 179644 in progress
- COT in slightly-degraded mode, only SL 4 and 5 at reduced gain; XFT three miss for SL 4
- running old-default low lumi table PHYSICS_2_02
- SVX readout error for two wedges, fixed by Florencia/Reiner
- level 3 hung up after VRB reset during run termination,
fixed by Arkadiy
Plan is to continue with PHYSICS_2_02[2,424,431]
- calibration and cosmics between stores
- DQMon running on CO screens, please report experience
End of Shift Numbers
|
CDF Run II
Runs 179643, 179644
Delivered Luminosity 0.619 pb^-1
Acquired Luminosity 0.476 pb^-1
Efficiency 77.0
|
- Stephan