| SciCo | DAQ Ace | Monitoring Ace | CO | (Operations Manager) |
| Steve Hahn | Jan Ehlers | Susana Cabrera | Serguei Bourov | Mary Convery |
Start of Shift Notes:  Take COT good data, maximize efficiency
Sat Mar 6 08:51:22
Looking into ACnet problems with USERB disk where ACnet GIFs are copied. Opening a DECterm on CNS51PC and directly trying a "set def userb:[000000]" and doing a directory gives a "remote node unreachable" error. I have talked with MCR, noting that both our consoles (CNS46 dedicated and CNS51 not) have the same problem, and suggesting perhaps it is a network problem. They are going to call the experts and get back to us.
- Steve Hahn :: (run 179683)Hourly plots have been captured with ALT+PRINT SCREEN and saved with Paint Shop Pro
Note to aces: How to copy ACnet screens on the ACnet consoles without using the ACnet command "XV capture" or "Save to GIF" in the Utilities menu? Answer: use screen capture instead! In fact, this method may be faster!
Note that this method does not use the http://adcon.fnal.gov/userb web page at all. |
However, recall that one should never use the "advanced edit" feature for the elog using Konquerer. Using it will cause havoc and mess up all of the elog features (also might make some people upset as a result).
| Run Coordinator elog: 09:36:32- The noise problem D0 has been experiencing on their detector has gone away. After a combination of a temperature increase to the cooling water, lower humidity in the hall, and mechanically reconnecting part of the toroid, the noise was gone. It may come back, of course. So, for now, the shutdown start date is back to Monday, March 15. Accelerator Division is developing a contingency plan to start the shutdown on Thursday, March 10 if the noise returns in the next day or two. A final decision on the shutdown date will be made on Monday (3/8) at 1600. - JPM |
Talked with Jim Smedinghoff about ACnet problems. Apparently, this is a ACnet-wide DECnet problem; we are not the only nodes suffering these problems. The network router, cns55, does not see some nodes including ours even after multiple reboots. In fact after the most recent reboot, one of our consoles--CNS51--did reappear, and now the directory structures work correctly. However, Jim had no confidence it may not drop out again any moment. He is investigating, but no estimate how long this is going to take. We should continue to use the screen capture method I outlined above for the indefinite future.
- Steve Hahn :: (run 179683)SCPU_TRACER_EVENT_ID Error: Hardware EVB has detected a problem with data quality in SCPU b0eb15 (forwarded by FER crate WCAL_03).- Jan :: (run 179683)
L2 Decision Timeout: L1Mon: saw 210 L1 DMA transfers, expect 1 (buffer number 0) L1Mon: Dumping data for 1 word. Word upper 32 bits lower 32 bits 0: 0x00000000 0x00008000 1: 0x00000000 0x40600012 2: 0x0083b477 0xa8948879 ... 418: 0x0083b477 0xa8948879 419: 0x8083b477 0x88948879 L1Mon: done.- Jan :: (run 179683)
SCPU_TRACER_EVENT_ID Error: Host b0eb11.fnal.gov, task tRec_0 SCPU-P1-E-TracerEventId: Event 10245271, crate 4, channel 8 has either bad Tracer ID or bad markers around Tracer word Hardware EVB has detected a problem with data quality in SCPU b0eb11 (forwarded by FER crate CCAL_03)- Jan :: (run 179683)
busy timeout B0SVX06: SCPU-P1-E-VrbHeader: Dump of header words for event 10450920 from VRB in slot 12: 0x7ff3c1c1 0x7ff3c1c1 0xe04418fe 0x8d2400f3 0x8c248b24 0x8a248924 0x25032619 0x270431f3 1 crate/s: b0svx06(16), busy.[RXPT]- Jan :: (run 179683)
Mention a couple of problems found by our CO:
Many triggers which use tracking are showing up with rates that are too low. Presumably this is becuase Charles recently changed the reference rates to take into account SL2 being masked on, but now we are running without SL2 masked on. Kaori has sent a message to Charles Plager and Kevin Pitts.
COT SLs 2, 3, 4, 6, and 8 all show up as red in the 1-D occupancy plots. Checked with Bob Wagner; these are caused by known single channel gain problems throughout the chamber.
- Steve Hahn :: (run 179683)for information: PCAL02 throws a bunch of reformatter errors at once with a low reject rate of ~0.005% (without stopping the DAQ)- Jan :: (run 179683)
SCPU_TRACER_EVENT_ID Error: Hardware EVB has detected a problem with data quality in SCPU b0eb12 (forwarded by FER crate XFT_FINDER_02) Host b0eb12.fnal.gov, task tRec_0 SCPU-P1-E-VrbHeader: Dump of header words for event 11822182 from VRB in slot 10 SCPU-P1-E-TracerEventId: Event 11822184, crate 100, channel 3 has either bad Tracer ID or bad markers around Tracer word- Jan :: (run 179683)
COT HV alarm for a few seconds in CDF GLOBAL ALARM: related to COT temperatures. It was so quickly that I could not identify the origin.- Susana.
CER_SVXMON_HALT_RECOVER_RUN_ERROR !!! Stuck Cellid S/B1/W5/L4/C7-13 . AUTO HRR will be issued- Jan :: (run 179683)
| The updated XMon L1 Trigger X-sections |
| XMon #6 |
Talked with Charles about current trigger rates after restarting XMon (since his last message). He was aware of them, and thinks with the current state of the COT we should be OK. He said we should worry if we see any particular triggers causing high dead time, but a check of the trigger display shows no culprits.
CER_SVXMON_HALT_RECOVER_RUN_ERROR: Stuck Cellid S/B5/W6/L4/C7-13 . AUTO HRR will be issued- Jan :: (run 179683)
information: every now and then low rate (~0.005%) REFORMATTER errors from FIB00 RAWREF Error - VRB-DLINCO - with code 18 -- START- Jan :: (run 179683)
Host b0eb12.fnal.gov, task tRec_0 SCPU-P1-E-CantResetVrb: Reset of VRB in slot 14 failed Press the button on the front panel OF THE VRB IN SLOT 14 , *NOT* the crate CPU, and WAIT AT LEAST 10 SECONDS. Note the button is recessed and will require a pen or paperclip to press. When pressed, lights will flash. If no lights flash, it wasn't pressed <- DONE, WORKED- Jan :: (run 179683)
FERML_HIGH_DEADTIME occured (no consequence)- Jan :: (run 179683)
| Date | Time | BLM | Dose | |
|---|---|---|---|---|
| 2004.03.06 | 15:29:13 | W Inner BLM | 594.31 | RADS |
| 2004.03.06 | 15:29:13 | W Outer BLM | 33.71 | RADS |
| 2004.03.06 | 15:29:13 | E Inner BLM | 111.40 | RADS |
| 2004.03.06 | 15:29:13 | E Outer BLM | 594.31 | RADS |
| Date | Time | BLM | Dose | |
|---|---|---|---|---|
| 2004.03.06 | 15:34:13 | W Inner BLM | 597.32 | RADS |
| 2004.03.06 | 15:34:13 | W Outer BLM | 33.71 | RADS |
| 2004.03.06 | 15:34:13 | E Inner BLM | 111.40 | RADS |
| 2004.03.06 | 15:34:13 | E Outer BLM | 597.32 | RADS |
| s |
| s |
| Run Number | Data Type | Physics Table | Begin Time | End Time | Live Time | L1 Accepts | L2 Accepts | L3 Accepts | Live Lumi, nb-1 | GR | SC | RC |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 179683 x2BDE3 | BEAM | PHYSICS_2_02 [2,424,431] | 23:51:25 | 15:03:45 | 791,507,714 | 14,646,722 | 3,039,894 | 1899.218 | 1 | |||
| Totals | 15:55:03 | 15:03:45 | 791,507,714 | 14,646,722 | 3,039,894 | 1899.218 |
| Totals | |
|---|---|
| Date: | 2004.03.06 |
| Shift: | day |
| Delivered luminosity: | 768.6 nb-1 |
| Acquired luminosity: | 728.7 nb-1 |
| Efficiency: | 94.8 |
Plot not available
This script has been called
3532
times since Aug 16th 2003
Store 3275 Initial Lum = 6.7e31 @ 2304 03/05/04 Run 179683 Start shift Lum = 3.3e31 Very smooth running. Problems with copying ACnet plots to USERB disk found to be DECnet router problem (we were not only ones affected). Jim Smedinghoff fixed the problem about 1000. Many trigger rates in XMon which use tracking are marked as too low. Presumably, this is because Charles yesterday changed reference to reflect SL2 being masked on, but we are no longer doing this. Charles and Kevin Pitts have been notified. Charles wrote back that he had updated L1 L2 and L3 trigger rates for the current configuration. However, after restarting XMon we do not see much improvement, other than several triggers are marked invalid (grey). Talked with Charles, under current comprimised COT conditions, he said we should worry if we see any triggers causing high dead time. We see no such problems. Our one downtime: 9 minutes to a VRB error in b0eb12, slot 14 at 1459. The standard front-panel reset worked.
HVAC alarm in CDF GLOBAL ALARMS. PDT-CH Collision hall differencial pressure, current value: 0.158 H20, hi limit is 0.13- Susana.