2004 CDF E-Log -- Owl shift. Sun Mar 7, 2004
SciCo DAQ Ace Monitoring Ace CO (Operations Manager)
Rick Field N. Miladinovic V. Khotilovich A. Loginov Mary Convery


Start of Shift Notes:  

Trying to get back to quality data taking!

Sun Mar 7 00:09:05 Please review from previous shift eLog this entry by Rainer and also this entry by JJ concerning response to abort gap losses. - JJ
Sun Mar 7 00:13:34 This this entry for more information about the b0wcal06 and b0cot17 readout problems.  - W.Badgett :: (run 179697)
Sun Mar 7 00:15:46
The TDC in slot 17 of b0cot01 had its flash memory erased during the HV outage.  I've put this
crate into SpyMode, and marked the board in slot 17 offline.  The DSP on the board will need to be
disabled to reload the flash memory, during an access.
 - Mitch Soderberg
-- Sun Mar 7 00:23:28 comment by...jdl --  If TDC is not operating, XFT probably should be masked on for these wires.
Sun Mar 7 00:28:00
an update on cot crate problems: 

done timeouts from: 

cot01: slot 17 problem - access required 

cot 02, 03, 05, 07, 14: upon activing get done timeouts - reset and shepherded crates - seem to be
ok now. 


cot 16: "vme controller not found" - power cycled crate, shepherded, seems to be ok. 

cot 17: done tineouts, powwer cycling, shepherding, didn't help. can't turn it off for extended
period of time as suggested by bill badgett at the end of the last shift elog. need to remove it
from run config and then fix with an access. 


wcal06: vivek is working on it. 

note: cot 16, 17, and wcal06 are all in the same rack. hmmmmm.
 - ian
Sun Mar 7 00:28:43 "Usual" TOF heartbeat loss at this time in night was somewhat not so usual: I've seen three(!) Smacs'es running at the same time in the Task Manager + Smacs_IFixClient. I killed them all and restarted, loading the parameters, as we were not taking data. TOF came back to work. - Vadim
Sun Mar 7 00:42:59 Run 179698 ACTIVATE: AAA_SHOTSETUP: PHYSICS_2_02[2,424,431] w/out cot17 - 2080 natasha
Sun Mar 7 00:46:08 Run 179698 TERMINATE: switching to physics w/ crippled configuration - 2080 natasha
Sun Mar 7 00:52:26 Run 179699 Activated at 2004.03.07 00:52:07 - RunControl
Sun Mar 7 00:52:48 Run 179699 ACTIVATE: AAA_CURRENT:PHYSICS_2_02[2,424,431] w/out cot17 - 2080 natasha
Sun Mar 7 00:52:48
showermax calibration failure in pcal02 slot 6 smxr 6
 - convery
Sun Mar 7 01:03:43 wcal06, slot 18 ADMEM erased its configuration FRAM and needed a redownload from qietest. Trigger and cafe FRAMs were okay though so I did not redownload them. Because of doing ADMEM configuration from qietest, run control complains :
 
MLE) b0wcal06:Messenger:12:51:44 AM->ADMEM in slot 18 not properly configured 
since the configuration tags are not the same as what run control thinks they should be. This should be completely ignored as this is only a cosmetic issue and has no impact on data whatsoever. To clear up this warning, I'll do a calibration and FRAM download for ccal and wcal crates at the end of this store to make the run control happy. - V Tiwari :: (run 179699)
Sun Mar 7 01:10:28 Run 179699 Terminated at 2004.03.07 01:09:45 - RunControl
Sun Mar 7 01:10:29 Run 179699 TERMINATE: ending to do silicon D mode calibration - 2080 natasha
Sun Mar 7 01:16:19 Run 179700 ACTIVATE: SVXCAL_DPSON - 2080 natasha
Sun Mar 7 01:18:04
 - Vadim
-- Sun Mar 7 01:19:46 comment by...Vadim --  Made this plots with help of MWSnap program. B0ABSM got a spike at around 23:50, also B0PAGC have a tendency to go up.
Sun Mar 7 01:19:47 I called MCR to let them know we would like a 2-hour access between stores to fix the casualties of the detector powerdown: a TDC in COT01, a tracer in COT17, and a showermax board in pcal02. As of yesterday evening, the store is scheduled to end late morning. - convery
-- Sun Mar 7 01:28:35 comment by...convery --  Irina reports that there may be blow fuses on Pisaboxes as well. Experts need to look into this.
Sun Mar 7 01:21:57 Run 179700 TERMINATE: ending svxcal_dpson - 2080 natasha
Sun Mar 7 01:24:02 Run 179701 ACTIVATE: SVXCAL_DPSOFF - 2080 natasha
Sun Mar 7 01:29:56 Run 179701 TERMINATE: ending svxcal_dpsoff - 2080 natasha
Sun Mar 7 01:38:36 Run 179702 Activated at 2004.03.07 01:38:11 - RunControl
Sun Mar 7 01:39:42 Run 179702 ACTIVATE: AAA_NOSILICON:TEST_ALPHA_CLUSTERING_NOSPIKES[4,436,403] - 2080 natasha
Sun Mar 7 01:41:50 Run 179702 Terminated at 2004.03.07 01:41:22 - RunControl
Sun Mar 7 01:41:51 Run 179702 TERMINATE: ending to include silicon - 2080 natasha
Sun Mar 7 01:46:28 Run 179703 Activated at 2004.03.07 01:46:05 - RunControl
Sun Mar 7 01:46:58 Run 179703 ACTIVATE: AAA_CURRENT: TEST_ALPHA_CLUSTERING_NOSPIKES[4,436,403] - 2080 natasha
Sun Mar 7 02:12:08
 - (abort gaps go down) Vadim
Sun Mar 7 02:12:58 Run 179703 Terminated at 2004.03.07 02:12:16 - RunControl
Sun Mar 7 02:12:59 Run 179703 TERMINATE: ending the trigger test - 2080 natasha
Sun Mar 7 02:14:10
 I found many CEM/CHA/WHA PMTs out of HV range. All channels  
have been reset to nominal values except of WHA (phi:03-05;   eta 06 - 11(east))for which we have
lost the HV read-out. I think the problem is due to a blown fuse on the pisa box but it need to be
checked. Anyway we need access to check it and fix the problem - Nikolay Luzhetskiy (luzhet@fnal.gov
)and Fotis Ptohos <ptohos@fnal.gov> should be called or emailed if we are going to have access
tomorrow. 

 I guess the PMTs on these wedges should have  
the correct voltages and there should not be any problem with data taken but it should be checked
either(laser run? ).  

  The HV channels corresponding to these problem has been masked. However, the HV bars for WHA will
show aroun "120" on the HV status page.  

 It takes some time for all channels to stabilize after a power outage so keep an eye on the
calorimeter HV for possible HV drifts. So the shift crew can ignore any alarms till the morning but
they should note some readings.  



 - Irina
Sun Mar 7 02:19:51 Run 179704 Activated at 2004.03.07 02:19:04 - RunControl
Sun Mar 7 02:19:51
I have problems starting following displays: 
SVXMon, YMon 

In the monitor windows(Web/E-log PC) in the SVXMon/YMon windows I noticed this message: 
DisplayServer::pollClients(): request = --- (probably socket not valid). 
I killed them(ctrl+c, ssh to the pc where they were running, killed them, restarted displays... did
all these steps again). 


In the "Consumer Lists and Status" Window status for them was unknown. 
 - Andrei
-- Sun Mar 7 02:25:39 comment by...a. --  
with a new run and a few more attempts the situation has changed - now they are
running/updating.

Note, that previous attempts had been done with previous run(s).

Sun Mar 7 02:22:01 Run 179704 ACTIVATE: AAA_CURRENT: PHYSICS_2_02[2,424,431], no cot17 - 2080 natasha
Sun Mar 7 02:22:03 Run 179704 ACTIVE:
Attention !!!. CER_SVXMON_HALT_RECOVER_RUN_ERROR !!! 
 Pipeline out of synch in 129 silicon readout chips . 
Auto HRR worked.
 - 2080 natasha
-- Sun Mar 7 10:09:32 comment by...R. Vidal --  this entry was missing a closing HTML tag < /pre >, which was inserted...
Sun Mar 7 02:28:53
 - Andrei
-- Sun Mar 7 02:30:08 comment by...a. --  
YMon COT Occupancy plots.

Sun Mar 7 02:52:15 losses and abort gaps had pikes: TeVMon Error => put silicon to standby - Vadim
Sun Mar 7 03:02:26 MCR called and they are having a problem with the electron lens. They said they would call back in 10 min to let us know if we should go into "end of store" mode. We put silicon on stand-by. Abort gap losses are okay (around 13 kHz). - Rick Field
Sun Mar 7 03:04:31
 - (hourly plots with the loss of the electron lens) Vadim
Sun Mar 7 03:10:43
RunControl says "SVXMon is missing" 

I have these messages in the SVXMon: 

APPConsumerInputModule::initIOSystem ===> CS_ConsSend p 
ort = 4031 
APPConsumerInputModule::initIOSystem ===> CS_Receiver p 
ort = 4030 

RunControl is in Halted state - is it a reason for the messages we have? 

BTW, svx display says "can not open socket connection".
 - Andrei :: (run 179704)
Sun Mar 7 03:18:05 MCR called. They are not ending the store. They are resetting the electron lens. Silicon is still on stand-by.  - Rick Field
Sun Mar 7 03:28:54 The electron lens is back up and the about gap losses look good (10.4 kHz). We will wait a few more minutes to make sure everything is stable and then we will turn Silicon back on. - Rick Field
Sun Mar 7 03:37:20 Silicon back on. B0PAGC at 11 kHz. - Rick Field
Sun Mar 7 04:05:34
 - Plots for the last hour. Vadim
Sun Mar 7 04:31:34
PPR YMon plots. Number of dead channels is a criterion to mark PPR bad/good. Is it 12 consecutive channels or 24(number of channels in the block)?
 - Andrei
Sun Mar 7 04:57:33 b0dap78-b0dap84 power OFF above CPUs went down as happened here.. Circuit breaker in 3rd floor was off again, switched it back on and CPUs came back.  - natasha,rainer
Sun Mar 7 05:07:36 The 3rd floor circuit breaker tripped. Rainer showed us how to reset it. We are now trying to get all the programs running again. - Rick Field
Sun Mar 7 05:19:34
 - (Hr.Pl.) Vadim
Sun Mar 7 05:20:33 WE have all the computers and programs running again except B0DAP52 which is still frozen. We will turn it off and then back on (i.e. power cycle it) and see if we can get it going.  - Rick Field
Sun Mar 7 05:47:10 B0DAP52 is dead. We can not restart it. Since it is early Sunday morning I am not going to page anyone. It was used for the ACE E-log (which can be done on another machine) and the "Big Brother" monitor (which the ACE can monitor elsewhere). Thus it does not appear critical and I will wait until Mary arrives at 8am to decide what to do. Everything else is up and running. - Rick Field
Sun Mar 7 06:00:29 Run 179699 RUNSTATUS:
Marked Bad, explanation:
COT Crates 01 and 17 are off
PSMX PCAL02 is off
 - cdfscico
Sun Mar 7 06:24:28
 - (Hr.Pl.) Vadim
Sun Mar 7 07:17:35
 - Vadim
-- Sun Mar 7 07:19:30 comment by...Vadim --  That's unusual: antiproton losses are less then zero
Sun Mar 7 07:26:49 MCR called and said that they are going to terminate the beam at 10am today. Our 2 hour access has been approved and we can go in at around 10am today. - Rick Field
Sun Mar 7 07:39:32 As noted above, a power strip/surge protector located underneath rack 3RR17E has tripped more than once in last month and brought down some of the consumer monitor PCs.
Although the power setup for this rack will be fixed during March 15 shutdown, we should be able to come up with an interim solution during the 2 hour access this morning when we can shut down consumers for a few minutes.
Serguei Bourov is the CO. If Dervin comes in for the access, they should be able to come up with quick plan.
If no one wants to fix this however, call me at home and I will come in and do it.  - JJ
Sun Mar 7 07:46:00 NOTE TO RICK FIELD: Ask one of the ACEs to explain difference between "HTML" and "PLAIN" eLog entries. You eLogs entries this morning all scroll off the screen.

Thanks and good morning!
JJ - JJ
-- Sun Mar 7 07:53:43 comment by...Rick Field --  I know the difference... However, I forgot to put in 's. Sorry!
-- Sun Mar 7 07:54:41 comment by...JJ --  Hmmm - or not. There may be some unclosed tag earlier in eLog that is causing problem.
Sun Mar 7 07:52:22 CPR NW Slices tripped. Recovered it quickly. - Vadim :: (run 179699)
Sun Mar 7 07:52:45
This is a test entry to make sure that JJ is not confused about note to Rick Field to learn
difference between HTML and PLAIN entries into eLog. Please ignore this entry.
 - jj
Sun Mar 7 07:55:25
Run Number Data Type Physics Table Begin Time End Time Live Time L1 Accepts L2 Accepts L3 Accepts Live Lumi, nb-1 GR SC RC
179699 x2BDF3 BEAM PHYSICS_2_02 [2,424,431] 00:52:07 01:09:45 00:16:57 5,491,073 95,622 26,400 15.953 0 1 1
179702 x2BDF6 BEAM TEST_ALPHA_CLUSTERING_NOSPIKES [4,436,403] 01:38:11 01:41:22 00:03:05 44,497 8,616 8,616 2.844 1 0
179703 x2BDF7 BEAM TEST_ALPHA_CLUSTERING_NOSPIKES [4,436,403] 01:46:05 02:12:16 00:23:14 330,865 64,411 64,411 21.106 1 1
179704 x2BDF8 BEAM PHYSICS_2_02 [2,424,431] 02:19:04 04:47:22 132,080,885 1,928,943 492,499 229.361 1
Totals 07:55:02 05:30:39 137,947,320 2,097,592 591,926 269.264
 - End of Shift Report
Sun Mar 7 08:00:07 Shift Summary:
** Store 3275 Continues ** 

** Trying to get back to quality data taking ** 

** 2 Hour Access Approved for 10am Today ** 
  COT 01 NW Top 
  COT 17 SE Bottom 
  PCAL02 W Toriod 

** WCAL06 Fixed by Tiwari ** 
See his log entry.  He wants to be notified when 
we have an access. 

** Electron Lens ** 
MCR had a problem with the electron lens and the 
L1COLI went red on TevMon.  We put the silicon on 
stand-by.  MCR reset the electron lens and got it 
running in a short time.  B0PAGC stabalized at 
around 11 kHz so we turned the silicon back on. 

** 3rd Floor Circuit Breaker ** 
The 3rd floor circuit breaker tripped.  Rainer  
showed us how to reset it.  We got all the 
computers and programs back running except 
B0DAP52 which is dead.

End of Shift Numbers
CDF Run II

Runs                   179669-179704
Delivered Luminosity   402.9 nb-1  
Acquired Luminosity    282.8 nb-1  
Efficiency             70.2

 - Rick Field
Sun Mar 7 08:00:27
 - (Hourly plots: LOSTPB are still negative) Vadim