2004 CDF E-Log -- Owl shift. Thu Feb 26, 2004
SciCo DAQ Ace Monitoring Ace CO (Operations Manager)
Guram Chlachidze Ian Vollrath Andrew Ivanov Oleg Poukhov J.J.Schmidt


Start of Shift Notes:  

Store 3256 in progress, data taking is paused. 
Bill Badgett is working with physics_test_2_02[10,429,434]

Thu Feb 26 00:22:53
We're having the usual teething pains after the  
TDC DSP programming.   Most pernicious were  
the b0imu00 and b0imu01 crates -- still having  
problems with b0imu01, even after re-reprogramming  
the DSPs.  Argh. 
 - W.Badgett :: (run 179364)
-- Thu Feb 26 00:27:22 comment by...WB --  
Am now de-programming the IMU TDCs back to 
the original V37.   Getting them to work tonight 
seems hopeless.

Thu Feb 26 00:39:24 Run 179364 Terminated at 2004.02.26 00:39:02 - RunControl
Thu Feb 26 00:40:06 Run 179364 TERMINATE: testing run of PHYSICS_TEST_2_02[10,429,434] ended - Ian x2080
Thu Feb 26 00:45:30
Now deprogramming all but the COT crates -- we have  
enough statistics in the last run to help track  
down the bc mismatch errors in the muon crates. 

 - W.Badgett :: (run 179364)
-- Thu Feb 26 00:50:46 comment by...WB --  
Also noticed a large number of crashes at L3.  Not 
clear why.

Thu Feb 26 00:52:32 Hey shift crew, since we are running in this funny state, I'd like to sneak in another test. Please run 30 minutes with TEST_ALPHA_CLUSTERING_NOSPIKES[2,430,403]. - jdl
-- Thu Feb 26 01:04:00 comment by...Guram --  Bill is not done yet, so we can run this test table later.
Thu Feb 26 00:53:56
Starting a new run after deprogramming non-COT TDCs and  
cleaning up the Level 3.    Did not see errors  
from the previous run in COT, so we expect smoother  
running this time.
 - W.Badgett :: (run 179365)
-- Thu Feb 26 00:57:17 comment by...WB --  
We disabled the COT HV inhibit after Activate, but 
before any events were taken.

Run looks much smoother now - no spew of TDC error 
messages.

Thu Feb 26 00:55:17 Run 179365 Activated at 2004.02.26 00:54:53 - RunControl
Thu Feb 26 00:56:04 Run 179365 ACTIVATE: give PHYSICS_TEST_2_02[10,429,434] another shot - Ian x2080
Thu Feb 26 01:02:29
At event 34347 (L2 accept number), I changed the DmaChain  
readout mode to true for crates COT_05 and COT_12 to see  
if that would make any difference in their rate of BC  
mismatch errors.
 - W.Badgett :: (run 179365)
Thu Feb 26 01:09:38
At event 77109, I changed the DmaChain mode  
for crate COT_16 to true, again to see if that would  
reduce the BC mismatch errors in that crate (as it  
did for COT_05 and COT_12).
 - W.Badgett :: (run 179365)
Thu Feb 26 01:18:36 Pulsar Guys are doing some fiber splitter tests, details are given here  - Vadim/Burkard
-- Thu Feb 26 01:40:18 comment by...jdl --  Please don't run TEST_ALPHA_CLUSTERING table while Pulsar guys are playing with iso fibers.
Thu Feb 26 01:31:49 Run 179365 Terminated at 2004.02.26 01:31:37 - RunControl
Thu Feb 26 01:33:12 Run 179365 TERMINATE: end test run of PHYSICS_TEST_2_02[10,429,434] - Ian x2080
Thu Feb 26 01:41:46
The COT crates are now being deprogrammed back  
to their original V37 TDC DSP code version.  All bankVersion  
values have returned to zero. 
 - W.Badgett
Thu Feb 26 01:45:56 Run 179366 TERMINATE: junk - Ian x2080
Thu Feb 26 01:49:44 Run 179367 Activated at 2004.02.26 01:49:38 - RunControl
Thu Feb 26 01:51:34 Run 179367 ACTIVATE: test of PHYSICS_TEST_2_02[10,429,434] with Pulsar with Isolist firmware loaded - Ian x2080
Thu Feb 26 01:51:35 Run 179367 Terminated at 2004.02.26 01:51:26 - RunControl
Thu Feb 26 01:53:05 Run 179367 TERMINATE: need to clean up L3 - Ian x2080
Thu Feb 26 01:53:43
The current FrontEndReadout/RunControl package fer  
is now at version v3_24.  All changes are minor,  
and include: 

  . Pick up bug-fix in merlin package that caused low  
    probality of memory heap corruption when creating  
    a merlin message 

  . Fix bug that was suppressing error reporting when  
    the TDC header word in Static RAM did not agree  
    with the header word in the TDC hit FIFO 

  . Patch makefile for newly declared camac product (RLC) 

  . Allow more error message to be reported from frontEnd  
    crates before prescaling them. 

Crates had to be rebooted to pick up the merlin bug fix. 
 - W.Badgett
-- Thu Feb 26 01:54:22 comment by...WB --  
2nd item above only applies to non-spy-mode readout 
TDC crates.

Thu Feb 26 02:09:51 Run 179368 Activated at 2004.02.26 02:09:05 - RunControl
Thu Feb 26 02:09:52 Run 179368 ACTIVATE: some more PHYSICS_TEST_2_02[10,429,434] - Ian x2080
Thu Feb 26 02:10:35 Run 179368 Terminated at 2004.02.26 02:10:16 - RunControl
Thu Feb 26 02:12:06 Run 179368 TERMINATE: some L3 problems - Ian x2080
Thu Feb 26 02:23:15 Run 179369 Activated at 2004.02.26 02:21:59 - RunControl
Thu Feb 26 02:24:31 Run 179369 ACTIVATE: junk - Ian x2080
Thu Feb 26 02:24:32 Run 179369 Terminated at 2004.02.26 02:23:30 - RunControl
Thu Feb 26 02:24:33 Run 179369 TERMINATE: junk - Ian x2080
Thu Feb 26 03:04:04
While waiting for the Level 3 problem to be fixed, I  
loaded the kernel flash RAMs in the SVT_00...07 crates. 
The CDF_VERSION of this kernel is 5.10;  the only real  
difference is that the RTI libraries are now loaded  
so that we can use the HeapCheck() function to verify  
the memory is not corrupted.   This was thanks to  
a work-around from Jim Patrick to load the library  
into large-memory 2304 boards.    So now HEAP_CORRUPT  
errors will be reported from these crates if HeapCheck()  
function indicates a problem with the memory structure. 
 - W.Badgett, Jim Patrick
Thu Feb 26 03:07:05
 - 23:30 - 3:00 status plots - Andrew
Thu Feb 26 03:17:30 Run 179374 Activated at 2004.02.26 03:15:29 - RunControl
Thu Feb 26 03:17:31 Run 179374 ACTIVATE: junk - Ian x2080
Thu Feb 26 03:27:01 Run 179374 Terminated at 2004.02.26 03:26:48 - RunControl
Thu Feb 26 03:27:16 Run 179374 TERMINATE: junk - Ian x2080
Thu Feb 26 03:30:11
Oops, I had the bankVersion=1 still for COTD bank in  
the hdwdb.  I am so bad.  That would cause an increase  
in the Level 3 accept rate as events went to the error  
stream.   However, this should *not* have caused the  
filters to crash...
 - W.Badgett :: (run 179374)
Thu Feb 26 03:48:28 Run 179375 Activated at 2004.02.26 03:47:48 - RunControl
Thu Feb 26 03:53:53 Run 179375 ACTIVATE: PHYSICS_2_02[2,424,431] - Ian x2080
-- Thu Feb 26 03:54:25 comment by...ian --  
L3 problems seem to have been resolved

Thu Feb 26 03:55:49
During an interlude, I loaded the new MVME 2401 kernel  
onto the COT crate 01 through 19 flash RAMs.  COT_00 had already  
been running for several days with the new kernel, so we  
were confident that it works fine.  Like the SVT kernel  
load mentioned above, this provides access to the famous 
HeapCheck() function, and allows us to monitor  
the integrity of the memory structure in these crate  
processors. 

Now all but he b0svx## and b0eb## crate processors have  
this checking enabled. 
 - W.Badgett :: (run 179375)
Thu Feb 26 03:57:40 Run 179375 Terminated at 2004.02.26 03:57:12 - RunControl
Thu Feb 26 03:59:01 Run 179375 TERMINATE: run ended to test out a table for jdl - Ian x2080
Thu Feb 26 04:04:29
L3 was paged due to many filters crashing leaving a great part of the farm in a bad state. This
has eventually been realized by Bill Badget as due to a COT related problem. After his COT
corrections, the level3 input rate came down to normal levels, and the crashes were not verified
anymore. The farm is working well.
 - Nuno
Thu Feb 26 04:05:25
The conclusion from tonight's beam-on TDC DSP V45 test runs  
was that the COTD bank is ready and working assuming  
the latest L3 tag is used: 

  PHYSICS_TEST_2_02 [10,429,434] 

or any after 434, assuming they have the patch: 

   Level3Filters/src/L3DaqErrorFilter.cc   CVS Revision 1.6 

The problems with the muon crates need further study, but we  
now have more informations to help diagnose the problem. 

Run 179365 was a nice long run with COTD in the new format  
and the rest of the TDC banks in the old format. 

Run 179364 was a useful run to help diagnose the muon crate  
readout problems. 
 - W.Badgett :: (run 179365)
Thu Feb 26 04:09:36 Run 179376 Activated at 2004.02.26 04:08:27 - RunControl
Thu Feb 26 04:11:43 Run 179376 Terminated at 2004.02.26 04:09:46 - RunControl
Thu Feb 26 04:11:44 Run 179376 TERMINATE: junk - Ian x2080
Thu Feb 26 04:11:46 Run 179376 ACTIVATE: junk - Ian x2080
Thu Feb 26 04:18:06 Run 179377 TERMINATE: junk - Ian x2080
Thu Feb 26 04:22:18 Run 179378 Activated at 2004.02.26 04:22:02 - RunControl
Thu Feb 26 04:23:57 Run 179378 ACTIVATE: test of TEST_ALPHA_CLUSTERING_NOSPIKES[2,430,403] - Ian x2080
Thu Feb 26 04:59:31 Run 179378 Terminated at 2004.02.26 04:58:59 - RunControl
Thu Feb 26 04:59:52 Run 179378 TERMINATE: end test of TEST_ALPHA_CLUSTERING_NOSPIKES[2,430,403] - Ian x2080
Thu Feb 26 05:07:04 Run 179379 Activated at 2004.02.26 05:06:24 - RunControl
Thu Feb 26 05:07:44 Run 179379 ACTIVATE: PHYSICS_TEST_2_02[10,429,433] - Ian x2080
Thu Feb 26 05:14:26
 - 3:00 - 5:00 status plots - Andrew
Thu Feb 26 06:30:38
got vrb evb error: 

Attention!!!. Event Builder SCPU_CANT_RESET_VRB Error !!! 
 THE BAD VRB CRATE is: b0eb16 
 THE BAD VRB MODULE is in SLOT: 10 

reset vrb. back running.
 - ian :: (run 179379)
Thu Feb 26 07:06:40
 - 5:00-7:00 status plots - Andrew
Thu Feb 26 07:47:43
Talked briefly on phone with Bob Wagner. He asks for no more 
tests with Silicon warm so Andy Hocker should start cooling  
Silicon as soon as he shows up this morning. 

 - JJ
Thu Feb 26 07:50:21
Antiproton losses and D0 Roman Pot motion.
 - R.J. Tesarek
-- Thu Feb 26 07:55:49 comment by...R.J. Tesarek --  
Antiproton losses and halo and D0 Roman pots
The above figure shows the antiproton losses (cyan) and halo (yellow) vs time and the position of various D0 Roman Pots. Note the increases in both losses and halo, but that they do not occur at the same time. For reference, the layout of the D0 pots is as follows:
CDF    ....     D2    D1         A2     A1    D0    P1    P2

<----- antiproton beam                     proton beam ----->
Note that the pots located between D0 and CDF (FPA2DL, FPA2OL, FPD1IL) appear to be responsible for the increase in losses while the pots on the opposite side of D0 (FPP1IL) appear to be responsible for our halo.
Thu Feb 26 07:55:48
Run Number Data Type Physics Table Begin Time End Time Live Time L1 Accepts L2 Accepts L3 Accepts Live Lumi, nb-1 GR SC RC
179364 x2BCA4 BEAM PHYSICS_TEST_2_02 [10,429,434] 23:53:45 00:39:02 00:06:47 4,256,779 54,571 13,686 12.390 1 1
179365 x2BCA5 BEAM PHYSICS_TEST_2_02 [10,429,434] 00:54:53 01:31:37 00:32:29 27,325,079 305,995 56,725 57.441 1 1
179375 x2BCAF BEAM PHYSICS_2_02 [2,424,431] 03:47:48 03:57:12 00:05:29 2,900,276 38,097 8,202 8.583 1 0
179378 x2BCB2 BEAM TEST_ALPHA_CLUSTERING_NOSPIKES [2,430,403] 04:22:02 04:58:59 00:36:57 865,936 162,551 162,551 56.139 1 1
179379 x2BCB3 BEAM PHYSICS_TEST_2_02 [10,429,433] 05:06:24 02:42:19 128,831,729 1,353,680 247,565 227.446 1
Totals 07:55:02 04:04:03 164,179,799 1,914,894 488,729 361.998
 - End of Shift Report
Thu Feb 26 08:00:50 Shift Summary:
We started our shift running several tests: 
- Bill Badgett was still working on TDC DSP microcode, facing 
 usual problems with IMU TDC DSP-programming. Finaly he got  
 COTD bank working assuming the latest L3 tag is used  
 PHYSICS_TEST_2_02 [10,429,434]. The problems with the muon   
 crates need further study. 
- 01:00: Vadim & Burkhard are taking some test runs splitting  
 (isolist) fibers  
- 03:00: L3 was paged due to many filters crashing and leaving a  
 great part of the farm in a bad state. This has eventually been 
 realized as due to a COT related problem. After Bill's COT 
 corrections,  L3 farm is working well. 
- 04:30: test run with TEST_ALPHA_CLUSTERING_NOSPIKES[2,430,403], 
 requested by jdl (only 30 min.) 
- 05:00 we only started run with physycs_test_2_02[10,429,433]. 


End of Shift Numbers
CDF Run II

Runs                   179364-179379
Delivered Luminosity   756.3  
Acquired Luminosity    373.1  
Efficiency             49.3

 - Guram
Thu Feb 26 08:06:39 Run 179379 Terminated at 2004.02.26 08:06:27 - RunControl
Thu Feb 26 08:07:07 Run 179379 TERMINATE: Silicon Expert Beginning To Cool: Detector Config Changed. - Tom x2080
Thu Feb 26 08:25:13 Run 179380 Activated at 2004.02.26 08:23:51 - RunControl
Thu Feb 26 08:25:14 Run 179380 ACTIVATE: Silicon Cooling. Taking Data until silicon is turned on. - Tom x2080
Thu Feb 26 08:25:16 Run 179379 RUNSTATUS:
Marked Bad, explanation:
COT 
SVX 
ISL 
L00 
SVT 
 - cdfscico