tomeichlersmith/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Decoding and Plotting Raw Data

After downloading data taken using pflib on cmslab1,
we need to decode it before being able to plot the various quantities of interest.
Most of this decoding work has already been implemented in ldmx-sw,
so we just need to massage that into our purposes.
In this GitHub gist, I've included this note as well as a ldmx-sw configuration
script for running with fire that will decode the raw data for you and put it
into an easier-to-analyze form.
Start Up

I needed to add some stuff to ldmx-sw in order to match our needs,
so you will need to re-compile ldmx-sw with my changes.
cd ldmx-sw
git fetch
git checkout umn/hgcroc
git submodule update
source scripts/ldmx-env.sh
ldmx clean src
ldmx cmake -B build -S .
cd build
ldmx make install

After completing this re-compilation, you should be good to move on to decoding and then analysis.
Decoding

Using the code in ldmx-sw is done through the program fire which is supplied a python
script to configure how it runs. This GitHub gist has the python configuration script decode.py
which is written to configure fire for decoding the raw pedestal run data collected at UMN.
ldmx fire decode.py some_file.raw

decode.py is treated by fire as a regular Python script, so I have added some additional
command-line parameters. You can see them all by running ldmx fire decode.py --help, but
the most important one (which is also required) is some_file.raw which is the file with
raw pedestal data. The reason to have this raw file be a command line parameter is so that
you can have multiple raw files each named after different settings that you are testing
and then you can decode them with the same procedure.
Outputs

Mainly, this program will output a file named similarly to your input raw file.
Using the example above where you input some_file.raw, the output file will be adc_some_file.root.
This ROOT file has the decoded data in it.
Plotting

After decoding the raw data, we want to view it.
There are two main ways to view the data (1) interactively with ROOT's broswer
and (2) using a python script to print out plots.
I suggest to start with (1) since it is easier and allows you to discover
what types of plots you want to print out. Once you have found out which plots
you want to be printing, then you can move to (2) in order to speed up the process
of plotting.
ROOT's Browser

We have packaged ROOT's browser into the container, so after decoding the raw
data file you can provide the browser the ROOT file with the decoded data.
ldmx rootbrowse adc_some_file.root

This launches a graphical interface for exploring the data stored in the file.
Note: In order to use a graphical interface from a program within WSL,
you will need to install whats called an "X-server" outside of WSL so that
the program within WSL can connect to your screen. Here is tutorial
on getting graphical applications running from within WSL.
ROOT's browser has a specific feature that is incredibly helpful
which is called the "tree viewer". Right-clicking on a tree in the browser
and selecting "StartViewer" opens up this menu.
(I will probably need to show you how to do this, it is not very intuitive.)
Python Script

The basic idea of having a plotting script is to allow you to print out
pre-determined plots quickly without having to wait for the extra work
of launching a graphical interface and clicking around inside of it.
Writing plotting scripts is tough, so I've included an example one in
this gist as well. It would be run through the container as well, but
this time using python3 rather than fire.
ldmx python3 plot.py adc_some_file.root

The output of this command is a PDF file containing a plot
of various channels and their pedestal readings.
You'll notice that this plot is ugly; this is becuase I did not invest
any time into making the plotting script improve how the plot looks.
For internal plots (just you, me, and Jeremy), it doesn't really matter
if a plot is "ugly" as long as it is understandable.

  
## decode.py
"""Basic HGCROCv2RawDataFile reformatting configuration"""

import argparse, sys

parser = argparse.ArgumentParser(f'ldmx fire {sys.argv[0]}')

parser.add_argument('input_file')
parser.add_argument('--pause',action='store_true')
parser.add_argument('--max_events',default=100,type=int)

arg = parser.parse_args()

"""
# single sample packet with no zero suppression

len = 41*(num actually connected links)
    + 2*(not connection links)
    + 2 # fpga headers
    + 2 # fpga link count words
    + 1 # fpga checksum footer

For Jeremy's test setup at CERN, there are 6 actually
connected links (or 3 HGC ROCs both halves connected),
so we get an event packet lenght of

    255

This is the length of event packets for runs 207, 208.

# multi-sample event packet with no zero supp

For the multi-sample no zero supp event packet,
Jeremy added a few extra words.

len = 2 # header signal words
    + 0 # event header <- seems to be missing in run208 file
    + (half of num samples rounded up) # counter words
    + (num sample sample packets)
    + 2 # footer signal words

For run208 Jeremy had 4 samples for each event but it seems
to be missing the whole-event header word.

    1026

The text version of the run208 file that Jeremy produced has 102600
lines for a one-hundred event run, so I'm trusting this.

When Jeremy adds the extra event header word in, this number will
increase to 1027.

# UMN HGC ROC Setup
  Firmware Version 1.24 Pedestal Run
    (quoting firmware version because this might change)

  len = 2 # header signal words
      + 1 # event header
      + 1 # num samples per event in pedestal run
      + 2 # fpga header
      + 2 # link count words
      + 41*2 # actual readout links
      + 2*7 # unconnected links
      + 1 # fpga footer
      + 2 # event footers
    = 107
"""

from LDMX.Framework import ldmxcfg

p = ldmxcfg.Process('unpack')
p.maxEvents = arg.max_events
p.termLogLevel = 0
p.logFrequency = 1

import LDMX.Hcal.hgcrocFormat as hcal_format
import LDMX.Hcal.digi as hcal_digi
import LDMX.Hcal.HcalGeometry
import LDMX.Hcal.hcal_hardcoded_conditions
from LDMX.DQM import umn
from LDMX.Packing import rawio

import os
base_name = os.path.basename(arg.input_file).replace('.raw','')
dir_name  = os.path.dirname(arg.input_file)
if not dir_name :
    dir_name = '.'

p.outputFiles = [f'{dir_name}/unpacked_{base_name}.root']

# where the ntuplizing tree will go
p.histogramFile = f'adc_{base_name}.root'

# sequence
#   1. split file into event packets
#   2. decode event packet into digi collection
#   3. ntuplize digi collection
p.sequence = [
        rawio.SingleSubsystemUnpacker(
            raw_file = arg.input_file,
            output_name = 'UMNChipSettingsTestRaw',
            num_bytes_per_event = 4*107, #1051, #1026, #255,
            detector_name = 'DNE'
            ),
        hcal_format.HcalRawDecoder(
            input_name = 'UMNChipSettingsTestRaw',
            output_name = 'UMNChipSettingsTestDigis',
            ),
        umn.TestHgcRoc(
            input_name = 'UMNChipSettingsTestDigis'
            )
        ]

if arg.pause :
    p.pause()

## plot.py
"""Example plotting script to be run inside the container"""

import ROOT
import sys

# turn on batch mode so ROOT doesn't try to launch a graphical interface
ROOT.gROOT.SetBatch(1)

fn = sys.argv[1]

comment = f'run {fn}'
if len(sys.argv) > 2 :
  comment = sys.argv[2]

# assume first command line argument is the file to plot
rf = ROOT.TFile(fn)

# get the tree of data
tree = rf.Get('hgcroc/adc')

# create a canvas to draw plots on
c = ROOT.TCanvas()

# draw what you want
tree.Draw('adc:channel','link==1','colz')
ROOT.gPad.GetPrimitive("htemp").SetTitle(f'{comment}, link 1')

# print
c.SaveAs('link1_pedestals_by_channel_'+fn.replace('root','pdf'))
	"""Basic HGCROCv2RawDataFile reformatting configuration"""

	import argparse, sys

	parser = argparse.ArgumentParser(f'ldmx fire {sys.argv[0]}')

	parser.add_argument('input_file')
	parser.add_argument('--pause',action='store_true')
	parser.add_argument('--max_events',default=100,type=int)

	arg = parser.parse_args()

	"""
	# single sample packet with no zero suppression

	len = 41*(num actually connected links)
	+ 2*(not connection links)
	+ 2 # fpga headers
	+ 2 # fpga link count words
	+ 1 # fpga checksum footer

	For Jeremy's test setup at CERN, there are 6 actually
	connected links (or 3 HGC ROCs both halves connected),
	so we get an event packet lenght of

	255

	This is the length of event packets for runs 207, 208.

	# multi-sample event packet with no zero supp

	For the multi-sample no zero supp event packet,
	Jeremy added a few extra words.

	len = 2 # header signal words
	+ 0 # event header <- seems to be missing in run208 file
	+ (half of num samples rounded up) # counter words
	+ (num sample sample packets)
	+ 2 # footer signal words

	For run208 Jeremy had 4 samples for each event but it seems
	to be missing the whole-event header word.

	1026

	The text version of the run208 file that Jeremy produced has 102600
	lines for a one-hundred event run, so I'm trusting this.

	When Jeremy adds the extra event header word in, this number will
	increase to 1027.

	# UMN HGC ROC Setup
	Firmware Version 1.24 Pedestal Run
	(quoting firmware version because this might change)

	len = 2 # header signal words
	+ 1 # event header
	+ 1 # num samples per event in pedestal run
	+ 2 # fpga header
	+ 2 # link count words
	+ 41*2 # actual readout links
	+ 2*7 # unconnected links
	+ 1 # fpga footer
	+ 2 # event footers
	= 107
	"""

	from LDMX.Framework import ldmxcfg

	p = ldmxcfg.Process('unpack')
	p.maxEvents = arg.max_events
	p.termLogLevel = 0
	p.logFrequency = 1

	import LDMX.Hcal.hgcrocFormat as hcal_format
	import LDMX.Hcal.digi as hcal_digi
	import LDMX.Hcal.HcalGeometry
	import LDMX.Hcal.hcal_hardcoded_conditions
	from LDMX.DQM import umn
	from LDMX.Packing import rawio

	import os
	base_name = os.path.basename(arg.input_file).replace('.raw','')
	dir_name = os.path.dirname(arg.input_file)
	if not dir_name :
	dir_name = '.'

	p.outputFiles = [f'{dir_name}/unpacked_{base_name}.root']

	# where the ntuplizing tree will go
	p.histogramFile = f'adc_{base_name}.root'

	# sequence
	# 1. split file into event packets
	# 2. decode event packet into digi collection
	# 3. ntuplize digi collection
	p.sequence = [
	rawio.SingleSubsystemUnpacker(
	raw_file = arg.input_file,
	output_name = 'UMNChipSettingsTestRaw',
	num_bytes_per_event = 4*107, #1051, #1026, #255,
	detector_name = 'DNE'
	),
	hcal_format.HcalRawDecoder(
	input_name = 'UMNChipSettingsTestRaw',
	output_name = 'UMNChipSettingsTestDigis',
	),
	umn.TestHgcRoc(
	input_name = 'UMNChipSettingsTestDigis'
	)
	]

	if arg.pause :
	p.pause()
	"""Example plotting script to be run inside the container"""

	import ROOT
	import sys

	# turn on batch mode so ROOT doesn't try to launch a graphical interface
	ROOT.gROOT.SetBatch(1)

	fn = sys.argv[1]

	comment = f'run {fn}'
	if len(sys.argv) > 2 :
	comment = sys.argv[2]

	# assume first command line argument is the file to plot
	rf = ROOT.TFile(fn)

	# get the tree of data
	tree = rf.Get('hgcroc/adc')

	# create a canvas to draw plots on
	c = ROOT.TCanvas()

	# draw what you want
	tree.Draw('adc:channel','link==1','colz')
	ROOT.gPad.GetPrimitive("htemp").SetTitle(f'{comment}, link 1')

	# print
	c.SaveAs('link1_pedestals_by_channel_'+fn.replace('root','pdf'))