This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# in a fresh conda environment, >= py3.8 | |
conda install xrootd -c conda-forge | |
# install dask-awkward@main, install dask-histogram from https://github.com/lgray/dask-histogram/tree/map_reduce_agg_hist_adds | |
pip install coffea xgboost mt2 distributed==2024.2.0 dask==2024.2.0 | |
git clone https://github.com/TopEFT/topcoffea.git -b coffea2023 | |
pushd topcoffea | |
pip install -e . | |
popd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# in a fresh conda environment, >= py3.8 | |
conda install xrootd -c conda-forge | |
pip install coffea xgboost mt2 | |
git clone https://github.com/TopEFT/topcoffea.git -b coffea2023 | |
pushd topcoffea | |
pip install -e . | |
popd | |
git clone https://github.com/cmstas/ewkcoffea.git -b coffea2023 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Running necessary_columns... | |
_ ._ __/__ _ _ _ _ _/_ Recorded: 07:32:07 Samples: 154011 | |
/_//_/// /_\ / //_// / //_'/ // Duration: 164.460 CPU time: 164.503 | |
/ _/ v4.6.1 | |
Program: run_wwz4l.py ../../input_samples/sample_jsons/test_samples/UL17_WWZJetsTo4L2Nu_forCI.json,../../input_samples/sample_jsons/test_samples/UL17_WWZJetsTo4L2Nu_forCI_extra.json -x iterative | |
164.462 <module> run_wwz4l.py:1 | |
└─ 164.462 report_necessary_columns dask_awkward/lib/inspect.py:118 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
with Client() as client: # distributed Client scheduler | |
# Run preprocess | |
print("\nRunning preprocess...") | |
dataset_runnable, dataset_updated = preprocess( | |
fileset, | |
maybe_step_size=50_000, | |
align_clusters=False, | |
files_per_batch=1, | |
#skip_bad_files=True, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
from scipy.stats import uniform | |
import vector | |
import hist | |
from math import pi | |
def make_vector(rawvec): | |
return vector.arr({"px": rawvec[:, 0], "py": rawvec[:, 1], "pz": rawvec[:, 2], "M": rawvec[:, 3]}) | |
P_beam = 1000 # GeV |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class CustomDataGenerator(tf.keras.utils.Sequence): | |
def __init__(self, | |
data_directory_path: str = "./", | |
labels_directory_path: str = "./", | |
is_directory_recursive: bool = False, | |
file_type: str = "csv", | |
data_format: str = "2D", | |
batch_size: int = 32, | |
file_count = None, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from coffea.nanoevents import NanoEventsFactory, NanoAODSchema | |
from coffea.processor import accumulate | |
from distributed import Client | |
import dask | |
import dask_awkward as dak | |
import dask.array | |
from dask.diagnostics import ProgressBar | |
import awkward |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# import warnings | |
# warnings.filterwarnings("error") | |
import logging | |
import os | |
import time | |
from coffea import processor | |
from coffea.nanoevents import NanoAODSchema |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import time | |
import awkward as ak | |
import dask_awkward as dak | |
import numpy as np | |
import os | |
from coffea.lookup_tools import extractor | |
from coffea.jetmet_tools import FactorizedJetCorrector |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(coffea-dev) lgray@dhcp-131-225-97-134 coffea % python -i spark_work.py | |
Setting default log level to "WARN". | |
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). | |
22/12/09 16:42:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable | |
40 * {Muon_pt: var * float32, Muon_eta: var * float32, Muon_phi: var * float32, Muon_mass: var * float32, Muon_charge: var * int32, nMuon: int64} | |
[pyarrow.RecordBatch | |
Muon_pt: list<item: float not null> not null | |
child 0, item: float not null | |
Muon_eta: list<item: float not null> not null | |
child 0, item: float not null |
NewerOlder