Dataset associated with the "Machine Learning Electroweakino Production" publication (https://doi.org/10.48550/arXiv.2411.00093).
In this study, we explore the possibility of enhancing searches for supersymmetric dark matter particles at the LHC in the monojet channel, by using Graph Neural Networks (GNNs). We train an ensemble of 10 networks for Wino- and Higgsino-like neutralinos, and we use it on Bino, Wino, and Higgsino test samples in order to derive the sensitivity achievable at the end of Run-3 and High Luminosity phases of the LHC.
The dataset contains 5 folders:
1) wino_train,
2) wino_val,
3) higgsino_train,
4) higgsino_val,
5) test.
Each "train" folder contains 10 files (archives) corresponding to an ensemble of 10 networks, for either Wino- or Higgsino-like neutralino. "Val" folders contain validation data for the ensemble, 10 files per each neutralino type. Validation and training data are all for the same mass point: neutralino mass 300 GeV and squark mass 2.2 TeV.
The "Test" folder contains test data for SM, Binos, Higgsinos, and Winos. For neutralino test data, the archives contain all 30 mass points. For the test set, masses of neutralinos vary between 200 GeV and 1100 GeV, while the masses of squarks vary between 2.0 TeV and 3.0 TeV.
Data was produced using Monte Carlo simulation methods, with MadGraph5, Pythia, and Delphes. The published data was subject to preselection, described in the associated article.
All files are in the awkd0 format. Example code demonstrating how to read the files and use them for NN training can be found in the official repository of the project:
https://github.com/Rav2/monojet
SuSpect, 3.1.1
SUSY-HIT, 1.5a
MadGraph5, 2.7.3
FastJet, 3.4.0
ExRootAnalysis, 1.1.2
Delphes, 3.4.3.pre12
CERN ROOT, 6.24/02
python, 3.8.10
numpy, 1.24.4
awkward0, 0.15.5
tensorflow, 2.7.0
pandas, 1.4.1