Skip to content

Installation & Setup

This page describes how to set up the analysis environment on a local machine or a remote cluster.


System & Environment

  • Python ≥ 3.10


    Required for all analysis scripts and notebooks.

  • Git


    For cloning the repository.

  • C++ Compiler


    Required for XRootD and compiled dependencies. GCC or Clang on Linux/macOS; use Conda on Windows.

  • Conda (recommended)


    Handles complex binary dependencies reliably. Requires Miniconda or Anaconda. pip + virtualenv also supported.


Installation

All dependencies are installed automatically via environment.yml (Conda) or requirements.txt (pip). See Set Up the Python Environment for instructions.

1. Clone the Repository

Terminal
git clone https://github.com/anrghv/H-to-WW-NanoAOD-analysis.git
cd H-to-WW-NanoAOD-analysis

2. Set Up the Python Environment

The repository includes a complete environment.yml specifying all required packages with minimum version constraints:

Create and activate the environment
conda env create -f environment.yml
conda activate HEP_analysis

This creates a Conda environment named HEP_analysis with:

  • All Scikit-HEP packages (uproot, awkward, vector, hist)
  • Dask for distributed computing
  • JupyterLab for interactive notebooks
  • fsspec-xrootd for XRootD file access
Create and activate the virtual environment
python3 -m venv .venv
source .venv/bin/activate   # Linux / macOS
# .venv\Scripts\activate    # Windows
pip install -r requirements.txt

Windows

The analysis runs on Windows, macOS, and Linux. On Windows, Conda is strongly recommended -- some dependencies have complex build requirements that Conda resolves automatically.

3. Verify the Installation

Verify all packages
    import uproot, awkward as ak, vector, hist, dask
    print("All packages loaded successfully.")
    print(f"  uproot  : {uproot.__version__}")
    print(f"  awkward : {ak.__version__}")
    print(f"  dask    : {dask.__version__}")
Test CERN EOS access
    import fsspec
    with fsspec.open(
        "root://eospublic.cern.ch//eos/opendata/cms/mc/"
        "RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2N_M-125"
        "_TuneCP5_minloHJJ_13TeV-powheg-jhugen727-pythia8/"
        "NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/30000/"
        "00B3B6E3-3D68-C048-A8C4-04EB699CCE5D.root"
    ) as f:
        print("XRootD connection OK:", f.path)

4. Run the Analysis

Launch the notebook
cd notebooks/
jupyter lab HWW_analysis.ipynb

For a step-by-step walkthrough and instructions for running a full batch job with Dask, see the Analysis notebook.