Snakemake Rules Reference¶
workflow/rules/aux.smk module¶
print_statsPrints a table with summary runtime information for each
simid.Can be run with
snakemake print_stats. The listed tiers are taken from the Simflow config fieldmake_steps.Note
The statistics refer to the total job wall time, as measured by Snakemake.
No wildcards are used.
print_benchmark_statsPrints a table with summary runtime information of a benchmarking run.
Can be run with
snakemake print_benchmark_stats. This functionality is useful to tune the number of remage primaries and jobs in the Simflow configuration. After printing the table, also writes an updatedgenerated/benchmarks/generated-simconfig.yamlwith suggestedprimaries_per_jobandnumber_of_jobsvalues that can optionally be swapped in place of the sourcesimconfig.yaml.Note
The runtime and the simulation speed are extracted from the event simulation loop statistics reported by remage. These values do not account for other remage steps like initialization or post-processing.
No wildcards are used.
_init_julia_envNo description provided.
cache_detector_usabilitiesCache detector usabilities.
Querying the metadata for detector usability can be slow and constitute the bottleneck in post-processing (
optandhittiers). This rule caches the mappingrun -> detector -> {usability, psd_usability}on disk.archive_plotsArchive all validation plots into a single tarball.
Must be triggered manually with
snakemake archive_plots— it is not part of the defaultalltarget. Collects allplots/subdirectories produced by the Simflow under thegenerated/directory and packs them intotarballs/<cycle>-plots.tar.xz, preserving the directory tree structure.No wildcards are used.
workflow/rules/cvt.smk module¶
gen_all_tier_cvtAggregate and produce all the
cvttier files.build_tier_cvtProduce a
cvttier file.cvtstands for “concatenatedevttier”.evtfiles for each simulation job are concatenated/aggregated into a single file.Uses wildcards
simid.plot_tier_cvt_observablesProduce validation plots of observable distributions from the
cvttier.Generates diagnostic plots from all
cvtoutput files for the givensimid.Uses wildcard
simid.
workflow/rules/evt.smk module¶
gen_all_tier_evtAggregate and produce all the
evttier files.build_tier_evtProduce an
evttier file.Event files re-organize the
hitandopttier data into a single, event-oriented table where each row correspond to an event.a unified TCM is built from the
optandhitdata. It is different from thestptier TCM since it includes also the SiPM channels;each chunk of the unified TCM is partitioned according to the livetime span of each run (see the
make_simstat_partition_filerule);fields from lower tiers are restructured into events;
new event-level fields are computed and stored in the output file;
optionally, random-coincidence (RC) SiPM data from real evt files is added as
spms/rc_energyandspms/rc_time(controlled byadd_random_coincidencesintier/evt/{experiment}/settings.yaml).
A top-level
detector_uidsstruct mapping detector names to reboost UIDs (union of hit and opt tiers) is also written, to enable downstream per-group filtering in thepdftier.Uses wildcards
simidandjobid.
workflow/rules/hit.smk module¶
gen_all_tier_hitAggregate and produce all the hit tier files.
build_tier_hitProduce a
hittier file starting from a singlestptier file.This rule implements the post-processing of the
stptier HPGe data in chunks, in the following steps:each chunk is partitioned according to the livetime span of each run (see the
make_simstat_partition_filerule). For each partition:the detector usability and PSD usability are retrieved from
legend-metadataand stored in the output;the active volume model is applied based on information from
legend-metadata;A/E is simulated based on current signal templates extracted from LEGEND-200 data;
energy is smeared according to the measured energy resolution (extracted from the data production parameters database);
a new time-coincidence map (TCM) across the processed detectors is created and stored in the output file.
The
stpdata format is preserved: detector tables are stored separately in the output file below/hit/{detector_name}.Uses wildcards
simidandjobid.plot_tier_hit_observablesProduce validation plots of observable distributions from the
hittier.Generates diagnostic plots from all
hitoutput files for the givensimid.Uses wildcard
simid.
workflow/rules/opt.smk module¶
gen_all_tier_optAggregate and produce all the opt tier files.
build_tier_optProduce a
opttier file starting from a singlestptier file.This rule implements the post-processing of the
stptier liquid argon energy depositions in chunks, in the following steps:each chunk is partitioned according to the livetime span of each run (see the
make_simstat_partition_filerule). For each partition:the detector usability is retrieved from
legend-metadataand stored in the output;scintillation photons are generated corresponding to simulated energy depositions;
detected photoelectrons are sampled according to the input optical map;
a finite resolution is applied to each photoelectron amplitude (see script);
photoelectrons are clustered in time to simulate the effect of finite time resolution of the system;
a new time-coincidence map (TCM) across the processed SiPMs is created and stored in the output file.
This rule can sample photoelectrons in each SiPM individually or for all SiPMs at the same time, see relevant
paramflag.The
stpdata format is preserved: SiPM tables are stored separately in the output file below/hit/{sipm_name}.Uses wildcards
simidandjobid.plot_tier_opt_observablesProduce validation plots of observable distributions from the
opttier.Generates diagnostic plots from all
optoutput files for the givensimid.Uses wildcard
simid.
workflow/rules/par.smk module¶
Rules to compute the simulation parameters (par step).
gen_all_tier_parProduce all
parstep outputs.make_simstat_partition_fileCreate the simulation event statistics partitioning file.
This rule maps chunks of event indices to partitions associated to the data taking runs specified in the “runlist” (from e.g.
config.runlist) and stores them on disk as YAML files. The format is the following:job_000: l200-p03-r001-phy: [0, 300] l200-p03-r002-phy: [301, 456] job_001: l200-p03-r002-phy: [0, 200] l200-p03-r003-phy: [201, 156] job_002: l200-p03-r003-phy: [0, 50]
The events simulated in job
0(456) are split betweenr001andr002. The partition corresponding tor002is however incomplete, and 200 events are taken from the simulation job1.The fraction of total simulated events (summed over all simulation jobs) that belong to a partition is determined by weighting with the fraction of livetime that belongs to that run.
Uses wildcard
simid.build_hpge_drift_time_mapProduce an HPGe drift time map.
Run a Julia script based on a pulse shape simulation performed with the
SolidStateDetectors.jlpackage, using crystal geometry information fromlegend-metadata.Uses wildcards
hpge_detectorandhpge_voltage.merge_hpge_drift_time_mapsMerge HPGe drift time maps in a single file.
Copy the top-level LH5 objects from each individual detector drift time map file into a single merged file using
h5copy.Uses wildcard
runid.plot_hpge_drift_time_mapsProduce a validation plot of an HPGe drift time map.
Generates diagnostic plots of the computed drift time map for a single detector at the specified operational voltage.
Uses wildcards
hpge_detectorandhpge_voltage.extract_current_pulse_modelExtract the HPGe current signal model.
Perform a fit of current signals recorded in LEGEND-200 and stores the best-fit model parameters in a YAML file.
Warning
This rule does not have the relevant LEGEND-200 data files as input, since they are dynamically discovered and this would therefore slow down the DAG generation. Therefore, remember to force-rerun if the input data is updated!
Uses wildcards
runidandhpge_detector.merge_current_pulse_model_parsMerge the HPGe current signal model parameters in a single file per
runid.Collect the individual best-fit parameter files (one per detector) and write them into a single YAML file keyed by detector name.
Uses wildcard
runid.extract_hpge_observables_modelsExtract and store on disk models of the HPGe observables for a run.
Stores YAML files with a mapping between HPGe detectors and respective information to reconstruct:
the energy resolution as a function of energy;
the A/E resolution as a function of energy;
as determined during energy calibration. This is done in a separate rule because the data production parameter database is large and we don’t want to use a lot of memory in the
build_tier_hitrule.Design: this rule is a collection step, not a validation step. It gathers what it can from l200data and
simprod/config/pars/geds/eresmod/; the output may be incomplete. Completeness is validated downstream inbuild_tier_hit.Uses wildcard
runid.
workflow/rules/pdf.smk module¶
gen_all_tier_pdfAggregate and produce all the
pdftier files.No wildcards are used.
archive_pdfsArchive all pdf tier files into a single tarball.
Must be triggered manually with
snakemake archive_pdfs— it is not part of the defaultalltarget. Collects all LH5 files produced undertier/pdf/and packs them intotarballs/<cycle>-pdfs.tar.xz, preserving the directory tree structure.No wildcards are used.
build_tier_pdfProduce a
pdftier file.Reads
cvttier data and bins it into histograms (the PDFs) according to the PDF configuration file.Uses wildcard
simid.
workflow/rules/stp.smk module¶
Rules to build the stp tier.
gen_all_tier_stpBuild the entire
stptier.gen_geom_configWrite a geometry configuration file for legend-pygeom-l200.
Start from the template/default geometry configuration file and eventually add extra configuration options in case requested in
simconfig.yamlthrough thegeom_config_extrafield.Uses wildcards
tierandsimid.build_geom_gdmlBuild a concrete geometry GDML file with
pygeoml200.Run
legend-pygeom-l200to convert the geometry configuration file into a GDML file.Uses wildcards
tierandsimid.gen_remage_macroWrite the remage macro file for a
stptier simulation to disk.Renders the macro template for the given
simidusinglegendsimflow.commands.make_remage_macro()and writes it to the canonical macro path undergenerated/macros/.Uses wildcard
simid.build_tier_stpRun a single simulation job for the
stptier.Invoke remage using the macro generated by
legendsimflow.commands.make_remage_macro()fromsimconfig.yaml.Uses wildcards
simidandjobid.Note
The output remage file is declared as
protectedto avoid accidental deletions, since it typically takes a lot of resources to produce it.plot_tier_stp_verticesProduce plots of the primary event vertices of tier
stp.Only the first file of the simulation (i.e. job ID 0) is used. The rule is given a high priority to make sure that the plot is produced early. The maximum number of plotted events is set in the plotting script.
Uses wildcard
simid.
workflow/rules/vtx.smk module¶
build_tier_vtxRun a single simulation job for the
vtxtier.Run the user-defined vertex generation command from
vtx_simconfig.yaml, templating it with the geometry file path, output file path, and number of events.Uses wildcards
simidandjobid.