legendsimflow package¶
Submodules¶
legendsimflow._version module¶
legendsimflow.aggregate module¶
- legendsimflow.aggregate.crystal_meta(config, diode_meta)¶
Get the crystal metadata starting from the diode metadata.
- Return type:
- legendsimflow.aggregate.gen_list_of_all_hpges_valid_for_modeling(config, write_to_file=None)¶
Generate the complete list of HPGe detectors valid for modeling.
Find out which HPGe detectors are valid for each runid and their voltages. Returns the following dictionary:
{ 'l200-p03-r000-phy': {'V00048A': 4200, ...}, 'l200-p03-r001-phy': {'V00050B': 3500, ...}, ... }
i.e. a mapping
runid -> hpge -> voltage.
- legendsimflow.aggregate.gen_list_of_all_par_outputs(config)¶
Generate the list of all (non-plot)
parstep output files in the Simflow.
- legendsimflow.aggregate.gen_list_of_all_plots(config, **kwargs)¶
Generate a list of all plot files across all active make_steps.
Extra keyword arguments (e.g.
cache) are forwarded togen_list_of_all_plots_outputs().
- legendsimflow.aggregate.gen_list_of_all_plots_outputs(config, tier, **kwargs)¶
Generate a list of all plot files that belong to a tier.
- legendsimflow.aggregate.gen_list_of_all_runids(config)¶
The full list of runids required in the Simflow.
- legendsimflow.aggregate.gen_list_of_all_simid_outputs(config, tier)¶
Generate a list of all files that belong to a tier.
- legendsimflow.aggregate.gen_list_of_all_simids(config)¶
Generate a list of all simids defined in the simflow.
The list is generated by querying the
stptier configuration.
- legendsimflow.aggregate.gen_list_of_all_tier_cvt_outputs(config, **kwargs)¶
Generate the list of all
cvttier files in the Simflow.
- legendsimflow.aggregate.gen_list_of_all_tier_pdf_outputs(config, **kwargs)¶
Generate the list of all
pdftier files in the Simflow.
- legendsimflow.aggregate.gen_list_of_all_usabilities(config)¶
Generate a usability mapping for all detectors and all runs defined in the Simflow.
Use this function to build a cache to avoid repeated metadata lookups. Returns the following dictionary:
{ 'l200-p03-r000-phy': { 'V00048A': {'usability': 'on', 'psd_usability': 0}, ... }, ... }
psd_usabilityis an integer encoding of thepsd.status.low_aoefield in the channel map status for germanium detectors (seelegendsimflow.metadata.PSD_USABILITY_CODE). If the field is absent it defaults silently to"valid"; if it has an unexpected value a warning is emitted and it also defaults to"valid".
- legendsimflow.aggregate.gen_list_of_aoeresmods(config, simid)¶
Generate the list of HPGe A/E resolution model parameter files for all requested runids.
- legendsimflow.aggregate.gen_list_of_currmod_plots_outputs(config, simid, cache=None)¶
Generate the list of HPGe current pulse model plot outputs.
- legendsimflow.aggregate.gen_list_of_currmods(config, runid, cache=None)¶
Generate the list of HPGe current model parameter files for a runid.
- legendsimflow.aggregate.gen_list_of_dtmap_plots_outputs(config, simid, cache=None)¶
Generate the list of HPGe drift time map plot outputs.
- legendsimflow.aggregate.gen_list_of_dtmaps(config, runid, cache=None)¶
Generate the list of HPGe drift time map files for a runid.
- legendsimflow.aggregate.gen_list_of_eresmods(config, simid)¶
Generate the list of HPGe energy resolution model parameter files for all requested runids.
- legendsimflow.aggregate.gen_list_of_hpges_valid_for_modeling(config, runid)¶
Make a sorted list of HPGe detectors for which we want to compute a model.
It generates the list of deployed detectors in runid via the LEGEND channelmap, then checks if in the crystal metadata there’s all the information required to generate a drift time map etc.
Detectors listed in the validity-based metadata directory
simprod/config/pars/{experiment}/geds/skip/for the given runid are additionally excluded from the result. The skip list is a mapping{detector_name: reason}; a WARNING is logged for each skipped detector. A missing directory or empty mapping is a no-op.Warning
This function is expensive in terms of filesystem I/O! Do not call it multiple times or in hot loops.
- legendsimflow.aggregate.gen_list_of_merged_currmods(config, simid)¶
Generate the list of (merged) HPGe current model parameter files for all requested runids.
- legendsimflow.aggregate.gen_list_of_merged_dtmaps(config, simid)¶
Generate the list of (merged) HPGe drift time map files for all requested runids.
- legendsimflow.aggregate.gen_list_of_plots_outputs(config, tier, simid, **kwargs)¶
Generate the list of plots files for a tier.simid.
- legendsimflow.aggregate.gen_list_of_psdcuts(config, simid)¶
Generate the list of HPGe PSD cut value files for all requested runids.
- legendsimflow.aggregate.gen_list_of_simid_inputs(config, tier, simid)¶
Generate the list of input files for a tier.simid.
- legendsimflow.aggregate.gen_list_of_simid_outputs(config, tier, simid, max_files=None)¶
Generate the list of output files for a tier.simid.
- legendsimflow.aggregate.get_hpge_voltage(config, hpge, runid)¶
Get the operational voltage for an HPGe in a given run.
Returns the voltage as an integer.
- Return type:
- legendsimflow.aggregate.get_simid_njobs(config, simid)¶
Number of jobs that will be generated for a tier.simid.
Based on the information contained in the stp tiers simulation configuration. If
config.benchmarkis true, this function always returns 1.- Return type:
- legendsimflow.aggregate.process_simlist(config, simlist=None, make_steps=None)¶
Produce a list of all output files that refer to a simlist.
Each simlist item is
<tier>.<simid>. The tier is interpreted as the latest tier requested for that simid; outputs are produced cumulatively for all tiers up to (and including) that tier in make_steps.
legendsimflow.archive module¶
- legendsimflow.archive.create_pdfs_tarball(pdf_dir, output, prefix)¶
Archive all pdf tier LH5 files into a .tar.xz.
- legendsimflow.archive.create_plots_tarball(generated_dir, output, prefix)¶
Archive all plots/ directories under generated_dir into a .tar.xz.
legendsimflow.awkward module¶
- legendsimflow.awkward.ak_isin(elements, test_elements, *, assume_unique=False)¶
legendsimflow.cli module¶
- legendsimflow.cli._partition(xs, n)¶
- legendsimflow.cli.snakemake_nersc_batch_cli()¶
Implementation of the
snakemake-nersc-batchCLI.
- legendsimflow.cli.snakemake_nersc_cli()¶
Implementation of the
snakemake-nerscCLI.
legendsimflow.commands module¶
- legendsimflow.commands._confine_by_volume(is_surface, volume, surface_max_intersections=100)¶
Helper function to generate confinement macro lines for a given volume.
- legendsimflow.commands._get_full_name(node)¶
Get the name of the function being called, including the module path if it’s an attribute access.
- Return type:
- legendsimflow.commands.get_confinement_from_function(function_string, reg)¶
Get the confinement commands for a function defined in the GDML.
The function string must correspond to the following format:
module.function(<...>, arg=...)
where
<...>will be replaced with thepyg4ometry.geant4.Registryinstance for the geometry.
- legendsimflow.commands.make_remage_macro(config, simid, tier='stp', geom=None)¶
Render the remage macro for a given simulation and write it to disk.
This function reads the simulation configuration for the provided tier/simid, assembles the macro substitutions (e.g.
GENERATOR,CONFINEMENT) using values and references defined under config.metadata, renders the specified macro template, writes the final macro file to the canonical input path, and returns both the macro text and the output file path.- Parameters:
config (
AttrsDict) – Mapping-like Snakemake configuration that supports attribute-style access (e.g.config.experiment,config.metadata, etc.). The following fields are used: -experiment: name of the experiment to select tier-specific metadata. -metadata.tier[tier][experiment].generators: generator definitions. -metadata.tier[tier][experiment].confinement: confinement definitions.simid (
str) – Simulation identifier to select the simconfig.tier (
str) – Simulation tier (e.g. “stp”, “ver”, …). Default is “stp”.
- Return type:
- Returns:
A tuple with
- The rendered macro text.
- The path where the macro was written.
Notes
The macro template path is taken from the simconfig template field.
Supported substitutions currently include:
GENERATORandCONFINEMENT.The user can provide arbitrary macro substitutions with the optional macro_substitutions field.
The macro is written to the canonical path returned by
patterns.input_simjob_filename().If
config.nersc.dvs_rois set, the vertices file will be read from the read-only filesystem mount/dvs_roat NERSC.
- legendsimflow.commands.remage_run(config, simid, *, jobid=None, tier='stp', geom='{input.geom}', procs=1, output='{output}', macro_free=False)¶
Build a remage CLI invocation string for a given simulation.
This constructs a shell-escaped command line for remage. When
macro_freeis True, the macro is rendered inline viamake_remage_macro()and its content is passed directly on the CLI. Whenmacro_freeis False (default), the pre-existing macro file path is referenced on the CLI and substitutions are passed via--macro-substitutions; in that case the caller is responsible for generating the macro file beforehand (e.g. via thegen_remage_macroSnakemake rule).Notes
Compatible with remage >= v0.16.
When
macro_freeis False (default), the command passes the macro file path and supplies macro substitutions via--macro-substitutions.When
macro_freeis True, the rendered macro content is inlined on the CLI (comments and empty lines removed) and values are pre-substituted.Two substitutions are always provided:
N_EVENTS(fromprimaries_per_jobor benchmark override) andSEED(a random 32-bit integer).SEEDis meant to be used as remage seed. It is determined by converting output to a 32-bit integer hash. If provided, the userconfig.simflow_rng_seedinteger is added as offset.The
JOBIDsubstitution is also provided if the jobid argument is notNone.If
config.runcmd.remageis set, it is used to determine the remage executable (split withshlex.split()), otherwiseremageis used.If
config.nersc.dvs_rois set, remage is set to read all inputs from the read-only filesystem mount/dvs_roat NERSC.If
config.nersc.scratchis set, the command will write the output file on the scratch disk and move it to the final expected destination at the end.
- Parameters:
config (
AttrsDict) – Snakemake-like configuration mapping. Must include metadata required bymake_remage_macro()and optionalbenchmarkandruncmdsections.simid (
str) – Simulation identifier for which to construct the command.jobid (
str|None) – Job identifier for the simulation run (string holding a zero-padded integer). Used as remage CLI macro substitution in case the macro contains it (e.g. if a vertices file is used).tier (
str) – Simulation tier (e.g.,"stp","ver"). Default is"stp".geom (
str|Path) – Path (or Snakemake placeholder) to the GDML geometry file.procs (
int) – Number of threads to pass to remage (integer or Snakemake placeholder). Internally uses remage’s--procs.output (
str|Path) – Path (or Snakemake placeholder) to the output remage file.macro_free (
bool) – If True, inline the macro contents on the CLI; if False, reference the macro file and pass substitutions via--macro-substitutions.
- Return type:
- Returns:
A shell-escaped command line suitable for direct execution.
legendsimflow.confine module¶
- legendsimflow.confine._get_matching_volumes(volume_list, patterns)¶
Return volumes from volume_list whose names match patterns.
Wildcard patterns are supported via
fnmatch.fnmatch().
- legendsimflow.confine.get_lar_minishroud_confine_commands(reg, pattern='minishroud_tube*', inside=True, lar_name='liquid_argon', outer_radius_in_mm=None, outer_height_in_mm=None)¶
Extract the commands for the LAr confinement inside/outside the NMS from the GDML.
- Parameters:
reg (
Registry) – The registry describing the geometry.pattern (
str|Sequence[str]) – The pattern used to search for physical volumes of minishrouds.inside (
bool) – If True, generate points inside the minishroud (NMS) volumes; if False, exclude the minishroud volumes from the generation region.lar_name (
str) – The name of the physical volume of the LAr.outer_radius_in_mm (
float|None) – If provided, gives an outer radius for the confinement. Only supported for outside confinement (inside=False).outer_height_in_mm (
float|None) – If provided, gives an outer height for the confinement. Only supported for outside confinement (inside=False).
- Return type:
- Returns:
A list of confinement commands for remage.
legendsimflow.exceptions module¶
legendsimflow.hpge_pars module¶
- legendsimflow.hpge_pars._iter_noise_waveforms(raw_files, hit_files, lh5_group, dsp_config, dsp_output, *, threshold=5, length=1000, energy_var='cuspEmax_cal')¶
Yield noise waveforms one at a time without accumulating them all in memory.
Parameters are the same as
get_noise_maxima_and_sample().
- legendsimflow.hpge_pars._lookup_generated_pars_file(l200data, metadata, runid, *, hit_tier_name='hit', pars_db=None)¶
- legendsimflow.hpge_pars._remove_outliers(data, sigma=5)¶
Remove elements more than
sigmastandard deviations from the mean.
- legendsimflow.hpge_pars.build_aoe_res_func(function)¶
A/E resolution function builder.
- Return type:
- legendsimflow.hpge_pars.build_aoe_res_func_dict(l200data, metadata, runid, *, hit_tier_name='hit', aoe_res_pars=None)¶
Build A/E resolution functions for each HPGe detector in a LEGEND-200 run.
- Return type:
- Returns:
Mapping of HPGe name to A/E resolution as a function of energy, where
energy is expected in units of keV.
- Parameters:
l200data (
str|Path) – The path to the L200 data production cycle.metadata (
LegendMetadata) – The metadata instancerunid (
str) – LEGEND-200 run identifier, must be of the form {EXPERIMENT}-{PERIOD}-{RUN}-{TYPE}.hit_tier_name (
str) – name of the hit tier. This is typically “hit” or “pht”.aoe_res_pars (
dict|AttrsDict|None) – fromlookup_aoe_res_metadata().
- legendsimflow.hpge_pars.build_aoe_res_func_from_entry(meta)¶
Build a bound A/E resolution callable from a single metadata entry.
- legendsimflow.hpge_pars.build_energy_res_func(function)¶
Energy resolution function builder.
- Return type:
- legendsimflow.hpge_pars.build_energy_res_func_dict(l200data, metadata, runid, *, hit_tier_name='hit', energy_res_pars=None)¶
Build energy resolution functions for each HPGe detector in a LEGEND-200 run.
- Return type:
- Returns:
Mapping of HPGe name to energy resolution function (FWHM), where energy is
expected in units of keV.
- Parameters:
l200data (
str|Path) – The path to the L200 data production cycle.metadata (
LegendMetadata) – The metadata instancerunid (
str) – LEGEND-200 run identifier, must be of the form {EXPERIMENT}-{PERIOD}-{RUN}-{TYPE}.hit_tier_name (
str) – name of the hit tier. This is typically “hit” or “pht”.energy_res_pars (
dict|AttrsDict|None) – fromlookup_energy_res_metadata().
- legendsimflow.hpge_pars.build_energy_res_func_from_entry(meta)¶
Build a bound energy resolution callable from a single metadata entry.
- Parameters:
meta (
dict|AttrsDict) – A single detector’s energy resolution metadata, with keysexpressionandparameters. Same format as one value fromlookup_energy_res_metadata().- Return type:
- Returns:
Callable that takes energy in keV and returns FWHM in keV.
- legendsimflow.hpge_pars.estimate_mean_aoe(popt, energy=1593)¶
Estimate the maximum aoe from the parameters of the current_pulse_model popt.
- Return type:
- legendsimflow.hpge_pars.fit_currmod(times_list, current_list)¶
Fit the model to multiple raw HPGe current pulses simultaneously.
Normalises each waveform by its peak amplitude and uses
iminuit.Minuitto minimise the summed RMS residual across all waveforms simultaneously. Fitting multiple waveforms provides a more robust estimate of the pulse-shape parameters than fitting a single event.- Parameters:
- Return type:
- Returns:
Tuple of the best-fit parameters (as a NumPy array), and arrays of the
best-fit model (time and current) evaluated around the peak.
- legendsimflow.hpge_pars.fit_noise_gauss(data, bins, *, fit_range=None, sigma_range=None)¶
Fit the data to a Gaussian to extract the resolution.
Performs a binned maximum likelihood fit using minuit.
- Parameters:
data (
TypeAliasType) – an array of the data to fit.bins (
int) – The number of bins.fit_result – The results of the iminuit fit.
fit_range (
tuple|None) – The range to use for the fit, if None this is determined from the data as +/- 5 standard deviations round the mean.sigma_range (
tuple|None) – The range of sigma values for the fit, if None is determined from the data.
- Return type:
- Returns:
The minuit object holding the fit results.
- legendsimflow.hpge_pars.get_current_pulse(raw_file, lh5_group, idx, dsp_config, dsp_output='curr_av', align='tp_aoe_max')¶
Extract the current pulse.
- Parameters:
lh5_group (
str) – where to find the waveform table.idx (
int) – the index of the waveform to read.dsp_config (
str) – thedspeedconfiguration file defining the DSP processing chain to estimate the current pulse.dsp_output (
str) – the name of the DSP output corresponding to the current pulse.align (
str) – DSP value around which the pulses are aligned.
- Return type:
- legendsimflow.hpge_pars.get_current_pulses(raw_file_idx_pairs, lh5_group, dsp_config, dsp_output='curr_av', align='tp_aoe_max')¶
Extract current pulses for multiple events.
Calls
get_current_pulse()for each(raw_file, idx)pair and returns the results as two parallel lists.- Parameters:
raw_file_idx_pairs (
list[tuple[Path|str,int]]) – list of(raw_file, idx)pairs.lh5_group (
str) – where to find the waveform table.dsp_config (
str|None) – thedspeedconfiguration file defining the DSP processing chain to estimate the current pulse.dsp_output (
str) – the name of the DSP output corresponding to the current pulse.align (
str|None) – DSP value around which the pulses are aligned.
- Return type:
tuple[list[ndarray[tuple[Any,...],dtype[TypeVar(_ScalarT, bound=generic)]]],list[ndarray[tuple[Any,...],dtype[TypeVar(_ScalarT, bound=generic)]]]]- Returns:
times_list – list of timestep arrays.
current_list – list of current-value arrays.
- legendsimflow.hpge_pars.get_noise_maxima_and_sample(raw_files, hit_files, lh5_group, dsp_config, dsp_output, template, *, norm=1, sample_size=100, threshold=5, maximum_number=None, energy_var='cuspEmax_cal')¶
Compute waveform maxima on-the-fly, keeping only a small sample in memory.
This avoids storing all noise waveforms at once. Instead, it iterates through waveforms, computes the maximum of
waveform + templatefor each, and only retains the firstsample_sizewaveforms for plotting.- Parameters:
raw_files (
list) – List of paths to raw files.hit_files (
list) – List of paths to hit files.lh5_group (
str) – The name of the lh5_group to find the waveform table in.dsp_config (
str) – thedspeedconfiguration file defining the DSP processing chain to estimate the current pulse.dsp_output (
str) – the name of the DSP output corresponding to the current pulse.template (
TypeAliasType) – the current-pulse template waveform.norm (
float) – normalisation for the template.sample_size (
int) – number of waveforms to keep for plotting.threshold (
float) – energy threshold to apply to select the noise waveforms.maximum_number (
int|None) – maximum number of waveforms to process.energy_var (
str) – the name of the energy variable to use for thresholding.
- Return type:
tuple[ndarray[tuple[Any,...],dtype[TypeVar(_ScalarT, bound=generic)]],ndarray[tuple[Any,...],dtype[TypeVar(_ScalarT, bound=generic)]]]- Returns:
sample_wfs – 2D array of the first
sample_sizewaveforms (for plotting).a_max – 1D array of the maximum of
waveform + templatefor each waveform.
- legendsimflow.hpge_pars.get_waveform_maxima(template, noise_wfs, *, norm=1)¶
Extract the maximum of each waveform based on combining the template with each waveform in noise_wfs.
Note
The length of the template must be the same as the waveforms in noise_wfs
- legendsimflow.hpge_pars.lookup_aoe_res_metadata(l200data, metadata, runid, *, hit_tier_name='hit', pars_db=None)¶
Lookup the measured A/E resolution metadata from LEGEND-200 data.
The metadata refers to the following model:
\[\sigma_\text{A/E}(E) = \sqrt{a + (b/E)^c}\]where \(E\) is in keV.
- Return type:
- Returns:
Mapping of HPGe name to metadata dictionary.
- Parameters:
l200data (
str|Path) – The path to the L200 data production cycle.metadata (
LegendMetadata) – The metadata instancerunid (
str) – LEGEND-200 run identifier, must be of the form {EXPERIMENT}-{PERIOD}-{RUN}-{TYPE}.hit_tier_name (
str) – name of the hit tier. This is typically “hit” or “pht”.pars_db (
TextDB|None) – optional existing non-lazy instance ofTextDB(".../path/to/prod/generated/par_{hit_tier_name}").
- legendsimflow.hpge_pars.lookup_currmod_fit_data(hit_files, lh5_group, ewin_center=1593, ewin_width=10, max_waveforms=1, get_drift_time=True)¶
Extract the indices of the events to fit.
Considers events with
abs(A/E) < 1.5and finds up tomax_waveformsevents closest to the median drift time. Returns a list of(event_index, file_index)pairs, sorted from closest to farthest from the median, with at mostmax_waveformsentries, together with the full and selected drift-time arrays for diagnostic purposes.- Parameters:
hit_files (
list[str|Path]) – tier-hit files used to determine the best indices.lh5_group (
str) – where the tier-hit data is found in the files.ewin_center (
float) – center of the energy window to use for the event search (same units as in data).ewin_width (
float) – width of the energy window to use for the event search (same units as in data).max_waveforms (
int) – maximum number of waveforms to return.get_drift_time (
bool) – Read also drift time to select waveforms.
- Return type:
tuple[list[tuple[int,int]],ndarray[tuple[Any,...],dtype[TypeVar(_ScalarT, bound=generic)]],ndarray[tuple[Any,...],dtype[TypeVar(_ScalarT, bound=generic)]]]- Returns:
pairs – list of
(event_index, file_index)tuples, sorted by proximity to the median drift time.all_dts – all drift-time values for events passing the energy and A/E cuts.
selected_dts – drift-time values for the selected subset of events.
- legendsimflow.hpge_pars.lookup_currmod_fit_inputs(l200data, metadata, runid, hpge, hit_tier_name='hit', max_waveforms=100)¶
Find raw files, event indices and the DSP configuration file.
- Parameters:
l200data (
str|Path) – The path to the L200 data production cycle.metadata (
LegendMetadata) – The metadata instancerunid (
str) – LEGEND-200 run identifier, must be of the form {EXPERIMENT}-{PERIOD}-{RUN}-{TYPE}.hpge (
str) – name of the HPGe detectorhit_tier_name (
str) – name of the hit tier. This is typically “hit” or “pht”.max_waveforms (
int) – maximum number of waveforms to return.
- Return type:
tuple[list[tuple[Path,int]],Path,ndarray[tuple[Any,...],dtype[TypeVar(_ScalarT, bound=generic)]],ndarray[tuple[Any,...],dtype[TypeVar(_ScalarT, bound=generic)]]]- Returns:
raw_wf_pairs – list of
(raw_file, event_index)pairs, up tomax_waveforms.dsp_cfg_file – path to the DSP configuration file.
all_dts – all drift-time values for events passing the energy and A/E cuts.
selected_dts – drift-time values for the selected subset of events.
- legendsimflow.hpge_pars.lookup_energy_res_metadata(l200data, metadata, runid, *, hit_tier_name='hit', pars_db=None)¶
Lookup the measured HPGe energy resolution metadata from LEGEND-200 data.
The metadata refers to the following model:
\[\text{FWHM}(E) = \sqrt{a + bE}\]where \(E\) is in keV.
- Return type:
- Returns:
Mapping of HPGe name to metadata dictionary.
- Parameters:
l200data (
str|Path) – The path to the L200 data production cycle.metadata (
LegendMetadata) – The metadata instancerunid (
str) – LEGEND-200 run identifier, must be of the form {EXPERIMENT}-{PERIOD}-{RUN}-{TYPE}.hit_tier_name (
str) – name of the hit tier. This is typically “hit” or “pht”.pars_db (
TextDB|None) – optional existing non-lazy instance ofTextDB(".../path/to/prod/generated/par_{hit_tier_name}").
- legendsimflow.hpge_pars.lookup_file_paths(l200data, runid, hit_tier_name)¶
Lookup the paths to the hit and raw files.
- Return type:
- legendsimflow.hpge_pars.lookup_psd_cut_values(l200data, metadata, runid, *, hit_tier_name='hit', pars_db=None)¶
Lookup the measured PSD cut values from LEGEND-200 data.
- Return type:
- Returns:
Mapping of HPGe name to metadata dictionary.
- Parameters:
l200data (
str|Path) – The path to the L200 data production cycle.metadata (
LegendMetadata) – The metadata instancerunid (
str) – LEGEND-200 run identifier, must be of the form {EXPERIMENT}-{PERIOD}-{RUN}-{TYPE}.hit_tier_name (
str) – name of the hit tier. This is typically “hit” or “pht”.pars_db (
TextDB|None) – optional existing non-lazy instance ofTextDB(".../path/to/prod/generated/par_{hit_tier_name}").
- legendsimflow.hpge_pars.plot_currmod_fit_result(t, A, model_t, model_A)¶
Plot the best fit results.
- Return type:
- legendsimflow.hpge_pars.plot_dt_selection(all_dts, selected_dts)¶
Plot the drift-time distribution and highlight the selected waveforms.
Draws a histogram of all drift-time values (passing the energy and A/E cuts) using the
histpackage and overlays a shaded band that spans the range of drift times of the events chosen for the current-pulse fit.- Parameters:
- Return type:
- Returns:
fig – The
matplotlib.figure.Figure.ax – The
matplotlib.axes.Axes.
- legendsimflow.hpge_pars.plot_gauss_fit(data, fit_result, fit_range=None, bins=100, nominal_val=None)¶
Plot the result of the Gaussian fit.
- Parameters:
data (
TypeAliasType) – an array of the data to fit.fit_result (
Minuit) – the result of the Gaussian fit.bins (
int) – The number of bins.fit_range (
tuple|None) – The range to use for the fit, if None this is determined from the data as +/- 5 standard deviations round the mean.nominal_val (
float|None) – The nominal mean to add as a line on the plot.
- Return type:
legendsimflow.metadata module¶
- legendsimflow.metadata._get_lh5_table(metadata, fname, hpge, tier, runid)¶
The correct LH5 table path.
Determines the correct path to a hpge detector table in tier tier.
- Return type:
- legendsimflow.metadata.decode_psd_usability(psd_usability_code)¶
Decode the PSD usability (see
encode_psd_usability()).- Return type:
- legendsimflow.metadata.decode_usability(usability_code)¶
Decode the HPGe usability (see
encode_usability()).- Return type:
- legendsimflow.metadata.encode_psd_usability(psd_usability)¶
Encode the PSD usability in an int.
- Return type:
- legendsimflow.metadata.encode_usability(usability)¶
Encode the HPGe usability in an int.
- Return type:
- legendsimflow.metadata.expand_runlist(metadata, runlist)¶
Expands a runlist as passed to the Simflow configuration.
A runlist is a list of:
runids in the form accepted by
is_runid();runlist DB queries in the form
<tag>.<datatype>.<period>(seequery_runlist_db()).
- legendsimflow.metadata.extract_integer(file_path)¶
Read a single integer from a file, stripping surrounding whitespace.
- Return type:
- legendsimflow.metadata.get_runlist(config, simid)¶
Gets the runlist assigned to a simulation.
If not overridden in the hit-tier simconfig, returns the global runlist stored in
config.runlist.
- legendsimflow.metadata.get_sanitized_fccd(metadata, det_name)¶
Return the FCCD value for det_name, falling back to 1 mm if the FCCD field is absent.
- Parameters:
metadata (
LegendMetadata) – LEGEND metadata database.det_name (
str) – Detector name.
- Return type:
- legendsimflow.metadata.get_simconfig(config, tier, simid=None, field=None)¶
Return the simulation configuration for the given tier and simid.
Raise
SimflowConfigErrorif any key is not found.
- legendsimflow.metadata.get_tier_settings(config, tier)¶
Return the settings block for tier and the current experiment.
- Return type:
- legendsimflow.metadata.get_vtx_simconfig(config, simid)¶
Get the vertex generation configuration for a stp-tier simid.
Returns the
vtx-tier generator requested by thestp-tier simulation with identifier simid.
- legendsimflow.metadata.is_runid(runid)¶
Whether a runid (run identifier) is correctly formatted.
It should be in the form
<experiment>-<period>-<run>-<datatype>/XXX-pNN-rMMM-AAAwhereXXXis any alphanumeric experiment identifier.- Return type:
- legendsimflow.metadata.is_simid(simid)¶
Whether a simid (simulation identifier) is correctly formatted.
A valid simid must consist entirely of word characters (letters, digits, underscores) and hyphens, matching the pattern
[-\w]+. Dots and other special characters are not allowed; in particular, dots are forbidden because they are used as the delimiter in the simlist format<tier>.<simid>.- Return type:
- legendsimflow.metadata.parse_runid(runid)¶
Extract runid fields.
Returns the experiment, period, run and datatype as a tuple. Period and run are integers.
- legendsimflow.metadata.query_runlist_db(metadata, query)¶
Query the runlist DB stored in legend-datasets.
Run expressions of the form
r00n..r00mare automatically expanded into full run lists. If for examplemetadata.datasets.runlists.valid.phy.p02 == "r000..r002":>>> query_runlist_db(metadata, "valid.phy.p02") ["l200-p02-r000-phy", "l200-p02-r001-phy", "l200-p02-r002-phy"]
- Parameters:
metadata (
LegendMetadata) – LEGEND metadata instance.query (
str) – expression in the form <tag>.<datatype>.<period> (see contents ofrunlists.yamlin legend-datasets.
- Return type:
- legendsimflow.metadata.reference_cal_run(metadata, runid)¶
The reference calibration run for runid.
Warning
This function does not account for dataflow overrides (e.g. calibration back-applying)!
- Return type:
- legendsimflow.metadata.runinfo(metadata, runid)¶
Get the datasets.runinfo entry for a LEGEND run identifier.
- Parameters:
metadata (
LegendMetadata) – LEGEND metadata database.runid (
str) – a run identifier in the format<experiment>-<period>-<run>-<datatype>.
- Return type:
- legendsimflow.metadata.simpars(metadata, par, runid, experiment, default=<object object>)¶
Extract simflow parameters for a certain LEGEND run.
Queries the simflow parameters database stored under
simprod.config.parsby experiment name experiment, parameter name par and LEGEND run identifier runid.- Parameters:
metadata (
LegendMetadata) – LEGEND metadata database.par (
str) – name of directory undermetadata.simprod.config.pars.{experiment}. Can be a nested property, as in e.g.geds.opv.value..and/are allowed separators.runid (
str) – a run identifier in the format<experiment>-<period>-<run>-<datatype>.experiment (
str) – experiment identifier (e.g.l200cfg01,l1000dsg01). Selects the experiment-level subdirectory undersimprod/config/pars/.default (
object) – value to return when the parameter directory is not found in the database or no validity entry matches runid. If not provided, such cases raiseKeyErrororLookupError. Other errors (e.g. malformed YAML) are always re-raised regardless of this argument.
- Return type:
- legendsimflow.metadata.smk_hash_simconfig(config, wildcards, field=None, ignore=None, **kwargs)¶
Get the dictionary hash for use in Snakemake rules.
- legendsimflow.metadata.usability(metadata, det_name, runid, default=None)¶
Get the usability for analysis of det_name in run runid.
Looks for the
analysis.usabilitymetadata field in the channel map. By default, an error is thrown if no information is found. If default is set to a non-None value, it will be returned.- Return type:
- legendsimflow.metadata.validate_simconfig_keys(simconfig, block=None)¶
Validate that all top-level keys of simconfig are valid simids.
Raises
SimflowConfigErrorlisting every invalid key if any are found.
legendsimflow.nersc module¶
- legendsimflow.nersc.dvs_ro(config, path)¶
Turn
/global/...file paths to/dvs_ro/...on NERSC.The input type is preserved.
Note
config must contain a
nersckey mapped to a dictionary containing advs_ro: Truekey.
- legendsimflow.nersc.dvs_ro_snakemake(snakemake)¶
Swap the read-only filesystem path in all Snakemake input files.
This function is meant to be used in Snakemake scripts, where the Snakemake rule attributes (input, output, …) are accessible from the special object
snakemake.Warning
This function mutates the input snakemake object in place.
See also
- Return type:
- legendsimflow.nersc.is_scratch_enabled(config)¶
Whether the scratch folder is enabled in this workflow.
- Return type:
- legendsimflow.nersc.make_on_scratch(config, path)¶
Return tools to produce a file in the scratch folder and move it to the final destination.
Returns the temporary path in the scratch dir and a function that will move it back to the original destination.
- legendsimflow.nersc.on_scratch(config, path)¶
Return the path of the file in the scratch folder.
Also makes sure the parent folder exists.
- Return type:
legendsimflow.partitioning module¶
- legendsimflow.partitioning.partition_simstat(n_events, n_events_part, runlist)¶
Partition the simulation event statistics according to run livetime.
Returns the following dictionary:
job_000: l200-p03-r001-phy: [0, 300] # interval includes its edges l200-p03-r002-phy: [301, 456] job_001: l200-p03-r002-phy: [0, 200] l200-p03-r003-phy: [201, 156] ...
where the number of events of each job is partitioned in runs, such that the global event partitioning in n_events_part is respected.
- Parameters:
n_events (
Mapping[str,int]) –mapping of number of simulation events and simulation job.
job_0000: 5000 job_0001: 7000 ...
n_events_part (
Mapping[str,int]) –mapping of fraction of total number of simulation events (summed over all jobs) per considered run, with weights equal to the run livetime fraction.
l200-p03-r001-phy: 300 l200-p03-r002-phy: 456 ... l200-<...>: tot_n_events
runlist (
Iterable[str]) – list of runs in the form<experiment>-<period>-<run>-<datatype>.
- Return type:
legendsimflow.patterns module¶
Prepare pattern strings to be used in Snakemake rules.
Extra keyword arguments are typically interpreted as variables to be
substituted in the returned (structure of) strings. They are passed to
snakemake.io.expand().
Definitions:
simid: string identifier for the simulation runsimjob: one job of a simulation run (corresponds to one macro file and one output file)jobid: zero-padded integer (i.e., a string) used to label a simulation job
- legendsimflow.patterns._expand(pattern, keep_list=False, **kwargs)¶
Expand a path pattern with Snakemake wildcards.
Returning a scalar unless keep_list is set.
- legendsimflow.patterns.benchmark_dtmap_filename(config, **kwargs)¶
The benchmark file path for drift time map generation for a detector and voltage.
- Return type:
- legendsimflow.patterns.benchmark_filename(config, **kwargs)¶
Formats a benchmark file path for a simid and jobid.
- Return type:
- legendsimflow.patterns.benchmark_tier_cvt_filename(config, **kwargs)¶
The benchmark file path for the cvt tier build for a simid.
- Return type:
- legendsimflow.patterns.benchmark_tier_pdf_filename(config, **kwargs)¶
The benchmark file path for the pdf tier build for a simid.
- Return type:
- legendsimflow.patterns.geom_config_filename(config, **kwargs)¶
The path to the geometry configuration YAML file for a tier and simid.
- Return type:
- legendsimflow.patterns.geom_gdml_filename(config, **kwargs)¶
The path to the GDML geometry file for a tier and simid.
- Return type:
- legendsimflow.patterns.geom_log_filename(config, **kwargs)¶
The log file path for geometry generation for a tier and simid.
- Return type:
- legendsimflow.patterns.input_currmod_evt_idx_file(config, **kwargs)¶
The path to the event index file used to extract current pulse waveforms.
- Return type:
- legendsimflow.patterns.input_simid_filenames(config, n_macros, **kwargs)¶
Returns the full path to n_macros input files for a simid.
Needed by script that generates all macros for a simid.
- legendsimflow.patterns.input_simjob_filename(config, **kwargs)¶
Returns the full path to the input file for a simid, tier and job index.
- Return type:
- legendsimflow.patterns.log_currmod_filename(config, **kwargs)¶
The log file path for current pulse model extraction for a detector and runid.
- Return type:
- legendsimflow.patterns.log_dtmap_filename(config, **kwargs)¶
The log file path for drift time map generation for a detector and voltage.
- Return type:
- legendsimflow.patterns.log_eresmod_filename(config, **kwargs)¶
The log file path for HPGe observables model extraction for a runid.
- Return type:
- legendsimflow.patterns.log_filename(config, **kwargs)¶
Formats a log file path for a simid and jobid.
- Return type:
- legendsimflow.patterns.log_simstat_part_filename(config, **kwargs)¶
The log file path for simulation event statistics partitioning for a simid.
- Return type:
- legendsimflow.patterns.log_tier_cvt_filename(config, **kwargs)¶
The log file path for the cvt tier build for a simid.
- Return type:
- legendsimflow.patterns.log_tier_pdf_filename(config, **kwargs)¶
The log file path for the pdf tier build for a simid.
- Return type:
- legendsimflow.patterns.output_aoeresmod_filename(config, **kwargs)¶
The path to the HPGe A/E resolution model parameter file for a runid.
- Return type:
- legendsimflow.patterns.output_currmod_filename(config, **kwargs)¶
The path to the per-detector HPGe current pulse model parameter file.
- Return type:
- legendsimflow.patterns.output_currmod_merged_filename(config, **kwargs)¶
The path to the merged HPGe current pulse model parameter file for a runid.
- Return type:
- legendsimflow.patterns.output_dtmap_filename(config, **kwargs)¶
The path to the HPGe drift time map file for a detector and voltage.
- Return type:
- legendsimflow.patterns.output_dtmap_merged_filename(config, **kwargs)¶
The path to the merged HPGe drift time map file for a runid.
- Return type:
- legendsimflow.patterns.output_eresmod_filename(config, **kwargs)¶
The path to the HPGe energy resolution model parameter file for a runid.
- Return type:
- legendsimflow.patterns.output_psdcuts_filename(config, **kwargs)¶
The path to the HPGe PSD cut values file for a runid.
- Return type:
- legendsimflow.patterns.output_simid_filenames(config, n_macros, **kwargs)¶
Returns the full path to n_macros output files for a simid.
- legendsimflow.patterns.output_simjob_filename(config, **kwargs)¶
Returns the full path to the output file for a simid, tier and job index.
- Return type:
- legendsimflow.patterns.output_simjob_regex(config, **kwargs)¶
A glob-style regex matching all output files for a tier.
- Return type:
- legendsimflow.patterns.output_tier_cvt_filename(config, **kwargs)¶
The path to the merged cvt tier output file for a simid.
- Return type:
- legendsimflow.patterns.output_tier_pdf_filename(config, **kwargs)¶
The path to the merged pdf tier output file for a simid.
- Return type:
- legendsimflow.patterns.pdf_tarball_filename(config)¶
The path to the pdf tier archive tarball for the current production cycle.
The Simflow has no explicit knowledge of the production cycle name, so the name of the directory where the Simflow lives is used as a proxy.
- Return type:
- legendsimflow.patterns.plot_currmod_filename(config, **kwargs)¶
The path to the current pulse model fit validation plot for a detector and runid.
- Return type:
- legendsimflow.patterns.plot_dtmap_filename(config, **kwargs)¶
The path to the drift time map validation plot for a detector and voltage.
- Return type:
- legendsimflow.patterns.plot_tier_cvt_observables_filename(config, **kwargs)¶
The path to the observable validation plot for a cvt simid.
- Return type:
- legendsimflow.patterns.plot_tier_hit_observables_filename(config, **kwargs)¶
The path to the observable validation plot for a hit simid.
- Return type:
- legendsimflow.patterns.plot_tier_opt_observables_filename(config, **kwargs)¶
The path to the observable validation plot for an opt simid.
- Return type:
- legendsimflow.patterns.plot_tier_stp_vertices_filename(config, **kwargs)¶
The path to the primary vertex validation plot for a stp simid.
- Return type:
- legendsimflow.patterns.plots_dirname(config, tier)¶
Returns the plots directory path for a tier.
- Return type:
- legendsimflow.patterns.plots_tarball_filename(config)¶
The path to the plots archive tarball for the current production cycle.
The Simflow has no explicit knowledge of the production cycle name, so the name of the directory where the Simflow lives is used as a proxy.
- Return type:
- legendsimflow.patterns.simjob_base_segment(config, **kwargs)¶
Formats a segment for a path including wildcards simid and jobid.
- Return type:
- legendsimflow.patterns.simstat_part_filename(config, **kwargs)¶
The path to the simulation event statistics partitioning file.
- Return type:
- legendsimflow.patterns.tier_cvt_base_segment(config, **kwargs)¶
The base filename segment for cvt tier files for a simid.
- Return type:
- legendsimflow.patterns.tier_pdf_base_segment(config, **kwargs)¶
The base filename segment for pdf tier files for a simid.
- Return type:
legendsimflow.plot module¶
- legendsimflow.plot.decorate(fig)¶
- legendsimflow.plot.n_nans(array)¶
- legendsimflow.plot.plot_hist(h, ax, n_nans=None, **kwargs)¶
- legendsimflow.plot.save_page(pdf, make_fig)¶
- legendsimflow.plot.set_empty(ax)¶
legendsimflow.profile module¶
legendsimflow.psl module¶
- legendsimflow.psl._check_pulse_shape_lib_keys(pulse_shape_lib)¶
Validate that the waveform map contains the required keys with correct types.
- Return type:
- legendsimflow.psl.align_waveforms_to_peak(wf_input, alignment_idx, nsamples_output_current_wfs)¶
Align an array of waveforms by shifting their maximum to a fixed index.
No normalization is performed; raw amplitudes are preserved.
Note
The output peak_indices is not the original drift time as the current waveform inherits baseline from convolution.
- Parameters:
- Return type:
- Returns:
shifted_wfs – 2D array of shifted waveforms
peak_indices – 1D array containing the original peak index for each current waveform
- legendsimflow.psl.apply_electronics_response(wf_array, rf_kernel, batch_size=50000)¶
Vectorized convolution using FFT with batching to save memory.
- Parameters:
- Return type:
Array- Returns:
convolved_wfs – The convolved waveforms as an Awkward Array
- legendsimflow.psl.build_electronics_response_kernel(dt, mu_bandwidth, sigma_bandwidth, tau_rc, gaussian_only=False, *, kernel_length=600, kernel_start=-100)¶
Create the system response kernel (gaussian + exponential decay).
This is obtained by convolving a Gaussian (representing the digitizer bandwidth) with a causal exponential decay (representing the preamplifier response). The kernel is normalized to have a sum of 1.
Note
The ‘full’ mode of convolution results in a length of 2*kernel_length - 1. If gaussian_only is True, the kernel will have a length of kernel_length and will only contain the Gaussian component, since no convolution is performed.
- Parameters:
dt (
float) – The time step between samples in the waveform (in ns)mu_bandwidth (
float) – The mean of the Gaussian representing the digitizer bandwidth (in ns)sigma_bandwidth (
float) – The standard deviation of the Gaussian representing the digitizer bandwidth (in ns)tau_rc (
float) – The time constant of the exponential decay representing the preamplifier response (in ns)gaussian_only (
bool) – If True, only use the Gaussian component (default is False)kernel_length (
int) – The total length of the response kernel in samples (default is 600)kernel_start (
int) – The starting index of the kernel relative to the waveform (default is -100, meaning the kernel will cover from -100 to 500 samples)
- Return type:
- Returns:
rf – The normalized response kernel
- legendsimflow.psl.make_realistic_pulse_shape_lib(ideal_pulse_shape_lib_obj, rf_kernel, alignment_idx, nsamples_output_current_wfs, mw_pars, dt_data=1.0)¶
Apply the waveform post-processing chain to generate a realistic waveform map.
Starts from an ideal waveform map and performs the following steps:
Converts coordinates (m to mm)
Convolves with system response
Aligns by Peak Time
Calculates compensated Drift Time
- Parameters:
ideal_pulse_shape_lib_obj (
Mapping[str,Array|Scalar]) –Mapping containing the ideal waveform map with coordinates and waveforms.
This should have the following format:
r: 1D array of radial coordinates
z: 1D array of axial coordinates
dt: Time step between samples in the waveforms
waveform_X: 3D array of ideal charge waveforms for angle X (shape: [n_z, n_r, n_samples])
rf_kernel (
ndarray) – The system response kernel (frombuild_electronics_response_kernel())alignment_idx (
int) – The index in the output array where waveform peaks will be alignednsamples_output_current_wfs (
int) – The total length of the resulting aligned current waveformsmw_pars (
dict[str,float|int]) –Dictionary of parameters for the moving window average step, with keys:
length: The length of the moving window (in samples)
num_mw: The number of moving windows to use in the moving window average
mw_type: The type of moving window to apply (see
dspeed.processors.moving_window_multifor details)
dt_data (
float) – The time step of the original data waveforms (in ns), used to scale the derivative.
- Return type:
- Returns:
realistic_pulse_shape_lib – Struct containing the processed realistic waveform map with the following keys:
r: 1D array of radial coordinates
z: 1D array of axial coordinates
t0: Global time offset applied to align waveforms
waveform_X: 3D array of processed current waveforms for angle X (shape: [n_r, n_z, nsamples_output_current_wfs], spatial axes reversed relative to Julia due to HDF5 column-/row-major conversion)
drift_time_X: 2D array of calculated drift times for angle X (shape: [n_r, n_z])
legendsimflow.reboost module¶
- legendsimflow.reboost._cluster_photoelectrons_flat(offsets, t, a, thr)¶
Numba-accelerated clustering kernel for innermost list level.
- Parameters:
- Return type:
- Returns:
out_t – Clustered times (first time in each cluster).
out_a – Clustered amplitudes (sum of amplitudes in each cluster).
counts – Number of clusters per original list.
- legendsimflow.reboost._listoffset_chain(layout)¶
Extract the chain of offsets from nested ListOffsetArrays.
- legendsimflow.reboost.cluster_photoelectrons(times, amps, thr)¶
Cluster photoelectrons within the instrument time resolution.
Clusters hits at axis=-1 (innermost lists) such that within each cluster the time span (last_time - first_time) does not exceed thr. This is useful for combining photoelectrons that arrive within the time resolution of the detector, treating them as a single detected event.
The output time is the first time of each cluster; the amplitude is the sum of all amplitudes in the cluster.
- Parameters:
times (
Array) – Awkward array of hit times. Must be sorted in ascending order within each innermost list. Sorting is the caller’s responsibility; unsorted input produces undefined behavior.amps (
Array) – Awkward array of amplitudes corresponding to times. Must have the same structure (nesting depth and list lengths) as times.thr (
float) – Maximum time span within a cluster (e.g., the detector time resolution).
- Return type:
tuple[Array,Array]- Returns:
clustered_times – Awkward array with the same nesting structure, containing the first time of each cluster.
clustered_amps – Awkward array with the same nesting structure, containing the summed amplitude of each cluster.
- Raises:
ValueError – If times and amps have different nesting depths or different numbers of elements.
Examples
>>> times = ak.Array([[0.0, 0.6, 1.1, 1.4, 2.3]]) >>> amps = ak.Array([[1.0, 2.0, 3.0, 4.0, 5.0]]) >>> t_out, a_out = cluster_photoelectrons(times, amps, thr=1.0) >>> ak.to_list(t_out) [[0.0, 1.1, 2.3]] >>> ak.to_list(a_out) [[3.0, 7.0, 5.0]]
- legendsimflow.reboost.gauss_smear(arr_true, arr_reso)¶
Smear values with expected resolution.
Samples from gaussian and shifts negative values to a fixed, tiny positive value.
- Return type:
Array
- legendsimflow.reboost.get_remage_detector_uids(h5file, *, lh5_table='stp')¶
Get mapping of detector names to UIDs from a remage output file.
The remage LH5 output files contain a link structure that lets the user access detector tables by UID. For example:
├── stp · struct{det1,det2,optdet1,optdet2,scint1,scint2} └── __by_uid__ · struct{det001,det002,det011,det012,det101,det102} ├── det001 -> /stp/scint1 ├── det002 -> /stp/scint2 ├── det011 -> /stp/det1 ├── det012 -> /stp/det2 ├── det101 -> /stp/optdet1 └── det102 -> /stp/optdet2This function analyzes this structure and returns:
{1: 'scint1', 2: 'scint2', 11: 'det1', 12: 'det2', 101: 'optdet1', 102: 'optdet2'}
- legendsimflow.reboost.get_remage_hit_range(tcm, det_name, uid, evt_idx_range)¶
Extract the range of remage output rows for an event range.
Queries the remage TCM (stored below
/tcmin stp_file) with the input evt_idx_range = [i, j] to extract the first and last index of rows (hits) in the det_name detector table that correspond to the input event range. Returns the start index and number of rows to read after it as a tuple.- Parameters:
tcm (
Array) – Time-coincidence map.det_name (
str) – name of the detector table in stp_file.uid (
int) – remage unique identifier for detector det_name.evt_idx_range (
list[int]) – [first, last] (i.e. first included, last included) index of events of interest present in the remage output file. Only positive indices are supported.
- Return type:
- legendsimflow.reboost.hpge_corrected_drift_time(chunk, dt_map, det_loc)¶
HPGe drift time heuristic corrected for crystal axis effects.
Note
This function will be moved to
reboost.- Return type:
Array
- legendsimflow.reboost.hpge_max_current(edep, drift_time, currmod_pars, **kwargs)¶
Calculate the maximum of the current pulse.
- Parameters:
edep (
Array) – energy deposited at each step.drift_time (
Array) – drift time of each energy deposit.currmod_pars (
Mapping) – dictionary storing the parameters of the current model (seereboost.hpge.psd.get_current_template())kwargs – forwarded to
reboost.hpge.psd.maximum_current().
- Return type:
Array
- legendsimflow.reboost.load_hpge_dtmaps(config, det_name, runid)¶
Load HPGe drift time maps from disk.
Automatically finds and loads drift time maps for crystal axes <100> <110>. If no map is found,
Noneis returned.- Parameters:
- Return type:
dict[str,HPGeRZField] |None
Note
This function will be moved to
reboost.
- legendsimflow.reboost.make_output_chunk(chunk)¶
Prepare output detector table chunk for the hit tier.
Note
This function will be moved to
reboost.- Return type:
- legendsimflow.reboost.smear_photoelectrons(array, fwhm_in_pe, rng=None)¶
Smear photoelectron pulse amplitudes.
Returns an array of gaussian distributed single-photoelectron amplitudes with the same shape of the input array.
- Return type:
Array
legendsimflow.spms_pars module¶
- legendsimflow.spms_pars._next_rc_evt_file(evt_files, rc_file_state)¶
Return the next evt file, cycling through the list before repeating.
- Parameters:
evt_files (
Sequence[str|Path]) – Ordered sequence of evt file paths to cycle through.rc_file_state (
dict[str,Any]) – Mutable state dict shared across calls. On the first call it is populated with keysorder(the file list),idx(current position, int), andcompleted_cycle(bool, set toTrueonce the list has been exhausted once). Subsequent calls incrementidxand wrap it when all files have been visited.
- Return type:
- Returns:
str | Path – Path to the next evt file to process.
- legendsimflow.spms_pars._process_spms_windows(time, energy, win_ranges, time_domain_ns, min_sep_ns)¶
Process SiPM data within specified window ranges.
Each
(start, end)range inwin_rangesis tiled with non-overlapping windows of lengthtime_domain_ns[1] - time_domain_ns[0], separated bymin_sep_ns. PE hits falling inside each window are selected and their times are shifted so that the window start maps totime_domain_ns[0].The function works on arrays of any rank. For N-D input (e.g. shape
(n_events, n_channels, n_pe)), each extracted window produces one output block of the same shape along all but the innermost axis, with only the PE dimension filtered. Blocks from all windows are then concatenated along axis=0, so M source events processed through W windows yieldW * Moutput entries.- Parameters:
time (
Array) – PE hit times. Any shape; the innermost axis is the PE axis.energy (
Array) – PE energies, same shape astime.win_ranges (
Sequence[tuple[float,float]]) – List of(start, end)tuples defining the time ranges to tile, in nanoseconds.time_domain_ns (
tuple[float,float]) – Target time range(start, end)for output times in nanoseconds. The window length isend - start. E.g.(-1000, 5000)selects 6000 ns windows and maps their start to-1000 ns.min_sep_ns (
float) – Minimum gap between consecutive windows in nanoseconds.
- Return type:
tuple[Array,Array]- Returns:
npe – PE energies extracted from all windows, concatenated along axis=0.
t0 – PE times relative to each window’s start (bounded by
time_domain_ns), same shape asnpe.
- legendsimflow.spms_pars.build_rc_evt_index_lookup(rc_evt_files)¶
Build per-file trigger index lookup for RC extraction.
- legendsimflow.spms_pars.get_chunk_rc_data(rc_evt_files, rc_file_state, chunk_size, rc_index_lookup)¶
Assemble random-coincidence data for one chunk.
- Parameters:
rc_evt_files (
Sequence[str|Path]) – Ordered sequence of evt files that can provide random-coincidence data. Must not be empty.rc_file_state (
dict[str,Any]) – Mutable state for file cycling and carryover between chunks. Expected keys are created/updated internally (e.g.order,idx,counts,carryover).chunk_size (
int) – Number of random-coincidence events requested for the current chunk. Must be positive.rc_index_lookup (
dict[str,dict[str,ndarray]]) – Precomputed mapping from evt file to trigger-event indices, built withbuild_rc_evt_index_lookup().
- Return type:
Array- Returns:
ak.Array – Random-coincidence data for one chunk with fields
rawid(chunk_size, n_channels),npe(chunk_size, n_channels, n_pe)andt0(same shape asnpe).
- legendsimflow.spms_pars.get_rc_evt_mask(evt_file)¶
Compute boolean event masks for random-coincidence extraction.
- Parameters:
- Return type:
tuple[Array,Array]- Returns:
mask_forced_pulser – Boolean mask selecting forced-trigger and pulser events, excluding muon coincidences.
mask_geds – Boolean mask selecting HPGe-triggered events, excluding muon coincidences.
- legendsimflow.spms_pars.get_rc_library(evt_file, rc_index_lookup, time_domain_ns=(-1000, 5000), min_sep_ns=6000, ext_trig_range_ns=((1000, 44000), (55000, 100000)), ge_trig_range_ns=((1000, 44000),))¶
Extract a library of random-coincidence (RC) events from an evt file.
To be used in correcting the SiPM photoelectrons with random coincidences.
For each qualifying trigger event, the SiPM waveform is divided into multiple non-overlapping time windows (see
_process_spms_windows). Each window yields one independent RC event, so the total number of entries in the returned library isn_source_events x n_windows. The per-channel structure is preserved:npeandt0have shape(n_rc_events, n_channels, n_pe)andrawidhas shape(n_rc_events, n_channels), matching thespms/*layout of the evt tier.Two trigger categories are processed with different window ranges to avoid contaminating RC events with physics signal:
Forced/pulser triggers: full waveform outside the central trigger window,
((1_000, 44_000), (55_000, 100_000))ns by default.HPGe/LAr triggers: first half only (before the trigger),
((1_000, 44_000),)ns by default.
Both categories are filtered to exclude muon coincidences.
- Parameters:
rc_index_lookup (
dict[str,dict[str,ndarray]]) – Precomputed mapping from evt file to trigger-event indices, built withbuild_rc_evt_index_lookup.time_domain_ns (
tuple[float,float]) – Target time range (start, end) for output times in nanoseconds. E.g.,(-1000, 5000)means output times will be in[-1000, 5000]. Default:(-1_000, 5_000).min_sep_ns (
float) – Minimal separation time between two windows in a trace, in nanoseconds. Default 6000.ext_trig_range_ns (
Sequence[tuple[float,float]]) – Window ranges for forced/pulser trigger events, as a sequence of(start, end)pairs in nanoseconds. Default:((1_000, 44_000), (55_000, 100_000)).ge_trig_range_ns (
Sequence[tuple[float,float]]) – Window ranges for HPGe/LAr trigger events, as a sequence of(start, end)pairs in nanoseconds. Default:((1_000, 44_000),).
- Return type:
Array- Returns:
ak.Array – Record array with fields
rawid(channel UIDs, shape(n_rc_events, n_channels)),npe(PE energies, shape(n_rc_events, n_channels, n_pe)), andt0(times relative to each window start, same shape asnpe). Channel ordering within each event matches the sourcespms/rawidordering inevt_file.
- legendsimflow.spms_pars.lookup_evt_files(l200data, runid, evt_tier_name)¶
Look up the evt tier file paths for a given run.
legendsimflow.tcm module¶
- legendsimflow.tcm.build_tcm(hit_files, out_file)¶
Re-create the TCM table from remage.
Use remage fields evtid and t0 (the latter is assumed to be in nanoseconds) to build coincidences. The settings are identical to the remage built-in TCM settings.
- Return type:
- legendsimflow.tcm.merge_stp_n_opt_tcms(tcm_stp, tcm_opt, *, scintillator_uid)¶
Merge tcm_opt rows into tcm_stp at the scintillator uid.
For each axis=0 row of tcm_stp, if tcm_stp.table_key contains scintillator_uid, replace that single element by splicing in the next row of tcm_opt.table_key. The same splice is applied to row_in_table using the corresponding tcm_opt.row_in_table, preserving alignment between table_key[i][j] and row_in_table[i][j].
- Parameters:
tcm_stp – Awkward record arrays with fields table_key and row_in_table.
tcm_opt – Awkward record arrays with fields table_key and row_in_table.
scintillator_uid – Scalar value in tcm_stp.table_key marking where to splice in tcm_opt, i.e. the UID of the scintillator table.
- Returns:
ak.Array – Record array with the same length as tcm_stp.
- legendsimflow.tcm.merge_stp_n_opt_tcms_chunk(tcm_stp, tcm_opt, *, scintillator_uid)¶
Chunk-level implementation of
merge_stp_n_opt_tcms().This function assumes tcm_opt contains exactly as many rows as there are rows in tcm_stp that contain scintillator_uid, in the same order.
- legendsimflow.tcm.merge_stp_n_opt_tcms_to_lh5(stp_file, opt_file, out_file, *, scintillator_uid, buffer_len='50*MB')¶
Stream-merge STP and OPT TCMs and write unified TCM to disk in chunks.
Iterates over stp_file:/tcm using
LH5Iterator. For each chunk, reads only the required number of OPT TCM rows (those corresponding to STP rows containing the scintillator_uid placeholder) via lh5.read_as with explicit indices. The merged output is appended to out_file:/tcm.- Return type:
legendsimflow.utils module¶
- legendsimflow.utils._curve_fit_popt_to_dict(popt)¶
Get the
scipy.optimize.curve_fit()parameter results as a dictionary.- Return type:
- legendsimflow.utils._make_path(d)¶
- legendsimflow.utils._merge_defaults(user, default)¶
Recursively merge default values into user configuration.
Merges values from default into user without overwriting existing user values. For nested dictionaries, performs recursive merge.
- legendsimflow.utils.add_field_string(name, chunk, data)¶
Add a string to the output table.
This is done in an HDF5-friendly way by storing the runid as a fixed-length string.
- Return type:
- legendsimflow.utils.apply_path_defaults(paths)¶
Set default values for optional path keys derived from
paths.pars.The following keys are optional in the Simflow configuration and, if absent, are derived from
paths.pars:geom: defaults to{paths.pars}/geomdtmaps: defaults to{paths.pars}/hpge/dtmaps
- Parameters:
paths (
dict) – Thepathssection of the Simflow configuration, with all values already converted topathlib.Pathobjects.- Return type:
- legendsimflow.utils.check_nans_leq(array, name, less_than_frac=0.1, min_entries=100)¶
Raise an exception if the fraction of NaN values in array is above threshold.
- Parameters:
array (
TypeAliasType) – the array to analyze.name (
str) – array name for exception message.less_than_frac (
float) – raise exception if fraction of NaNs is above this threshold.min_entries (
int) – minimum number of entries required to apply the fraction check. With fewer entries, a warning is logged instead of raising an exception.
- Return type:
- legendsimflow.utils.get_dict_value(d, field, default=None)¶
Return a value from a nested dictionary using a dot-separated field path.
- legendsimflow.utils.get_evt_tier_name(l200data)¶
Extract the name of the evt tier for this production cycle.
If the pet tier is present this is used else the evt tier is used.
- legendsimflow.utils.get_hit_tier_name(l200data)¶
Extract the name of the hit tier for this production cycle.
If the pht tier is present this is used else the hit tier is used.
- legendsimflow.utils.init_generated_pars_db(l200data, tier=None, lazy=True)¶
Initializes the pars database from a LEGEND-200 data production.
- legendsimflow.utils.init_simflow_context(raw_config, workflow=None, logger=None)¶
Pre-process and sanitize the Simflow configuration.
Returns a dictionary with useful objects to be used in the Simflow Snakefiles (i.e. the “context”):
set default configuration fields;
substitute
$_and environment variables;convert to
AttrsDict;cast filesystem paths to
pathlib.Path;clone and configure legend-metadata;
attach a
LegendMetadatainstance to the Simflow configuration;export important environment variables.
- Parameters:
raw_config (
dict|AttrsDict|str|Path) – Simflow configuration mapping or path to a configuration file.workflow – Snakemake workflow instance. If None, occurrences of
$_in the configuration will be replaced with the path to the current working directory.logger (
Logger|None) – Logger to use for status messages (e.g. the Snakemake logger when called from a Snakefile). Defaults to the module logger.
- Return type:
- legendsimflow.utils.link_external_paths(config, workflow_basedir, *, logger=None)¶
Symlink user-overridden paths back into their default locations.
When the user has manually overridden a
paths.<key>entry insimflow-config.yamlto point outside the current production cycle (e.g. reusing thehittier from another production), this function creates a symlink at the canonical default location pointing to the override. Snakemake rules keep readingconfig.paths.<key>directly; the symlink only exists so the prod cycle’s owngenerated/tree shows the external data in the standard layout.The default locations are computed from the simflow’s own template at
<workflow_basedir>/../templates/default.yaml, with$_substituted to the current working directory (the prod cycle root in the standard Snakemake invocation). Created symlinks are relative to the destination’s parent, keeping the prod cycle portable.For each supported key:
if
config.paths.<key>resolves to the default location, no override is in effect; any stale symlink at that location is removed;otherwise a symlink is created (or refreshed) at the default location pointing to
config.paths.<key>.
Real directories at the default location are never touched. The call is a safe no-op when
<workflow_basedir>/../templates/default.yamldoes not exist.Supported keys (relative to
paths): everytier.<name>,pars,macros,geomanddtmaps. Default paths forgeomanddtmapsfall back to<pars>/geomand<pars>/hpge/dtmapswhen absent from the template (mirroringapply_path_defaults()).- Parameters:
config (
AttrsDict) – Simflow configuration as returned byinit_simflow_context().workflow_basedir (
str|Path) – Snakemake workflow basedir (workflow.basedirin a Snakefile). Used only to locate the simflow’s default template.logger (
Logger|None) – Logger to use for status messages (e.g. the Snakemake logger when called from a Snakefile). Defaults to the module logger.
- Return type:
- legendsimflow.utils.lookup_dataflow_config(l200data)¶
Finds and loads the dataflow configuration file.
- legendsimflow.utils.sanitize_dict_with_defaults(read_dict, defaults)¶
Swap-in defaults when values are illegal.
- Return type:
- legendsimflow.utils.setup_logdir_link(config)¶
Set up the timestamp-tagged directory for the workflow log files.
- legendsimflow.utils.sorted_by(subset, order)¶
Sort a sequence according to a specified order, dropping duplicates.
- Return type: