Base classes
General and WaveformModels
- class ActivationLSTMCell(input_size, hidden_size, gate_activation=<function hard_sigmoid>, recurrent_dropout=0)[source]
Bases:
ModuleLSTM Cell using variable gating activation, by default hard sigmoid
If gate_activation=torch.sigmoid this is the standard LSTM cell
Uses recurrent dropout strategy from https://arxiv.org/abs/1603.05118 to match Keras implementation.
- forward(input, state)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class CustomLSTM(cell, *cell_args, bidirectional=True, **cell_kwargs)[source]
Bases:
ModuleLSTM to be used with custom cells
- forward(input, state=None)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class GroupingHelper(grouping)[source]
Bases:
objectA helper class for grouping streams for the annotate function. In most cases, no direct interaction with this class is required. However, when implementing new models, subclassing this helper allows for more flexibility.
- group_stream(stream, strict, min_length_s, comp_dict)[source]
Perform grouping of input stream. In addition, enforces the strict mode, i.e, if strict=True only keeps segments where all components are available, and discards segments that are too short. For grouping=channel no checks are performed.
- Parameters:
stream (
Stream) – Input streamstrict (
bool) – If streams should be treated strict as for waveform model. Only applied if grouping is “full”.min_length_s (
float) – Minimum length of a segment in seconds. Only applied if grouping is “full”.comp_dict (
dict[str,int]) – Mapping of component characters to int. Only used if grouping is “full”.
- Return type:
list[list[Trace]]- Returns:
Grouped list of list traces.
- property grouping
- class SeisBenchModel(citation=None)[source]
Bases:
ModuleBase SeisBench model interface for processing waveforms.
- Parameters:
citation (str, optional) – Citation reference, defaults to None.
- property citation
- property device
Returns the device of the model parameters. Assumes all parameters are on the same device.
- property dtype
Returns the dtype of the model parameters. Assumes all parameters are of the same dtype.
- classmethod from_pretrained(name, version_str='latest', update=False, force=False, wait_for_file=False)[source]
Load pretrained model with weights.
A pretrained model weights consists of two files. A weights file [name].pt and a [name].json config file. The config file can (and should) contain the following entries, even though all arguments are optional:
“docstring”: A string documenting the pipeline. Usually also contains information on the author.
“model_args”: Argument dictionary passed to the init function of the pipeline.
“seisbench_requirement”: The minimal version of SeisBench required to use the weights file.
“default_args”: Default args for the
annotate()/classify()functions. These arguments will supersede any potential constructor settings.“version”: The version string of the model. For all but the latest version, version names should furthermore be denoted in the file names, i.e., the files should end with the suffix “.v[VERSION]”. If no version is specified in the json, the assumed version string is “1”.
Warning
Even though the version is set to “latest” by default, this will only use the latest version locally available. Only if no weight is available locally, the remote repository will be queried. This behaviour is implemented for privacy reasons, as it avoids contacting the remote repository for every call of the function. To explicitly update to the latest version from the remote repository, set update=True.
- Parameters:
name (str) – Model name prefix.
version_str (str) – Version of the weights to load. Either a version string or “latest”. The “latest” model is the model with the highest version number.
force (bool, optional) – Force execution of download callback, defaults to False
update (bool) – If true, downloads potential new weights file and config from the remote repository. The old files are retained with their version suffix.
wait_for_file (bool, optional) – Whether to wait on partially downloaded files, defaults to False
- Returns:
Model instance
- Return type:
- abstractmethod get_model_args()[source]
Obtain all model parameters for saving.
- Returns:
Dictionary of all parameters for a model to store during saving.
- Return type:
Dict
- classmethod list_pretrained(details=False, remote=True)[source]
Returns list of available pretrained weights and optionally their docstrings.
- Parameters:
details (bool) – If true, instead of a returning only a list, also return their docstrings. By default, returns the docstring of the “latest” version for each weight. Note that this requires to download the json files for each model in the background and is therefore slower. Defaults to false.
remote (bool) – If true, reports both locally available weights and versions in the remote repository. Otherwise only reports local versions.
- Returns:
List of available weights or dict of weights and their docstrings
- Return type:
list or dict
- classmethod list_versions(name, remote=True)[source]
Returns list of available versions for a given weight name.
- Parameters:
name (str) – Name of the queried weight
remote (bool) – If true, reports both locally available versions and versions in the remote repository. Otherwise only reports local versions.
- Returns:
List of available versions
- Return type:
list[str]
- classmethod load(path, version_str=None, **kwargs)[source]
Load a SeisBench model from local path.
For more information on the SeisBench model format see py:func:save.
- Parameters:
path (pathlib.Path ot str) – Define the path to the SeisBench model.
version_str (str, None) – Version string of the model. If none, no version string is appended.
- Returns:
Model instance
- Return type:
- property name
- save(path, weights_docstring='', version_str=None)[source]
Save a SeisBench model locally.
SeisBench models are stored inside the directory ‘path’. SeisBench models are saved in 2 parts, the model configuration is stored in JSON format [path][.json], and the underlying model weights in PyTorch format [path][.pt]. Where ‘path’ is the output path to store. The suffixes are appended to the path parameter automatically.
In addition, the models can have a version string which is appended to the json and the pt path. For example, setting version_str=”1” will append .v1 to the file names.
The model config should contain the following information, which is automatically created from the model instance state:
“weights_docstring”: A string documenting the pipeline. Usually also contains information on the author.
“model_args”: Argument dictionary passed to the init function of the pipeline.
“seisbench_requirement”: The minimal version of SeisBench required to use the weights file.
“default_args”: Default args for the
annotate()/classify()functions.
Non-serializable arguments (e.g. functions) cannot be saved to JSON, so are not converted.
- Parameters:
path (pathlib.Path or str) – Define the path to the output model.
weights_docstring (str, default to '') – Documentation for the model weights (training details, author etc.)
version_str (str, None) – Version string of the model. If none, no version string is appended.
- to_preferred_device(verbose=False)[source]
Move the model to an accelerator if available. Currently, this function checks for CUDA, MPS and XPU accelerators (in this order).
The function does not automatically move models to TPU. Check out torch_xla to see how to move models to TPU.
- Parameters:
verbose (
bool) – If true, prints the new device of the model.
- property weights_docstring
- property weights_version
- class WaveformModel(component_order=None, sampling_rate=None, output_type=None, default_args=None, in_samples=None, pred_sample=0, labels=None, filter_args=None, filter_kwargs=None, grouping='instrument', allow_padding=False, **kwargs)[source]
Bases:
SeisBenchModel,ABCAbstract interface for models processing waveforms. Based on the properties specified by inheriting models, WaveformModel automatically provides the respective
annotate()/classify()functions. Both functions take obspy streams as input. Theannotate()function has a rather strictly defined output, i.e., it always outputs obspy streams with the annotations. These can for example be functions of pick probability over time. In contrast, theclassify()function can tailor it’s output to the model type. For example, a picking model might output picks, while a magnitude estimation model might only output a scalar magnitude. Internally,classify()will usually rely onannotate()and simply add steps to it’s output.For details see the documentation of these functions.
The following parameters are available for the annotate/classify functions:Argument
Description
Default value
batch_size
Batch size for the model
256
overlap
Overlap between prediction windows. Values between 0 and 1 are treated as fractions of the window length. Values above 1 a sample counts. (only for window prediction models)
0
stacking
Stacking method for overlapping windows (only for window prediction models). Options are ‘max’ and ‘avg’.
avg
stride
Stride in samples (only for point prediction models)
1
strict
If true, only annotate if recordings for all components are available, otherwise impute missing data with zeros.
False
flexible_horizontal_components
If true, accepts traces with Z12 components as ZNE and vice versa. This is usually acceptable for rotationally invariant models, e.g., most picking models.
True
zerophase_resample
If true, the filter applied before resampling for anti-aliasing is zero-phase. Otherwise, uses causal filter. Note that using a different filter in application than in training might cause small out of distribution issues
True
Hint
Please note that the default parameters can be superseded by the pretrained model weights. Check model.default_args to see which parameters are overwritten.- Parameters:
component_order (
list|str|None) – Specify component order (e.g. ‘ZNE’), defaults to None.sampling_rate (
float|None) – Sampling rate of the model, defaults to None. If sampling rate is not None, the annotate and classify functions will automatically resample incoming traces and validate correct sampling rate if the model overwritesannotate_stream_pre().output_type (
str|None) –The type of output from the model. Current options are:
”point” for a point prediction, i.e., the probability of containing a pick in the window or of a pick at a certain location. This will provide an
annotate()function. If anclassify_aggregate()function is provided by the inheriting model, this will also provide aclassify()function.”array” for prediction curves, i.e., probabilities over time for the arrival of certain wave types. This will provide an
annotate()function. If anclassify_aggregate()function is provided by the inheriting model, this will also provide aclassify()function.”regression” for a regression value, i.e., the sample of the arrival within a window. This will only provide a
classify()function.
default_args (
dict[str,Any] |None) – Default arguments to use in annotate and classify functionsin_samples (
int|None) – Number of input samples in timepred_sample (
int|tuple[int,int] |None) – For a “point” prediction: sample number of the sample in a window for which the prediction is valid. For an “array” prediction: a tuple of first and last sample defining the prediction range. Note that the number of output samples and input samples within the given range are not required to agree.labels (
str|list[str] |None) – Labels for the different predictions in the output, e.g., Noise, P, S. If a function is passed, it will be called for every label generation and be provided with the stats of the trace that was annotated.filter_args (
tuple|None) – Arguments to be passed toobspy.filter()inannotate_stream_pre()filter_kwargs (
dict[str,Any] |None) – Keyword arguments to be passed toobspy.filter()inannotate_stream_pre()grouping (
str|GroupingHelper) – Level of grouping for annotating streams. Supports “instrument”, “channel” and “full”. Alternatively, a custom GroupingHelper can be passed.allow_padding (
bool) – If True, annotate will pad different windows if they have different sizes. This is useful, for example, for multi-station methods.kwargs – Kwargs are passed to the superclass
- annotate(stream, copy=True, **kwargs)[source]
Annotates an obspy stream using the model based on the configuration of the WaveformModel superclass. For example, for a picking model, annotate will give a characteristic function/probability function for picks over time. The annotate function contains multiple subfunctions, which can be overwritten individually by inheriting models to accommodate their requirements. These functions are:
Please see the respective documentation for details on their functionality, inputs and outputs.
Hint
If your machine is equipped with an accelerator, e.g., a GPU, this function will usually run faster when making use of the accelerator. Just call
model.to("cuda")/model.to("mps")/model.to("xpu")or use the functionto_preferred_device()to automatically select the best device. In addition, you might want to increase the batch size by passing the batch_size argument to the function. Possible values might be 2048 or 4096 (or larger if your GPU permits).Hint
All calls to
annotateandclassifywill automatically resample the input data to the sampling rate of the model, if defined. When data is downsampled, this might involve an anti-alias filter. To control whether this filter is zero-phase, use the argumentzerophase_resample. For more fine-grained control of the resampling process, manually resample the data before passing it toannotate.Warning
Even though the asyncio implementation itself is not parallel, this does not guarantee that only a single CPU core will be used, as the underlying libraries (pytorch, numpy, scipy, …) might be parallelised. If you need to limit the parallelism of these libraries, check their documentation, e.g., here or here. Bear in mind that a lower number of threads might occasionally improve runtime performance, as it limits overheads, e.g., here.
- Parameters:
stream (obspy.core.Stream) – Obspy stream to annotate
copy (bool) – If true, copies the input stream. Otherwise, the input stream is modified in place.
kwargs
- Returns:
Obspy stream of annotations
- async annotate_async(stream, copy=True, **kwargs)[source]
annotate implementation based on asyncio
- Parameters:
stream – Obspy stream to annotate
copy – If true, copies the input stream. Otherwise, the input stream is modified in place.
kwargs – Additional arguments for annotation
- Return type:
Stream- Returns:
Obspy stream of annotations
- annotate_batch_post(batch, piggyback, argdict)[source]
Runs postprocessing on the predictions of a window for the annotate function, e.g., reformatting them. By default, returns the original prediction. Inheriting classes should overwrite this function if necessary.
- Parameters:
batch (
Tensor) – Predictions for the batch. The data type depends on the model.argdict (
dict[str,Any]) – Dictionary of argumentspiggyback (
Any) – Piggyback information, by default None.
- Return type:
Tensor- Returns:
Postprocessed predictions
- annotate_batch_pre(batch, argdict)[source]
Runs preprocessing on batch level for the annotate function, e.g., normalization. By default, returns the input batch unmodified. Optionally, this can return a tuple of the preprocessed batch and piggyback information that is passed to
annotate_batch_post(). This can for example be used to transfer normalization information. Inheriting classes should overwrite this function if necessary.- Parameters:
batch (
Tensor) – Input batchargdict (
dict[str,Any]) – Dictionary of arguments
- Return type:
Tensor- Returns:
Preprocessed batch and optionally piggyback information that is passed to
annotate_batch_post()
- annotate_stream_pre(stream, argdict)[source]
Runs preprocessing on stream level for the annotate function, e.g., filtering or resampling. By default, this function will resample all traces if a sampling rate for the model is provided. Furthermore, if a filter is specified in the class, the filter will be executed. As annotate create a copy of the input stream, this function can safely modify the stream inplace. Inheriting classes should overwrite this function if necessary. To keep the default functionality, a call to the overwritten method can be included.
- Parameters:
stream (obspy.Stream) – Input stream
argdict – Dictionary of arguments
- Returns:
Preprocessed stream
- annotate_stream_validate(stream, argdict)[source]
Validates stream for the annotate function. This function should raise an exception if the stream is invalid. By default, this function will check if the sampling rate fits the provided one, unless it is None, and check for mismatching traces, i.e., traces covering the same time range on the same instrument with different values. Inheriting classes should overwrite this function if necessary. To keep the default functionality, a call to the overwritten method can be included.
- Parameters:
stream (obspy.Stream) – Input stream
argdict – Dictionary of arguments
- Returns:
None
- classify(stream, parallelism=None, **kwargs)[source]
Classifies the stream. The classification can contain any information, but should be consistent with existing models.
- Parameters:
stream (obspy.core.Stream) – Obspy stream to classify
kwargs
- Return type:
- Returns:
A classification for the full stream, e.g., a list of picks or the source magnitude.
- classify_aggregate(annotations, argdict)[source]
An aggregation function that converts the annotation streams returned by
annotate()into a classification. A classification consists of a ClassifyOutput, essentialy a namespace that can hold an arbitrary set of keys. However, when implementing a model which already exists in similar form, we recommend using the same output format. For example, all pick outputs should have the same format.- Parameters:
annotations – Annotations returned from
annotate()argdict – Dictionary of arguments
- Return type:
- Returns:
Classification object
- async classify_async(stream, **kwargs)[source]
Async interface to the
classify()function. See details there.- Return type:
- classify_stream_pre(stream, argdict)[source]
Runs preprocessing on stream level for the classify function, e.g., subselecting traces. By default, this function will simply return the input stream. In contrast to
annotate_stream_pre(), this function operates on the original input stream. The stream should therefore not be modified in place. Note thatannotate_stream_pre()will be executed on the output of this stream within theclassify()function.- Parameters:
stream (obspy.Stream) – Input stream
argdict – Dictionary of arguments
- Returns:
Preprocessed stream
- property component_order
- static detections_from_annotations(annotations, threshold)[source]
Converts the annotations streams for a single phase to discrete detections using a classical trigger on/off. The lower threshold is set to half the higher threshold. Detections are represented by
Detectionobjects. The detection start_time and end_time are set to the trigger on and off times.- Parameters:
annotations – Stream of annotations
threshold (
float) – Higher threshold for trigger
- Return type:
- Returns:
List of detections
- get_model_args()[source]
Obtain all model parameters for saving.
- Returns:
Dictionary of all parameters for a model to store during saving.
- Return type:
Dict
- static picks_from_annotations(annotations, threshold, phase)[source]
Converts the annotations streams for a single phase to discrete picks using a classical trigger on/off. The lower threshold is set to half the higher threshold. Picks are represented by
Pickobjects. The pick start_time and end_time are set to the trigger on and off times.- Parameters:
annotations – Stream of annotations
threshold – Higher threshold for trigger
phase – Phase to label, only relevant for output phase labelling
- Return type:
- Returns:
List of picks
- static resample(stream, sampling_rate, zerophase=True)[source]
Perform inplace resampling of stream to a given sampling rate.
- Parameters:
stream (
Stream) – Input streamsampling_rate (
float) – Sampling rate (sps) to resample tozerophase (
bool) – If true, use a zero-phase filter for antialiasing, otherwise a causal filter.
- static sanitize_mismatching_overlapping_records(stream)[source]
Detects if for any id the stream contains overlapping traces that do not match. If yes, all mismatching parts are removed and a warning is issued.
- Parameters:
stream (obspy.core.Stream) – Input stream
- Returns:
The stream object without mismatching traces
- Return type:
obspy.core.Stream
- stream_to_array(traces, argdict)[source]
Converts streams into a start time and a numpy array. Assumes:
All traces within a group can be put into an array, i.e, the strict parameter is already enforced. Every remaining gap is intended to be filled with zeros. The selection/cutting of intervals has already been done by
GroupingHelper.group_stream().No overlapping traces of the same component exist
All traces have the same sampling rate
- Parameters:
stream (obspy.core.Stream) – Input stream
argdict (
dict) – Dictionary of arguments
- Return type:
GroupedTraceData- Returns:
output_times: Start times for each array
- Returns:
output_data: Arrays with waveforms
- class WaveformPipeline(components, citation=None)[source]
Bases:
ABCA waveform pipeline is a collection of models that together expose an
annotate()and aclassify()function. Examples of waveform pipelines would be multi-step picking models, conducting first a detection with one model and then a pick identification with a second model. This could also easily be extended by adding further models, e.g., estimating magnitude for each detection.In contrast to
WaveformModel, a waveform pipeline is not a pytorch module and has no forward function. This also means, that all components of a pipeline will usually be trained separately. As a rule of thumb, if the pipeline can be trained end to end, it should most likely rather be aWaveformModel. For a waveform pipeline, theannotate()andclassify()functions are not automatically generated, but need to be implemented manually.Waveform pipelines offer functionality for downloading pipeline configurations from the SeisBench repository. Similarly to
SeisBenchModel, waveform pipelines expose afrom_pretrained()function, that will download the configuration for a pipeline and its components.To implement a waveform pipeline, this class needs to be subclassed. This class will throw an exception when trying to instantiate.
Warning
In contrast to
SeisBenchModelthis class does not yet feature versioning for weights. By default, all underlying models will use the latest, locally available version. This functionality will eventually be added. Please raise an issue on Github if you require this functionality.- Parameters:
components (dict [str, SeisBenchModel]) – Dictionary of components contained in the model. This should contain all models used in the pipeline.
citation (str, optional) – Citation reference, defaults to None.
- property citation
- abstractmethod classmethod component_classes()[source]
Returns a mapping of component names to their classes. This function needs to be defined in each pipeline, as it is required to load configurations.
- Returns:
Dictionary mapping component names to their classes.
- Return type:
Dict[str, SeisBenchModel classes]
- property docstring
- classmethod from_pretrained(name, force=False, wait_for_file=False)[source]
Load pipeline from configuration. Automatically loads all dependent pretrained models weights.
A pipeline configuration is a json file. On the top level, it has three entries:
- “components”: A dictionary listing all contained models and the pretrained weight to use for this model.
The instances of these classes will be created using the
from_pretrained()method. The components need to match the components from the dictionary returned bycomponent_classes().
“docstring”: A string documenting the pipeline. Usually also contains information on the author.
“model_args”: Argument dictionary passed to the init function of the pipeline. (optional)
- Parameters:
name (str) – Configuration name
force (bool, optional) – Force execution of download callback, defaults to False
wait_for_file (bool, optional) – Whether to wait on partially downloaded files, defaults to False
- Returns:
Pipeline instance
- Return type:
- classmethod list_pretrained(details=False)[source]
Returns list of available configurations and optionally their docstrings.
- Parameters:
details (bool) – If true, instead of a returning only a list, also return their docstrings. Note that this requires to download the json files for each model in the background and is therefore slower. Defaults to false.
- Returns:
List of available weights or dict of weights and their docstrings
- Return type:
list or dict
- property name
DASModels
- class DASAnnotateCallback[source]
Bases:
ABCThis abstract class describes the interface for callbacks used in the DAS annotate method. Callbacks will get streaming outputs from the annotate method, containing the different chunks after processing with the deep learning model. Different callbacks are available, e.g., for picking or for writing the full output. To implement a new callback, inherit from this class and implement the methods. Callbacks are stateful, allowing them, for example, to handle overlaps between adjacent chunks.
- finalize()[source]
Finalize step for the callback. This is called after the last chunk is processed and can be used to generate the final results based on the intermediate results processed in each chunk.
The finalize step is optional.
- Return type:
None
- abstractmethod get_results_dict()[source]
This method returns a dictionary with the results of the callback. It is used to generate the ClassifyOutput when using the callback through classify.
- Return type:
dict[str,Any]
- abstractmethod handle_patch(annotations, in_coords, out_coords)[source]
This method is called for each patch of the output after processing it with the deep learning model. Results inferred from this step should be stored in class variables.
- Return type:
None
- setup(data, patching_structure, annotate_keys)[source]
Setup step for the callback. This is called before the first chunk is processed and can be used to initialize state variables, e.g., the shape of the output or arrays for intermediate results.
The setup step is optional, however, it is usually good practice to reset all state variables in the setup step.
- Return type:
None
- class DASModel(dt_range=None, dx_range=None, patching_structure=None, buffer_queue_size=8, annotate_forward_kwargs=None, annotate_keys=None, default_args=None, fk_filter_args=None, filter_samples=None, **kwargs)[source]
Bases:
SeisBenchModel,ABCThis is the base class for all models processing DAS data.
Hint
If you are an end-user looking to apply pretrained models, you most likely won’t interact with this class directly. Instead, you will use classes inheriting from this class and their
annotate()andclassify()functions. If you aim to develop your own model, you should inherit from this class and have a look at the details below.Hint
When calling
annotateorclassify, the model can perform automatic resampling along both axis. This ensures that the model can be flexibly applied to data of different sampling rates and channel spacings. However, as models are typically stable with respect to small changes in sampling rate and channel spacing, this class allows for a range of sampling rates and channel spacings to be specified. When called on data that does not fall into this ratio, the model will search for the smallest set of integers for upsampling and downsampling. The resampling is done usingscipy.signal.resample_poly. To get the exact resampling ratio for a particular input array, check the functionget_resample_ratio().- Parameters:
patching_structure (
PatchingStructure|None) – The structure of the patches to cut for annotation. If None, the functionget_patching_structure()needs to be implemented, allowing to dynamically adjust the patching structure to the input data.dt_range (
tuple[float,float] |None) – Admissible range for the time step of data to be processed. This value is only taken into account for the execution of theannotate/classifyfunctions. See the above hint on the resampling behavior. Values are in seconds.dx_range (
tuple[float,float] |None) – Same asdt_rangebut along the channel axis. Values are in meters.buffer_queue_size (
int) – Maximum number of chunks to keep in the intermediate buffers.annotate_forward_kwargs (
dict[str,Any] |None) – Additional keyword arguments to pass to theforwardmethod of the model when runningannotate/classify.annotate_keys (
list[str] |None) – List of annotation keys to read from the output.default_args (
dict[str,Any] |None) – Default arguments for the optional keyword arguments ofannotate/classify.fk_filter_args (
dict[str,Any] |None) – Arguments for the F-k filter. SeeFKFilterfor details.filter_samples (
tuple[str,dict[str,Any]] |None) – Filter to apply along the sample axis. SeeVirtualTransformedDataArrayfor details.
- static calc_output_shape_and_coordinates(da, patching_structure)[source]
Calculates the shape and coordinate axis of the output array after processing with the given patching structure. In case the output shape would be fractional, an extra sample is added to the output array along the corresponding axis.
- Return type:
tuple[tuple[int,int],dict[str,InterpCoordinate]]
- async classify_async(data, **kwargs)[source]
The classify method is used to process the data and apply the default callback. The
kwargsare split into two groups: those that are passed to the callback and those that are passed to the annotate method.- Return type:
- property classify_callback: Type[DASAnnotateCallback]
Return the default callback for this model. For example, for picking models, this would be a DASPickingCallback. The class will then be instantiated and used to process the output of the annotate method. Constructor arguments will be extracted from the
kwargspassed toclassify.
- get_model_args()[source]
Obtain all model parameters for saving.
- Returns:
Dictionary of all parameters for a model to store during saving.
- Return type:
Dict
- get_patching_structure(data_shape, argdict)[source]
To enable dynamic window sizes, depending on the shape of the input record, this function can be overwritten. By default, returns the predefined patching structure. In addition, this function allows to overwrite the overlap dynamically.
The
data_shapeis provided for adaptive models. Note that the data shape can have float coordinates due to in-memory resampling of the data. The actual output shape can only be inferred once the patching structure has been defined, as the number of truncated samples depends on the patching structure. Therefore, models should be flexible towards the case of slightly smaller data shapes than the theoretical one.- Return type:
- class DASPickingCallback(thresholds=0.2, min_time_separation=1.0)[source]
Bases:
DASAnnotateCallbackPick arrivals from probability curves using scipy.signal.find_peaks. The picking is performed independently on each channel, i.e., no continuity is assumed between channels.
- Parameters:
thresholds (
float|dict[str,float]) – Confidence thresholds for picking. Can be a single value for all phases, or a dictionary with thresholds per phase.min_time_separation (
float) – Minimum time separation between two picks of the same phase in seconds.
- finalize()[source]
Finalize step for the callback. This is called after the last chunk is processed and can be used to generate the final results based on the intermediate results processed in each chunk.
The finalize step is optional.
- Return type:
None
- get_results_dict()[source]
This method returns a dictionary with the results of the callback. It is used to generate the ClassifyOutput when using the callback through classify.
- Return type:
dict[str,Any]
- handle_patch(annotations, in_coords, out_coords)[source]
This method is called for each patch of the output after processing it with the deep learning model. Results inferred from this step should be stored in class variables.
- Return type:
None
- setup(data, patching_structure, annotate_keys)[source]
Setup step for the callback. This is called before the first chunk is processed and can be used to initialize state variables, e.g., the shape of the output or arrays for intermediate results.
The setup step is optional, however, it is usually good practice to reset all state variables in the setup step.
- Return type:
None
- class FKFilter(dt, dx, v_min=None, v_max=None, mode='pass', **kwargs)[source]
Bases:
ModuleAn F-k filter implemented in PyTorch. The filter processes batched data, i.e., the input format should be (batch, samples, channels).
- Parameters:
dx (
float) – Channel spacing in spacedt (
float) – Sample spacing in timev_min (
float|None) – Minimum velocity to be considered in the filter. If None, no filtering is applied.v_max (
float|None) – Maximum velocity to be considered in the filter. If None, no filtering is applied.mode (
str) – Either “pass” or “reject”. If “pass” all velocities between v_min and v_max are retained. If “reject”, all frequencies outside this band.
- forward(data)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type:
Tensor
- class InMemoryCollectionCallback(stacking='avg')[source]
Bases:
DASAnnotateCallbackCollects the raw predictions of the model in memory and splices the DAS array back together from the individual patches. To avoid memory overflows, this callback should only be used for small datasets.
- finalize()[source]
Finalize step for the callback. This is called after the last chunk is processed and can be used to generate the final results based on the intermediate results processed in each chunk.
The finalize step is optional.
- Return type:
None
- get_results_dict()[source]
This method returns a dictionary with the results of the callback. It is used to generate the ClassifyOutput when using the callback through classify.
- Return type:
dict[str,Any]
- handle_patch(annotations, in_coords, out_coords)[source]
This method is called for each patch of the output after processing it with the deep learning model. Results inferred from this step should be stored in class variables.
- Return type:
None
- setup(data, patching_structure, annotate_keys)[source]
Setup step for the callback. This is called before the first chunk is processed and can be used to initialize state variables, e.g., the shape of the output or arrays for intermediate results.
The setup step is optional, however, it is usually good practice to reset all state variables in the setup step.
- Return type:
None
- class MultiCallback(callbacks)[source]
Bases:
DASAnnotateCallback- finalize()[source]
Finalize step for the callback. This is called after the last chunk is processed and can be used to generate the final results based on the intermediate results processed in each chunk.
The finalize step is optional.
- Return type:
None
- get_results_dict()[source]
This method returns a dictionary with the results of the callback. It is used to generate the ClassifyOutput when using the callback through classify.
- Return type:
dict[str,Any]
- handle_patch(annotations, in_coords, out_coords)[source]
This method is called for each patch of the output after processing it with the deep learning model. Results inferred from this step should be stored in class variables.
- Return type:
None
- setup(data, patching_structure, annotate_keys)[source]
Setup step for the callback. This is called before the first chunk is processed and can be used to initialize state variables, e.g., the shape of the output or arrays for intermediate results.
The setup step is optional, however, it is usually good practice to reset all state variables in the setup step.
- Return type:
None
- class PatchCoordinate(sample, channel, w_sample, w_channel)[source]
Bases:
objectCoordinates of a patch in the input or output array. Denotes the upper-left corner of the patch and the dimensions along each axis. Note that coordinates can take non-integer values due to transformations. Callbacks should be able to handle this, e.g., by casting to int.
- channel: float
- property channel_int: int
- sample: float
- property sample_int: int
- w_channel: int
- w_sample: int
- class PatchingStructure(in_samples, in_channels, out_samples, out_channels, range_samples, range_channels, overlap_samples=None, overlap_channels=None)[source]
Bases:
object- in_channels: int
- in_samples: int
- out_channels: int
- out_samples: int
- overlap_channels: int | None = None
- overlap_samples: int | None = None
- range_channels: tuple[int, int]
- range_samples: tuple[int, int]
- property shift_channels
- property shift_samples
- class VirtualTransformedDataArray(data, patching_structure, resample_samples=(1, 1), resample_channels=(1, 1), filter_samples=None, force_dtype=None, channel_coord_name=None)[source]
Bases:
objectThis class wraps a
xdas.DataArrayand allows to apply a transformation to it on the fly. It is used to allow loading data from disk in chunks and only applying the transformations to the current chunk. This way, the total memory consumption is independent of the size of the underlying data.For resampling, the class uses
scipy.signal.resample_poly, which internally performs an upsampling, a zero-phase FIR filter and a downsampling step. To avoid boundary artifacts, extra samples are loaded for the filtering and truncated afterwards (at most 11 extra samples per side).After resampling, the class can apply an IIR filter along the sample axis. Only causal filters are supported because acausal filters would require loading the whole data at once. Only IIR filters are supported due to their higher computational efficiencies and because the number of filter states to cache between consecutive chunks is much lower. No filters along the channel axis are implemented. Instead, consider using an F-k filter.
- Parameters:
data (
DataArray) –xdas.DataArrayto be transformedpatching_structure (
PatchingStructure) – The structure of the patches to cut for annotation.resample_samples (
tuple[int,int]) – Tuple of integers (up, down) defining the resampling factors along the sample axis.resample_channels (
tuple[int,int]) – Tuple of integers (up, down) defining the resampling factors along the channel axis.filter_samples (
tuple[str,dict[str,Any]] |None) – Tuple of filter type and keyword arguments to pass to the filter. The filter type must be the name of a filter design function in scipy.signal, e.g., “butter” or “cheby1”. The filter must support theoutputkeyword argument, as this implementation relies on second-order sections. The filter corners should be specified in Hz. The class will automatically pass the sampling rate to thefsargument of the filter creation. No filter is applied if this argument is None.channel_coord_name (
str|None) – Name of the coordinate in the data array that contains the channel coordinates.
- property coords: dict[str, xdas.Coordinate]
- property dt: float
- property dtype: dtype
- property dx: float
- static estimate_theoretical_output_shape(data, resample_samples, resample_channels)[source]
- Return type:
tuple[float,float]
- property filter_sos: ndarray | None
- property shape: tuple[int, int]
Shape of the transformed data. Always in
(samples, channels)dimension order.Truncates the right end of the output to ensure
(output_samples - in_samples)is divisible by the upsampling. The same is done for the channel axis. This is necessary to avoid fractional window offsets.
- class WriterBuffer(data, stacking, output_shape)[source]
Bases:
objectA buffer to handle intersections between overlapping output data. The buffer expects data in patches of equal size. The patch order needs to be left to right (samples), top to bottom (channels), i.e., first all samples for a range of channels need to be processed before the next row can be processed.
The buffer keeps up to two rows in memory and writes slices along the sample axis once they are fully predicted.
- add_data(data, out_coords)[source]
- Return type:
tuple[ndarray,PatchCoordinate] |None
- property stacking: str
- class WriterCallback(output_path, stacking='avg')[source]
Bases:
DASAnnotateCallbackWrites the raw predictions of the model to disk. The callback implements streaming processing to avoid excessive memory usage, while ensuring correct splicing at the overlaps between adjacent patches.
The output writing relies on the xdas DataArrayWriter . This means that the output will be written in multiple files using one output folder per annotation key. To load the files for key
xusexdas.open_mfdataarray("output_path/x/*"). Note that the time coordinate will have minor discontinuities due to the chunked writing. These can be fixed by callingdata.coords["time"] = data.coords["time"].simplify(tolerance=np.timedelta64(1, "us")).- finalize()[source]
Finalize step for the callback. This is called after the last chunk is processed and can be used to generate the final results based on the intermediate results processed in each chunk.
The finalize step is optional.
- Return type:
None
- get_results_dict()[source]
This method returns a dictionary with the results of the callback. It is used to generate the ClassifyOutput when using the callback through classify.
- Return type:
dict[str,Any]
- handle_patch(annotations, in_coords, out_coords)[source]
This method is called for each patch of the output after processing it with the deep learning model. Results inferred from this step should be stored in class variables.
- Return type:
None
- setup(data, patching_structure, annotate_keys)[source]
Setup step for the callback. This is called before the first chunk is processed and can be used to initialize state variables, e.g., the shape of the output or arrays for intermediate results.
The setup step is optional, however, it is usually good practice to reset all state variables in the setup step.
- Return type:
None