Pre-processing

Modules for data pre-processing
pytomo.preproc.*

deconvolve

class pytomo.preproc.deconvolve.Deconvolver(traces, wavelet_dict)

Perform deconvolution.

Parameters
  • traces (obspy.traces) – waveform traces

  • wavelet_dict (dict) – dictionary of wavelets for each event_id

deconvolve(waveform, wavelet, level=0.005)

Deconvolve a single waveform. :param waveform: waveform :type waveform: ndarray :param wavelet: wavelet :type wavelet: ndarray :param level: water level :type level: float

iterstack

class pytomo.preproc.iterstack.InputFile(input_file)

Input file for IterStack.

Parameters

input_file (str) – path of IterStack input file

class pytomo.preproc.iterstack.IterStack(traces, modelname='prem', phasenames=['p', 'P', 'Pdiff'], t_before=10, t_after=30.0, min_cc=0.6, sampling=20, shift_polarity=True, freq=0.005, freq2=1.0, verbose=0)

Iterative stacking for source wavelet following Kennett & Rowlingson (2014)

Parameters

traces (obspy.traces) – waveform traces

compute()

Compute source wavelets for all events.

Returns

dictionary of wavelets for each event_id

Return type

wavelet_dict (dict)

compute_windows()

Return: windows (list): list of pydsm.window.Window

save_stf_catalog(out_dir)

Save the empirical source time functions to files. One file per event, named event_id.stf. :param out_dir: output directory :type out_dir: str

pytomo.preproc.iterstack.find_best_shift(y, y_template, shift_polarity=False, skip_freq=1)
Compute the index shift to maximize the correlation

between y[shift:shift+n] and y_template, where n=len(y_template)

Parameters
  • y (np.ndarray) – must have len(y) >= len(y_template)

  • y_template (np.ndarray) –

  • shift_polarity – allows to switch polarity (default is False)

  • skip_freq – skip points to reduce computation time (default is 1, i.e. no skip)

Returns

best shift int: polarity (1 or -1)

Return type

int

stfgridsearch

class pytomo.preproc.stfgridsearch.STFGridSearch(dataset, seismic_model, tlen, nspc, sampling_hz, freq, freq2, windows, durations, amplitudes, n_distinct_comp_phase, buffer=10.0)

Compute triangular source time functions by grid search. Args:

compute_parallel(mode=0, dir=None, verbose=0)

Compute using MPI.

Parameters
  • mode (int) – 0: sh+psv, 1: psv, 2:sh

  • dir (str) –

  • verbose (int) – 0: quiet, 1: debug

Returns

dict with event_id as key and

a=np.ndarray((n_durations,n_amplitudes,3)) as values. a[:,:,0] gives durations; a[:,:,1] gives amplitudes; a[:,:,2] gives misfit values

Return type

misfit_dict (dict)

get_best_parameters(misfit_dict, count_dict, duration_cap=1.3)

Get best duration and amplitude correction from misfit_dict.

Parameters
  • misfit_dict (dict) – dict returned by compute_parallel()

  • count_dict (dict) – dict returbed by compute_parallel()

Returns

best_params_dict. keys are event_id; values are

tuples (duration, amplitude).

Return type

dict

stream

pytomo.preproc.stream.read_sac(sacpaths_regex)

Read sac files :param sacpaths_regex: regex specifies list of sac paths

Returns

waveform traces

Return type

traces (obspy.traces)

pytomo.preproc.stream.sac_files_iterator(sacpaths_regex)

Yields chunks of sac files in which the number of events is <= the number of CPU cores.

Parameters

sacpaths_regex – regex to the sac files locations on disk (e.g., /root_dir/event*/*[RZT])

Yields

list of str – list of paths to sac files