Welcome!
obspyDMT (obspy Data Management Tool) is a command line tool for retrieving, processing and management of massive seismological data in a fully automatic way which can be run in serial or in parallel.
This tool is developed to mainly address the following tasks automatically:
- Retrieval of waveforms (MSEED or SAC), stationXML/response files and metadata from FDSN and ArcLink archives. This could be done in serial or in parallel for single or large requests.
- Supports both event-based and continuous requests.
- Extracting the information of all the events via user-defined options (time span, magnitude, depth and event location) from IRIS.
- Updating existing archives (waveforms, stationXML/response files and metadata).
- Processing the data in serial or in parallel (e.g. removing the trend of the time series, tapering, filtering and Instrument correction).
- Management of large seismic datasets.
- Plotting tools (events and/or station locations, ray coverage (event-station pair), epicentral-distance plots for all archived waveforms and seismicity maps).
- Exploring stationXML files by plotting the instrument response for all stages and/or for each stage.
This tutorial has been divided into the following sections:
- How to cite obspyDMT
- Lets get started: install obspyDMT and check your local machine for required dependencies.
- Quick tour: run a quick tour.
- Option types: there are two types of options in obspyDMT: option-1 (with value) and option-2 (without value)
- event-info request: if you are looking for some events and you want to get info about them without downloading waveforms.
- event-based request: retrieve the waveforms, stationXML/response files and meta-data of all the requested stations for all the events found in the archive.
- continuous request: retrieve the waveforms, stationXML/response files and meta-data of all the requested stations and for the requested time span.
- Update: if you want to continue an interrupted request or complete your existing archive.
- Geographical restriction: if you are interested in the events happened in a specific geographical coordinate and/or retrieving the data from the stations in a specific circular or rectangular bounding area.
- Instrument correction: applying instrument correction to raw counts using stationXML/response files.
- Parallel retrieving and processing: send the requests and/or process the data in parallel. This section introduces some options (bulk and parallel retrieving and processing) to speed-up the whole procedure.
- Plot: for an existing archive, you can plot all the events and/or all the stations, ray path for event-station pairs and epicentral-distance/time for the waveforms using GMT-5 or basemap tools.
- Explore stationXML file: how to explore and analyze different stages available in a stationXML file.
- Seismicity: plot the geographical and historical distribution of earthquake activities (seismicity).
- Folder structure: the way that obspyDMT organizes your retrieved and processed data in the file-based mode.
- Available options: all options currently available in obspyDMT.
If you use obspyDMT, please consider citing the code as:
Kasra Hosseini (2014), obspyDMT (Version 0.7.0) [software] [https://github.com/kasra-hosseini/obspyDMT]
We have also published a paper in SRL (Seismological Research Letters) for obspyDMT's predecessor that we kindly ask you to cite in case that you found obspyDMT useful for your research:
C. Scheingraber, K. Hosseini, R. Barsch, and K. Sigloch (2013), ObsPyLoad - a tool for fully automated retrieval of seismological waveform data, Seismological Research Letters, 84(3), 525-531, DOI:10.1785/0220120103.
Once a working Python and ObsPy environment are installed, obspyDMT can be installed from the source-code:
clone the obspyDMT git repository (or fork obspyDMT in GitHub and clone your fork):
$ git clone https://github.com/kasra-hosseini/obspyDMT.git /path/to/my/obspyDMT $ cd /path/to/my/obspyDMT $ python setup.py install
Alternatively:
$ git clone https://github.com/kasra-hosseini/obspyDMT.git /path/to/my/obspyDMT $ cd /path/to/my/obspyDMT $ pip install -v -e .
In case that these do not work for you, the source code could be downloaded directly from GitHub website and you can either work with the source code or install it:
$ cd /path/to/my/obspyDMT $ python setup.py install
obspyDMT can be used from a system shell without explicitly calling the Python interpreter. It contains various options for customizing the request. Each option has a reasonable default value and the user can change them to adjust obspyDMT options to a specific request. The following command gives all the available options with their default values:
$ obspyDMT --help
To check the dependencies required for running the code properly:
$ obspyDMT --check
ATTENTION: if obspyDMT is installed on your machine, it can be easily run from everywhere. However, if you want to use the source code instead:
$ cd /path/to/my/obspyDMT.py $ ./obspyDMT.py --check
In all the following examples, we assume that obspyDMT is already installed.
To run a quick tour, it is enough to:
$ obspyDMT --tour
dmt-tour-data directory will be created in the current path and the retrieved/processed data will be organized there. (Please refer to Folder structure section for more information)
To have an overview on the retrieved raw counts, the waveforms can be plotted by:
$ obspyDMT --plot_epi 'dmt-tour-data'
for plotting the corrected waveforms:
$ obspyDMT --plot_epi 'dmt-tour-data' --plot_type corrected
obspyDMT plots the ray coverage (ray path between each event-station pair) by:
$ obspyDMT --plot_ray 'dmt-tour-data'
There are two types of options in obspyDMT: option-1 (with value) and option-2 (without value). In the first type, user should provide value/s which will be stored and will be used in the program as input. However, by adding type-2 options, which does not require any value, one feature will be activated or deactivated (e.g. if you enter '--check', refer to Lets get started section, the program will check all the dependencies required for running the code properly).
The general form to enter the input (i.e. change the default values) is as follow:
$ obspyDMT --option-1 'value' --option-2
To show all the available options with short descriptions:
$ obspyDMT --help
The options specified by --option=OPTION are type-1 (with value) and --option are type-2 (without value).
ONE GOOD THING: the order of options is commutative!
In this type of request, obspyDMT will search for all the available events based on the options specified by the user, print the results and create an event catalog without retrieving waveforms or stationXML/response files.
The following lines show how to send an event-info request with obspyDMT followed by some examples.
The general way to define an event-info request is:
$ obspyDMT --event_info --option-1 'value' --option-2
The --event_info flag forces the code to just retrieve the event information and create an event catalog. For details on option-1 and option-2 please refer to Option types section.
Example 1: run with the default values:
$ obspyDMT --event_info
When the job starts, a folder will be created with the address specified with --datapath flag (by default: obspyDMT-data in the current directory). To access the event information for this example, go to /path/specified/in/datapath/2014-10-16_2014-10-21_5.5_9.9/EVENTS-info and check the *EVENT-CATALOG text file (Please refer to Folder structure section for more information).
WARNING: it can happen that obspyDMT does not find any event with the above command. This is due to the default values and the availability of the events at the time that you are testing the code. For customizing the request, refer to the next example.
Example 2: by adding flags to the above command, one can change the default values and add/remove functionalities of the code. As an example, the following command shows how to get the info of all the events with event magnitude in the range of 6.6-8.0 occurred after 2013-05-01 and before 2014-01-01:
$ obspyDMT --datapath event_info_example --event_info --min_mag 6.6 --max_mag 8.0 --min_date 2013-05-01 --max_date 2014-01-01
In the above command, --datapath is an option to specify the directory in which the data will be stored, --event_info determines that obspyDMT should just search for the event information and do not retrieve any seismic data (waveforms, stationxml files and metadata) and the other options --min_mag, --max_mag, --min_date, --max_date specify the minimum/maximum magnitude, minimum and maximum date.
In this type of request, the following steps will be done automatically:
- Search for all available events based on the options specified by the user.
- Check the availability of the requested stations for each event.
- Start to retrieve the waveforms and/or stationXML/response files for each event and for all available stations. (default: waveforms, stationXML/response files and metadata will be retrieved.)
- Applying instrument correction to all saved waveforms based on the specified options.
Retrieving and processing could be done in serial or in parallel.
The following lines show how to send an event-based request with obspyDMT followed by some short examples.
The general way to define an event-based request is:
$ obspyDMT --option-1 'value' --option-2
For details on option-1 and option-2 please refer to Option types section.
Example 1: the following command shows how to get all the waveforms, stationXML/response files and metadata of BHZ channels available in TA network with station names start with Z for the great Tohoku-oki earthquake of magnitude Mw 9.0:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ'
or instead of using identity option:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --net 'TA' --sta 'Z*' --cha 'BHZ'
Example 2: By default, obspyDMT saves the waveforms in SAC format. In this case, it will fill in the station location (stla and stlo), station elevation (stel), station depth (stdp), event location (evla and evlo), event depth (evdp) and event magnitude (mag) in the SAC headers. However, if the desired format is MSEED: (for downloading the same event and station identity as Example 1)
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ' --mseed
Example 3: for downloading just the raw waveforms without stationXML/response file and instrument correction:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ' --mseed --response 'N' --ic_no
Example 4: the default values for the preset (how close the time series (waveform) will be cropped before the origin time of the event) and the offset (how close the time series (waveform) will be cropped after the origin time of the event) are 0 and 1800 seconds. You can change them by adding the following flags:
$ obspyDMT --preset time_before --offset time_after --option-1 value --option-2
In this type of request, the following steps will be done automatically:
- Get the time span from input and in case of large time spans, divide it into small intervals.
- Check the availability of the requested stations for each interval.
- Start to retrieve the waveforms and/or stationXML/response files for each interval and for all the available stations. (default: waveforms, stationXML/response files and metadata will be retrieved.)
- Applying instrument correction to all saved waveforms based on the specified options.
- Merging the retrieved waveforms for all time intervals to get a waveform with the original requested time span and save the final product.
The following lines show how to send a continuous request with obspyDMT followed by some short examples.
The general way to define a continuous request is:
$ obspyDMT --continuous --option-1 value --option-2
For details on option-1 and option-2 please refer to Option types section.
Example 1: the following command line shows how to get all the waveforms, stationXML/response files and metadata of the BHZ channels available in TA network with station names start with Z for the specified time span:
$ obspyDMT --continuous --identity 'TA.Z*.*.BHZ' --min_date '2011-01-01' --max_date '2011-01-03'
WARNING: it is possible that this request takes a long time on your machine (depends on your internet connection). In this case, you can send parallel requests:
$ obspyDMT --continuous --identity 'TA.Z*.*.BHZ' --min_date '2011-01-01' --max_date '2011-01-03' --req_parallel --req_np 10
or instead of using identity option:
$ obspyDMT --continuous --net 'TA' --sta 'Z*' --cha 'BHZ' --min_date '2011-01-01' --max_date '2011-01-03'
Example 2: By default, obspyDMT saves the waveforms in SAC format. In this case, it will fill in the station location (stla and stlo), station elevation (stel), station depth (stdp), event location (evla and evlo), event depth (evdp) and event magnitude (mag) in the SAC headers. However, if the desired format is MSEED: (for downloading the same event and station identity as Example 1)
$ obspyDMT --continuous --identity 'TA.Z*.*.BHZ' --min_date '2011-01-01' --max_date '2011-01-03' --mseed
Example 3: for downloading just the raw waveforms without response file and instrument correction:
$ obspyDMT --continuous --identity 'TA.Z*.*.BHZ' --min_date '2011-01-01' --max_date '2011-01-03' --mseed --response 'N' --ic_no
If you want to continue an interrupted request or complete your existing archive, you can use the updating option. The general ways to update an existing folder (located in address) for FDSN stations or ArcLink stations:
$ obspyDMT --fdsn_update 'address' --option-1 value --option-2 $ obspyDMT --arc_update 'address' --option-1 value --option-2
Please note that all the commands presented in this section could be applied to continuous request by just adding --continuous flag to the command line (refer to the continuous request section).
Example 1: first, lets retrieve all the waveforms, stationXML/response files and metadata of BHZ channels available in TA network with station names start with Z for the great Tohoku-oki earthquake of magnitude Mw 9.0:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ'
now, we want to update the folder for BHE channels:
$ obspyDMT --fdsn_update './obspyDMT-data' --identity 'TA.Z*.*.BHE'
we can send requests to other data-centers available in FDSN for both retrieving and updating.
As an example, we want to update the directory for all available BHZ channels in RESIF data-center with FR as network name:
$ obspyDMT --fdsn_update './obspyDMT-data' --identity 'FR.*.*.BHZ' --fdsn_base_url RESIF
WARNING: it is possible that this request takes a long time on your machine (depends on your internet connection). In this case, you can send parallel requests:
$ obspyDMT --fdsn_update './obspyDMT-data' --identity 'FR.*.*.BHZ' --fdsn_base_url RESIF --req_parallel --req_np 10
If you are interested in the events happened in a specific geographical coordinate and/or retrieving the data from the stations in a specific circular or rectangular bounding area, you are in the right section! Here, we have two examples:
Example 1: to extract the info of all the events occurred from January 2000 until October 2014 in a rectangular area (lon1=44.38E lon2=63.41E lat1=24.21N lat2=40.01N) with magnitude more than 3.0:
$ obspyDMT --event_info --min_mag '3.0' --min_date '2000-01-01' --max_date '2014-10-01' --event_rect '44.38/63.41/24.21/40.01'
Example 2: to retrieve all the waveforms, stationXML/response files and metadata of BHZ channels available in a specific rectangular bounding area (lon1=125.0W lon2=70.0W lat1=25N lat2=45N) for the great Tohoku-oki earthquake of magnitude Mw 9.0, the command line will be:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --cha 'BHZ' --station_rect '-125.0/-70.0/25.0/45.0'
When obspyDMT retrieves waveforms and their stationXML/response files, by default it removes the trends of time series, tapers the waveforms, filters and corrects them to the desired physical unit (displacement, velocity or acceleration). The default correction unit is Displacement and to change it into Velocity or Acceleration:
$ obspyDMT --corr_unit 'VEL' --option-1 'value' --option-2 $ obspyDMT --corr_unit 'ACC' --option-1 'value' --option-2
where option-1 and option-2 are the flags defined by the user (see Option types section).
You can deactivate the instrument correction by:
$ obspyDMT --ic_no --option-1 value --option-2
Please note that all the commands presented in this section could be applied to continuous request by just adding --continuous flag to the command line (refer to continuous request section).
Before applying the instrument correction, a bandpass filter will be applied to the data with default values: (0.008, 0.012, 3.0, 4.0). If you want to apply another band pass filter:
$ obspyDMT --pre_filt '(f1,f2,f3,f4)' --option-1 value --option-2
where (f1,f2,f3,f4) are the four corner frequencies of a cosine taper, one between f2 and f3 and tapers to zero for f1 < f < f2 and f3 < f < f4.
If you do not need the pre filter:
$ obspyDMT --pre_filt 'None' --option-1 value --option-2
In case that you want to apply instrument correction to an existing folder:
$ obspyDMT --ic_all 'address' --corr_unit unit
here address is the path where your not-corrected waveforms are stored. as mentioned above, unit is the unit that you want to correct the waveforms to. It could be DIS (default), VEL or ACC.
To make it clearer, let's take a look at an example with following steps:
Step 1: to retrieve all the waveforms, stationXML/response files and metadata of BHZ channels available in TA network with station names start with Z for the great Tohoku-oki earthquake of magnitude Mw 9.0: (please note that instrument correction will be applied to the retrieved waveforms by default)
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ'
Step 2: now to correct the raw waveforms for velocity:
$ obspyDMT --ic_all '/path/specified/in/datapath' --corr_unit 'VEL'
For each download request, obspyDMT uses ObsPy clients to establish connection to the data-centers, sends the request, downloads the data and disconnect. Some modifications can be applied to enhance the whole procedure:
bulk request
bulk request is a method provided by FDSN which gives access to multiple channels of MSEED data for specified time ranges, i.e. instead of sending the requests one by one, a list of requests can be sent.
obspyDMT incorporates this option and it can be activated by:
$ obspyDMT --fdsn_bulk --option-1 'value' --option-2
Parallel retrieving and processing
Moreover, obspyDMT can send the requests in parallel which makes the whole procedure much more efficient. In this case, the requests (event-based or continuous) will be divided into the number of requested processes, each process sends the request to the data providers, retrieves and organizes the data. The general syntax for this option is:
$ obspyDMT --req_parallel --req_np 10 --option-1 'value' --option-2
--req_parallel means that the request should be sent in parallel and --req_np 10 specifies the number of requested processes which is 10 here.
obspyDMT can run the processing unit in parallel as well. In this mode, it divides the job into the number of requested processes and each of them performs the instrument correction or any other defined processes and stores the results. Syntax to activate this option is:
$ obspyDMT --ic_parallel --ic_np 10 --option-1 'value' --option-2
--ic_parallel means that the processing should be done in parallel and ic_np 10 specifies the number of requested processes which is 10 here.
For an existing archive, you can plot all the events and/or all the stations, ray path for event-station pairs and epicentral-distance/time for the waveforms.
The general syntax for plotting tools is:
$ obspyDMT --plot_option 'address'
that --plot_option could be --plot_ev for events, --plot_sta for stations, --plot_se for stations and events, --plot_ray for ray path between each event-station pairs and --plot_epi for epicentral-distance/time.
All the examples showed in this section are based on a database created by the following request:
$ obspyDMT --min_mag '8.9' --min_date '2011-03-01' --identity 'TA.Z*.*.BHZ'
Example 1: let's plot both stations and events available in the folder:
$ obspyDMT --plot_se './obspyDMT-data'
the default format is png, but assume that we want pdf for our figures, then:
$ obspyDMT --plot_se './obspyDMT-data' --plot_format 'pdf'
Example 2: in this example, we want to plot the ray path for event-station pairs but save the result in $HOME/Desktop:
$ obspyDMT --plot_ray './obspyDMT-data' --plot_format 'pdf' --plot_save '$HOME/Desktop'
Example 3: obspyDMT supports GMT plots as well. For this reason, GMT5 should be installed on your machine. In this example, we want to plot the ray path for event-station pairs (similat to Example 2) by using GMT5:
$ obspyDMT --plot_ray_gmt './obspyDMT-data'
obspyDMT is able to plot the content of stationXML files by the following command: (all the figures will be saved at ./stationxml_plots by default)
Example 1: plot the amplitude and phase components of a stationXML file that was retrieved in event-based request:
$ obspyDMT --plotxml_dir path/to/STXML.TA.Z33A..BHZ --plotxml_paz
--plotxml_dir flag forces obspyDMT to generate a plot for amplitude and phase components of the StationXML file of TA.Z33A..BHZ station including all stages. --plotxml_paz extracts only PAZ, sensitivity and gain of the instrument response and plots the amplitude and phase components of that. Additionally, obspyDMT compares the results using L1 norm between full response and only PAZ information and plots the results.
Moreover, it plots the stages of the stationXML file as well.
Geographical and historical distribution of earthquake activities (seismicity) can be plotted using --seismicity option in obspyDMT. In this mode, the software finds the events according to the input parameters and generates an image in which the events are categorized based on depth and magnitude.
Example: the command line to create Japan seismicity map from all the events available in IRIS archive with magnitude more than 3.0 since 2000 is as follow:
$ obspyDMT --datapath 'Seismicity' --seismicity --min_mag 3.0 --min_date 2000-01-01 --max_date 2013-01-01 --event_rect 120.0/155.0/25.0/55.0
--datapath is the address where the event catalog will be created, --seismicity enables the seismicity mode and --min_mag, --min_date, --max_date and --event_rect are event search parameters.
obspyDMT organizes the retrieved and processed data in a homogeneous way. When you want to run the code, you can specify a top-level folder path in which all the data will be organized:
$ obspyDMT --datapath '/path/to/my/desired/address'
obspyDMT will create the folder (/path/to/my/desired/address) then start to create folders and files during retrieving and processing as it is shown in the following figure:
All the options currently available in obspyDMT could be seen by:
$ obspyDMT --help
The options specified by --option=OPTION are type-1 (with value) and --option are type-2 (without value). Please refer to Option types section for more info about type 1 and type 2