You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sometimes we talk about GMT sessions in issues/PRs. It's important to know that there are two different kinds of sessions: (1) the GMT CLI session; (2) the GMT C API session. Here are some technical notes of the two GMT sessions to help understand how PyGMT works and the potential flaws.
The GMT CLI session
Here is a simple GMT CLI script:
gmt begin map -V
gmt basemap -R0/10/0/10 -JX10c -Baf
gmt end show
The gmt begin command creates the so-called GMT CLI session directory under the ~/.gmt/sessions directory. The session directory name is like ~/.gmt/sessions/gmt_session.XXXXX, in which XXXXX is the parent process ID (PPID), but can be changed by the environmental variable GMT_SESSION_NAME. Then, subsequent GMT commands will read/write information from/to files in this directory. This is how different GMT module calls communicate in modern mode. See https://docs.generic-mapping-tools.org/dev/begin.html#note-on-unix-shells for the official explanations.
In the current PyGMT implementation, when we import the PyGMT library (i.e., import pygmt),
we call gmt begin to create the GMT CLI session. This GMT CLI session will be used by all subsequent GMT calls. It's usually OK, but when used in multiprocessing, GMT module calls from different processes access this directory at the same time, which can cause corruptions. This explains why PyGMT has troubles with multiprocessing (#217).
So, to make PyGMT support multiprocessing, the solution seems straightforward:
Set environmental variables GMT_SESSION_NAME to a unique value (we already have the unique_name() function) so that each process has a unique session name
Do not call gmt begin at import time so that each process has a unique GMT CLI session directory
The C API function GMT_Create_Session creates the so-called GMT C API session. This function does a lot of things, including, allocating memory for internal variables, deciding the session name, loading gmt.conf settings, and more. The API function GMT_Destroy_Session is responsible for destroying the GMT C API session.
The equivalent PyGMT version should be:
from pygmt.clib import Session
with Session() as lib:
lib.call_module("begin", "pygmt-session")
lib.call_module("figure", "apimodern -")
lib.call_module("basemap", "-BWESN -Bafg -JM16c -R5/41/9/43")
lib.call_module("psconvert", "-A -Tg")
lib.call_module("end")
However, in the current implementation, the PyGMT version looks like below:
from pygmt.clib import Session
with Session() as lib:
lib.call_module("begin", "pygmt-session")
with Session() as lib:
lib.call_module("figure", "apimodern -")
with Session() as lib:
lib.call_module("basemap", "-BWESN -Bafg -JM16c -R5/41/9/43")
with Session() as lib:
lib.call_module("psconvert", "-A -Tg")mi
with Session() as lib:
lib.call_module("end")
in which the GMT C API sessions are created/destroyed multiple times. We may have some performance improvements if we can use a single GMT C API session, but we also need to note that the GMT CLI script also creates/destorys GMT C API sessions repeatly.
Extra notes
Here are some extra notes:
The session name is decided in C API function GMT_Create_Session. So, GMT_SESSION_NAME should be defined before calling GMT_Create_Session.
Data processing modules can be called in either classic mode or modern mode (i.e., inside gmt begin or not), but some modules behave different in classis/modern mode. For example, gmt makecpt writes the output to stdout in classic mode but to a hidden CPT file in modern mode. To make things as simple as possible, it's OK to always call gmt begin at the beginning.
A GMT C API session is required when calling any GMT C API functions. However, a GMT CLI session is only required when calling GMT modules. For example, the following Python script works without a GMT CLI session:
from pygmt.clib import Session
with Session() as lib:
lib.read_data("@earth_relief_01d_g", kind="grid")
The text was updated successfully, but these errors were encountered:
Sometimes we talk about GMT sessions in issues/PRs. It's important to know that there are two different kinds of sessions: (1) the GMT CLI session; (2) the GMT C API session. Here are some technical notes of the two GMT sessions to help understand how PyGMT works and the potential flaws.
The GMT CLI session
Here is a simple GMT CLI script:
The
gmt begin
command creates the so-called GMT CLI session directory under the~/.gmt/sessions
directory. The session directory name is like~/.gmt/sessions/gmt_session.XXXXX
, in which XXXXX is the parent process ID (PPID), but can be changed by the environmental variable GMT_SESSION_NAME. Then, subsequent GMT commands will read/write information from/to files in this directory. This is how different GMT module calls communicate in modern mode. See https://docs.generic-mapping-tools.org/dev/begin.html#note-on-unix-shells for the official explanations.In the current PyGMT implementation, when we import the PyGMT library (i.e.,
import pygmt
),we call
gmt begin
to create the GMT CLI session. This GMT CLI session will be used by all subsequent GMT calls. It's usually OK, but when used in multiprocessing, GMT module calls from different processes access this directory at the same time, which can cause corruptions. This explains why PyGMT has troubles with multiprocessing (#217).So, to make PyGMT support multiprocessing, the solution seems straightforward:
GMT_SESSION_NAME
to a unique value (we already have theunique_name()
function) so that each process has a unique session namegmt begin
at import time so that each process has a unique GMT CLI session directoryA proof-of-concept PR is opened at #3392.
The GMT C API session
We also need to know a little about the GMT C API session. Here is a simplified C example that calls GMT C API functions (the original example is https://github.com/GenericMappingTools/gmt/blob/master/src/testapi_modern.c):
The C API function
GMT_Create_Session
creates the so-called GMT C API session. This function does a lot of things, including, allocating memory for internal variables, deciding the session name, loading gmt.conf settings, and more. The API functionGMT_Destroy_Session
is responsible for destroying the GMT C API session.The equivalent PyGMT version should be:
However, in the current implementation, the PyGMT version looks like below:
in which the GMT C API sessions are created/destroyed multiple times. We may have some performance improvements if we can use a single GMT C API session, but we also need to note that the GMT CLI script also creates/destorys GMT C API sessions repeatly.
Extra notes
Here are some extra notes:
GMT_Create_Session
. So,GMT_SESSION_NAME
should be defined before callingGMT_Create_Session
.gmt begin
or not), but some modules behave different in classis/modern mode. For example,gmt makecpt
writes the output to stdout in classic mode but to a hidden CPT file in modern mode. To make things as simple as possible, it's OK to always callgmt begin
at the beginning.The text was updated successfully, but these errors were encountered: