Tutorial 3 - Add first draft of an xarray integration tutorial (#15)

* Add pre-commit * Add xarray tutorial * Add projection explanation * Specify data server * Reorder dependencies in environment.yml --------- Co-authored-by: Wei Ji <[email protected]> Co-authored-by: Federico Esteban <[email protected]> Co-authored-by: Yvonne Fröhlich <[email protected]>
GenericMappingTools · Dec 1, 2024 · a08f3ed · a08f3ed
1 parent c2876fa
commit a08f3ed
Show file tree

Hide file tree

Showing 6 changed files with 345 additions and 1 deletion.
diff --git a/.gitignore b/.gitignore
@@ -3,3 +3,6 @@
 
 # Jupyter Notebook
 .ipynb_checkpoints/
+
+# Data
+book/data.nc
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,51 @@
+repos:
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v5.0.0
+    hooks:
+      - id: trailing-whitespace
+      - id: end-of-file-fixer
+      - id: check-docstring-first
+      - id: check-json
+      - id: check-yaml
+      - id: debug-statements
+      - id: mixed-line-ending
+
+  - repo: https://github.com/asottile/pyupgrade
+    rev: v3.19.0
+    hooks:
+      - id: pyupgrade
+        args:
+          - "--py38-plus"
+
+  - repo: https://github.com/keewis/blackdoc
+    rev: v0.3.9
+    hooks:
+      - id: blackdoc
+
+  - repo: https://github.com/charliermarsh/ruff-pre-commit
+    rev: "v0.7.2"
+    hooks:
+      - id: ruff
+        args: ["--fix"]
+
+  - repo: https://github.com/pre-commit/mirrors-prettier
+    rev: v4.0.0-alpha.8
+    hooks:
+      - id: prettier
+
+  - repo: https://github.com/nbQA-dev/nbQA
+    rev: 1.8.7
+    hooks:
+      - id: nbqa-ruff
+        args: ["--fix"]
+      - id: nbqa-isort
+        args: ["--profile=black"]
+        additional_dependencies: [isort==5.6.4]
+      - id: nbqa-black
+      - id: nbqa-pyupgrade
+        args: ["--py37-plus"]
+
+  - repo: https://github.com/shellcheck-py/shellcheck-py
+    rev: v0.10.0.1
+    hooks:
+      - id: shellcheck
diff --git a/book/_static/xarray.png b/book/_static/xarray.png
diff --git a/book/_toc.yml b/book/_toc.yml
@@ -15,3 +15,4 @@ parts:
   - file: markdown-notebooks
   - file: tut01_firstfigure
   - file: tut02_spe_pd_gpd
+  - file: tut03_spe_xarray
diff --git a/book/tut03_spe_xarray.ipynb b/book/tut03_spe_xarray.ipynb
@@ -0,0 +1,283 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# **Tutorial 3** - scientific Python ecosystem: `Xarray`\n",
+    "\n",
+    "In this tutorial we will to process and visualize raster data using [`PyGMT`](https://www.pygmt.org)'s integration with [**Xarray**](https://xarray.dev).\n",
+    "\n",
+    "-----\n",
+    "This tutorial is part of the AGU2024 annual meeting GMT/PyGMT pre-conference workshop (PREWS9) **Mastering Geospatial Visualizations with GMT/PyGMT**\n",
+    "- Conference: https://agu.confex.com/agu/agu24/meetingapp.cgi/Session/226736\n",
+    "- GitHub: https://github.com/GenericMappingTools/agu24workshop\n",
+    "- Website: https://www.generic-mapping-tools.org/agu24workshop\n",
+    "- Recommended version: PyGMT v0.13.0 with GMT 6.5.0"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## What is Xarray?\n",
+    "\n",
+    "[Xarray](https://xarray.dev) is an open source project and Python package for working with n-dimensional data using labeled dimensions and coordinates.\n",
+    "\n",
+    "<img src=\"_static/xarray.png\" alt=\"Xarray data structure\" style=\"width:50%;\"/>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Why use Xarray with PyGMT\n",
+    "\n",
+    "Xarray provides a set of [useful features](https://xarray.dev/#features) for working with multi-dimensional data and unlocks even more functionality through its [rich ecosystem](https://xarray.dev/#ecosystem). Some specific benefits of integrating Xarray and PyGMT include:\n",
+    "\n",
+    "- Simplifying the extension of PyGMT functionality to 3-D+ datasets\n",
+    "- Access additional datasets, such as those located on the cloud\n",
+    "- Use out-of-core computing with libraries like [Dask](https://www.dask.org/)\n",
+    "- Interactively visualize by using Xarray, PyGMT, and dashboarding tools like [Panel](https://panel.holoviz.org/) and [Bokeh](https://bokeh.org/)\n",
+    "- Work with both raster and vector data in memory\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Getting started\n",
+    "\n",
+    "First we'll import all the libraries used throughout the tutorial."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import gcsfs\n",
+    "import pygmt\n",
+    "import xarray as xr"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's use a PyGMT function to explore an Xarray DataArray. We can use the `load_earth_relief` function to load one of GMT's remote datasets. This will take longer the first time that it's run because GMT downloads the data from the GMT server. We'll specify the server in order to quicken download speeds to Washington, D.C., USA."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "with pygmt.config(GMT_DATA_SERVER=\"NOAA\"):\n",
+    "    grid = pygmt.datasets.load_earth_relief(resolution=\"01d\")\n",
+    "grid"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Plotting raster data\n",
+    "\n",
+    "We can pass this dataset to `grd_image()` to plot the dataset, specifying a projection and colormap. We use the Winkel Tripel projection (\"R\") with a 12c width to show the whole world."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig = pygmt.Figure()\n",
+    "fig.grdimage(grid=grid, projection=\"R12c\", cmap=\"SCM/oleron\", frame=True)\n",
+    "fig.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Processing raster data\n",
+    "\n",
+    "All of the PyGMT raster processing functions can accept and return Xarray DataArrays. As one example, we'll use the `grdgradient` function to create a hillshade raster. This will calculate the reflection of a light source projecting from west to east at an altitude of 30 degrees."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "region = [-119.825, -119.4, 37.6, 37.825]\n",
+    "grid = pygmt.datasets.load_earth_relief(resolution=\"03s\", region=region)\n",
+    "\n",
+    "hillshade = pygmt.grdgradient(grid=grid, radiance=[270, 30])\n",
+    "hillshade"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's plot the result using the same grdimage function shared earlier, this time using a custom colormap."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig = pygmt.Figure()\n",
+    "pygmt.makecpt(cmap=\"SCM/grayC\", series=[-1.5, 0.3, 0.01])\n",
+    "fig.grdimage(\n",
+    "    grid=hillshade,\n",
+    "    projection=\"M12c\",\n",
+    "    frame=[\"lSEt\", \"xa0.1\", \"ya0.1\"],\n",
+    "    cmap=True,\n",
+    ")\n",
+    "fig"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Download data and load with Xarray\n",
+    "\n",
+    "Most times you'll want to use your own data rather than the sample datasets. Here we show how to use the `which` functionality to download a NetCDF file available via https."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Download the dataset from the IRI Data Library\n",
+    "url = \"https://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NODC/.WOA09/.Grid-1x1/.Annual/.temperature/.t_an/data.nc\"\n",
+    "netcdf_file = pygmt.which(fname=url, download=True)\n",
+    "woa_temp = xr.open_dataset(netcdf_file).isel(time=0)\n",
+    "woa_temp"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Make a static plot of sea surface temperature\n",
+    "fig = pygmt.Figure()\n",
+    "fig.grdimage(grid=woa_temp.t_an.sel(depth=0), cmap=\"vik\", projection=\"R15c\", frame=True)\n",
+    "fig.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Open data directly from the cloud\n",
+    "\n",
+    "Xarray allows you to open data directly from cloud object storage. This can be useful to avoid downloading the entire dataset, since Xarray can subset to a specific section of interest before loading the data into memory.\n",
+    "\n",
+    "Here we show this functionality based on [Xarray's tutorial](https://tutorial.xarray.dev/intermediate/remote_data/cmip6-cloud.html#selecting-time-slices). We'll load a specific dataset from the Coupled Model Intercomparison Project Phase 6 (CMIP6), calculate the sea level change between 2015 and 2100, and plot the results using PyGMT."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "First, let's load CMIP6 data from Google Cloud Storage"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fs = gcsfs.GCSFileSystem(token=\"anon\")\n",
+    "\n",
+    "store = fs.get_mapper(\n",
+    "    \"gs://cmip6/CMIP6/ScenarioMIP/NOAA-GFDL/GFDL-ESM4/ssp585/r1i1p1f1/Omon/zos/gr/v20180701/\"\n",
+    ")\n",
+    "ds = xr.open_zarr(store=store, consolidated=True)\n",
+    "ds"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Calculate the different in sea level between a date in 2100 and 2015"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "zos_2015jan = ds.zos.sel(time=\"2015-01-16\").squeeze()\n",
+    "zos_2100dec = ds.zos.sel(time=\"2100-12-16\").squeeze()\n",
+    "sealevelchange = zos_2100dec - zos_2015jan"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Make a plot of the difference in sea level"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig = pygmt.Figure()\n",
+    "pygmt.makecpt(cmap=\"vik\", series=[-1, 1, 0.5], continuous=True)\n",
+    "fig.grdimage(grid=sealevelchange, cmap=True, projection=\"R15c\", frame=True)\n",
+    "fig.colorbar()\n",
+    "fig.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "agu24workshop",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/environment.yml b/environment.yml
@@ -7,7 +7,13 @@ dependencies:
   - jupyterlab=4.2.5
   - pygmt=0.13.0
   - python=3.12
-  - jupyter-book=1.0.2
   # Tutorial 2 - pandas and GeoPandas, Tutorial 4 - Geophysics (Seismology)
   - pandas=2.2.2
   - geopandas=1.0.1
+  # Tutorial 3 - Xarray
+  - gcsfs=2024.10.0
+  - xarray=2024.10.0
+  - zarr=2.18.3
+  # Optional dev dependencies
+  - pre-commit
+  - jupyter-book=1.0.2