From e86166f6e79ba27ae80294c34c62ac652b5bf778 Mon Sep 17 00:00:00 2001 From: pkdash Date: Fri, 30 Aug 2024 15:15:18 -0400 Subject: [PATCH 1/9] [#157] new notebook with examples for retrieving data from NLDI --- .../USGS_dataretrieval_NLDI_Examples.ipynb | 453 ++++++++++++++++++ 1 file changed, 453 insertions(+) create mode 100644 demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb diff --git a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb new file mode 100644 index 00000000..79f62d51 --- /dev/null +++ b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb @@ -0,0 +1,453 @@ +{ + "cells": [ + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "# USGS dataretrieval Python Package NLDI Data Access Examples\n", + "\n", + "This notebook provides examples of using the Python dataretrieval package to retrieve data from the United States Geological Survey (USGS) Hydro Network-Linked Data Index (NLDI). The dataretrieval package provides a collection of functions to get data from the USGS Hydro Network-Linked Data Index (NLDI)." + ], + "id": "94cf2fc11d917e1b" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Install the Package\n", + "\n", + "Use the following code to install the package if it doesn't exist already within your Jupyter Python environment. Note the `nldi` option in the `dataretrieval` package installation. The default `dataretrieval` does not support NLDI data access." + ], + "id": "8695bd7a7b335650" + }, + { + "metadata": {}, + "cell_type": "code", + "source": "!pip install dataretrieval[nldi]", + "id": "2cb985f4a60ce046", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Load the package so that you can use its functions in this notebook.", + "id": "ef27dd9de0b05a9a" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "from dataretrieval import nldi\n", + "from IPython.display import display" + ], + "id": "aa0f8aad72102b29", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "### Basic Usage\n", + "\n", + "The dataretrieval package provides a number of functions to get data from the USGS NLDI. \n", + "\n", + "#### The following examples show how to use the `get_basin()` function from the dataretrieval package to get basin data from the USGS NLDI. The following arguments are supported:\n", + "\n", + "* **feature_source** (string): The name of the NLDI feature source.\n", + "* **feature_id** (string): The identifier of the NLDI feature.\n", + "* **simplified** (boolean): If True, the data will be returned with simplified polygons. If False, the data will be returned as a single polygon (default is False).\n", + "* **split_catchment** (boolean): If True, the data will be returned with split catchment polygons. If False, the data will be returned as a single polygon (default is False) NOTE: Setting this to True may result in error due to a known issue with NLDI API.\n", + "* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False).\n" + ], + "id": "213e4c0d0b983a19" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Example 1: Get aggregated basin level data for a single feature source.", + "id": "9900c519345f9d2f" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "# set the parameters needed to retrieve data\n", + "feature_source = \"WQP\"\n", + "feature_id = \"USGS-01031500\"" + ], + "id": "8db53f3d4d004e65", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Get the basin data as a geopandas dataframe", + "id": "c8595b1e706a8468" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "gdf = nldi.get_basin(feature_source, feature_id)\n", + "display(gdf)" + ], + "id": "d8d0d847d8c171b6", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Get the basin data as GeoJSON (as_json=True)", + "id": "af599a9f632930d" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "basin_json_data = nldi.get_basin(feature_source, feature_id, as_json=True)\n", + "print(basin_json_data)" + ], + "id": "340793f67b33ff39", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "#### The following examples show how to use the `get_flowlines()` function from the dataretrieval package to get flowlines data from the USGS NLDI. The following arguments are supported:\n", + "\n", + "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD').\n", + "* **feature_source** (string): The name of the NLDI feature source.\n", + "* **feature_id** (string): The identifier of the NLDI feature.\n", + "* **comid** (integer): COMID (required if feature_resource is not specified).\n", + "* **distance** (integer): Distance in kilometers (default is 5).\n", + "* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False)." + ], + "id": "23a84052f0711d2" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Example 1: Get the flowlines data using feature_source and feature_id", + "id": "3dc19d7dd78e3173" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Get the flowlines data as a geopandas dataframe", + "id": "f510302d022eca43" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "gdf = nldi.get_flowlines(\n", + " navigation_mode='UM', feature_source=\"WQP\", feature_id=\"USGS-01031500\"\n", + ")\n", + "display(gdf)" + ], + "id": "404457b0b8ea283c", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Get the flowlines data as GeoJSON (as_json=True)", + "id": "8e21e235eb64f446" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "flowlines_json_data = nldi.get_flowlines(\n", + " navigation_mode='UM', feature_source=\"WQP\", feature_id=\"USGS-01031500\", as_json=True\n", + ")\n", + "print(flowlines_json_data)" + ], + "id": "c1d916a742e0e986", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Example 2: Get the flowlines data using comid", + "id": "42259375160429ab" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Get the flowlines data as a geopandas dataframe", + "id": "d01c1153d0e782a6" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "gdf = nldi.get_flowlines(navigation_mode='UM', comid=13294314)\n", + "display(gdf)" + ], + "id": "c014b708c08984e2", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Get the flowlines data as GeoJSON (as_json=True)", + "id": "49856c0e97950d5d" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "flowlines_json_data = nldi.get_flowlines(\n", + " navigation_mode='UM', comid=13294314, as_json=True\n", + ")\n", + "print(flowlines_json_data)" + ], + "id": "b39d360a47ba170f", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "#### The following examples show how to use the `get_features()` function from the dataretrieval package to get features data from the USGS NLDI. The following arguments are supported:\n", + "\n", + "* **data_source** (string): The name of the NLDI data source.\n", + "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD').\n", + "* **feature_source** (string): The name of the NLDI feature source.\n", + "* **feature_id** (string): The identifier of the NLDI feature (required if feature_resource is specified).\n", + "* **comid** (integer): COMID (required if feature_resource is not specified).\n", + "* **distance** (integer): Distance in kilometers (default is 50).\n", + "* **lat** (float): Latitude (required if feature for a specific location is specified).\n", + "* **long** (float): Longitude (required if feature for a specific location is specified).\n", + "* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False)." + ], + "id": "b27151f75e00f649" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Example 1: Get all features along the specified navigation path.", + "id": "b24a6b1f49ed5f7d" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Get the features data using navigation path (UM) and origin type feature source", + "id": "77f35ddf65093622" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "gdf = nldi.get_features(\n", + " data_source=\"census2020-nhdpv2\",\n", + " navigation_mode=\"UM\",\n", + " feature_source=\"WQP\",\n", + " feature_id=\"USGS-01031500\",\n", + ")\n", + "display(gdf)" + ], + "id": "492b5bedfb71a478", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Get the features data using navigation path (UM) and origin type COMID ", + "id": "2a61ce386ef17c8a" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "gdf = nldi.get_features(\n", + " data_source=\"census2020-nhdpv2\", navigation_mode=\"UM\", comid=13294314\n", + ")\n", + "display(gdf)" + ], + "id": "fe7bee5ba6e4f419", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Get the features data using origin type feature source (no navigation path)", + "id": "c7418ac7d155af6c" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "gdf = nldi.get_features(feature_source=\"WQP\", feature_id=\"USGS-01031500\")\n", + "display(gdf)" + ], + "id": "bbde823aba2b82ba", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Get the features data using navigation path (UM) and origin type COMID", + "id": "cae198983f26adef" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "gdf = nldi.get_features(\n", + " comid=13294314, data_source=\"census2020-nhdpv2\", navigation_mode=\"UM\"\n", + ")\n", + "display(gdf)" + ], + "id": "1957fe0113e682d4", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Get the features data for a specific location (lat, long)", + "id": "d72767444885dc2d" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "gdf = nldi.get_features(lat=43.073051, long=-89.401230)\n", + "display(gdf)" + ], + "id": "e6769b9885ef0edb", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "#### The following examples show how to use the `search()` function from the dataretrieval package to get data (basins, flowlines, and features) from the USGS NLDI. You can use this `search()` function instead of the `get_basin()`, `get_flowlines()`, and `get_features()` functions described above. The search function returns data as a python dictionary. The following arguments are supported:\n", + "\n", + "* **feature_source** (string): The name of the NLDI feature source.\n", + "* **feature_id** (string): The identifier of the NLDI feature (required if feature_resource is specified).\n", + "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD').\n", + "* **data_source** (string): The name of the NLDI data source.\n", + "* **find** (string): The specific data type to search for. Allowed values are 'basin', 'flowlines', and 'feature' (default is 'features').\n", + "* **comid** (integer): COMID (required if feature_resource is not specified).\n", + "* **lat** (float): Latitude (required if feature for a specific location is specified).\n", + "* **long** (float): Longitude (required if feature for a specific location is specified).\n", + "* **distance** (integer): Distance in kilometers (default is 50)." + ], + "id": "4283a91f6b12446d" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Example 1: Get aggregated basin level data for a single feature source.", + "id": "9fbfebefedf5d5c2" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "# set the parameters needed to retrieve data\n", + "feature_source = \"WQP\"\n", + "feature_id = \"USGS-01031500\"" + ], + "id": "9fe88e0664f629e8", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "basin_data = nldi.search(\n", + " feature_source=feature_source, feature_id=feature_id, find=\"basin\"\n", + ")\n", + "print(basin_data)" + ], + "id": "d7422c075998921c", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "#### Example 2: Get flowlines data for a specified feature source.", + "id": "bc4ef96efc59550a" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "flowlines_data = nldi.search(\n", + " navigation_mode='UM',\n", + " feature_source=feature_source,\n", + " feature_id=feature_id,\n", + " find=\"flowlines\",\n", + ")\n", + "print(flowlines_data)" + ], + "id": "e247afc5b85a226c", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "### Example 3: Get all features along the specified navigation path.", + "id": "7e17c9af5d643323" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "features_data = nldi.search(\n", + " data_source=\"census2020-nhdpv2\",\n", + " navigation_mode='UM',\n", + " feature_source=feature_source,\n", + " feature_id=feature_id,\n", + " find=\"features\",\n", + ")\n", + "print(features_data)" + ], + "id": "a40613fc4fedc416", + "outputs": [], + "execution_count": null + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From c3d077e1a125406e40f6b9496784170950b8e4dd Mon Sep 17 00:00:00 2001 From: pkdash Date: Fri, 30 Aug 2024 15:20:41 -0400 Subject: [PATCH 2/9] [#157] fixing corrupted notebook --- demos/hydroshare/USGS_dataretrieval_SiteInfo_Examples.ipynb | 2 +- .../hydroshare/USGS_dataretrieval_SiteInventory_Examples.ipynb | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/demos/hydroshare/USGS_dataretrieval_SiteInfo_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_SiteInfo_Examples.ipynb index c9acc633..51007f9b 100644 --- a/demos/hydroshare/USGS_dataretrieval_SiteInfo_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_SiteInfo_Examples.ipynb @@ -98,7 +98,7 @@ "* **siteOutput** (string 'basic' or 'expanded'): Indicates the richness of metadata you want for site attributes. Note that for visually oriented formats like Google Map format, this argument has no meaning. Note: for performance reasons, siteOutput='expanded' cannot be used if seriesCatalogOutput=true or with any values for outputDataTypeCd.\n", "* **seriesCatalogOutput** (boolean): A switch that provides detailed period of record information for certain output formats. The period of record indicates date ranges for a certain kind of information about a site, for example the start and end dates for a site's daily mean streamflow.\n", "\n", - "For additional parameter options see https://waterservices.usgs.gov/docs/site-service/site-service-details/ + "For additional parameter options see https://waterservices.usgs.gov/docs/site-service/site-service-details" ] }, { diff --git a/demos/hydroshare/USGS_dataretrieval_SiteInventory_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_SiteInventory_Examples.ipynb index 70193ddd..d8a2ceb1 100644 --- a/demos/hydroshare/USGS_dataretrieval_SiteInventory_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_SiteInventory_Examples.ipynb @@ -96,7 +96,7 @@ "* **siteOutput** (string 'basic' or 'expanded'): Indicates the richness of metadata you want for site attributes. Note that for visually oriented formats like Google Map format, this argument has no meaning. Note: for performance reasons, siteOutput='expanded' cannot be used if seriesCatalogOutput=true or with any values for outputDataTypeCd.\n", "* **seriesCatalogOutput** (boolean): A switch that provides detailed period of record information for certain output formats. The period of record indicates date ranges for a certain kind of information about a site, for example the start and end dates for a site's daily mean streamflow.\n", "\n", - "For additional parameter options see https://waterservices.usgs.gov/docs/site-service/site-service-details/ + "For additional parameter options see https://waterservices.usgs.gov/docs/site-service/site-service-details" ] }, { From 26bcad1e84c3a404d6600dc36d4e9afb17c4c87d Mon Sep 17 00:00:00 2001 From: pkdash Date: Fri, 30 Aug 2024 15:22:44 -0400 Subject: [PATCH 3/9] [#157] deleting empty cell --- .../USGS_dataretrieval_Statistics_Examples.ipynb | 7 ------- 1 file changed, 7 deletions(-) diff --git a/demos/hydroshare/USGS_dataretrieval_Statistics_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_Statistics_Examples.ipynb index e386a301..f67f1510 100644 --- a/demos/hydroshare/USGS_dataretrieval_Statistics_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_Statistics_Examples.ipynb @@ -286,13 +286,6 @@ " startDt=\"2000\", endDt=\"2007\")\n", "display(x3[0])" ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] } ], "metadata": { From 9ed87298d1c62e593187ae727eaa1a64bb6fec62 Mon Sep 17 00:00:00 2001 From: pkdash Date: Fri, 30 Aug 2024 15:24:50 -0400 Subject: [PATCH 4/9] [#157] adding examples with no multi-index --- ...S_dataretrieval_DailyValues_Examples.ipynb | 19 ++++++++- ...retrieval_GroundwaterLevels_Examples.ipynb | 17 ++++++++ .../USGS_dataretrieval_Peaks_Examples.ipynb | 23 +++++++---- ...GS_dataretrieval_UnitValues_Examples.ipynb | 17 ++++++++ ..._dataretrieval_WaterSamples_Examples.ipynb | 39 +++++++++++++++++++ 5 files changed, 107 insertions(+), 8 deletions(-) diff --git a/demos/hydroshare/USGS_dataretrieval_DailyValues_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_DailyValues_Examples.ipynb index e12e6eef..1c3cf92e 100644 --- a/demos/hydroshare/USGS_dataretrieval_DailyValues_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_DailyValues_Examples.ipynb @@ -286,6 +286,23 @@ "display(dailyMultiSites[0])" ] }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "dailyMultiSites = nwis.get_dv(sites=[\"01491000\", \"01645000\"], parameterCd=[\"00010\", \"00060\"],\n", + " start=\"2012-01-01\", end=\"2012-06-30\", statCd=[\"00001\",\"00003\"],\n", + " multi_index=False)\n", + "display(dailyMultiSites[0])" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -330,4 +347,4 @@ }, "nbformat": 4, "nbformat_minor": 1 -} \ No newline at end of file +} diff --git a/demos/hydroshare/USGS_dataretrieval_GroundwaterLevels_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_GroundwaterLevels_Examples.ipynb index a0afd9f9..5c31853b 100644 --- a/demos/hydroshare/USGS_dataretrieval_GroundwaterLevels_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_GroundwaterLevels_Examples.ipynb @@ -286,6 +286,23 @@ "display(data2[0])" ] }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "site_ids = [\"434400121275801\", \"375907091432201\"]\n", + "data2 = nwis.get_gwlevels(sites=site_ids, multi_index=False)\n", + "print(\"Retrieved \" + str(len(data2[0])) + \" data values.\")\n", + "display(data2[0])" + ] + }, { "cell_type": "markdown", "metadata": { diff --git a/demos/hydroshare/USGS_dataretrieval_Peaks_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_Peaks_Examples.ipynb index 3dd77e66..236eff05 100644 --- a/demos/hydroshare/USGS_dataretrieval_Peaks_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_Peaks_Examples.ipynb @@ -183,6 +183,22 @@ "print(\"The query URL used to retrieve the data from NWIS was: \" + peak_data[1].url)" ] }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "site_ids = ['01594440', '040851325']\n", + "peak_data = nwis.get_discharge_peaks(site_ids, multi_index=False)\n", + "print(\"Retrieved \" + str(len(peak_data[0])) + \" data values.\")" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -239,13 +255,6 @@ "data4 = nwis.get_discharge_peaks(stations, start='1953-01-01', end='1960-01-01')\n", "display(data4[0])" ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] } ], "metadata": { diff --git a/demos/hydroshare/USGS_dataretrieval_UnitValues_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_UnitValues_Examples.ipynb index c38402fd..c24fc587 100644 --- a/demos/hydroshare/USGS_dataretrieval_UnitValues_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_UnitValues_Examples.ipynb @@ -366,6 +366,23 @@ "print('Retrieved ' + str(len(discharge_multisite[0])) + ' data values.')\n", "display(discharge_multisite[0])" ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "discharge_multisite = nwis.get_iv(sites=['04024430', '04024000'], parameterCd=parameterCode,\n", + " start='2013-10-01', end='2013-10-01', multi_index=False)\n", + "print('Retrieved ' + str(len(discharge_multisite[0])) + ' data values.')\n", + "display(discharge_multisite[0])" + ] } ], "metadata": { diff --git a/demos/hydroshare/USGS_dataretrieval_WaterSamples_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_WaterSamples_Examples.ipynb index 0bb8b1f7..44a9f3b3 100644 --- a/demos/hydroshare/USGS_dataretrieval_WaterSamples_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_WaterSamples_Examples.ipynb @@ -225,6 +225,24 @@ "display(wq_multi_site[0])" ] }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "site_ids = ['04024430', '04024000']\n", + "parameter_code = '00065'\n", + "wq_multi_site = nwis.get_qwdata(sites=site_ids, parameterCd=parameter_code, multi_index=False)\n", + "print('Retrieved data for ' + str(len(wq_multi_site[0])) + ' samples.')\n", + "display(wq_multi_site[0])" + ] + }, { "cell_type": "markdown", "metadata": { @@ -260,6 +278,27 @@ "display(wq_data2[0])\n" ] }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "The following example is the same as the previous example but with multi index turned off (multi_index=False)" + }, + { + "metadata": {}, + "cell_type": "code", + "outputs": [], + "execution_count": null, + "source": [ + "site_ids = ['04024430', '04024000']\n", + "parameterCd = ['34247', '30234', '32104', '34220']\n", + "startDate = '2012-01-01'\n", + "endDate = ''\n", + "wq_data2 = nwis.get_qwdata(sites=site_ids, parameterCd=parameterCd,\n", + " start=startDate, end=endDate, multi_index=False)\n", + "print('Retrieved data for ' + str(len(wq_multi_site[0])) + ' samples.')\n", + "display(wq_data2[0])" + ] + }, { "cell_type": "markdown", "metadata": {}, From a4ac32fea30c7809b8c4f00b8c7dec868b60c972 Mon Sep 17 00:00:00 2001 From: pkdash Date: Fri, 30 Aug 2024 15:35:11 -0400 Subject: [PATCH 5/9] [#157] reformating code cells --- .../USGS_dataretrieval_NLDI_Examples.ipynb | 47 ++++--------------- 1 file changed, 10 insertions(+), 37 deletions(-) diff --git a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb index 79f62d51..e884a363 100644 --- a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb @@ -146,9 +146,7 @@ "metadata": {}, "cell_type": "code", "source": [ - "gdf = nldi.get_flowlines(\n", - " navigation_mode='UM', feature_source=\"WQP\", feature_id=\"USGS-01031500\"\n", - ")\n", + "gdf = nldi.get_flowlines(navigation_mode='UM', feature_source=\"WQP\", feature_id=\"USGS-01031500\")\n", "display(gdf)" ], "id": "404457b0b8ea283c", @@ -165,9 +163,7 @@ "metadata": {}, "cell_type": "code", "source": [ - "flowlines_json_data = nldi.get_flowlines(\n", - " navigation_mode='UM', feature_source=\"WQP\", feature_id=\"USGS-01031500\", as_json=True\n", - ")\n", + "flowlines_json_data = nldi.get_flowlines(navigation_mode='UM', feature_source=\"WQP\", feature_id=\"USGS-01031500\", as_json=True)\n", "print(flowlines_json_data)" ], "id": "c1d916a742e0e986", @@ -207,9 +203,7 @@ "metadata": {}, "cell_type": "code", "source": [ - "flowlines_json_data = nldi.get_flowlines(\n", - " navigation_mode='UM', comid=13294314, as_json=True\n", - ")\n", + "flowlines_json_data = nldi.get_flowlines(navigation_mode='UM', comid=13294314, as_json=True)\n", "print(flowlines_json_data)" ], "id": "b39d360a47ba170f", @@ -250,12 +244,7 @@ "metadata": {}, "cell_type": "code", "source": [ - "gdf = nldi.get_features(\n", - " data_source=\"census2020-nhdpv2\",\n", - " navigation_mode=\"UM\",\n", - " feature_source=\"WQP\",\n", - " feature_id=\"USGS-01031500\",\n", - ")\n", + "gdf = nldi.get_features(data_source=\"census2020-nhdpv2\", navigation_mode=\"UM\", feature_source=\"WQP\", feature_id=\"USGS-01031500\")\n", "display(gdf)" ], "id": "492b5bedfb71a478", @@ -272,9 +261,7 @@ "metadata": {}, "cell_type": "code", "source": [ - "gdf = nldi.get_features(\n", - " data_source=\"census2020-nhdpv2\", navigation_mode=\"UM\", comid=13294314\n", - ")\n", + "gdf = nldi.get_features(data_source=\"census2020-nhdpv2\", navigation_mode=\"UM\", comid=13294314)\n", "display(gdf)" ], "id": "fe7bee5ba6e4f419", @@ -308,9 +295,7 @@ "metadata": {}, "cell_type": "code", "source": [ - "gdf = nldi.get_features(\n", - " comid=13294314, data_source=\"census2020-nhdpv2\", navigation_mode=\"UM\"\n", - ")\n", + "gdf = nldi.get_features(comid=13294314, data_source=\"census2020-nhdpv2\", navigation_mode=\"UM\")\n", "display(gdf)" ], "id": "1957fe0113e682d4", @@ -374,9 +359,7 @@ "metadata": {}, "cell_type": "code", "source": [ - "basin_data = nldi.search(\n", - " feature_source=feature_source, feature_id=feature_id, find=\"basin\"\n", - ")\n", + "basin_data = nldi.search(feature_source=feature_source, feature_id=feature_id, find=\"basin\")\n", "print(basin_data)" ], "id": "d7422c075998921c", @@ -393,12 +376,7 @@ "metadata": {}, "cell_type": "code", "source": [ - "flowlines_data = nldi.search(\n", - " navigation_mode='UM',\n", - " feature_source=feature_source,\n", - " feature_id=feature_id,\n", - " find=\"flowlines\",\n", - ")\n", + "flowlines_data = nldi.search(navigation_mode='UM', feature_source=feature_source, feature_id=feature_id, find=\"flowlines\")\n", "print(flowlines_data)" ], "id": "e247afc5b85a226c", @@ -415,13 +393,8 @@ "metadata": {}, "cell_type": "code", "source": [ - "features_data = nldi.search(\n", - " data_source=\"census2020-nhdpv2\",\n", - " navigation_mode='UM',\n", - " feature_source=feature_source,\n", - " feature_id=feature_id,\n", - " find=\"features\",\n", - ")\n", + "features_data = nldi.search(data_source=\"census2020-nhdpv2\", navigation_mode='UM', feature_source=feature_source,\n", + " feature_id=feature_id, find=\"features\")\n", "print(features_data)" ], "id": "a40613fc4fedc416", From 1a63ead76d65c79209f3c973ef673fe2d3e796a7 Mon Sep 17 00:00:00 2001 From: pkdash Date: Tue, 10 Sep 2024 09:47:51 -0400 Subject: [PATCH 6/9] [#157] updates to nldi examples notebook by Jeff Horsburgh --- .../USGS_dataretrieval_NLDI_Examples.ipynb | 602 ++++++++++++------ 1 file changed, 397 insertions(+), 205 deletions(-) diff --git a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb index e884a363..80402420 100644 --- a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb @@ -1,424 +1,616 @@ { "cells": [ { - "metadata": {}, "cell_type": "markdown", + "id": "94cf2fc11d917e1b", + "metadata": { + "tags": [] + }, "source": [ "# USGS dataretrieval Python Package NLDI Data Access Examples\n", "\n", - "This notebook provides examples of using the Python dataretrieval package to retrieve data from the United States Geological Survey (USGS) Hydro Network-Linked Data Index (NLDI). The dataretrieval package provides a collection of functions to get data from the USGS Hydro Network-Linked Data Index (NLDI)." - ], - "id": "94cf2fc11d917e1b" + "This notebook provides examples of using the Python dataretrieval package to retrieve data from the United States Geological Survey (USGS) Hydro Network-Linked Data Index (NLDI). The dataretrieval package provides a collection of functions to get data from the USGS Hydro Network-Linked Data Index (NLDI). For more information about NLDI visit the [USGS website](https://labs.waterdata.usgs.gov/docs/nldi/about-nldi/index.html) describing NLDI or [this blog post](https://waterdata.usgs.gov/blog/nldi-intro/) that covers basic features of the NLDI.\n", + "\n", + "From the [NLDI API documentation](https://labs.waterdata.usgs.gov/api/nldi/swagger-ui/index.html): The NLDI is a search service that takes a watershed outlet identifier as a starting point, a navigation mode to perform, and the type of data desired in response to the request. It can provide geospatial representations of the navigation or linked data sources found along the navigation. It also has the ability to return landscape characteristics for the catchment the watershed outlet is contained in or the total upstream basin." + ] }, { - "metadata": {}, "cell_type": "markdown", + "id": "8695bd7a7b335650", + "metadata": {}, "source": [ "### Install the Package\n", "\n", - "Use the following code to install the package if it doesn't exist already within your Jupyter Python environment. Note the `nldi` option in the `dataretrieval` package installation. The default `dataretrieval` does not support NLDI data access." - ], - "id": "8695bd7a7b335650" + "Use the following code to install the package if it doesn't exist already within your Jupyter Python environment. Note the `nldi` option in the `dataretrieval` package installation. The default `dataretrieval` package installation does not support NLDI data access. You must run the dataretrieval install with the `nldi` option to ensure that you have the necessary capabilities. If you are running this notebook in the CUAHSI JuypyterHub server linked to HydroShare, you will want to run the following pip install command. The base `dataretrieval` package is already installed in the CUAHSI JupyterHub Python environment, but it does not include the NLDI option." + ] }, { - "metadata": {}, "cell_type": "code", - "source": "!pip install dataretrieval[nldi]", + "execution_count": null, "id": "2cb985f4a60ce046", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "!pip install dataretrieval[nldi]" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "Load the package so that you can use its functions in this notebook.", - "id": "ef27dd9de0b05a9a" + "id": "ef27dd9de0b05a9a", + "metadata": {}, + "source": [ + "Load the package so that you can use its functions in this notebook." + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "from dataretrieval import nldi\n", - "from IPython.display import display" - ], + "execution_count": null, "id": "aa0f8aad72102b29", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "from dataretrieval import nldi\n", + "from IPython.display import display\n", + "import folium" + ] }, { - "metadata": {}, "cell_type": "markdown", + "id": "213e4c0d0b983a19", + "metadata": { + "tags": [] + }, "source": [ - "### Basic Usage\n", + "***\n", + "\n", + "## Usage Examples\n", + "\n", + "The dataretrieval package provides a number of functions to get data from the USGS NLDI. The following sections provide examples of how each of the available functions can be used\n", "\n", - "The dataretrieval package provides a number of functions to get data from the USGS NLDI. \n", + "* `get_basin()`: function used to get the upstream basin boundary for any feature indexed by NLDI.\n", + "* `get_flowlines()`: function used to get upstream or downstream flowline data from NLDI.\n", + "* `get_features()`: function used to retrieve information about features from any source indexed by NLDI.\n", + "* `search()`: general function used to retrieve data from NLDI. Can be used in place of the functions listed above.\n", "\n", - "#### The following examples show how to use the `get_basin()` function from the dataretrieval package to get basin data from the USGS NLDI. The following arguments are supported:\n", + "***\n", + "\n", + "### Examples for the `get_basin()` function:\n", + "\n", + "The following examples show how to use the `get_basin()` function from the dataretrieval package to retrieve the upstream basin boundary (the contributing watershed for a feature) from the USGS NLDI associated with any feature indexed by NLDI. \n", + "\n", + "The following arguments are supported:\n", "\n", "* **feature_source** (string): The name of the NLDI feature source.\n", "* **feature_id** (string): The identifier of the NLDI feature.\n", "* **simplified** (boolean): If True, the data will be returned with simplified polygons. If False, the data will be returned as a single polygon (default is False).\n", - "* **split_catchment** (boolean): If True, the data will be returned with split catchment polygons. If False, the data will be returned as a single polygon (default is False) NOTE: Setting this to True may result in error due to a known issue with NLDI API.\n", + "* **split_catchment** (boolean): If True, the data will be returned with split catchment polygons. If False, the data will be returned as a single polygon (default is False) NOTE: Setting this to True may result in an error due to a known issue with NLDI API.\n", "* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False).\n" - ], - "id": "213e4c0d0b983a19" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "#### Example 1: Get aggregated basin level data for a single feature source.", - "id": "9900c519345f9d2f" + "id": "9900c519345f9d2f", + "metadata": {}, + "source": [ + "#### Example 1: Get the upstream basin for a single feature from a single source.\n", + "In this example, we will retrieve the boundary for the upstream basin for a single monitoring site feature. For the monitoring site feature, we will use a USGS water quality monitoring station. The feature source is the Water Quality Portal (WQP), and the identifier of the feature is the USGS water quality monitoring site identifer prepended with the agency code for USGS (USGS-01031500). You can use the identifiers for any monitoring station indexed by the NLDI to retrieve its upstream watershed boundary." + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "# set the parameters needed to retrieve data\n", - "feature_source = \"WQP\"\n", - "feature_id = \"USGS-01031500\"" - ], + "execution_count": null, "id": "8db53f3d4d004e65", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "# set the parameters needed to retrieve data\n", + "feat_source = 'WQP'\n", + "feat_id = 'USGS-01031500'" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "Get the basin data as a geopandas dataframe", - "id": "c8595b1e706a8468" + "id": "c8595b1e706a8468", + "metadata": {}, + "source": [ + "Now we can call the `get_basin` function to get the coordinates of the polygon making up the basin boundary associated with this monitoring site - i.e., the watershed area upstream of the given monitoring site. The result will be returned as a geopandas dataframe unless the `as_json` argument is used and set to True." + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "gdf = nldi.get_basin(feature_source, feature_id)\n", - "display(gdf)" - ], + "execution_count": null, "id": "d8d0d847d8c171b6", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "basin_gdf = nldi.get_basin(feature_source=feat_source, feature_id=feat_id)\n", + "display(basin_gdf)" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "Get the basin data as GeoJSON (as_json=True)", - "id": "af599a9f632930d" + "id": "af599a9f632930d", + "metadata": {}, + "source": [ + "If you want to get the basin boundary coordinate data in GeoJSON format, you can use the `as_json` argument in the function call (as_json=True)" + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "basin_json_data = nldi.get_basin(feature_source, feature_id, as_json=True)\n", - "print(basin_json_data)" - ], + "execution_count": null, "id": "340793f67b33ff39", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "basin_json_data = nldi.get_basin(feature_source=feat_source, feature_id=feat_id, as_json=True)\n", + "print(basin_json_data)" + ] + }, + { + "cell_type": "markdown", + "id": "c077de87-a826-4d93-b96a-bd65d98706f5", + "metadata": {}, + "source": [ + "Make a quick map of the selected monitoring station and the upstream boundary returned by `get_basin()`." + ] }, { + "cell_type": "code", + "execution_count": null, + "id": "9ff150da-b55f-4b19-97a5-75d5aa4d63f4", "metadata": {}, + "outputs": [], + "source": [ + "# Get the feature associated with the monitoring site \n", + "# More examples of how to use the get_features() function are given below\n", + "site_gdf = nldi.get_features(feature_source=feat_source, feature_id=feat_id)\n", + "\n", + "# Set the Coordinate Reference System (CRS) for the GeoDataFrames \n", + "# containing the basin boundary coordinates and the monitoring site\n", + "# epsg='4326' for WGS84\n", + "basin_gdf.set_crs(epsg='4326', inplace=True)\n", + "site_gdf.set_crs(epsg='4326', inplace=True)\n", + "\n", + "# Create a base map using folium\n", + "m = folium.Map(location=[site_gdf.geometry.x[0], site_gdf.geometry.y[0]], zoom_start=10)\n", + "\n", + "# Add the selected monitoring location and basin features to the map\n", + "folium.GeoJson(site_gdf, name='Monitoring Location').add_to(m)\n", + "folium.GeoJson(basin_gdf, name='Basin Boundary', color='red').add_to(m)\n", + "\n", + "# Zoom the map to the bounds of the data\n", + "bounds = m.get_bounds()\n", + "m.fit_bounds(bounds)\n", + "\n", + "# Add layer control to toggle layers\n", + "folium.LayerControl().add_to(m)\n", + "\n", + "# Display the map\n", + "m" + ] + }, + { "cell_type": "markdown", + "id": "23a84052f0711d2", + "metadata": {}, "source": [ - "#### The following examples show how to use the `get_flowlines()` function from the dataretrieval package to get flowlines data from the USGS NLDI. The following arguments are supported:\n", + "***\n", + "\n", + "### Examples for the `get_flowlines()` function:\n", + "\n", + "The following examples show how to use the `get_flowlines()` function from the dataretrieval package to get flowline data from the USGS NLDI. Flowlines can be traced upsream or downstream from any indexed feature (e.g., a USGS monitoring location), and the result can include mainstem only or tributaries. Flowlines are returned for the specified navigation in WGS84 latitude/longitude coordinats in GeoJSON format.\n", "\n", - "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD').\n", + "The following arguments are supported:\n", + "\n", + "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD'). 'UM' = Upstream on the mainstem. 'DM' = Downstream on the mainstem. 'UT' = Upstream with tributaries. 'DD' = Downstream with diversions.\n", "* **feature_source** (string): The name of the NLDI feature source.\n", "* **feature_id** (string): The identifier of the NLDI feature.\n", "* **comid** (integer): COMID (required if feature_resource is not specified).\n", - "* **distance** (integer): Distance in kilometers (default is 5).\n", + "* **distance** (integer): Distance in kilometers (default is 5). This distance parameter dictates the distance to navigate.\n", "* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False)." - ], - "id": "23a84052f0711d2" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "#### Example 1: Get the flowlines data using feature_source and feature_id", - "id": "3dc19d7dd78e3173" - }, - { + "id": "3dc19d7dd78e3173", "metadata": {}, - "cell_type": "markdown", - "source": "Get the flowlines data as a geopandas dataframe", - "id": "f510302d022eca43" + "source": [ + "#### Example 1: Get the flowline data using feature_source and feature_id\n", + "Get the flowline data tracing upstream on the mainstem from a USGS water quality monitoring site with the result returned as a geopandas dataframe. In this example, we will trace upstream including tributaries and include the distance argument with a distance that is sufficiently large to retrieve all of the flowlines upstream of the monitoring station." + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "gdf = nldi.get_flowlines(navigation_mode='UM', feature_source=\"WQP\", feature_id=\"USGS-01031500\")\n", - "display(gdf)" - ], + "execution_count": null, "id": "404457b0b8ea283c", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "feat_source = 'WQP'\n", + "feat_id = 'USGS-01031500'\n", + "\n", + "flowlines_gdf = nldi.get_flowlines(navigation_mode='UT', feature_source=feat_source, feature_id=feat_id, distance=100)\n", + "display(flowlines_gdf)" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "Get the flowlines data as GeoJSON (as_json=True)", - "id": "8e21e235eb64f446" + "id": "8e21e235eb64f446", + "metadata": {}, + "source": [ + "Get the same flowline data with the result returned as GeoJSON (as_json=True)" + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "flowlines_json_data = nldi.get_flowlines(navigation_mode='UM', feature_source=\"WQP\", feature_id=\"USGS-01031500\", as_json=True)\n", - "print(flowlines_json_data)" - ], + "execution_count": null, "id": "c1d916a742e0e986", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "flowlines_json_data = nldi.get_flowlines(navigation_mode='UT', feature_source=feat_source, feature_id=feat_id, as_json=True)\n", + "print(flowlines_json_data)" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "#### Example 2: Get the flowlines data using comid", - "id": "42259375160429ab" + "id": "62c634be-74e1-4476-86f1-b7bd8fdbbd76", + "metadata": {}, + "source": [ + "Add the retrieved flowline data to the map for visualization of what is returned. To change the distance traced upstream on the flowlines, change the value of the distance argument in the `get_flowlines()` function call and set the distance upstream you want to trace. You can also change the navigation_mode argument to include or exclude tributaries." + ] }, { + "cell_type": "code", + "execution_count": null, + "id": "0e9b0a6f-faef-40a4-ae34-a75e210bf5d2", "metadata": {}, - "cell_type": "markdown", - "source": "Get the flowlines data as a geopandas dataframe", - "id": "d01c1153d0e782a6" + "outputs": [], + "source": [ + "# Get the feature associated with the monitoring site \n", + "# More examples of how to use the get_features() function are given below\n", + "site_gdf = nldi.get_features(feature_source=feat_source, feature_id=feat_id)\n", + "\n", + "# Set the Coordinate Reference System (CRS) for the GeoDataFrames \n", + "# containing the basin boundary coordinates and the monitoring site\n", + "# epsg='4326' for WGS84\n", + "site_gdf.set_crs(epsg='4326', inplace=True)\n", + "flowlines_gdf.set_crs(epsg='4326', inplace=True)\n", + "\n", + "# Create a base map using folium\n", + "m = folium.Map(location=[site_gdf.geometry.x[0], site_gdf.geometry.y[0]], zoom_start=10)\n", + "\n", + "# Add the selected monitoring location and basin features to the map\n", + "folium.GeoJson(site_gdf, name='Monitoring Location').add_to(m)\n", + "folium.GeoJson(basin_gdf, name='Basin Boundary', color='red').add_to(m)\n", + "folium.GeoJson(flowlines_gdf, name='Flowlines', color='blue').add_to(m)\n", + "\n", + "# Zoom the map to the bounds of the data\n", + "bounds = m.get_bounds()\n", + "m.fit_bounds(bounds)\n", + "\n", + "# Add layer control to toggle layers\n", + "folium.LayerControl().add_to(m)\n", + "\n", + "# Display the map\n", + "m" + ] }, { + "cell_type": "markdown", + "id": "42259375160429ab", "metadata": {}, + "source": [ + "#### Example 2: Get flowline data using a NHDPlus comid\n", + "In some cases, you may wish to get flowline data associated with a NHDPlus COMID rather than flowlines associated with a monitoring station. The following shows how to get flowline data for a COMID as a geopandas dataframe." + ] + }, + { "cell_type": "code", + "execution_count": null, + "id": "c014b708c08984e2", + "metadata": {}, + "outputs": [], "source": [ "gdf = nldi.get_flowlines(navigation_mode='UM', comid=13294314)\n", "display(gdf)" - ], - "id": "c014b708c08984e2", - "outputs": [], - "execution_count": null + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "Get the flowlines data as GeoJSON (as_json=True)", - "id": "49856c0e97950d5d" + "id": "49856c0e97950d5d", + "metadata": {}, + "source": [ + "To get the same flowline data as GeoJSON (as_json=True) instead of as a geopandas dataframe, do the following." + ] }, { - "metadata": {}, "cell_type": "code", + "execution_count": null, + "id": "b39d360a47ba170f", + "metadata": {}, + "outputs": [], "source": [ "flowlines_json_data = nldi.get_flowlines(navigation_mode='UM', comid=13294314, as_json=True)\n", "print(flowlines_json_data)" - ], - "id": "b39d360a47ba170f", - "outputs": [], - "execution_count": null + ] }, { - "metadata": {}, "cell_type": "markdown", + "id": "b27151f75e00f649", + "metadata": {}, "source": [ - "#### The following examples show how to use the `get_features()` function from the dataretrieval package to get features data from the USGS NLDI. The following arguments are supported:\n", + "***\n", + "\n", + "### Examples for the `get_features()` function:\n", + "\n", + "The following examples show how to use the `get_features()` function from the dataretrieval package to get features data from the USGS NLDI. The `get_features()` function returns all features found along the specified navigation as points in WGS84 latitude/longitude endoced as GeoJSON.\n", + "\n", + "The following arguments are supported:\n", "\n", "* **data_source** (string): The name of the NLDI data source.\n", - "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD').\n", + "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD'). 'UM' = Upstream on the mainstem. 'DM' = Downstream on the mainstem. 'UT' = Upstream with tributaries. 'DD' = Downstream with diversions.\n", "* **feature_source** (string): The name of the NLDI feature source.\n", - "* **feature_id** (string): The identifier of the NLDI feature (required if feature_resource is specified).\n", - "* **comid** (integer): COMID (required if feature_resource is not specified).\n", - "* **distance** (integer): Distance in kilometers (default is 50).\n", + "* **feature_id** (string): The identifier of the NLDI feature (required if feature_source is specified).\n", + "* **comid** (integer): COMID (required if feature_source is not specified).\n", + "* **distance** (integer): Distance in kilometers (default is 50). This distance parameter dictates the distance to navigate.\n", "* **lat** (float): Latitude (required if feature for a specific location is specified).\n", "* **long** (float): Longitude (required if feature for a specific location is specified).\n", "* **as_json** (boolean): If True, the data will be returned as a python dictionary. If False, the data will be returned as a geopandas dataframe (default is False)." - ], - "id": "b27151f75e00f649" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "### Example 1: Get all features along the specified navigation path.", - "id": "b24a6b1f49ed5f7d" - }, - { + "id": "b24a6b1f49ed5f7d", "metadata": {}, - "cell_type": "markdown", - "source": "Get the features data using navigation path (UM) and origin type feature source", - "id": "77f35ddf65093622" + "source": [ + "#### Example 1: Get all indexed features along a specified navigation path\n", + "Get all of the indexed features from a particular data source using a navigation path that traces upstream along the mainstam (UM) and uses as an origin for the trace a feature from a given feature source (e.g., a monitoring station from the Water Quality Portal). You can get any indexed features from any of the included data sources. Example data sources and the codes you need to retrieve features from those sources include:\n", + "* \"census2020-nhdpv2\" - 2020 Census Block - NHDPlusV2 Catchment Intersections\n", + "* \"epa_nrsa\" - EPA National Rivers and Streams Assessment\n", + "* \"gfv11_pois\" - USGS Geospatial Fabric V1.1 Points of Interest\n", + "* \"huc12pp\" - HUC12 Pour Points NHDPlusV2\n", + "* \"npdes\" - NPDES Facilities that Discharge to Water\n", + "* \"nwisgw\" - NWIS Groundwater Sites\n", + "* \"nwissite\" - NWIS Surface Water Sites\n", + "* \"WQP\" - Water Quality Portal\n", + "* \"comid\" - NHDPlus comid\n", + "\n", + "There are a few others - for a complete list, consult the getDataSources endpoint of the NLDI API. For this example, we will trace upstream along the mainstem and find any NWIS surface water sites located along the mainstem of the river." + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "gdf = nldi.get_features(data_source=\"census2020-nhdpv2\", navigation_mode=\"UM\", feature_source=\"WQP\", feature_id=\"USGS-01031500\")\n", - "display(gdf)" - ], + "execution_count": null, "id": "492b5bedfb71a478", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "feat_source = 'WQP'\n", + "feat_id = 'USGS-01031500'\n", + "features_gdf = nldi.get_features(data_source='nwissite', navigation_mode='UM', feature_source=feat_source, feature_id=feat_id)\n", + "display(features_gdf)" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "Get the features data using navigation path (UM) and origin type COMID ", - "id": "2a61ce386ef17c8a" + "id": "ee85219d-4950-4350-8a2e-68f38fe0c096", + "metadata": {}, + "source": [ + "Add the returned features to the map." + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "gdf = nldi.get_features(data_source=\"census2020-nhdpv2\", navigation_mode=\"UM\", comid=13294314)\n", - "display(gdf)" - ], - "id": "fe7bee5ba6e4f419", + "execution_count": null, + "id": "935739e5-e438-40fe-b8e1-7dc123dbe878", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "# Get the feature associated with the monitoring site \n", + "# More examples of how to use the get_features() function are given below\n", + "site_gdf = nldi.get_features(feature_source=feat_source, feature_id=feat_id)\n", + "\n", + "# Set the Coordinate Reference System (CRS) for the GeoDataFrames \n", + "# containing the basin boundary coordinates and the monitoring site\n", + "# epsg='4326' for WGS84\n", + "site_gdf.set_crs(epsg='4326', inplace=True)\n", + "features_gdf.set_crs(epsg='4326', inplace=True)\n", + "\n", + "# Create a base map using folium\n", + "m = folium.Map(location=[site_gdf.geometry.x[0], site_gdf.geometry.y[0]], zoom_start=10)\n", + "\n", + "# Add the selected monitoring location and basin features to the map\n", + "folium.GeoJson(site_gdf, name='Monitoring Location').add_to(m)\n", + "folium.GeoJson(basin_gdf, name='Basin Boundary', color='red').add_to(m)\n", + "folium.GeoJson(flowlines_gdf, name='Flowlines', color='blue').add_to(m)\n", + "folium.GeoJson(features_gdf, name='Features', color='green').add_to(m)\n", + "\n", + "# Zoom the map to the bounds of the data\n", + "bounds = m.get_bounds()\n", + "m.fit_bounds(bounds)\n", + "\n", + "# Add layer control to toggle layers\n", + "folium.LayerControl().add_to(m)\n", + "\n", + "# Display the map\n", + "m" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "Get the features data using origin type feature source (no navigation path)", - "id": "c7418ac7d155af6c" + "id": "2a61ce386ef17c8a", + "metadata": {}, + "source": [ + "Rather than using a water quality monitoring station as the orgin, you can also use a NHDPlus COMID as the origin for the trace. The code below does the same thing as the code above except it uses a COMID as the origin for the navigation. " + ] }, { - "metadata": {}, "cell_type": "code", + "execution_count": null, + "id": "fe7bee5ba6e4f419", + "metadata": {}, + "outputs": [], "source": [ - "gdf = nldi.get_features(feature_source=\"WQP\", feature_id=\"USGS-01031500\")\n", + "gdf = nldi.get_features(data_source='nwissite', navigation_mode='UM', comid=13294314)\n", "display(gdf)" - ], - "id": "bbde823aba2b82ba", - "outputs": [], - "execution_count": null + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "Get the features data using navigation path (UM) and origin type COMID", - "id": "cae198983f26adef" + "id": "c7418ac7d155af6c", + "metadata": {}, + "source": [ + "#### Example 2: Get information about indexed features\n", + "Sometimes you may want to get information about indexed features without using any sort of navigation path. In this case, you can use `get_features()` without specifying a navigation_mode. The following code returns the feature information for a water quality monitoring station from the Water Quality Portal." + ] }, { - "metadata": {}, "cell_type": "code", + "execution_count": null, + "id": "bbde823aba2b82ba", + "metadata": {}, + "outputs": [], "source": [ - "gdf = nldi.get_features(comid=13294314, data_source=\"census2020-nhdpv2\", navigation_mode=\"UM\")\n", + "gdf = nldi.get_features(feature_source='WQP', feature_id='USGS-01031500')\n", "display(gdf)" - ], - "id": "1957fe0113e682d4", - "outputs": [], - "execution_count": null + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "Get the features data for a specific location (lat, long)", - "id": "d72767444885dc2d" + "id": "d72767444885dc2d", + "metadata": {}, + "source": [ + "#### Example 3: Get information about indexed features for a specific location\n", + "Get information about indexed features for a specific location by passing latitude and longitude coordinates into the `get_features()` function." + ] }, { - "metadata": {}, "cell_type": "code", + "execution_count": null, + "id": "e6769b9885ef0edb", + "metadata": {}, + "outputs": [], "source": [ "gdf = nldi.get_features(lat=43.073051, long=-89.401230)\n", "display(gdf)" - ], - "id": "e6769b9885ef0edb", - "outputs": [], - "execution_count": null + ] }, { - "metadata": {}, "cell_type": "markdown", + "id": "4283a91f6b12446d", + "metadata": {}, "source": [ - "#### The following examples show how to use the `search()` function from the dataretrieval package to get data (basins, flowlines, and features) from the USGS NLDI. You can use this `search()` function instead of the `get_basin()`, `get_flowlines()`, and `get_features()` functions described above. The search function returns data as a python dictionary. The following arguments are supported:\n", + "***\n", + "\n", + "### Examples for the `search()` function:\n", + "\n", + "The following examples show how to use the `search()` function from the dataretrieval package to get data (basins, flowlines, and features) from the USGS NLDI. You can use the `search()` function instead of the `get_basin()`, `get_flowlines()`, and `get_features()` functions described above. The search function returns data as a python dictionary. \n", + "\n", + "The following arguments are supported:\n", "\n", "* **feature_source** (string): The name of the NLDI feature source.\n", "* **feature_id** (string): The identifier of the NLDI feature (required if feature_resource is specified).\n", - "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD').\n", + "* **navigation_mode** (string): Navigation mode (allowed values are 'UM', 'DM', 'UT', 'DD'). 'UM' = Upstream on the mainstem. 'DM' = Downstream on the mainstem. 'UT' = Upstream with tributaries. 'DD' = Downstream with diversions.\n", "* **data_source** (string): The name of the NLDI data source.\n", "* **find** (string): The specific data type to search for. Allowed values are 'basin', 'flowlines', and 'feature' (default is 'features').\n", - "* **comid** (integer): COMID (required if feature_resource is not specified).\n", + "* **comid** (integer): NHDPlus COMID (required if feature_source is not specified).\n", "* **lat** (float): Latitude (required if feature for a specific location is specified).\n", "* **long** (float): Longitude (required if feature for a specific location is specified).\n", - "* **distance** (integer): Distance in kilometers (default is 50)." - ], - "id": "4283a91f6b12446d" + "* **distance** (integer): Distance in kilometers (default is 50). This distance parameter dictates the distance to navigate." + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "#### Example 1: Get aggregated basin level data for a single feature source.", - "id": "9fbfebefedf5d5c2" + "id": "9fbfebefedf5d5c2", + "metadata": {}, + "source": [ + "#### Example 1: Get the upstream basin for an indexed water quality monitoring station\n", + "Instead of using `get_basin()`, the `search()` function can be used to retrieve the upstream contributing area for a water quality station from the Water Quality Portal." + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "# set the parameters needed to retrieve data\n", - "feature_source = \"WQP\"\n", - "feature_id = \"USGS-01031500\"" - ], + "execution_count": null, "id": "9fe88e0664f629e8", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "# set the parameters needed to retrieve data\n", + "feat_source = 'WQP'\n", + "feat_id = 'USGS-01031500'" + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "basin_data = nldi.search(feature_source=feature_source, feature_id=feature_id, find=\"basin\")\n", - "print(basin_data)" - ], + "execution_count": null, "id": "d7422c075998921c", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "basin_data = nldi.search(feature_source=feat_source, feature_id=feat_id, find=\"basin\")\n", + "print(basin_data)" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "#### Example 2: Get flowlines data for a specified feature source.", - "id": "bc4ef96efc59550a" + "id": "bc4ef96efc59550a", + "metadata": {}, + "source": [ + "#### Example 2: Get flowlines data for a specified feature source\n", + "Instead of using `get_flowlines()`, the `search()` function can be used to retrieve the flowlines traced upstream or downstream from a feature. In this case, we can get the flowlines upstream along the mainstem for the water quality station in the last example." + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "flowlines_data = nldi.search(navigation_mode='UM', feature_source=feature_source, feature_id=feature_id, find=\"flowlines\")\n", - "print(flowlines_data)" - ], + "execution_count": null, "id": "e247afc5b85a226c", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "flowlines_data = nldi.search(navigation_mode='UM', feature_source=feat_source, feature_id=feat_id, find='flowlines')\n", + "print(flowlines_data)" + ] }, { - "metadata": {}, "cell_type": "markdown", - "source": "### Example 3: Get all features along the specified navigation path.", - "id": "7e17c9af5d643323" + "id": "7e17c9af5d643323", + "metadata": {}, + "source": [ + "#### Example 3: Get all features along a specified navigation path\n", + "Likewise, instead of using `get_features()`, the `search()` function can be used to retrieve all of the indexed features from a particular data source along a particular navigation path - in this case tracing upstream along the mainstem from the same water quality station in the previous examples." + ] }, { - "metadata": {}, "cell_type": "code", - "source": [ - "features_data = nldi.search(data_source=\"census2020-nhdpv2\", navigation_mode='UM', feature_source=feature_source,\n", - " feature_id=feature_id, find=\"features\")\n", - "print(features_data)" - ], + "execution_count": null, "id": "a40613fc4fedc416", + "metadata": {}, "outputs": [], - "execution_count": null + "source": [ + "features_data = nldi.search(data_source='nwissite', navigation_mode='UM', feature_source=feat_source,\n", + " feature_id=feat_id, find='features')\n", + "print(features_data)" + ] } ], "metadata": { "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", - "version": 2 + "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", - "pygments_lexer": "ipython2", - "version": "2.7.6" + "pygments_lexer": "ipython3", + "version": "3.9.7" } }, "nbformat": 4, From d54a9ad9a6f549cbef2f89ea11b920ca635f1888 Mon Sep 17 00:00:00 2001 From: Timothy Hodson <34148978+thodson-usgs@users.noreply.github.com> Date: Thu, 3 Oct 2024 23:19:37 -0500 Subject: [PATCH 7/9] Update demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb Co-authored-by: Elise Hinman <121896266+ehinman@users.noreply.github.com> --- demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb index 80402420..4b424638 100644 --- a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb @@ -198,7 +198,7 @@ "\n", "### Examples for the `get_flowlines()` function:\n", "\n", - "The following examples show how to use the `get_flowlines()` function from the dataretrieval package to get flowline data from the USGS NLDI. Flowlines can be traced upsream or downstream from any indexed feature (e.g., a USGS monitoring location), and the result can include mainstem only or tributaries. Flowlines are returned for the specified navigation in WGS84 latitude/longitude coordinats in GeoJSON format.\n", + "The following examples show how to use the `get_flowlines()` function from the dataretrieval package to get flowline data from the USGS NLDI. Flowlines can be traced upstream or downstream from any indexed feature (e.g., a USGS monitoring location), and the result can include mainstem only or tributaries. Flowlines are returned for the specified navigation in WGS84 latitude/longitude coordinats in GeoJSON format.\n", "\n", "The following arguments are supported:\n", "\n", From 3303634eec9b4dcab404912f189cc4458c3d1091 Mon Sep 17 00:00:00 2001 From: Timothy Hodson <34148978+thodson-usgs@users.noreply.github.com> Date: Thu, 3 Oct 2024 23:20:15 -0500 Subject: [PATCH 8/9] Update demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb Co-authored-by: Elise Hinman <121896266+ehinman@users.noreply.github.com> --- demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb index 4b424638..3f85e0ad 100644 --- a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb @@ -365,7 +365,7 @@ "metadata": {}, "source": [ "#### Example 1: Get all indexed features along a specified navigation path\n", - "Get all of the indexed features from a particular data source using a navigation path that traces upstream along the mainstam (UM) and uses as an origin for the trace a feature from a given feature source (e.g., a monitoring station from the Water Quality Portal). You can get any indexed features from any of the included data sources. Example data sources and the codes you need to retrieve features from those sources include:\n", + "Get all of the indexed features from a particular data source using a navigation path that traces upstream along the mainstem (UM) and uses as an origin for the trace a feature from a given feature source (e.g., a monitoring station from the Water Quality Portal). You can get any indexed features from any of the included data sources. Example data sources and the codes you need to retrieve features from those sources include:\n", "* \"census2020-nhdpv2\" - 2020 Census Block - NHDPlusV2 Catchment Intersections\n", "* \"epa_nrsa\" - EPA National Rivers and Streams Assessment\n", "* \"gfv11_pois\" - USGS Geospatial Fabric V1.1 Points of Interest\n", From 0a1c1703cd7fb104aeb8e1a851fc01d46e8391e9 Mon Sep 17 00:00:00 2001 From: Timothy Hodson <34148978+thodson-usgs@users.noreply.github.com> Date: Fri, 4 Oct 2024 09:06:12 -0500 Subject: [PATCH 9/9] Update demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb Co-authored-by: Elise Hinman <121896266+ehinman@users.noreply.github.com> --- demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb index 3f85e0ad..c627e452 100644 --- a/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb +++ b/demos/hydroshare/USGS_dataretrieval_NLDI_Examples.ipynb @@ -301,7 +301,7 @@ "id": "42259375160429ab", "metadata": {}, "source": [ - "#### Example 2: Get flowline data using a NHDPlus comid\n", + "#### Example 2: Get flowline data using a NHDPlus COMID\n", "In some cases, you may wish to get flowline data associated with a NHDPlus COMID rather than flowlines associated with a monitoring station. The following shows how to get flowline data for a COMID as a geopandas dataframe." ] },