Skip to content

Commit

Permalink
Merge pull request #49 from arashmodrad/main
Browse files Browse the repository at this point in the history
3d vignette
  • Loading branch information
mikejohnson51 authored May 24, 2024
2 parents 0ba13b5 + a96c60f commit b54c989
Show file tree
Hide file tree
Showing 2 changed files with 211 additions and 9 deletions.
21 changes: 12 additions & 9 deletions vignettes/devcon2024-tutorial.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,9 @@ hls = arrow::open_dataset(glue::glue("/Users/mjohnson/hydrofabric/v2.2/conus_hl"

# High level

The hydrofabric team is focused on delivering a consistent, interoporable, flexible, cloud native solution to those interested in hydrologic modeling.
The HydroFabric team is dedicated to delivering a consistent, interoperable, flexible, cloud-native solution for hydrologic modeling.

It provides foundational features, topology and attributes needed for mapping, geospatial analysis, machine learning, evaulation, data assimilation, NextGen (and AWI data stream, NGIAB, etc) applications, and other modleing needs (e.g. NHM, US Water Census) etc.
This solution provides foundational features, topology, and attributes necessary for a variety of applications, including mapping, geospatial analysis, machine learning, evaluation, data assimilation, NextGen, AWI data stream, NGIAB, and other modeling needs such as the National Hydrologic Model (NHM) and the U.S. Water Census

```{r, fig.align='center', echo = FALSE, fig.cap="Enterprise Hydrofabric System"}
knitr::include_graphics("../man/figures/roadmap2.png")
Expand All @@ -57,7 +57,7 @@ Heavy use of
- s3 (aws)
- defered evaluation

Today's tutorial makes a concentrated effort to highlight to highlight these using a real world example.
Today's tutorial makes a concentrated effort to highlight these features using a real-world example.

While we provide a national
For this example, we will use NWIS `06752260` that sits on the Cache La Poudre River in Fort Collins, Colorado. You can use any USGS gage you desire, or any of the `r hls` found via [lynker-spatial](https://lynker-spatial.s3-us-west-2.amazonaws.com/hydrofabric/v2.2/hydrolocations.html).
Expand All @@ -67,7 +67,7 @@ For this example, we will use NWIS `06752260` that sits on the Cache La Poudre R
knitr::include_graphics("../man/figures/hydrofabric.png")
```

If you successfully complete this tutorial you will create the data products (and skills!) needed to run the AWI datastream and NGIAB.
Successfully completing this tutorial will equip you with the data products (and skills!) needed to run the AWI datastream and NGIAB.

# Setup

Expand Down Expand Up @@ -366,8 +366,10 @@ oh BTW area was been calculated in Rl_ls

# Extacting Cross Sections

hyperlink to bew 3D-vignette
link to JOSS paper
We introduce the 3D Cross Sections Project, which provides transects along the entire river network in the CONUS (Contiguous United States). This product is developed using and indexed to the National Hydrologic Geospatial Fabric Reference [link](https://www.sciencebase.gov/catalog/item/60be0e53d34e86b93891012b) and is currently available in an optimized cloud-native format at [link](s3://lynker-spatial/hydrofabric/v20.1/3D/cross-sections/). The product is built on a 10m DEM from the 3D Elevation Program ([3DEP](https://www.usgs.gov/3d-elevation-program)) and is supplemented by machine learning models to account for missing bathymetry, covering all 2.7 million reaches in the CONUS.

In this context, we demonstrate how to use a subset of the NextGen file to locate the outlet, retrieve the ID, and pull machine-learned bathymetry information while automatically handling the crosswalk between different geospatial fabrics. We also illustrate the use of the [AHGestimation package](https://github.com/mikejohnson51/AHGestimation) to generate the missing bathymetric information. This package offers numerous functionalities, including the computation of mass-preserving hydraulic geometries, rating curves, generating cross sections from at-a-station hydraulic geometry (AHG) principles, and more.


```{r}
crosswalk <- as_sqlite(nextgen_file, "network") |>
Expand All @@ -380,9 +382,9 @@ cs <- open_dataset("s3://lynker-spatial/hydrofabric/v2.2/reference/routelink_ls/
select(hf_id, TopWdthCC, owp_dingman_r, owp_y_bf) |>
inner_join(mutate(crosswalk, hf_id = as.integer64(hf_id)), by = "hf_id") |>
collect() %>%
summarise(TW = mean(TopWdthCC),
r = mean(owp_dingman_r),
Y = mean(owp_y_bf),
summarise(TW = mean(ml_tw_nchan(m)),
r = mean(ml_r),
Y = mean(ml_y_inchan(m)),
poi_id = poi_id[1])
bathy = AHGestimation::cross_section(r = cs$r, TW = cs$TW, Ymax = cs$Y)
Expand All @@ -393,6 +395,7 @@ plot(bathy$x, bathy$Y, type = "l",
main = glue("Average XS at POI: {cs$poi_id}"))
```

For more details regarding the AHGestimation toolbox, please refer to our [JOSS publication](https://joss.theoj.org/papers/10.21105/joss.06145). For information on the machine learning methods, see our under review [publication](https://lynker-spatial.s3.us-west-2.amazonaws.com/documents/ml_manuscript.pdf). For further demonstrations, visit our [page](https://noaa-owp.github.io/hydrofabric/).

# Populate Flowpath Attributes

Expand Down
199 changes: 199 additions & 0 deletions vignettes/devcon2024_auxiliary.rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
---
title: "Auxiliary data"
description: |
"NOAA's Enterprise Hydrofabric System"
author:
- name: "Mike Johnson"
url: https://github.com/mikejohnson51
affiliation: Lynker, NOAA-Affiliate
affiliation_url: https://lynker.com
- name: "Angus Watters"
url: https://github.com/anguswg-ucsb
affiliation: Lynker, NOAA-Affiliate
affiliation_url: https://lynker.com
- name: "Arash Modaresi Rad"
url: https://github.com/arashmodrad
affiliation: Lynker, NOAA-Affiliate
affiliation_url: https://lynker.com
- name: "James Coll"
url: https://github.com/james.coll
affiliation: Lynker, NOAA-Affiliate
affiliation_url: https://lynker.com
output: distill::distill_article
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, out.width = "100%", message = TRUE)
```


# Introduction

Welcome to this tutorial on leveraging the robust subsetting features in R to extract geospatial data effectively. Throughout this vigennet, we'll delve into using identifiers to retrieve specific geospatial data from geosptial fabrics for areas of interest. Our journey will involve not only fetching geospatial attributes but also extracting and visualizing additional data like 3D channel geometry and accessing routlink files. Let's dive in and explore the geospatial information!

## Location of Interest and Subset

First, let's ensure that we have the necessary libraries loaded:

```{r}
library(hydrofabric)
library(ggplot2)
library(plotly)
```

Now, let's extract the geospatial fabric for `comid = 101`:

```{r}
geosptial_output <- get_subset(comid = 101,
source = "s3://lynker-spatial/hydrofabric",
type = 'reference',
hf_version = "2.2")
```

That's easy!

## Accessing routelink files

## 3D Channel Geometry

One of our key projects of NOAA is the 3D River initiative. Our main goal is to boost innovation in river morphology research. We're committed to improving hydrological modeling by integrating additional tools and fostering interdisciplinary studies in river ecosystems.

- **3D River Channel Cross Sections**: Our platform offers a comprehensive 3D representation of channel geometry and cross sections for the CONUS. This dataset, constructed from a 10-meter Digital Elevation Model (DEM), incorporates meticulous details crucial for delineating river floodplains, rectifying DEM inaccuracies, preserving the consistent decrease in elevation along the river's path from upstream to downstream, and more. Detailed information regarding our cross-sectional data model can be accessed [here](https://noaa-owp.github.io/hydrofabric/articles/cs_dm.html). Additionally, we employ machine learning methodologies to address gaps in bathymetric data [see this article](https://noaa-owp.github.io/hydrofabric/articles/07-channel-geometry.html).

- **Features**: Our platform dynamically generates cross sections updated with survey data whenever available. Offering over 10 cross sections per flowline, each with elevation points spaced at 1-meter intervals, it spans approximately 2.7 million reaches, providing indispensable data for comprehensive analysis of river morphology. Leveraging a 10-meter DEM integrated with bathymetric data derived from machine learning algorithms, this feature facilitates the derivation of river topobathy information through the Reference Fabric, enabling exhaustive watershed-level inquiries. This advancement enhances routing accuracy within hydrological models such as the National Water Model T-route, augments bathymetric representation in Flood Inundation Mapping (FIM), and enables a myriad of comprehensive scientific analyses through the 3D River project.

We have several papers underway base on this project including:
Johnson JM, Afshari S, Rad AM. AHGestimation: An R package for computing robust, mass preserving hydraulic geometries and rating curves. Journal of Open Source Software. 2024 Apr 18;9(96):6145.[link](https://joss.theoj.org/papers/10.21105/joss.06145.pdf)

Arash Modaresi Rad, J. Michael Johnson, Zahra Ghahremani, James Coll, Nels Frazier. Enhancing River Channel Dimension Estimation: A Machine Learning Approach Leveraging the National Water Model, Hydrographic Networks, and Landscape Characteristics[link]()

To access the 3D data we can use the subset derived earlier by narrowing down our search, we extract the target VPU and finding the most upstream watershed:

```{r}
# Retrive vpu ID
vpu <- unique(geosptial_output$network$vpuid)
hf_ids <- unique(geosptial_output$network$id)
max_seq = which.max(geosptial_output$network$hydroseq)
watershed_id<- geosptial_output$network$id[max_seq]
```

Since the 3D channel data is developed for the next-generation fabric, we need to cross-walk from the reference fabric to the next generation:

```{r}
conus_net <- arrow::open_dataset('s3://lynker-spatial/hydrofabric/v20.1/conus_net.parquet') %>%
filter(hf_id %in% hf_ids, vpu == vpu) %>%
select(id, hf_id) %>%
collect()
# now extract the identifiers
ids <- unique(conus_net$id)
```

Now we can use the IDs to read the cross-sections from 3D channel geometry:

```{r}
version = "20.1"
source = "s3://lynker-spatial/hydrofabric"
hook = glue("{source}/v{version}/3D/cross-sections/nextgen_{vpu}_cross_sections.parquet")
cs = arrow::open_dataset(hook) %>%
filter(hy_id %in% ids) %>%
collect()
```

We can then select a single cross-section for a next-gen ID and plot it:

```{r, echo = FALSE}
# lets look at a signle cross section
target_id <- unique(arrow::open_dataset('s3://lynker-spatial/hydrofabric/v20.1/conus_net.parquet') %>%
filter(hf_id == watershed_id, vpu == vpu) %>%
select(id) %>%
collect())$id
target_cs_id <- 1
cs_sub <- cs %>% filter(hy_id == target_id & cs_id == target_cs_id)
# now we plot it
unique_categories <- unique(cs_sub$point_type)
category_color_map <- c("#F8766D","#7CAE00","#00BFC4", "#00BFC4")
ggplot(cs_sub, aes(x = pt_id, y = Z, color = point_type)) +
geom_line(aes(group = 1), color = 'black') +
geom_point(size = 3) +
scale_color_manual(values = category_color_map) +
labs(title = glue("Cross section for {target_id} with cs_id {target_cs_id}"),
x = 'pt_id',
y = 'Z',
color = 'Point Type') +
theme_minimal()
```

We can also have a 3D representation of our flow lines and cross-sections. First, we filter our cross-walk file and flowlines to the target area:

```{r}
hf_ids <- conus_net %>%
filter(id == target_id) %>%
select(hf_id) %>%
collect()
flowlines <- geosptial_output$flowpaths %>%
filter(id %in% hf_ids$hf_id) %>%
collect()
```

The 3D plot:

```{r, echo = FALSE}
grouped_data <- cs_sub %>%
group_by(cs_id) %>%
summarize(avg_Z = mean(Z)) %>%
arrange(avg_Z)
# Initialize plotly object
p <- plot_ly()
p <- add_trace(p, x = ~X, y = ~Y, z = ~Z, type = "scatter3d", mode = "markers",
marker = list(size = 5),
data = group_data)
# Add flowlines
for (i in 1:nrow(flowlines)) {
flowline_geom <- st_geometry(flowlines[i,])
linestrings_df <- as.data.frame(st_coordinates(flowline_geom))
colnames(linestrings_df) <- c("X", "Y", "Z")
avg_Z <- min(cs_sub$Z)
linestrings_df$Z <- avg_Z -0.5
p <- p %>%
add_trace(
x = linestrings_df$X,
y = linestrings_df$Y,
z = linestrings_df$Z,
type = 'scatter3d',
mode = 'lines',
line = list(color = 'black', width = 2),
showlegend = FALSE
)
}
# Customize layout
p <- p %>%
layout(
title = '3D plot of cross sections with flowline',
scene = list(
xaxis = list(title = 'X'),
yaxis = list(title = 'Y'),
zaxis = list(title = 'Z (m)')
),
legend = list(title = list(text = 'Average Z'))
)
p <- layout(p, scene = list(
aspectmode = "manual",
aspectratio = list(x = 3, y = 1.5, z = 0.17)
))
# Render the plot
p
```


0 comments on commit b54c989

Please sign in to comment.