Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract and readchla not playing along #102

Closed
ryanreisinger opened this issue Apr 22, 2020 · 4 comments
Closed

extract and readchla not playing along #102

ryanreisinger opened this issue Apr 22, 2020 · 4 comments

Comments

@ryanreisinger
Copy link

raadtools::extract() is not working with readchla().

library(raadtools)

xyt <- data.frame(c(50.5, -50.5, "2017-12-01"), nrow = 1)

raadtools::extract(readchla, xyt)

Gives the error:

Error in x(returnfiles = TRUE, ...) : unused argument (returnfiles = TRUE)

@mdsumner
Copy link
Member

This can't work sadly, readchla is geared to reading a mean for a set of input date/s and grid. We can wrap a function that reads monthly chla if that's if use

@ryanreisinger
Copy link
Author

This can't work sadly, readchla is geared to reading a mean for a set of input date/s and grid. We can wrap a function that reads monthly chla if that's if use

Ah, right! I'd be interested to see that solution yes. But I can use a brute force approach since I will probably calculate some derived parameters for each chla map in any case.

@mdsumner
Copy link
Member

mdsumner commented Apr 22, 2020

A thing I did recently that soothes my finicky soul is to extract ALL daily chla observations within a given radius in time/space for a set of points. Then it's your job to figure out what to use from those, and is arguably better than producing chunks-in-time maps or aggregating in space. It works to get the NASA algorithm and the Johnson too: #95

So you would run

d <- data.frame(lon = c(147, 140, 130, 110, 100), 
                lat = c(-43, -45, -55, -60, -58), 
                date = seq(as.Date("2015-01-01"), by = "10 days", length.out = 5), 
                ID = c("a", "a", "a", "b", "b"))
bin_extract(d, radius = 50000, days = 5)

Replace d with your data frame lon, lat, date, label - only order matters, names are ignored. Then you have weightings in distance_m (metres) and distance_t (days), your original label and you can make a determination of "enough data" vs. "too distant.

bin_extract(d, radius = 50000, days = 5)
# A tibble: 398 x 9
   bin_num chla_johnson chla_nasa date                distance_m  blon  blat distance_t label
     <int>        <dbl>     <dbl> <dttm>                   <dbl> <dbl> <dbl>      <dbl> <chr>
 1 3522015        0.218     0.137 2015-01-10 00:00:00     53338.  141. -44.7          1 a    
 2 3522016        0.189     0.126 2015-01-10 00:00:00     57225.  141. -44.7          1 a    
 3 3528147        0.182     0.123 2015-01-10 00:00:00     35002.  140. -44.7          1 a    
 4 3528157        0.212     0.140 2015-01-10 00:00:00     54464.  141. -44.7          1 a    
 5 3528158        0.189     0.126 2015-01-10 00:00:00     58108.  141. -44.7          1 a    
 6 3528159        0.232     0.147 2015-01-10 00:00:00     61885.  141. -44.7          1 a    
 7 3534290        0.182     0.116 2015-01-10 00:00:00     44536.  140. -44.6          1 a    
 8 3534291        0.164     0.114 2015-01-10 00:00:00     42566.  140. -44.6          1 a    
 9 3534292        0.161     0.115 2015-01-10 00:00:00     41026.  140. -44.6          1 a    
10 3534293        0.160     0.115 2015-01-10 00:00:00     39966.  140. -44.6          1 a    
# … with 388 more rows

You can up the radius in time or space to ensure you get more values and then just filter out later, the time to run will depend mostly on the number of unique days and less so on the number of points for each day.

@ryanreisinger
Copy link
Author

Super, thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants