Purge WRDS API Dependency from Viz Pipelines #810
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The changes herein originated from ticket #772 - namely improving the process that synchronizes the on-prem WRDS location DB with our cloud version. The syncing process started failing a couple of months back. Turns out that was due to permission errors on the WRDS-side when trying to write the Location DB dump to S3. So with the help of @DrixTabligan-NOAA, we went ahead and tackled #590 to get the DB dump uploaded to the NWM Shared bucket instead.
Once #590 was complete, the syncing process continued to fail due to the WRDS API test failing on the new location DB dumps. This is expected behavior, since the most recent changes to the WRDS DB were more significant. However, rather than just fix the failing code, I also completely removed the dependency on the WRDS API (since none of our services depend on it anymore) and replaced it with tests that are DB-centric.
The terraform code that creates the Test WRDS DB lambda packages every SQL file across our whole repository into a folder that the lambda has access to. When the tests run, the code creates a foreign db connection to the
wrds_location3_ondeck
db via a newtest_external
schema and also creates aautomated_test
schema. Then the code iterates through every SQL file and checks if it has any dependency on the WRDS db (i.e. references to theexternal
schema). If the dependency exists, the SQL is tweaked on the fly to access thewrds_location3_ondeck
db via thetest_external
schema, and also to write any intermediateSELECT ... INTO
results to theautomated_test
db instead. If all of the relevant SQL files execute successfully, the tests pass and thewrds_lcoation3_ondeck
db is swapped for the live (i.e. non-_ondeck
) version.That's that! No more dependence upon the WRDS API and even more robust and relevant testing.