Skip to content

Redwood

Benjamin Ran edited this page Jun 9, 2017 · 36 revisions

Components

Each server component is a Spring Boot java application packaged in a jar further packaged in a tar.gz file with the Tanuki Java Service Wrapper and extracted into end docker images. Look in each dcc-{storage|metadata|auth}-server/src/main/resources/application.yml for configuration properties, which can be overridden using Spring external configuration. The easiest way to do this for the redwood-storage-server is to edit dcc-ops/redwood/conf/application.storage.properties.

Overview

The auth-server is essentially an OAuth2.0 server that grants and verifies tokens each created with a set of scopes of form "site.PROJECT.action". The tokens are tracked in the auth-db (PostgreSQL). Users need an appropriately-scoped token to perform actions on entities (files) associated with a given project. The metadata-server keeps track of entities (files) with object-id, bundle-id (gnosId), filename, etc. in the metadata-db (MongoDB). New bundles (collections of entities with same bundle-id) must be registered with the metadata-server before they can be uploaded to S3. The storage-server federates access to S3 and exposes the primary storage-system functionality.

The storage-client is a command-line client for accessing the storage-server. The metadata-client is a command-line client for registering new bundles with the metadata-server. The redwood-client is a docker image that packages the storage-client and metadata-client together with a few convenience wrapper scripts.

Administration Guide

The following should work on dev or prod redwood (as long as you use a well-formed dcc-ops/redwood/.env--see how it's generated by the installer).

See the current projects tracked by redwood:

dcc-ops/redwood/cli/bin/redwood project list

Register a new project (e.g. FOO_BAR) for redwood to track:

dcc-ops/redwood/cli/bin/redwood project create FOO_BAR

List all access tokens:

dcc-ops/redwood/cli/bin/redwood token list | jq

Create an accessToken for a user (e.g. giving user [email protected] upload/download permission on project FOO_BAR):

dcc-ops/redwood/cli/bin/redwood token create -u "[email protected]" -s "aws.FOO_BAR.upload aws.FOO_BAR.download"

List a user's (e.g. [email protected]'s) access tokens:

dcc-ops/redwood/cli/bin/redwood token list -u [email protected]

Revoke an accessToken (e.g. ae335...):

dcc-ops/redwood/cli/bin/redwood token revoke ae335944-37a6-43bb-b71c-7163eb9c2976

Revoke a user's (e.g. beni's) access to all scopes:

curl -XDELETE http://localhost:8444/admin/scope/beni -u admin:secret

Updating Production Redwood

For changes that don't involve a schema or infrastructure change (and only require pulling a new docker image), you can make a dcc-ops release that pins a more recent release version of the redwood servers in prod.yml. Then on the production server run docker-compose -f base.yml -f prod.yml down; docker-compose -f base.yml -f prod.yml up. You'll see warnings about the redwood_internal network not being removed because it still has active endpoints and warnings about orphan containers existing; this is because redwood-{auth|metadata}-{db|backup} stay running and is a good thing.

Developer Guide

For developing dcc-storage, dcc-metadata, dcc-auth, or the redwood infrastructure

Build Process

Each constituent redwood server is built into a docker image from one of the above git repositories:

To build one of the redwood servers, check out the corresponding git repository. Then, from the project root:

./mvnw

That builds a tar archive and docker image of the server. For example, in dcc-storage server dcc-storage-server/target/dcc-storage-server-*-dist.tar.gz and a quay.io/ucsc_cgl/redwood-storage-server:VERSION docker image will be built.

The docker image and tar archive are tagged with the the project version as defined in pom.xml.

Versions

Redwood in dev mode uses the latest SNAPSHOT version of the redwood server containers (this has to be updated each release). Redwood in prod mode uses the latest non-SNAPSHOT version of the redwood server containers (also updated each release, at least once dcc-ops is ready to use the new version)

Dev vs. Prod

Redwood can be run in "dev" or "prod" mode (using base.yml and either dev.yml or prod.yml). Use redwood up -d or redwood up respectively.

Dev mode opens ports on the redwood server containers where the server itself is listening (serving http) as well as remote debugging ports. Dev mode runs a redwood-nginx container that proxies https requests to virtual hosts {storage|metadata|auth}.redwood.io to the corresponding redwood server container. A self-signed certificate is used that accepts requests for {storage|metadata|auth}.redwood.io. It can was generated and can be regenerated (e.g. with new subjectAltNames) from dcc-ops/common/ssldev. However, you'll have to add the new certificate to the redwood-client dev truststore.

Prod mode adds containers that do daily automatic backup and adds environment variables to signal to the core-nginx-letsencrypt-companion container to obtain the appropriate ssl certificates for each redwood subdomain.

Run Client Against Local Instance

If running dev redwood locally, you can

docker run --rm -it --net redwood_internal --link redwood-nginx:metadata.redwood.io --link redwood-nginx:storage.redwood.io -e ACCESS_TOKEN=$(redwood token create) -e REDWOOD_ENDPOINT=redwood.io -v $(pwd):/data quay.io/ucsc_cgl/redwood-client:1.1.1 bash

Debugging

Server

You can remote debug the redwood servers as they run in the docker container.

To do this, exec into the container, cd $(which dcc-storage-server)/../conf, edit the wrapper.conf file to specify java remote debugging options (-agentlib:jdwp=transport=dt_socket,server=y,address=8000,suspend=n), and restart the java process with dcc-storage-server restart. Similar steps work for dcc-auth and dcc-metadata.

Then check dcc-ops/redwood/dev.yml to see which host port maps to the container's port 8000 and start a remote debugging session in your IDE pointing to that port.

Client

You can remote debug the client more easily: docker run ... bash a client container with -p 8000:8000 then execute export JAVA_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,address=8000,suspend=y before uploading or downloading as usual. Then you'll be able to start a remote debugging session from your IDE to localhost:8000.

You can also run the client with log level "DEBUG" (instead of the "INFO" default) by editing /dcc/icgc-storage-client/conf/logback.xml in the container:

sed -i 's/level="INFO"/level="DEBUG"/' /dcc/icgc-storage-client/conf/logback.xml

Inspect the databases

All redwood state is stored in the configured S3 bucket, the redwood-auth-db, and the redwood-metadata-db

redwood-metadata-db

MongoDB instance that tracks bundle id, file id, filename, etc.

Simple connection:

$ docker exec -it redwood-metadata-db mongo dcc-metadata

Find all records for a particular bundle_id (e.g. efa...):

> db.Entity.find({gnosId: 'efac875b-faf5-5e0d-b778-0ef411b81cad'}).limit(10)

redwood-auth-db

Access tokens and scope storage

Simple connection:

docker exec -it dcc-auth-db psql -Upostgres -d dcc

User scopes are stored in the authorities table.

select * from authorities;

OAuth Client information (id, password, scopes, etc) is stored in the oauth_client_details table.

select * from oauth_client_details;

A note on credentials

This guide assumes the default development credentials for the different principals at play in the auth service. These prinicipals, credentials, and their definition locations are listed here.

  • Postgres (dcc-auth-db)
    • Database Admin User
      • Username: postgres (default for postgres image)
      • Password: password (POSTGRES_PASSWORD environment variable of dcc-auth-db container as defined in docker-compose.yml)
      • Used for connecting to postgres
  • Auth Service
    • Auth Service Admin
      • Username: admin (security.user.name property in dcc-auth-server/src/main/resources/application.yml)
      • Password: secret (security.user.password property in dcc-auth-server/src/main/resources/application.yml)
      • Used for making calls to all admin endpoints of auth-server (/admin/* endpoints)
    • Auth OAuth Management Client
      • Username: mgmt (defined in postgres://dcc/users:username and postgres://dcc/oauth_client_details:client_id and
      • initialized via dcc-auth-db/auth-schema-postgresql.sql)
      • Password: pass (defined in postgres://dcc/users:password and postgres://dcc/oauth_client_details:client_secret and
      • initialized via dcc-auth-db/auth-schema-postgresql.sql)
      • Used as credentials of OAuth client for making oauth calls to auth-server (/oauth/* endpoints)

Release Process

You should follow the hubflow release process.

Once all tests pass and the code is ready for release, update the project version by running ./mvnw versions:set -DnewVersion=r1.2.3 (replace 'r1.2.3' as appropriate). This should be committed on the release branch just before finishing the release.

Then finish the release with hubflow and update the project version to the next SNAPSHOT (e.g. ./mvnw versions:set -DnewVersion=r1.2.4-SNAPSHOT with 'r1.2.4' replaced as appropriate).

Version Numbers

Version numbers are prefixed by an 'r' to distinguish them from ICGC builds. Try to follow semantic versioning.