The GraphQL API supports queries for targets, disease/phenotype, drug, target-disease association and Search. You can also query metadata on the API to get the version of the data and API.
At the moment the API uses Clickhouse and Elasticsearch 7.2 as data sources.
Technology | Version |
---|---|
Scala | 2.13.10 |
Play Framework | 2.8.18 |
Sangria | 3.5.3 |
Slick | 3.4.1 |
elastic4s | 8.5.3 |
clickhouse-jdbc | 0.3.2 |
To run locally you will need to have access to Elastic Search and Clickhouse. These instances need to run in specific ports Elastic Search will have to run in port 9200 and Clickhouse needs to be running in port 8123. You can do this by having local instances or tunnel to a server hosted instance.
To tunnel the instances hosted in GCP you can use the follow commands
Elastic Search
make es_tunnel instance=<some ES instance> zone=<instance zone>
Clickhouse
make ch_tunnel instance=<some ES instance> zone=<instance zone>
Once you have access to the data you can execute make run
to run the API. This will start an instance in port 9000. To debug the API you'll need to run make debug
. After the API has started you can access the GraphQL Playground in http://localhost:9000/playground
.
This application uses Sangria as a GraphQL wrapper and uses deferred resolver caches to improve query times. In cases where the data is updated in Elasticsearch it will not be available on the front-end if it has previously been cached.
To avoid using the cache in your deployments you can set the environment variable $PLATFORM_API_IGNORE_CACHE
to true
. The default value is false
which means the cache will be used.
To reset the cache following a data update use the following request:
curl --location --request GET 'http://localhost:9000/api/v4/rest/cache/clear' \
--header 'apikey: <very secret code>'
Logging to local use / development can be configured by updating the logback.xml
file in the conf directory.
If you want to run the API locally and include the logs you can use make run_log logfile=<logback path>
or debug it using make debug_log logfile=<logback path>
ex: make run_log logfile=./conf/logback.xml
Production deployments use the production.xml
file to configure loggging. These should be set conservatively because
GCP charges based on the quantity of logs, so we only want to produce what we need for monitoring, basic
trouble-shooting.
Tests annoted with IntegrationTestTag
require there to be access to a configured ElasticSearch instance against which
to run the queries.
The Open Targets Platform front end makes use of pre-written GraphQL queries. Since we want to be aware if changes in the API are likely to break the FE, we have integration tests in place to check if this is going to happen.
Note, make sure you have access to ElasticSearch and Clickhouse!
- Get the files: run
sbt updateGqlFiles
to retrieve all '*.gql' files from the front-end repository and copy them to thetest/resources/gqlQueries
directory and prints output regarding which files are new / changed. - Run tests
sbt testOnly controllers.GqlTest
Since the FE and BE are developed independently, it's worth checking what has changed since we last tested. Before
testing run sbt updateGqlFiles
. This will print which files are new or updated.
If there are updated files, run git diff test/resources/gqlQueries
to see if any previously configured tests require
updating (mainly if the input parameters change. If there are new files new tests will need to be added.
If the above step shows that there are more files to add, create a new test for them using an existing one as a template. For example:
"Cancer gene census queries" must {
"return a valid response" in {
testQueryAgainstGqlEndpoint(TargetDiseaseSize("CancerGeneCensus_sectionQuery"))
}
}
Take note of the following:
- 'CancerGeneCensus_sectionQuery' is the name of the file, this will be used to read in the actual query.
TargetDiseaseSize
is a case class which extendsGqlCase
. You choose the relevant case class based on which inputs are required by the file you are adding. Looking at the 'CancerGeneCensus_sectionQuery' query, we see that it takes three parameters, target, disease and size:
query CancerGeneCensusQuery($ensemblId: String!, $efoId: String!, $size: Int!) {
disease(efoId: $efoId) {
id
evidences(
ensemblIds: [$ensemblId]
- It just so happens that
TargetDiseaseSize
will generate inputs that satisfy this requirement. To see what else is available consider other case classes which extend GqlCase.
- The GraphQL test are using generators to create inputs for the queries. The generators themselves are defined
in
GqlItTestInputs.scala
and read from files in/test/resources/gqpInputs
. - The starting point for the input lists were those used by Checkomatic to identify useful targets and diseases to test against. To add more inputs add them to the resource files.
Copyright 2014-2024 EMBL - European Bioinformatics Institute, Genentech, GSK, MSD, Pfizer, Sanofi and Wellcome Sanger Institute
This software was developed as part of the Open Targets project. For more information please see: http://www.opentargets.org
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.