Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Improve /countries/<country> performance #55

Open
mojodna opened this issue Dec 12, 2018 · 0 comments
Open

API: Improve /countries/<country> performance #55

mojodna opened this issue Dec 12, 2018 · 0 comments

Comments

@mojodna
Copy link
Member

mojodna commented Dec 12, 2018

/countries/<country> counts are implemented in

https://github.com/AmericanRedCross/osm-stats-api/blob/52e3e5a8ef0a582488f252473e56eaca0b0dffb0/routes/country-page.js#L162-L188

Querying the changesets table (which is actually a view over raw_changesets) is untenable, as it currently contains 61107400 rows (simply selecting the number of rows took several minutes).

An approach similar to how user + hashtag statistics are periodically aggregated using a materialized view (e.g. https://github.com/AmericanRedCross/osm-stats-workers/blob/232b75424835f5b0693398af8198eb97ed49d2a0/sql/hashtag_stats.sql) would improve performance dramatically (especially with the addition of an index on <country>).

Secondarily, wrapping queries in a caching function similar to

https://github.com/AmericanRedCross/osm-stats-api/blob/52e3e5a8ef0a582488f252473e56eaca0b0dffb0/routes/hashtag.js#L62-L72

would further help things. The cache library in use (locking-cache) limits the number of concurrent matching requests to 1, so multiple requests to the same endpoint will wait on the execution of a single SQL query. The use of stale: true (https://github.com/AmericanRedCross/osm-stats-api/blob/52e3e5a8ef0a582488f252473e56eaca0b0dffb0/routes/hashtag.js#L21) when initializing the cache allows it to return stale data immediately while still executing the query for subsequent requests (helpful in the case of hashtag endpoints, as there will be many repeated requests for the same data).

Unfortunately, the underlying data (via [raw_changesets_countries.sql](https://github.com/AmericanRedCross/osm-stats-workers/blob/232b75424835f5b0693398af8198eb97ed49d2a0/sql/raw_changesets_countries.sql) does only supports "N changesets in this country" (not "N edits...") due to the lack of a count` (or similarly-named) column.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant