Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research indexing and optimizing queries to improve frontend performance #1450

Closed
lalgee opened this issue May 25, 2021 · 5 comments
Closed

Comments

@lalgee
Copy link
Collaborator

lalgee commented May 25, 2021

User story/persona
As a user, item/asset tiles and transcription page are sometimes slow to load when I'm exploring and finding something to work on. The buttons that allow me to find a new page, go to the next reviewable asset or go to the next transcribeable asset can also be slow to load.

Is your feature request related to a problem? Please describe.
Research Indexing certain columns and optimizing queries to improve load times.

Additional context
image

@jkueloc
Copy link
Collaborator

jkueloc commented Jul 26, 2021

Observations

1st - the Django Debug Toolbar DDT sql explain times are significantly longer the pgadmin for the same query.
However - the extra time in the DDT does better explain the delay between the request and the application page being displayed.

there is overhead for each index on write. Think admin bulk import and saves on Asset Transcriptions

campaign -
The view is slow. A lot of Python work that generates a lot of database reads -
first to questions - can we offload some of this work to the database (see possible solution)?

Observation -> The the first slow query in the campaign
a cursor for concordia_transcription user_id and reviewed_by_id. - I don't understand why
It's building the stats - progress calculation for the campaign and children/project item assets..
lots of db reads vs. memory. views.py #627...672+

Possible solution -> the counts for transcription_status could be a calculated index(es)?

QQ -> ...are the proper indexes in place for postgres to combine them efficiently?

QQ --> would an index on each concordia_* [_asset, _item, _project] of id, FK, and published=T be more efficient?
QQ --> On concordia_campaign - does it make sense to have all three of these?
published
unlisted
published, unlisted
(show up on the unused index query in local dev environment)

QQ --> Probably need more code review?
> Home Page 'Jump into a transcription' takes a inrodinate amount of time to return - but the sql
returned in the Debug tool bar do not have long execution times (longest 632 ms).
> Also loading a campaign listed from the Home page takes long to return a resulting page.
*Please note that setting django debug tool bar PRETTIER_SQL to False did improve response time.
django-commons/django-debug-toolbar#1438
> Admin Asset page list loads are slow. Django Debug tool bar shows long response time - 18794.86 TIME (MS)
However, explain (analyze, verbose, buffers) on the same query "Execution Time: 18.831 ms"

/---
Further...

  • I would like to run the missing and unused index queries against production db.

I propose removing unused indexes due to the overhead when writing to the database - might also help the import process speed.
I propose exploring adding indexes to some of the import process tables to help performance there

/---
"missing index query" in a non-prod environment returned four possibilities:

"relname","too_much_seq","case","rel_size","seq_scan","idx_scan"
"importer_importitemasset",8.0,"Missing Index?",107470848.0,10.0,2.0
"django_admin_log",3.0,"Missing Index?",8962048.0,7.0,4.0
"auth_user_groups",2.0,"Missing Index?",491520.0,5.0,3.0
"concordia_sitereport",1.0,"Missing Index?",1966080.0,4.0,3.0

/---
Navigating in the admin module via the "buttons" at the top of the screen (ex. items, projects, campaign)
is much quicker than navigating to and from the list of assets.

/---
How do you navigate assets in admin - the pages of assets take longer than the links within an individual asset page
Using the filters on the right help narrow the number of pages you need to work with.

@jkueloc
Copy link
Collaborator

jkueloc commented Jul 26, 2021

Missing Index query

SELECT
  relname,
  seq_scan - idx_scan AS too_much_seq,
  CASE
    WHEN
      seq_scan - coalesce(idx_scan, 0) > 0
    THEN
      'Missing Index?'
    ELSE
      'OK'
  END,
  pg_relation_size(relname::regclass) AS rel_size, seq_scan, idx_scan
FROM
  pg_stat_all_tables
WHERE
  schemaname = 'public'
  AND pg_relation_size(relname::regclass) > 80000
ORDER BY
  too_much_seq DESC;

4 returned


"relname","too_much_seq","case","rel_size","seq_scan","idx_scan"
"importer_importitemasset",8.0,"Missing Index?",107470848.0,10.0,2.0
"django_admin_log",3.0,"Missing Index?",8962048.0,7.0,4.0
"auth_user_groups",2.0,"Missing Index?",491520.0,5.0,3.0
"concordia_sitereport",1.0,"Missing Index?",1966080.0,4.0,3.0
"importer_importjob",-2.0,"OK",2293760.0,4.0,6.0
"importer_importitem",-3.0,"OK",5341184.0,4.0,7.0
"django_session",-451.0,"OK",833863680.0,6.0,457.0
"auth_user",-1807.0,"OK",5734400.0,20.0,1827.0
"concordia_project",-4967.0,"OK",196608.0,707.0,5674.0
"concordia_item",-13537.0,"OK",8347648.0,611.0,14148.0
"concordia_userassettagcollection",-58463.0,"OK",3686400.0,169.0,58632.0
"concordia_asset",-740163.0,"OK",145620992.0,25.0,740188.0
"concordia_userassettagcollection_tags",-1851566.0,"OK",8355840.0,10.0,1851576.0
"concordia_tag",-2426767.0,"OK",4521984.0,7.0,2426774.0
"concordia_transcription",-5996668.0,"OK",1119092736.0,20.0,5996688.0

@jkueloc
Copy link
Collaborator

jkueloc commented Jul 26, 2021

Unused Indexes query

SELECT
  indexrelid::regclass as index,
  relid::regclass as table,
  'DROP INDEX ' || indexrelid::regclass || ';' as drop_statement
FROM
  pg_stat_user_indexes
  JOIN
    pg_index USING (indexrelid)
WHERE
  idx_scan = 0
  AND indisunique is false;

  45 returned
  "index","table","drop_statement"
  "auth_group_name_a6ea08ec_like","auth_group","DROP INDEX auth_group_name_a6ea08ec_like;"
  "auth_group_permissions_group_id_b120cbf9","auth_group_permissions","DROP INDEX auth_group_permissions_group_id_b120cbf9;"
  "auth_group_permissions_permission_id_84c5c92e","auth_group_permissions","DROP INDEX auth_group_permissions_permission_id_84c5c92e;"
  "auth_permission_content_type_id_2f476e4b","auth_permission","DROP INDEX auth_permission_content_type_id_2f476e4b;"
  "auth_user_groups_group_id_97559544","auth_user_groups","DROP INDEX auth_user_groups_group_id_97559544;"
  "auth_user_user_permissions_permission_id_1fbb5f2c","auth_user_user_permissions","DROP INDEX auth_user_user_permissions_permission_id_1fbb5f2c;"
  "auth_user_user_permissions_user_id_a95ead1b","auth_user_user_permissions","DROP INDEX auth_user_user_permissions_user_id_a95ead1b;"
  "captcha_captchastore_hashkey_cbe8d15a_like","captcha_captchastore","DROP INDEX captcha_captchastore_hashkey_cbe8d15a_like;"
  "concordia_a_publish_4761f1_idx","concordia_asset","DROP INDEX concordia_a_publish_4761f1_idx;"
  "concordia_asset_media_type_b9722094","concordia_asset","DROP INDEX concordia_asset_media_type_b9722094;"
  "concordia_asset_media_type_b9722094_like","concordia_asset","DROP INDEX concordia_asset_media_type_b9722094_like;"
  "concordia_asset_slug_104354c9","concordia_asset","DROP INDEX concordia_asset_slug_104354c9;"
  "concordia_assettranscriptionreservation_asset_id_4ba5e394","concordia_assettranscriptionreservation","DROP INDEX concordia_assettranscriptionreservation_asset_id_4ba5e394;"
  "concordia_c_publish_2c3b1c_idx","concordia_campaign","DROP INDEX concordia_c_publish_2c3b1c_idx;"
  "concordia_campaign_published_77b53d52","concordia_campaign","DROP INDEX concordia_campaign_published_77b53d52;"
  "concordia_campaign_slug_ff5e749f_like","concordia_campaign","DROP INDEX concordia_campaign_slug_ff5e749f_like;"
  "concordia_campaign_unlisted_87ea1e97","concordia_campaign","DROP INDEX concordia_campaign_unlisted_87ea1e97;"
  "concordia_project_slug_42b45305","concordia_project","DROP INDEX concordia_project_slug_42b45305;"
  "concordia_project_slug_42b45305_like","concordia_project","DROP INDEX concordia_project_slug_42b45305_like;"
  "concordia_project_topics_topic_id_ccbb0fcc","concordia_project_topics","DROP INDEX concordia_project_topics_topic_id_ccbb0fcc;"
  "concordia_resource_campaign_id_02d73737","concordia_resource","DROP INDEX concordia_resource_campaign_id_02d73737;"
  "concordia_resource_topic_id_f1c51824","concordia_resource","DROP INDEX concordia_resource_topic_id_f1c51824;"
  "concordia_simplecontentblock_slug_a5324e96_like","concordia_simplecontentblock","DROP INDEX concordia_simplecontentblock_slug_a5324e96_like;"
  "concordia_t_publish_7f5b9d_idx","concordia_topic","DROP INDEX concordia_t_publish_7f5b9d_idx;"
  "concordia_tag_value_374a9b09_like","concordia_tag","DROP INDEX concordia_tag_value_374a9b09_like;"
  "concordia_topic_published_9f4ef4e8","concordia_topic","DROP INDEX concordia_topic_published_9f4ef4e8;"
  "concordia_topic_slug_dd63f840","concordia_topic","DROP INDEX concordia_topic_slug_dd63f840;"
  "concordia_topic_slug_dd63f840_like","concordia_topic","DROP INDEX concordia_topic_slug_dd63f840_like;"
  "concordia_topic_unlisted_410da90a","concordia_topic","DROP INDEX concordia_topic_unlisted_410da90a;"
  "concordia_userprofile_user_id_d5d418af","concordia_userprofile","DROP INDEX concordia_userprofile_user_id_d5d418af;"
  "django_admin_log_content_type_id_c4bce8eb","django_admin_log","DROP INDEX django_admin_log_content_type_id_c4bce8eb;"
  "django_celery_beat_periodictask_clocked_id_47a69f82","django_celery_beat_periodictask","DROP INDEX django_celery_beat_periodictask_clocked_id_47a69f82;"
  "django_celery_beat_periodictask_crontab_id_d3cba168","django_celery_beat_periodictask","DROP INDEX django_celery_beat_periodictask_crontab_id_d3cba168;"
  "django_celery_beat_periodictask_interval_id_a8ca27da","django_celery_beat_periodictask","DROP INDEX django_celery_beat_periodictask_interval_id_a8ca27da;"
  "django_celery_beat_periodictask_name_265a36b7_like","django_celery_beat_periodictask","DROP INDEX django_celery_beat_periodictask_name_265a36b7_like;"
  "django_celery_beat_periodictask_solar_id_a87ce72c","django_celery_beat_periodictask","DROP INDEX django_celery_beat_periodictask_solar_id_a87ce72c;"
  "django_site_domain_a2e37b91_like","django_site","DROP INDEX django_site_domain_a2e37b91_like;"
  "robots_rule_allowed_rule_id_aa90358c","robots_rule_allowed","DROP INDEX robots_rule_allowed_rule_id_aa90358c;"
  "robots_rule_allowed_url_id_6157f60a","robots_rule_allowed","DROP INDEX robots_rule_allowed_url_id_6157f60a;"
  "robots_rule_disallowed_rule_id_fad7328c","robots_rule_disallowed","DROP INDEX robots_rule_disallowed_rule_id_fad7328c;"
  "robots_rule_disallowed_url_id_8e02db92","robots_rule_disallowed","DROP INDEX robots_rule_disallowed_url_id_8e02db92;"
  "robots_rule_sites_rule_id_7921a799","robots_rule_sites","DROP INDEX robots_rule_sites_rule_id_7921a799;"
  "robots_rule_sites_site_id_efe5be97","robots_rule_sites","DROP INDEX robots_rule_sites_site_id_efe5be97;"
  "auth_user_email_1c89df09_like","auth_user","DROP INDEX auth_user_email_1c89df09_like;"

@jkueloc
Copy link
Collaborator

jkueloc commented Jul 30, 2021

As I expected - results running the missing index and unused indexes against the production database returned different result (use patterns and more resources available in prod....)

--Missing Index query
 relname                |  too_much_seq |  case |   rel_size  |  seq_scan |   idx_scan
---------------------------------------+--------------+------+------------+----------+------------
auth_user_groups                      |         -752 | OK   |     507904 |      102 |        854
django_admin_log                      |        -1164 | OK   |   10067968 |       83 |       1247
concordia_sitereport                  |        -1511 | OK   |    2138112 |      420 |       1931
importer_importjob                    |       -20628 | OK   |    2383872 |      160 |      20788
django_celery_beat_periodictask       |       -55897 | OK   |     122880 |       56 |      55953
django_session                        |      -303478 | OK   |  926547968 |       55 |     303533
importer_importitemasset              |     -1050790 | OK   |  139378688 |      181 |    1050971
importer_importitem                   |     -1347016 | OK   |   17326080 |      129 |    1347145
concordia_userassettagcollection      |     -4638864 | OK   |    4497408 |     3159 |    4642023
concordia_project                     |    -22040722 | OK   |     245760 |   160845 |   22201567
concordia_userassettagcollection_tags |    -34070760 | OK   |    9871360 |      574 |   34071334
concordia_item                        |    -38286525 | OK   |    9748480 |   337234 |   38623759
concordia_tag                         |    -51502318 | OK   |    4784128 |     1422 |   51503740
auth_user                             |   -126873712 | OK   |    6348800 |    19822 |  126893534
concordia_asset                       |   -529713192 | OK   |  188710912 |   213731 |  529926923
concordia_transcription               |  -1583177495 | OK   | 1240645632 |    39617 | 1583217112
(16 rows)

--Unused Indexes query
  index                         |              table              |                          drop_statement

------------------------------------------------------+---------------------------------+-----------------------------------------------------------------
-
auth_group_permissions_group_id_b120cbf9             | auth_group_permissions          | DROP INDEX auth_group_permissions_group_id_b120cbf9;
auth_group_permissions_permission_id_84c5c92e        | auth_group_permissions          | DROP INDEX auth_group_permissions_permission_id_84c5c92e;
auth_user_groups_group_id_97559544                   | auth_user_groups                | DROP INDEX auth_user_groups_group_id_97559544;
concordia_asset_media_type_b9722094                  | concordia_asset                 | DROP INDEX concordia_asset_media_type_b9722094;
concordia_asset_media_type_b9722094_like             | concordia_asset                 | DROP INDEX concordia_asset_media_type_b9722094_like;
concordia_asset_slug_104354c9                        | concordia_asset                 | DROP INDEX concordia_asset_slug_104354c9;
concordia_campaign_unlisted_87ea1e97                 | concordia_campaign              | DROP INDEX concordia_campaign_unlisted_87ea1e97;
concordia_project_slug_42b45305                      | concordia_project               | DROP INDEX concordia_project_slug_42b45305;
concordia_simplecontentblock_slug_a5324e96_like      | concordia_simplecontentblock    | DROP INDEX concordia_simplecontentblock_slug_a5324e96_like;
concordia_t_publish_7f5b9d_idx                       | concordia_topic                 | DROP INDEX concordia_t_publish_7f5b9d_idx;
concordia_topic_published_9f4ef4e8                   | concordia_topic                 | DROP INDEX concordia_topic_published_9f4ef4e8;
concordia_topic_slug_dd63f840                        | concordia_topic                 | DROP INDEX concordia_topic_slug_dd63f840;
concordia_topic_unlisted_410da90a                    | concordia_topic                 | DROP INDEX concordia_topic_unlisted_410da90a;
concordia_userprofile_user_id_d5d418af               | concordia_userprofile           | DROP INDEX concordia_userprofile_user_id_d5d418af;
django_celery_beat_periodictask_clocked_id_47a69f82  | django_celery_beat_periodictask | DROP INDEX django_celery_beat_periodictask_clocked_id_47a69f82;
django_celery_beat_periodictask_crontab_id_d3cba168  | django_celery_beat_periodictask | DROP INDEX django_celery_beat_periodictask_crontab_id_d3cba168;
django_celery_beat_periodictask_interval_id_a8ca27da | django_celery_beat_periodictask | DROP INDEX django_celery_beat_periodictask_interval_id_a8ca27da;
django_celery_beat_periodictask_solar_id_a87ce72c    | django_celery_beat_periodictask | DROP INDEX django_celery_beat_periodictask_solar_id_a87ce72c;
django_site_domain_a2e37b91_like                     | django_site                     | DROP INDEX django_site_domain_a2e37b91_like;
importer_importjob_created_by_id_73b41954            | importer_importjob              | DROP INDEX importer_importjob_created_by_id_73b41954;
robots_rule_allowed_url_id_6157f60a                  | robots_rule_allowed             | DROP INDEX robots_rule_allowed_url_id_6157f60a;
robots_rule_disallowed_url_id_8e02db92               | robots_rule_disallowed          | DROP INDEX robots_rule_disallowed_url_id_8e02db92;
robots_rule_sites_rule_id_7921a799                   | robots_rule_sites               | DROP INDEX robots_rule_sites_rule_id_7921a799;
(23 rows)

@rabiloc
Copy link
Contributor

rabiloc commented Jan 21, 2022

The above queries are applicable for removing unused index keys, we will also need to apply index keys to tables that effect the way campaigns/projects and other related items get called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants