Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Introduces a few new functions to improve caching for entries requests.
Problem
The current approach to querying entries uses
WPCOM_Liveblog_Entry_Query::get_all_entries_asc()
to grab the full list of entries, thenWPCOM_Liveblog_Entry_Query::find_between_timestamps()
to slice out the relevant entries from the full list.This is fast and generally works well, but it breaks down as the size of the Liveblog grows. When the list of entries is too big to fit in a single cache entry (1MB is the default for memcache), Liveblog will fall over completely as there is no caching happening.
The current approach is also not super efficient, as, despite the caching of the entries list, when
WPCOM_Liveblog_Entry::for_json()
is called in a loop on every entry from insideWPCOM_Liveblog::get_entries_by_time()
, at least one additional, uncached DB query is made per entry (for the comment author, viaget_comment_class()
), and several other functions / filters are called, which is unnecessary CPU cycles when all that can be cached and reused by all subsequent requests.Solution
Instead of caching everything in a giant blob, we should cache the results of
WPCOM_Liveblog::get_entries_by_time()
, so that we are only caching a few entries and are preventing any additional queries and processing. When the cache is hot, we can spit out results instantly without worrying about the total number of entries.To get there, this PR adds 3 new functions:
WPCOM_Liveblog::get_entries_by_time_cached()
- operates exactly the same asWPCOM_Liveblog::get_entries_by_time()
, but internally does not query all entries at once, and caches the results.WPCOM_Liveblog_Entry_Query::count_entries()
- This returns a count of all entries in a Liveblog, removing deletions. This replaces the need to grab the full list, strip out deletions (viaWPCOM_Liveblog::flatten_entries()
, then count it. This function caches internally and the cache is busted when a new entry is added, as the last entry timestamp is part of the key.WPCOM_Liveblog_Entry_Query::get_between_timestamps_with_query()
- ReplacesWPCOM_Liveblog_Entry_Query::get_between_timestamps()
(which queries full list then slices) with a comments query with a date query. Maintains the same expected return value by running the results throughWPCOM_Liveblog_Entry_Query::remove_replaced_entries()
, just likeWPCOM_Liveblog_Entry_Query::find_between_timestamps()
does.Considerations
Instead of the full list, we now do a comments query with a date query to find entries between two timestamps. Internally,
get_comments()
also caches all queries, so this should be pretty fast.These new functions aren't yet hooked up anywhere, but have been designed to be a drop-in replacement for
WPCOM_Liveblog::get_entries_by_time()
inside the REST route handler for entries requests.WPCOM_Liveblog::get_entries_paged()
still uses the full entries list, and should be rewritten. This is less critical, as these requests are comparatively rare. Large enough lists of entries will still stop being cached inside this function, but it may be passable as most traffic is for entries polling.I believe there is a bug in how pagination is calculated.
WPCOM_Liveblog::flatten_entries()
currently does not removeupdate
entries (it does removedelete
entries), so the total count is higher than the actual number of entries that would render to a page. The new code inWPCOM_Liveblog_Entry_Query::count_entries()
matches this behavior by counting all entries that are notdelete
s. To me this is a bug, but at least the new stuff is matching what exists for now.Currently there aren't tests for the new functions - those should be added.