Can we remove WCIVF from the upstream endpoints? #375

symroe · 2023-01-30T12:39:28Z

symroe
Jan 30, 2023
Maintainer

But but, we need WCIVF for candidate data!

Right, but hear me out...

sequenceDiagram
	par Devs.DC to WDIV
	    Devs.DC-->>WDIV: Postcode
	    WDIV-->>EE: Elections for point
	    EE -->> WDIV: Has election?
	    WDIV -->> Devs.DC: Polling stations + Council
	and Devs.DC to WCIVF
		Devs.DC-->>WCIVF: Postcode
	    WCIVF-->>EE: Elections for point
	    EE -->> WCIVF: Ballot list
	    WCIVF -->> Devs.DC: Ballots
	end

This is roughly what we do at the moment (address pickers left out for the moment):

Get a postcode from the user
In parallel, pass that postcode to WDIV and WCIVF
Stitch the two responses together
Return to the user

This requires both WCIVF and WDIV to be up and scaling well. This is working ok for the most part.

However, in recent years the biggest use case for the data from WCIVF is just the simple ballot data that we get from YNR. That is, a ballot containing:

Candidate nomination data (names, parties, list positions)
Candidate email addresses
Candidate Photos

We have added some extra data like hustings and leaflets, but, by volume of requests, we don't use them a lot.

What I'm suggesting is that we can use the WDIV EE lookup to return the list of ballots directly.

YNR, as the canonical source for ballot data, can write the JSON that we need to an S3 bucket when they are updated. Then, the devs.DC API can simply get the basic data from S3 directly, removing WCIVF from the lookup. We can have calls to get_hustings_for_ballot etc that can either come from WCIVF, or come from other S3 buckets. The point is, at this stage the key is the ballot paper ID, so it's fairly easy to cache the data we want.

sequenceDiagram

	Devs.DC-->>WDIV: Postcode
	WDIV-->>EE: Elections for point
	EE -->> WDIV: List of Ballots
	WDIV -->> Devs.DC: Polling stations + Council + Ballots

	Devs.DC-->> S3 bucket: For each ballot
	S3 bucket-->>Devs.DC: Candidate data

Why though?

This model has a few advantages.

Performance and resilience

At the moment if either WCIVF or WDIV is down, the API is down. We can mitigate this by failing the requests gracefully, but there's no escaping the fact that we can't serve candidate data if WCIVF isn't around.

Getting the same, mostly static, data from S3 would mitigate this a lot.

We wont be able to perform both queries in parallel like we do at the moment, but the "candidates for ballots" query would be as fast as getting the content from S3, so it's hardly going to slow things down.

Address picker on WCIVF

The suggested change actually has huge implications for WCIVF, It means that WCIVF itself can become a client of the API, and use the API to perform postcode (and address) lookups.

It can still store person profile data locally, and just get the list of ballots it needs to show from the API response. This is a super quick way to get address pickers on WCIVF without having to worry about AddressBase.

chris48s · 2023-01-30T15:37:44Z

chris48s
Jan 30, 2023
Maintainer

Broadly this makes a lot of sense.

From a scaling perspective, static content is good content.
At the moment, exposing any data point from YNR via the devs.DC API means you have to sync it into WCIVF and expose it via the API. Cutting out the middleman would be useful.
WCIVF as a client of the devs.DC API also makes a lot of sense

A couple of things that occur to me...

I may be zeroing in on one sentence a bit too much here, but..

YNR, as the canonical source for ballot data, can write the JSON that we need to an S3 bucket when they are updated

Some thoughts on this.

If you update the S3 bucket every time something gets updated in YNR, there's a few possible issues:

If the DB write happens, but the S3 write fails then the two sources can be out of sync (and potentially never get back into sync). How do you handle this?
If changing one data point on one candidate requires re-writing the entire ballot object to S3 this potentially introduces quite a bit of DB querying and network IO on every save operation in YNR. Potentially this could scale poorly for things like party list elections where each party is fielding 10 candidates, or whatever.
You've got to be really careful to make sure the 'update to S3' logic fires on every DB update. For example, there is potential that if you update data in the django shell, via /admin, in a database migration, using a .bulk_update(), etc you could skip the 'update to S3' logic.

..although the big upside is that as long as everything works, YNR and the S3 bucket should be pretty much in sync at any given point in time.

The other option would be to have some kind of background/scheduled job to sync the YNR DB to the S3 bucket. As long as you can pick up what's changed, that gets rid of a lot of those problems. The tradeoff is that there's always going to be some lag between the data being updated in YNR and the sync job updating the S3 bucket. You've also got to

make sure that background job can pick up what has changed since the last sync (which might be tricky)
ensure the job runs efficiently
monitor the sync job to make sure it keeps running

The other thing worth flagging here is that at the moment devs.DC can call WDIV and WCIVF in parallel. Having to have the ballots back from WDIV before you can get the candidates means you have to do the calls in sequence. Getting stuff from a S3 bucket within AWS should be a fast call but worth flagging that there is a response time impact to consider here.

0 replies

symroe · 2023-01-30T18:57:19Z

symroe
Jan 30, 2023
Maintainer Author

Thanks Chris, I did wonder if I could nerd snipe your brain in to this thread 😆

In order:

If you update the S3 bucket every time something gets updated in YNR, there's a few possible issues

(And some related points)

This is all sensible to think about, but something you might not be aware of is some of the existing changes to YNR/WCIVF.

Since ~Sept 21 we moved away from WCIVF doing a large import of the whole database every night to purely a delta based import. This is done by exposing an updated_gte filter in the YNR API and then storing the updated times in WCIVF.

This means that we're already in a position where we can be sure that "give me all the things updated since X" is going to be true.

That in turn means that we can remove the need to write to S3 on every DB write, and just have a job mop up the writes every n minutes, in the exact same way as WCIVF currently imports via the API. We could in fact have that job use the API and store the timestamps in something like DynamoDB. The Lambda function for this would be fairly straightforward, actually.

We could also supplement this with some nightly job that just writes everything out, or at least writes current ballots. There are currently 33,442 ballots in YNR. Not trivial to write to S3, but also not something that's going to take long inside AWS.

Also, don't forget that WCIVF will need more detailed person data than will be in YNR. If you think about the static JSON on S3 only containing the person IDs (alongside other ballot info), then WCIVF would need to do a Person.objects.filter(person_id__in=[]) query on the local database to get the "full" profile. This means not disabling the existing import system.

I'm not sure, but I think all of that covers all your points above? Using a Lambda on the existing API (something I only thought of when writing this) would mean the maintenance of the queue would be almost nothing, too.

As for performance: yeah, I did talk about that above but I don't think it's going to be a large problem for us. There are also options for optimization, for example by sticking a CloudFront between S3 and the devs.DC API, we could get the benefit of connection pooling and edge caching, rather than creating a new S3 client for each boot of the Lambda instance (the cold starts is where we'd slow things down most, I think). We could just use S3's web hosting too, and even some in memory caching in Lambda for more speed still.

Either way, I think the slight slowdown will be ok for most of our users at worst, or unnoticeable at best.

0 replies

symroe · 2023-05-03T22:29:47Z

symroe
May 3, 2023
Maintainer Author

This is done now 🎉

1 reply

chris48s May 4, 2023
Maintainer

Brave choice of date to ship that. Really going with the "building the plane as we fall off the cliff" vibe there 😄

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we remove WCIVF from the upstream endpoints? #375

{{title}}

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Can we remove WCIVF from the upstream endpoints? #375

symroe Jan 30, 2023 Maintainer

Why though?

Performance and resilience

Address picker on WCIVF

Replies: 3 comments · 1 reply

chris48s Jan 30, 2023 Maintainer

symroe Jan 30, 2023 Maintainer Author

symroe May 3, 2023 Maintainer Author

chris48s May 4, 2023 Maintainer

symroe
Jan 30, 2023
Maintainer

Replies: 3 comments 1 reply

chris48s
Jan 30, 2023
Maintainer

symroe
Jan 30, 2023
Maintainer Author

symroe
May 3, 2023
Maintainer Author

chris48s May 4, 2023
Maintainer