Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to account for rate limiting #1

Open
anthonyp opened this issue Apr 9, 2017 · 4 comments
Open

Need to account for rate limiting #1

anthonyp opened this issue Apr 9, 2017 · 4 comments

Comments

@anthonyp
Copy link

anthonyp commented Apr 9, 2017

While using the gsheet target, I hit a rate limit error. It seems as though this target needs to account for those limits by throttling accordingly, otherwise it won't be usable for more than a trivial amount of data.

@mdelaurentis
Copy link
Contributor

I started to work on this, but unfortunately I don't think it's as straightforward as it seems. When we exceed the rate limit, we get a response that looks like this:

  INFO Exc is {'resp': {'transfer-encoding': 'chunked', 'cache-control': 'private', '-content-encoding': 'gzip', 'vary': 'Origin, X-Origin, Referer', 'x-frame-options': 'SAMEORIGIN'
, 'content-type': 'application/json; charset=UTF-8', 'content-length': '506', 'server': 'ESF', 'alt-svc': 'quic=":443"; ma=2592000; v="37,36,35"', 'status': '429', 'date': 'Sun, 14
May 2017 18:19:25 GMT', 'x-xss-protection': '1; mode=block'}, 'uri': 'https://sheets.googleapis.com/v4/spreadsheets/1ouYDLCAAuDTAOqD9MojDyCQ7_jURwpF-aqi-SsuoICs/values/sample%21A1%3
AZZZ:append?valueInputOption=USER_ENTERED&alt=json', 'content': b'{\n  "error": {\n    "code": 429,\n    "message": "Insufficient tokens for quota \'WriteGroup\' and limit \'USER-10
0s\' of service \'sheets.googleapis.com\' for consumer \'project_number:825939786222\'.",\n    "errors": [\n      {\n        "message": "Insufficient tokens for quota \'WriteGroup\'
 and limit \'USER-100s\' of service \'sheets.googleapis.com\' for consumer \'project_number:825939786222\'.",\n        "domain": "global",\n        "reason": "rateLimitExceeded"\n
    }\n    ],\n    "status": "RESOURCE_EXHAUSTED"\n  }\n}\n'}

Note that there's no retry-after header, so we don't really know how long to wait before trying again. I tried using an exponential backoff, but even after sleeping for a minute we're still locked out.

I think the best way to go is actually going to be to insert rows in batches rather than one at a time. The API allows us to insert multiple rows at a time. Note that if we do this, we'll need to ensure that we emit a STATE message only after we save all of the records that came before it.

@timvisher
Copy link

Based on this doc, we could try setting an internal rate limit accordingly and see if that behaves any better. Given Google's general API support I doubt we'll get anything nicer than that.

@rseabrook
Copy link

I opened PR #7 to batch together multiple records to the same stream into a single API call. The batch size is configurable and it solved my issue with rate limit requests. The target could still benefit from some internal rate limiting and/or retry logic though.

@timvisher are you the right person to review the PR?

@timvisher
Copy link

Someone from Stitch will get to this when they have bandwidth. Unfortunately I can't make any promises regarding the timeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants