Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic pagination that continues until exhaustion #2908

Open
joshuaclayton opened this issue Jun 11, 2024 · 4 comments
Open

Automatic pagination that continues until exhaustion #2908

joshuaclayton opened this issue Jun 11, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@joshuaclayton
Copy link

Problem to solve

Chained requests require explicit declaration, which makes pagination through an unknown page size untenable.

POST https://URL
Content-Type: application/json
[BasicAuth]
user: pass
{
  "payload": "body"
}
HTTP/2 200
[Captures]
body: body
next_page: header "Link" regex "<([^>]+)>; rel=\"next\""

POST {{next_page}}
[BasicAuth]
user: pass
{
  "payload": "body"
}
HTTP/2 200
[Captures]
body: body
next_page: header "Link" regex "<([^>]+)>; rel=\"next\""

# keep repeating somehow - programmatically generate hurl files? Continue with copy/paste?

Proposal

The simplest example would be to wholesale swap URLs based on presence of a capture without any additional modification. This would allow for simple asserts / body capture decoupled from raw values and instead based on structure (e.g. presence of a field in a JSON response). Asserting against raw values likely wouldn't make sense for anything dynamic given generic pagination.

In that case, an additional section might work:

[PaginatesVia]
url: header "Link" regex "<([^>]+)>; rel=\"next\""

Other approaches might include more specific data capture (e.g. parsing page=5 from the Link header for the correct page, or querying the JSON response if that's where pagination info sits).

Additional context and resources

Specific use case: data extraction (rather than response assertion) against paginated resources of unknown size.

I'd looked to see if there was any functionality around looping within the grammar and didn't find anything, and while I understand it may be possible to use JSON output + shell + jq or similar to initiate chaining, in an ideal world there'd be a mechanism for this within the grammar itself.

@joshuaclayton joshuaclayton added the enhancement New feature or request label Jun 11, 2024
@fabricereix
Copy link
Collaborator

Thanks @joshuaclayton for your issue.
Automatic pagination is really an interesting/challenging use case.
It would be nice if it could fit in a more general looping mechanism not specific to pagination.
We have already skip, we might also add a repeat with a specifc repetition
or a termination condition (similar to retry).

We need plenty of examples to see how it could work.

@jcamiel
Copy link
Collaborator

jcamiel commented Jun 12, 2024

With --skip and --repeat, one can image such a file:

POST {{url}}
[Options]
repeat: -1 # infinite loop
skip: {{url}} isNull
{
  "payload": "body"
}
HTTP/2 200
[Captures]
body: body
url: header "Link" regex "<([^>]+)>; rel=\"next\""

We initiate the variable url with initial value, play the request if this variable is not null, update the variable url and repeat.
The thing that is missing is when the capture for the variable url is failing, Hurl considers it as an error whereas we want to continue the run. We could imagine in this case to give a default value to the capture if it is failing url: header "Link" regex "<([^>]+)>; rel=\"next\"" default null

POST {{url}}
[Options]
repeat: -1 # infinite loop
skip: {{url}} isNull
{
  "payload": "body"
}
HTTP/2 200
[Captures]
body: body
url: header "Link" regex "<([^>]+)>; rel=\"next\"" default null

In summary, we could use repeat and skipwithout too much syntax changes:

  • accept a predicate in skip
  • find a way to make "faillible" capture: with a default value for instance

@jcamiel
Copy link
Collaborator

jcamiel commented Jun 13, 2024

Another, better, syntax for default could be else:

POST {{url}}
[Options]
repeat: -1 # infinite loop
skip: {{url}} isNull
{
  "payload": "body"
}
HTTP/2 200
[Captures]
body: body
url: header "Link" regex "<([^>]+)>; rel=\"next\"" else null

@lepapareil
Copy link
Collaborator

lepapareil commented Jun 17, 2024

One possible solution is to use repeat feature, which has been developed by @jcamiel and will be available in the next release.

For example, using Gitlab api to retrieve tags list from a repo, all we have to do is creating pagination.hurl :

  • Make a first request section to get total pages :
GET {{gitlab_api_url}}/projects/{{gitlab_project_id}}/repository/tags?private_token={{gitlab_token}}&per_page={{per_page}}&page=1
Content-Type: application/json

HTTP 200

[Captures]
total_pages: header "X-Total-Pages" toInt
  • Then iterate wit repeat catching next page from each response :
GET {{gitlab_api_url}}/projects/{{gitlab_project_id}}/repository/tags?private_token={{gitlab_token}}&sort=desc&order_by=version&per_page={{per_page}}&page={{next_page}}
Content-Type: application/json
[Options]
repeat: {{total_pages}}

HTTP 200

[Captures]
next_page: header "X-Next-Page"
  • And simply exec hurl and set init vars:
$ hurl \
    --variable gitlab_api_url=https://gitlab.com/api/v4 \
    --variable gitlab_project_id=1 \
    --variable gitlab_token=***** \
    --variable per_page=1 \
    --variable next_page=1 \
    pagination.hurl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants