Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redirects overwrite URL #911

Open
crazydef opened this issue May 11, 2023 · 2 comments
Open

Redirects overwrite URL #911

crazydef opened this issue May 11, 2023 · 2 comments

Comments

@crazydef
Copy link

Description

If a request causes a redirect, the final Response object's URL is that of the redirect, not the original URL requested by the application.

Example/How to Reproduce

In this simple example:

cpr::Response r = cpr::Get(cpr::Url{"http://osocorporation.com/"});

the URL in the response object holds the string https://osocorporation.com/. Nothing particularly special in this instance, but if the server performs a more complex redirect, possibly serving custom error pages, for example, the calling application has no way of knowing what the original request was.

Possible Fix

It would be beneficial if the response retained the original URL along with any redirects.

Maybe make the response hold a vector of URLs, where the first entry is the original request, and subsequent items are the redirects?

Where did you get it from?

GitHub (branch e.g. master)

Additional Context/Your Environment

  • OS: Windows 10
  • Version: 1.10.1
@COM8
Copy link
Member

COM8 commented May 14, 2023

Hi @crazydef, thanks for reporting this. I see your point there.
But based on my experience using libcurl, this is not really possible (as far as I'm aware).

A quick ChatGPT question resulted in the following example how to solve it. It's not perfect but rather a crude way of solving it.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <curl/curl.h>

struct MemoryStruct {
  char *memory;
  size_t size;
};

static size_t WriteMemoryCallback(void *contents, size_t size, size_t nmemb, void *userp) {
  size_t realsize = size * nmemb;
  struct MemoryStruct *mem = (struct MemoryStruct *)userp;

  mem->memory = realloc(mem->memory, mem->size + realsize + 1);
  if(mem->memory == NULL) {
    printf("not enough memory (realloc returned NULL)\n");
    return 0;
  }

  memcpy(&(mem->memory[mem->size]), contents, realsize);
  mem->size += realsize;
  mem->memory[mem->size] = 0;

  return realsize;
}

static int DebugCallback(CURL *handle, curl_infotype type, char *data, size_t size, void *userp) {
  if(type == CURLINFO_TEXT) {
    printf("== Info: %s", data);
    if(strstr(data, "Location:")) {
      printf("Redirected to: %s", data+10);
      // You may want to store these URLs in a linked list or other data structure here.
    }
  }
  return 0;
}

int main(void) {
  CURL *curl_handle;
  CURLcode res;

  struct MemoryStruct chunk;

  chunk.memory = malloc(1);  /* will be grown as needed by the realloc above */
  chunk.size = 0;    /* no data at this point */

  curl_global_init(CURL_GLOBAL_ALL);

  /* init the curl session */
  curl_handle = curl_easy_init();

  /* specify URL to get */
  curl_easy_setopt(curl_handle, CURLOPT_URL, "http://example.com");

  /* send all data to this function  */
  curl_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION, WriteMemoryCallback);

  /* we pass our 'chunk' struct to the callback function */
  curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, (void *)&chunk);

  /* enable verbose for easier tracing */
  curl_easy_setopt(curl_handle, CURLOPT_VERBOSE, 1L);
  curl_easy_setopt(curl_handle, CURLOPT_DEBUGFUNCTION, DebugCallback);

  /* follow redirects */
  curl_easy_setopt(curl_handle, CURLOPT_FOLLOWLOCATION, 1L);

  /* get it! */
  res = curl_easy_perform(curl_handle);

  /* check for errors */
  if(res != CURLE_OK) {
    fprintf(stderr, "curl_easy_perform() failed: %s\n", curl_easy_strerror(res));
  }

  /* cleanup curl stuff */
  curl_easy_cleanup(curl_handle);

  free(chunk.memory);

  /* we're done with libcurl, so clean it up */
  curl_global_cleanup();

  return 0;
}

For this feather we need some kind of "handler" we can register inside curl that triggers as soon as we get redirected. I will put this on the backlog.

@crazydef
Copy link
Author

To be honest, I don't know how useful it would be to know every redirect. I just mentioned that as a possible solution. At the very least though, making a copy of the original URL and keeping that alongside the final URL would probably be sufficient for 99.99% of use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants