Redirects overwrite URL #911

crazydef · 2023-05-11T10:40:19Z

Description

If a request causes a redirect, the final Response object's URL is that of the redirect, not the original URL requested by the application.

Example/How to Reproduce

In this simple example:

cpr::Response r = cpr::Get(cpr::Url{"http://osocorporation.com/"});

the URL in the response object holds the string https://osocorporation.com/. Nothing particularly special in this instance, but if the server performs a more complex redirect, possibly serving custom error pages, for example, the calling application has no way of knowing what the original request was.

Possible Fix

It would be beneficial if the response retained the original URL along with any redirects.

Maybe make the response hold a vector of URLs, where the first entry is the original request, and subsequent items are the redirects?

Where did you get it from?

GitHub (branch e.g. master)

Additional Context/Your Environment

OS: Windows 10
Version: 1.10.1

The text was updated successfully, but these errors were encountered:

COM8 · 2023-05-14T04:54:48Z

Hi @crazydef, thanks for reporting this. I see your point there.
But based on my experience using libcurl, this is not really possible (as far as I'm aware).

A quick ChatGPT question resulted in the following example how to solve it. It's not perfect but rather a crude way of solving it.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <curl/curl.h>

struct MemoryStruct {
  char *memory;
  size_t size;
};

static size_t WriteMemoryCallback(void *contents, size_t size, size_t nmemb, void *userp) {
  size_t realsize = size * nmemb;
  struct MemoryStruct *mem = (struct MemoryStruct *)userp;

  mem->memory = realloc(mem->memory, mem->size + realsize + 1);
  if(mem->memory == NULL) {
    printf("not enough memory (realloc returned NULL)\n");
    return 0;
  }

  memcpy(&(mem->memory[mem->size]), contents, realsize);
  mem->size += realsize;
  mem->memory[mem->size] = 0;

  return realsize;
}

static int DebugCallback(CURL *handle, curl_infotype type, char *data, size_t size, void *userp) {
  if(type == CURLINFO_TEXT) {
    printf("== Info: %s", data);
    if(strstr(data, "Location:")) {
      printf("Redirected to: %s", data+10);
      // You may want to store these URLs in a linked list or other data structure here.
    }
  }
  return 0;
}

int main(void) {
  CURL *curl_handle;
  CURLcode res;

  struct MemoryStruct chunk;

  chunk.memory = malloc(1);  /* will be grown as needed by the realloc above */
  chunk.size = 0;    /* no data at this point */

  curl_global_init(CURL_GLOBAL_ALL);

  /* init the curl session */
  curl_handle = curl_easy_init();

  /* specify URL to get */
  curl_easy_setopt(curl_handle, CURLOPT_URL, "http://example.com");

  /* send all data to this function  */
  curl_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION, WriteMemoryCallback);

  /* we pass our 'chunk' struct to the callback function */
  curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, (void *)&chunk);

  /* enable verbose for easier tracing */
  curl_easy_setopt(curl_handle, CURLOPT_VERBOSE, 1L);
  curl_easy_setopt(curl_handle, CURLOPT_DEBUGFUNCTION, DebugCallback);

  /* follow redirects */
  curl_easy_setopt(curl_handle, CURLOPT_FOLLOWLOCATION, 1L);

  /* get it! */
  res = curl_easy_perform(curl_handle);

  /* check for errors */
  if(res != CURLE_OK) {
    fprintf(stderr, "curl_easy_perform() failed: %s\n", curl_easy_strerror(res));
  }

  /* cleanup curl stuff */
  curl_easy_cleanup(curl_handle);

  free(chunk.memory);

  /* we're done with libcurl, so clean it up */
  curl_global_cleanup();

  return 0;
}

For this feather we need some kind of "handler" we can register inside curl that triggers as soon as we get redirected. I will put this on the backlog.

crazydef · 2023-05-14T10:05:00Z

To be honest, I don't know how useful it would be to know every redirect. I just mentioned that as a possible solution. At the very least though, making a copy of the original URL and keeping that alongside the final URL would probably be sufficient for 99.99% of use cases.

crazydef added Bug 🐛 Needs Investigation 🔍 labels May 11, 2023

COM8 added Enhancement 👌 and removed Bug 🐛 Needs Investigation 🔍 labels May 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redirects overwrite URL #911

Redirects overwrite URL #911

crazydef commented May 11, 2023

COM8 commented May 14, 2023

crazydef commented May 14, 2023

Redirects overwrite URL #911

Redirects overwrite URL #911

Comments

crazydef commented May 11, 2023

Description

Example/How to Reproduce

Possible Fix

Where did you get it from?

Additional Context/Your Environment

COM8 commented May 14, 2023

crazydef commented May 14, 2023