-
-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warcio does not support replay of sites hosted on NCSA 1.5 #141
Comments
I'm not sure on further investigation if this is an NCSA issue or an issue with the 1997 IA ARCs. I can't find a version of NCSA 1.5 to test my theory. |
That's okay, I found a copy of the old 1995 source code of NCSA 1.5 here , and an old copy of the docs sits here.
Here is the patch: patch1.txt
Dockerfile
Makefile
Now if I do something like this:
And it's ready to serve requests. I'm attaching a zip here with the source already patched and all aforementioned files included. |
I've added a PR that fixes the issue in replaying webarchives that were created from servers running NCSA 1.5.1. I'm not convinced this is the best solution but it does fox our issue and allow the archive content to replay: #153 |
Here is an interesting one for you Ilya.
The original NCSA 1.5 web server responds with "HTTP 200 Document follows" rather than HTTP/1.0.
In recorderloader.py HTTP_TYPES is only looking for 'HTTP/1.0', 'HTTP/1.1'.
Modifying HTTP_TYPES to look for
'HTTP/1.0', 'HTTP/1.1', 'HTTP'
does allow the request web page to replay. I'd add this as a PR but I doubt this is the best idea.Here is the header from the ARC file in question:
This is the url in question but you'll only see a 500 error:
https://webarchive.nationalarchives.gov.uk//ukgwa/19970616061332/http://www.open.gov.uk:80/ofsted/nursery/rp511200.htm
I'll share the ARC file with you if I can get permission to release it.
The text was updated successfully, but these errors were encountered: