Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warcio does not support replay of sites hosted on NCSA 1.5 #141

Open
omgoo opened this issue Feb 14, 2022 · 3 comments
Open

Warcio does not support replay of sites hosted on NCSA 1.5 #141

omgoo opened this issue Feb 14, 2022 · 3 comments

Comments

@omgoo
Copy link

omgoo commented Feb 14, 2022

Here is an interesting one for you Ilya.

The original NCSA 1.5 web server responds with "HTTP 200 Document follows" rather than HTTP/1.0.

In recorderloader.py HTTP_TYPES is only looking for 'HTTP/1.0', 'HTTP/1.1'.

Modifying HTTP_TYPES to look for 'HTTP/1.0', 'HTTP/1.1', 'HTTP' does allow the request web page to replay. I'd add this as a PR but I doubt this is the best idea.

Here is the header from the ARC file in question:

http://www.open.gov.uk:80/ofsted/nursery/rp511200.htm 193.32.28.8 19970616061332 text/html 30594
HTTP 200 Document follows
Date: Mon, 16 Jun 1997 07:09:23 GMT
Server: NCSA/1.5.1
Last-modified: Fri, 09 May 1997 20:24:52 GMT
Content-type: text/html
Content-length: 30414

This is the url in question but you'll only see a 500 error:

https://webarchive.nationalarchives.gov.uk//ukgwa/19970616061332/http://www.open.gov.uk:80/ofsted/nursery/rp511200.htm

I'll share the ARC file with you if I can get permission to release it.

@omgoo
Copy link
Author

omgoo commented Feb 14, 2022

I'm not sure on further investigation if this is an NCSA issue or an issue with the 1997 IA ARCs. I can't find a version of NCSA 1.5 to test my theory.

@wsdookadr
Copy link

wsdookadr commented Jul 11, 2022

I can't find a version of NCSA 1.5 to test my theory.

That's okay, I found a copy of the old 1995 source code of NCSA 1.5 here , and an old copy of the docs sits here.
Then I wrote a patch for it. Not that much has changed:

  • some getline(..) name collision
  • some Makefile changes to get the linker flags right for gdbm
  • some ncsa config file changes, also added a starter script
  • the DIR data structure being opaque doesn't allow direct access to the file descriptor member anymore, instead dirfd must be used

Here is the patch:

patch1.txt
diff --git a/BUGS b/BUGS
index c173b2d..4a43b40 100644
--- a/BUGS
+++ b/BUGS
@@ -12,7 +12,7 @@ Known Bugs in 1.5.1
 *) Relative urls in imagemaps are broken
 *) doesn't kill cgi scripts if the user aborts
 *) content_length gets reset after scanning cgi headers, instead of before
-*) can core dump on special case in getline (rfc822 line wrapping)
+*) can core dump on special case in getline1 (rfc822 line wrapping)
 
 Known Bugs in 1.5.1b3
 ---------------------
diff --git a/CHANGES b/CHANGES
index 826a6fe..f544900 100644
--- a/CHANGES
+++ b/CHANGES
@@ -12,7 +12,7 @@ Changes for 1.5.2a
 
 Changes for 1.5.2
 ------------------
-*) Changed getline rfc822 line wrap to check for validity of the next bits
+*) Changed getline1 rfc822 line wrap to check for validity of the next bits
    before attempting to see them
 *) Changed imagemap.c so relative URLs actually work
 *) Don't core dump on a method only request
@@ -58,7 +58,7 @@ Changes for 1.5.1
 *) Why do we require full URLs in Redirect?  A local (root) url should work fine
 *) Redirect from .htaccess should work now (completely)
 *) Added hack to allow SSI of CGI, at a great expense of speed (CGI_SSI_HACK)
-*) Made getline() code re-entrant (now has its own sock_buf struct)
+*) Made getline1() code re-entrant (now has its own sock_buf struct)
 
 
 
@@ -219,7 +219,7 @@ Fixes for 1.5 Beta 3
 *) Now log start command to error_log
 *) Improved usage function (for -v command line)
 *) Made sigjmp_buf default define for JMP_BUF (missed from 1.4.2)
-*) Fixed getline()
+*) Fixed getline1()
 
 
 
diff --git a/cgi-src/change-passwd.c b/cgi-src/change-passwd.c
index fe1dd5a..4976046 100644
--- a/cgi-src/change-passwd.c
+++ b/cgi-src/change-passwd.c
@@ -151,7 +151,7 @@ main(int argc, char *argv[]) {
     }
 
     found = 0;
-    while(!(getline(line,256,f))) {
+    while(!(getline1(line,256,f))) {
         if(found || (line[0] == '#') || (!line[0])) {
             putline(tfp,line);
             continue;
diff --git a/cgi-src/imagemap.c b/cgi-src/imagemap.c
index 9f99c72..5c42429 100644
--- a/cgi-src/imagemap.c
+++ b/cgi-src/imagemap.c
@@ -41,7 +41,7 @@
 ** 03/07/95: Carlos Varela, [email protected]
 **
 ** 1.8 : Fixed bug (strcat->sprintf) when reporting error.
-**       Included getline() function from util.c in NCSA httpd distribution.
+**       Included getline1() function from util.c in NCSA httpd distribution.
 **
 ** 11/08/95: Brandon Long, [email protected]
 **
@@ -124,7 +124,7 @@ int main(int argc, char **argv)
        goto openconf;
     }
 
-    while(!(getline(input,MAXLINE,fp))) {
+    while(!(getline1(input,MAXLINE,fp))) {
         char confname[MAXLINE];
         if((input[0] == '#') || (!input[0]))
             continue;
@@ -163,7 +163,7 @@ int main(int argc, char **argv)
         servererr(errstr);
     }
 
-    while(!(getline(input,MAXLINE,fp))) {
+    while(!(getline1(input,MAXLINE,fp))) {
         char type[MAXLINE];
         char url[MAXLINE];
         char num[10];
@@ -377,7 +377,7 @@ int isname(char c)
         return (!isspace(c));
 }
 
-int getline(char *s, int n, FILE *f) {
+int getline1(char *s, int n, FILE *f) {
     register int i=0;
 
     while(1) {
diff --git a/cgi-src/util.c b/cgi-src/util.c
index c3d5d65..b01a9f6 100644
--- a/cgi-src/util.c
+++ b/cgi-src/util.c
@@ -95,7 +95,7 @@ int rind(char *s, char c) {
     return -1;
 }
 
-int getline(char *s, int n, FILE *f) {
+int getline1(char *s, int n, FILE *f) {
     register int i=0;
 
     while(1) {
diff --git a/cgi-src/util.h b/cgi-src/util.h
index eded336..432bd42 100644
--- a/cgi-src/util.h
+++ b/cgi-src/util.h
@@ -6,7 +6,7 @@ char x2c(char *what);
 void unescape_url(char *url);
 void plustospace(char *str);
 int rind(char *s, char c);
-int getline(char *s, int n, FILE *f);
+int getline1(char *s, int n, FILE *f);
 void send_fd(FILE *f, FILE *fd);
 int ind(char *s, char c);
 void escape_shell_cmd(char *cmd);
diff --git a/conf/httpd.conf-dist b/conf/httpd.conf-dist
index dc96e72..f0d3b8f 100644
--- a/conf/httpd.conf-dist
+++ b/conf/httpd.conf-dist
@@ -27,7 +27,7 @@ ServerType standalone
 # need HTTPd to be run as root initially.
 # Default: 80 (or DEFAULT_PORT)
 
-Port 80
+Port 8412
 
 # StartServers: The number of servers to launch at startup.  Must be
 # compiled without the NO_PASS compile option
@@ -66,8 +66,8 @@ TimeOut 1200
 # User/Group: The name (or #number) of the user/group to run HTTPd as.
 # Default: #-1 (or DEFAULT_USER / DEFAULT_GROUP)
 
-User nobody
-Group #-1
+User ncsa
+Group ncsa
 
 # IdentityCheck: Enables or disables RFC931 compliant logging of the 
 # remote user name for sites which run identd or something similar. 
@@ -97,7 +97,7 @@ Group #-1
 # Default: If you do not specify a ServerName, HTTPd attempts to retrieve
 #	   it through system calls.
 
-#ServerName new.host.name
+ServerName localhost
 
 # ServerAdmin: Your address, where problems with the server should be
 # e-mailed.
@@ -262,8 +262,8 @@ DNSMode Standard
 # VirtualHost as Optional or Required.
 
 <VirtualHost 127.0.0.1 Optional>
-DocumentRoot /local
-ServerName localhost.ncsa.uiuc.edu
+DocumentRoot /usr/local/etc/httpd/htdocs
+ServerName localhost
 ResourceConfig conf/localhost_srm.conf
 </VirtualHost>
 
diff --git a/conf/srm.conf-dist b/conf/srm.conf-dist
index 1d713fb..9cec3d2 100644
--- a/conf/srm.conf-dist
+++ b/conf/srm.conf-dist
@@ -49,12 +49,12 @@ ScriptAlias /cgi-bin/ /usr/local/etc/httpd/cgi-bin/
 # FCGIScritpAlias: Same as ScriptAlias, except for FCGI scripts
 # Format: FCGIScriptAlias fakename realname
 
-FCGIScriptAlias /fcgi-bin/ /usr/local/etc/httpd/fcgi-devel-kit/examples/
+# FCGIScriptAlias /fcgi-bin/ /usr/local/etc/httpd/fcgi-devel-kit/examples/
 
 # Define the AppClasses. These get hit when requests come in for
 # /fcgi-bin/tiny-fcgi.fcgi or /fcgi-bin/tiny-fcgi2.fcgi
-AppClass /usr/local/etc/httpd/fcgi-devel-kit/examples/tiny-fcgi.fcgi -listen-queue-depth 10 -processes 2
-AppClass /usr/local/etc/httpd/fcgi-devel-kit/examples/tiny-fcgi2.fcgi -listen-queue-depth 10 -processes 2
+#AppClass /usr/local/etc/httpd/fcgi-devel-kit/examples/tiny-fcgi.fcgi -listen-queue-depth 10 -processes 2
+#AppClass /usr/local/etc/httpd/fcgi-devel-kit/examples/tiny-fcgi2.fcgi -listen-queue-depth 10 -processes 2
 
 #===========================================================================
 # Directory Indexing
@@ -151,6 +151,7 @@ DefaultType text/plain
 #AddType text/x-imagemap .map
 #AddType application/x-httpd-cgi .cgi
 #AddType application/x-httpd-fcgi .fcgi
+#AddType application/x-httpd-cgi .cgi
 
 #===========================================================================
 # Misc Server Resources
diff --git a/src/CHANGES b/src/CHANGES
index 3432366..ccd3c35 100644
--- a/src/CHANGES
+++ b/src/CHANGES
@@ -13,7 +13,7 @@
 
 Fixes for 1.5.2
 ------------------
-*) Changed getline rfc822 line wrap to check for validity of the next bits
+*) Changed getline1 rfc822 line wrap to check for validity of the next bits
    before attempting to see them
 *) Changed imagemap.c so relative URLs actually work
 *) Don't core dump on a method only request
diff --git a/src/HTTPd_REQ_PATH b/src/HTTPd_REQ_PATH
index d081916..b494d34 100644
--- a/src/HTTPd_REQ_PATH
+++ b/src/HTTPd_REQ_PATH
@@ -13,13 +13,13 @@ child_main  			httpd.c
    free
  RequestMain			http_request.c
   signal
-  getline
+  getline1
   setproctitle
   decode_request		http_request.c
    strtok
    MapMethod			http_request.c
    get_mime_headers
-    getline
+    getline1
     strchr
     isspace
     strcasecmp
@@ -73,7 +73,7 @@ child_main  			httpd.c
       stat
       FOpen			fdwrap.c
       parse_access_dir		http_config.c
-       cfg_getline		util.c
+       cfg_getline1		util.c
        access_syntax_error	http_config.c
        cfg_getword		util.c
        add_type			http_mime.c
@@ -151,11 +151,11 @@ child_main  			httpd.c
        add_cgi_vars
        error_log2stderr
        execle/execve
-       getline			util.c
+       getline1			util.c
        write
        read
        scan_script_header	cgi.c
-	getline			util.c
+	getline1			util.c
 	strdup
 	realloc
        waitpid
@@ -167,7 +167,7 @@ child_main  			httpd.c
 	 dump_default_header	http_mime.c
        send_script		cgi.c
 	alarm
-	getline
+	getline1
 	write
 	read
        kill_children		cgi.c
diff --git a/src/Makefile b/src/Makefile
index 381b9b9..f2aac41 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -60,7 +60,7 @@ KRB5_CFLAGS = -DKRB5 -I$(KRB5_DIR)/include -I$(KRB5_DIR)/include/krb5
 #
 # To enable DBM password/groupfile support, define the DBM_SUPPORT flag
 
-DBM_CFLAGS = -DDBM_SUPPORT
+DBM_CFLAGS = "-lgdbm -lgdbm_compat -lcrypt"
 #DBM_LIBS = -lndbm
 #DBM_LIBS = -ldbm 
 #DBM_LIBS = -lgdbm
@@ -187,11 +187,11 @@ hp-cc:
 	make tar AUX_CFLAGS=-DHPUX CC=cc CFLAGS="-O -Aa" DBM_LIBS=-lndbm
 
 linux:
-	make tar AUX_CFLAGS=-DLINUX CC=gcc CFLAGS=-O2 DBM_LIBS=-lgdbm
+	make tar AUX_CFLAGS=-DLINUX CC=gcc CFLAGS=-O2 DBM_LIBS="-lgdbm -lcrypt"
 
 linux2: linux
 linux1: 
-	make tar AUX_CFLAGS="-DLINUX -DFD_LINUX" CC=gcc CFLAGS=-O2 DBM_LIBS=-lgdbm
+	make tar AUX_CFLAGS="-DLINUX -DFD_LINUX" CC=gcc CFLAGS=-O2 DBM_LIBS="-lgdbm -lcrypt"
 
 netbsd:
 	make tar AUX_CFLAGS=-DNETBSD EXTRA_LIBS=-lcrypt CC=cc CFLAGS=-O2
diff --git a/src/cgi.c b/src/cgi.c
index 4a13a13..369504d 100644
--- a/src/cgi.c
+++ b/src/cgi.c
@@ -267,7 +267,7 @@ int scan_cgi_header(per_request *reqInfo, int pd)
    
   /* ADC put in the G_SINGLE_CHAR option, so that CGI SSI's would work.  
    * it was:
-   * if((ret = getline(reqInfo->cgi_buf,str,HUGE_STRING_LEN-1,0,timeout)) <= 0)
+   * if((ret = getline1(reqInfo->cgi_buf,str,HUGE_STRING_LEN-1,0,timeout)) <= 0)
    *
    * This should be cleaned up perhaps so that it only does this if SSI's are
    * allowed for this script directory.  ZZZZ
@@ -278,7 +278,7 @@ int scan_cgi_header(per_request *reqInfo, int pd)
 #endif /* CGI_SSI_HACK */
 
     while(1) {
-      if((ret = getline(reqInfo->cgi_buf,str,HUGE_STRING_LEN-1,options,timeout)) <= 0)
+      if((ret = getline1(reqInfo->cgi_buf,str,HUGE_STRING_LEN-1,options,timeout)) <= 0)
       {
         char error_msg[MAX_STRING_LEN];
 	Close(pd);
@@ -508,7 +508,7 @@ int cgi_stub(per_request *reqInfo, struct stat *finfo, int allow_options)
       int nDone = 0;
       
       signal(SIGPIPE,SIG_IGN);
-      nBytes=getline(reqInfo->sb, szBuf,HUGE_STRING_LEN,G_FLUSH, timeout);
+      nBytes=getline1(reqInfo->sb, szBuf,HUGE_STRING_LEN,G_FLUSH, timeout);
       nTotalBytes = nBytes;
       if (nBytes >= 0) {
         if (nBytes > 0) write(p2[1], szBuf, nBytes);
@@ -538,10 +538,10 @@ int cgi_stub(per_request *reqInfo, struct stat *finfo, int allow_options)
     }
     
     /* Previously, this was broken because we read the results of the CGI using
-     * getline, but the SSI parser used buffered stdio.
+     * getline1, but the SSI parser used buffered stdio.
      * 
      * ADC changed scan_cgi_header so that it uses G_SINGLE_CHAR when it
-     * calls getline.  Yes, this means pitiful performance for CGI scripts.
+     * calls getline1.  Yes, this means pitiful performance for CGI scripts.
      */
     /* Fine, parse the output of CGI scripts.  Talk about useless
      * overhead. . .
@@ -620,7 +620,7 @@ long send_fd(per_request *reqInfo, int pd, void (*onexit)(void))
 
     alarm(timeout);
     if (reqInfo->cgi_buf != NULL)
-      n=getline(reqInfo->cgi_buf, buf,IOBUFSIZE,G_FLUSH,timeout);
+      n=getline1(reqInfo->cgi_buf, buf,IOBUFSIZE,G_FLUSH,timeout);
      else 
       n = 0;
     while (1) {
diff --git a/src/digest.c b/src/digest.c
index ba7b0e9..76678cd 100644
--- a/src/digest.c
+++ b/src/digest.c
@@ -63,7 +63,7 @@ int get_digest(per_request *reqInfo, char *user, char *realm, char *digest,
 		    reqInfo->auth_digestfile);
 	    die(reqInfo,SC_SERVER_ERROR,errstr);
 	}
-	while(!(cfg_getline(l,MAX_STRING_LEN,f))) {
+	while(!(cfg_getline1(l,MAX_STRING_LEN,f))) {
 	    if((l[0] == '#') || (!l[0])) continue;
 	    getword(w,l,':');
 	    getword(r,l,':');
diff --git a/src/fcgi.c b/src/fcgi.c
index be836ba..40ae76b 100644
--- a/src/fcgi.c
+++ b/src/fcgi.c
@@ -2310,7 +2310,7 @@ static int FastCgiDoWork(WS_Request *reqPtr, FastCgiInfo *infoPtr)
 
     if (nFirst) {
       char szBuf[IOBUFSIZE];
-      nBytes=getline(reqPtr->sb, szBuf,IOBUFSIZE,G_FLUSH,0);
+      nBytes=getline1(reqPtr->sb, szBuf,IOBUFSIZE,G_FLUSH,0);
       BufferAddData(infoPtr->reqInbufPtr, szBuf, nBytes);
       if (nBytes > 0) {
 	BufferAddData(infoPtr->reqInbufPtr, szBuf, nBytes);
diff --git a/src/fdwrap.c b/src/fdwrap.c
index dd33a81..fe2e68e 100644
--- a/src/fdwrap.c
+++ b/src/fdwrap.c
@@ -20,8 +20,8 @@
  *
  */
 
-#include "config.h"
 #include "portability.h"
+#include "config.h"
 
 #include <stdio.h>
 #ifndef NO_STDLIB_H 
diff --git a/src/http_access.c b/src/http_access.c
index f7a4827..5e2668e 100644
--- a/src/http_access.c
+++ b/src/http_access.c
@@ -180,11 +180,8 @@ int find_host_deny(per_request *reqInfo, int x)
     return FA_ALLOW;
 }
 
-/* match_referer()
- * currently matches restriction with sent for only as long as restricted
- */
-int match_referer(char *restrict, char *sent) {
-  return !(strcmp_match(sent,restrict));
+int match_referer(char *restrict_, char *sent) {
+  return !(strcmp_match(sent,restrict_));
 }
 
 /* find_referer_allow()
diff --git a/src/http_auth.c b/src/http_auth.c
index 5139dd5..3f90656 100644
--- a/src/http_auth.c
+++ b/src/http_auth.c
@@ -140,7 +140,7 @@ int get_pw(per_request *reqInfo, char *user, char *pw, security_data* sec)
 
     if (reqInfo->auth_pwfile_type == AUTHFILETYPE_STANDARD) { 	
 	/* From Conrad Damon ([email protected]), 	 
-	   Don't start cfg_getline loop if auth_pwfile is a directory. */
+	   Don't start cfg_getline1 loop if auth_pwfile is a directory. */
 
 	if ((stat (reqInfo->auth_pwfile, &finfo) == -1) ||
 	    (!S_ISREG(finfo.st_mode))) {
@@ -152,7 +152,7 @@ int get_pw(per_request *reqInfo, char *user, char *pw, security_data* sec)
 	    sprintf(errstr,"Could not open user file %s",reqInfo->auth_pwfile);
 	    die(reqInfo,SC_SERVER_ERROR,errstr); 	
 	}
-	while(!(cfg_getline(l,MAX_STRING_LEN,f))) { 	 
+	while(!(cfg_getline1(l,MAX_STRING_LEN,f))) { 	 
 	    if((l[0] == '#') || (!l[0])) 
 		continue; 	 
 	    getword(w,l,':');
diff --git a/src/http_config.c b/src/http_config.c
index 54aee66..b21751a 100644
--- a/src/http_config.c
+++ b/src/http_config.c
@@ -186,7 +186,7 @@ void process_server_config(per_host *host, FILE *cfg, FILE *errors,
   if (!virtual) n=0;
   
   /* Parse server config file. Remind me to learn yacc. */
-  while(!(cfg_getline(l,MAX_STRING_LEN,cfg))) {
+  while(!(cfg_getline1(l,MAX_STRING_LEN,cfg))) {
     ++n;
     if((l[0] != '#') && (l[0] != '\0')) {
       cfg_getword(w,l);
@@ -541,7 +541,7 @@ void process_resource_config(per_host *host, FILE *open, FILE *errors,
 	else return;
     }
   } else cfg = open;
-  while(!(cfg_getline(l,MAX_STRING_LEN,cfg))) {
+  while(!(cfg_getline1(l,MAX_STRING_LEN,cfg))) {
     ++n;
     if((l[0] != '#') && (l[0] != '\0')) {
       cfg_getword(w,l);
@@ -862,7 +862,7 @@ int parse_access_dir(per_request *reqInfo, FILE *f, int line, char or,
 	sec[x].on_deny[i] = NULL;
     }
 
-    while(!(cfg_getline(l,MAX_STRING_LEN,f))) {
+    while(!(cfg_getline1(l,MAX_STRING_LEN,f))) {
         ++n;
         if((l[0] == '#') || (!l[0])) continue;
         cfg_getword(w,l);
@@ -1198,7 +1198,7 @@ int parse_access_dir(per_request *reqInfo, FILE *f, int line, char or,
                 else if(!strcasecmp(w,"DELETE")) m[M_DELETE]=1;
             }
             while(1) {
-                if(cfg_getline(l,MAX_STRING_LEN,f))
+                if(cfg_getline1(l,MAX_STRING_LEN,f))
                     access_syntax_error(reqInfo,n,"Limit missing /Limit",f,file);
                 n++;
                 if((l[0] == '#') || (!l[0])) continue;
@@ -1393,7 +1393,7 @@ void process_access_config(FILE *errors)
         perror("fopen");
         exit(1);
     }
-    while(!(cfg_getline(l,MAX_STRING_LEN,f))) {
+    while(!(cfg_getline1(l,MAX_STRING_LEN,f))) {
         ++n;
         if((l[0] == '#') || (!l[0])) continue;
 	cfg_getword(w,l);
diff --git a/src/http_mime.c b/src/http_mime.c
index 0d89048..f1f0151 100644
--- a/src/http_mime.c
+++ b/src/http_mime.c
@@ -146,7 +146,7 @@ void init_mime(void)
     forced_types = NULL;
     encoding_types = NULL;
 
-    while(!(cfg_getline(l,MAX_STRING_LEN,f))) {
+    while(!(cfg_getline1(l,MAX_STRING_LEN,f))) {
         if(l[0] == '#') continue;
         cfg_getword(w,l);
         if(!(ct = (char *)malloc(sizeof(char) * (strlen(w) + 1))))
diff --git a/src/http_request.c b/src/http_request.c
index 57e6808..9973839 100644
--- a/src/http_request.c
+++ b/src/http_request.c
@@ -484,7 +484,7 @@ void get_http_headers(per_request *reqInfo)
     char *field_val;
     int options = 0;
 
-    while(getline(reqInfo->sb,field_type,HUGE_STRING_LEN-1,options,
+    while(getline1(reqInfo->sb,field_type,HUGE_STRING_LEN-1,options,
 		  timeout) != -1) {
 
         if(!field_type[0]) 
@@ -612,7 +612,7 @@ void RequestMain(per_request *reqInfo)
       sockbuf_count++;
     }
 
-    if (getline(reqInfo->sb, as_requested, HUGE_STRING_LEN,
+    if (getline1(reqInfo->sb, as_requested, HUGE_STRING_LEN,
 		options, timeout) == -1)
         return;
 
diff --git a/src/portability.h b/src/portability.h
index 7f4fc9f..62a9fdb 100644
--- a/src/portability.h
+++ b/src/portability.h
@@ -20,6 +20,7 @@
 #ifndef _PORTABILITY_H_
 #define _PORTABILITY_H_
 
+
 /* Define one of these according to your system. */
 #if defined(SUNOS4)
 #define BSD
@@ -30,6 +31,7 @@
 char *crypt(char *pw, char *salt);
 #define DIR_FILENO(p)  ((p)->dd_fd)
 
+
 #elif defined(SOLARIS2)
 #undef BSD
 #define NO_KILLPG
@@ -210,7 +212,7 @@ typedef int mode_t;
 #endif
 /* Needed for newer versions of libc (5.2.x) to use FD_LINUX hack */
 #define DIRENT_ILLEGAL_ACCESS
-#define DIR_FILENO(p)  ((p)->dd_fd)
+#define DIR_FILENO(p)  (dirfd(p))
 #define CMSG_DATA(cmptr)  ((cmptr)->cmsg_data)
 #define NEED_SYS_UN_H
 #undef BSD
diff --git a/src/rfc822.c b/src/rfc822.c
index ad13ad0..02309a8 100644
--- a/src/rfc822.c
+++ b/src/rfc822.c
@@ -3,8 +3,8 @@
   30-Aug-94 ekr
 */
 
-/*A wrapper around getline to do rfc822 line unfolding*/
-int ht_rfc822_getline(char *s,int n,int f,unsigned int timeout)
+/*A wrapper around getline1 to do rfc822 line unfolding*/
+int ht_rfc822_getline1(char *s,int n,int f,unsigned int timeout)
   {
     static char pb=0;
     int len;
@@ -22,7 +22,7 @@ int ht_rfc822_getline(char *s,int n,int f,unsigned int timeout)
         return(0);
     }
    
-    while(!getline(s,n,f,timeout)){
+    while(!getline1(s,n,f,timeout)){
       len=strlen(s);
       s+=len;      
       n-=len;
diff --git a/src/util.c b/src/util.c
index 5e81c52..750970f 100644
--- a/src/util.c
+++ b/src/util.c
@@ -545,7 +545,7 @@ void http2cgi(char* h, char *w) {
 	w++;
 }
 
-void getline_timed_out(int sig) 
+void getline1_timed_out(int sig) 
 {
     char errstr[MAX_STRING_LEN];
 
@@ -582,7 +582,7 @@ sock_buf *new_sock_buf(per_request *reqInfo, int sd)
  * This routine is currently not thread safe.
  * This routine may be thread safe. (blong 3/13/96)
  */
-int getline(sock_buf *sb, char *s, int n, int options, unsigned int timeout)
+int getline1(sock_buf *sb, char *s, int n, int options, unsigned int timeout)
 {
     char *endp = s + n - 1;
     int have_alarmed = 0;
@@ -614,7 +614,7 @@ int getline(sock_buf *sb, char *s, int n, int options, unsigned int timeout)
     do {
 	if (sb->buf_posn == sb->buf_good) {
 	    have_alarmed = 1;
-	    signal(SIGALRM,getline_timed_out);
+	    signal(SIGALRM,getline1_timed_out);
 	    alarm(timeout);
 
 	    ret=read(sb->sd, sb->buffer, size);
@@ -738,7 +738,7 @@ int eat_ws (FILE* fp)
     return ch;
 }
          
-int cfg_getline (char* s, int n, FILE* fp)
+int cfg_getline1 (char* s, int n, FILE* fp)
 {
     int   len = 0, ch;
 
diff --git a/src/util.h b/src/util.h
index 41f78e1..3c1079d 100644
--- a/src/util.h
+++ b/src/util.h
@@ -24,7 +24,7 @@
 #include <time.h>
 #include <sys/stat.h>
 
-/* getline options */
+/* getline1 options */
 #define G_RESET_BUF	1
 #define G_FLUSH		2
 #define G_SINGLE_CHAR   4
@@ -49,10 +49,10 @@ void getparents(char *name);
 void no2slash(char *name);
 uid_t uname2id(char *name);
 gid_t gname2id(char *name);
-int getline(sock_buf *sb, char *s, int n, int options, unsigned int timeout);
+int getline1(sock_buf *sb, char *s, int n, int options, unsigned int timeout);
 sock_buf *new_sock_buf(per_request *reqInfo, int sd);
 int eat_ws (FILE* fp);
-int cfg_getline(char *s, int n, FILE *f);
+int cfg_getline1(char *s, int n, FILE *f);
 void getword(char *word, char *line, char stop);
 void splitURL(char *line, char *url, char *args);
 void cfg_getword(char *word, char *line);
diff --git a/start.sh b/start.sh
new file mode 100755
index 0000000..1ebbefa
--- /dev/null
+++ b/start.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+./httpd
+while true; do
+    sleep 1;
+done
diff --git a/support/Makefile b/support/Makefile
index 26c65eb..afeef87 100644
--- a/support/Makefile
+++ b/support/Makefile
@@ -49,7 +49,7 @@ hp-gcc:
 	make all CC=gcc CFLAGS="-DHPUX" EXTRA_LIBS=-lndbm
 
 linux:
-	make all CC=gcc CFLAGS="-DLINUX" EXTRA_LIBS=-lgdbm
+	make all CC=gcc CFLAGS="-DLINUX" EXTRA_LIBS="-lcrypt -lgdbm -lgdbm_compat"
 
 netbsd: 
 	make all CC=cc CFLAGS="-DNETBSD" EXTRA_LIBS=-lcrypt
diff --git a/support/dbmdigest.c b/support/dbmdigest.c
index 75f22db..bd3473c 100644
--- a/support/dbmdigest.c
+++ b/support/dbmdigest.c
@@ -42,7 +42,7 @@ void getword(char *word, char *line, char stop) {
     while(line[y++] = line[x++]);
 }
 
-int getline(char *s, int n, FILE *f) {
+int getline1(char *s, int n, FILE *f) {
     register int i=0;
 
     while(1) {
@@ -166,7 +166,7 @@ main(int argc, char *argv[]) {
     strcpy(user,argv[2]);
 
     found = 0;
-    while(!(getline(line,MAX_STRING_LEN,f))) {
+    while(!(getline1(line,MAX_STRING_LEN,f))) {
         if(found || (line[0] == '#') || (!line[0])) {
             putline(tfp,line);
             continue;
diff --git a/support/htpasswd.c b/support/htpasswd.c
index fb3415a..cedf37d 100644
--- a/support/htpasswd.c
+++ b/support/htpasswd.c
@@ -45,7 +45,7 @@ void getword(char *word, char *line, char stop) {
     while(line[y++] = line[x++]);
 }
 
-int getline(char *s, int n, FILE *f) {
+int getline1(char *s, int n, FILE *f) {
     register int i=0;
 
     while(1) {
@@ -163,7 +163,7 @@ main(int argc, char *argv[]) {
     strcpy(user,argv[2]);
 
     found = 0;
-    while(!(getline(line,MAX_STRING_LEN,f))) {
+    while(!(getline1(line,MAX_STRING_LEN,f))) {
         if(found || (line[0] == '#') || (!line[0])) {
             putline(tfp,line);
             continue;
diff --git a/support/webgrab.c b/support/webgrab.c
index b254c49..53cc9fa 100644
--- a/support/webgrab.c
+++ b/support/webgrab.c
@@ -24,6 +24,7 @@
 #include <netdb.h>
 
 #include <string.h>
+#include <stdlib.h>
 
 #define VERSION "1.3"
 

Dockerfile
FROM debian:11
RUN apt-get -y update && apt-get -y install gcc make libgdbm-dev libgdbm-compat-dev procps curl
ADD ncsa-httpd /ncsa-httpd
RUN cd ncsa-httpd && make clean linux
RUN mkdir -p /usr/local/etc/httpd/htdocs
RUN mkdir -p /usr/local/etc/httpd/logs
RUN mkdir -p /usr/local/etc/httpd/conf
ADD ncsa-httpd/conf/httpd.conf-dist /usr/local/etc/httpd/conf/httpd.conf
ADD ncsa-httpd/conf/access.conf-dist /usr/local/etc/httpd/conf/access.conf
ADD ncsa-httpd/conf/localhost_srm.conf-dist /usr/local/etc/httpd/conf/localhost_srm.conf
ADD ncsa-httpd/conf/mime.types /usr/local/etc/httpd/conf/mime.types
ADD ncsa-httpd/conf/srm.conf-dist /usr/local/etc/httpd/conf/srm.conf
RUN useradd -ms /bin/bash ncsa

WORKDIR /ncsa-httpd

CMD ["./start.sh"]

Makefile

build:
	docker build -t ncsa-1.5 .

start:
	docker run -p 8412:8412 --rm --name "oldncsa" ncsa-1.5

stop:
	docker stop oldncsa

shell:
	docker exec -ti oldncsa bash

Now if I do something like this:

user@garage3:~/ncsa$ make start 
docker run -p 8412:8412 --rm --name "oldncsa" ncsa-1.5
NCSA HTTPd NCSA/1.5.2a
Licensed material.  Portions of this work are
Copyright (C) 1995-1996 Board of Trustees of the University of Illinois
Copyright (C) 1995-1996 The Apache Group
Copyright (C) 1989-1993 RSA Data Security, Inc.
Copyright (C) 1993-1994 Carnegie Mellon University
Copyright (C) 1991      Bell Communications Research, Inc. (Bellcore)
Copyright (C) 1994      Spyglass, Inc.

And it's ready to serve requests. I'm attaching a zip here with the source already patched and all aforementioned files included.
In order to build the docker image you'll have to run make build.

ncsa.zip

@omgoo
Copy link
Author

omgoo commented Apr 6, 2023

I've added a PR that fixes the issue in replaying webarchives that were created from servers running NCSA 1.5.1. I'm not convinced this is the best solution but it does fox our issue and allow the archive content to replay: #153

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants