-
-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDFs missing from NoteCards HC files at files.interlisp.org #1830
Comments
these are all files that failed in some way (an error or call to HELP) when run through TEdit or (for Lisp source files) have package problems. The Notecards files might have missing image object sources, because Notecards isn't in WHEREIS.HASH for TEdit to pick up. There may be other reasons that all need to be fixedd in the HCFILES process or perhaps some TEdit bugs. I think the way to address these is to take the files one at a time, try to TEdit hardcopy the file, fix the problems encountered and then try again. Since many of the problems affect more than one file. |
@pamoroso @rmkaplan @fghalasz The task here is to figure out why HCFILES fails on these files by seeing what errors arise when you just try to TEDIT the files and "hardcopy" to pdf or postscript. @fghalasz wrote: The hcfiles-fails.txt and hcfiles.dribble files from the weekly builds are stored along with all the hcfiles-generated PDF files on files.interlisp.org. Unfortunately they are not included in the html index pages that HCFILES creates because they are not available until after HCFILES is run. But they are available at https://files.interlisp.org/medley/loadups/hcfiles-fails.txt and https://files.interlisp.org/medley/loadups/hcfiles.dribble either in a browser or via wget. hcfiles-fails.txt is just a selection from hcfiles.dribble of every line that ends in IL:FAIL. |
I did a quick sampling and most of the issues are In one case opening the file with TEdit breaks with the error I'll go through each file and report on the issues in more detail. |
Opening {MEDLEY}/notecards/docs/user-guide-v1.2/03-SOFTWARE-INSTALLATION.TEDIT with
The context of the error:
|
Opening {MEDLEY}/notecards/docs/user-guide-v1.2/04-SYSTEM-USE-ISSUES.TEDIT with
The document has other rendering issues such as black boxes and random ideographic characters like in this screenshot: Using the mouse wheel to scroll down the document yields a break window with the error:
The error context:
Exiting the break with Using the scroll bar to scroll yields a break window with the same error but a different stacktrace. Using the |
Opening {MEDLEY}/notecards/docs/user-guide-v1.2/05-NOTECARDS-BASICS.TEDIT with
where
The context of the
|
That looks like an xccs encoding issue- it thinks the font name is not in charset 0. This is one for @rmkaplan |
This one fails also in the Venue sysout, albeit with a different kind of error. But there the image object getfns don’t load automatically, and maybe were not included in the basic loadup. So I’ll have to poke a little more before I can say that the file itself is smashed
… On Sep 23, 2024, at 9:00 AM, Nick Briggs ***@***.***> wrote:
That looks like an xccs encoding issue- it thinks the font name is not in charset 0. This is one for @rmkaplan <https://github.com/rmkaplan>
—
Reply to this email directly, view it on GitHub <#1830 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJLCAOQD2EKXCUCW6HDZYA3LHAVCNFSM6AAAAABOFG2EUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRYG4ZDGOJYGE>.
You are receiving this because you were mentioned.
|
if you load the image object GETFN manually, does it still fail in the Venue sysout? |
The HRULE getfn is already there, although it is hard to say that that is what it is really looking for. In the venue sysout the failure shows up as trying to get the UNKNOWNGETFN proper of a NIL image object, in my current code it shows up as a font problem with a message that it failed in HRULE.GETFN (but that message may be the result of an earlier problem).
… On Sep 23, 2024, at 9:27 AM, Larry Masinter ***@***.***> wrote:
if you load the image object GETFN manually, does it still fail in the Venue sysout?
—
Reply to this email directly, view it on GitHub <#1830 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJJ7UPQYNQE4KLHU5ZLZYA6NPAVCNFSM6AAAAABOFG2EUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRYG44DGOBQHA>.
You are receiving this because you were mentioned.
|
Opening {MEDLEY}/notecards/docs/user-guide-v1.2/11-SYSTEM-CARDS.TEDIT with
The context of the error:
|
This doesn’t give an error in the Venue sysout, but the file shows up as mostly black bars after the first half page.
With current code, this is not recognized as a Tedit format file, something wrong with the trailer. So Tedit tries to read it as a plaintext file,and runs into long 255 sequences, which the XCCS format doesn’t like.
… On Sep 23, 2024, at 12:27 PM, Paolo Amoroso ***@***.***> wrote:
Opening {MEDLEY}/notecards/docs/user-guide-v1.2/11-SYSTEM-CARDS.TEDIT with (TEDIT '11-SYSTEM-CARDS.TEDIT)opens a blank TEdit window and a break window with the error:
In ERROR:
EXPECTED PLANE 0 XCCS CHARACTER IS ILL-FORMED
The context of the error:
6_: BT
ERROR
OPENTEXTSTREAM
TEDIT
TEDIT
FAULTEVAL
EVAL
EXEC
7_: BTV
MESS1 "EXPECTED PLANE 0 XCCS CHARACTER IS ILL-FORMED"
MESS2 NIL
NOBREAK NIL
ERROR
STRM
#<Input Stream on {DSK}<home>paolo>il>ncdocs>11-SYSTEM-CARDS.TEDIT;1/130,12200>
START 0
END 98304
DEFAULTCHARLOOKS {CL101/13312:Gacha10}
DEFAULTPARALOOKS {FMT101/62694:LE-0-0}
NEXTFILEPOS 1060
CHARSET 255
FIRSTPC {PIECE}#144,127614
CODESIZE 1
SBINABLE T
EOLC 0
PC {PIECE}#122,66362
BYTE 255
CHAR NIL
PREVPC {PIECE}#122,66362
PTYPE 0
RUNLEN 15
FILEPOS 1045
CRBEFORE T
SHIFTNEXT T
\TEDIT.GET.UNFORMATTED.FILE.XCCS
SI::*CLEANUP-FORMS* SI::RESETUNWIND
TEXTOBJ {TEXTOBJ}#144,172000
FORMAT :XCCS
DEFAULTCHARLOOKS {CL101/13312:Gacha10}
DEFAULTPARALOOKS {FMT101/62694:LE-0-0}
PIECES NIL
SI::*UNWIND-PROTECT*
STREAM
#<Input Stream on {DSK}<home>paolo>il>ncdocs>11-SYSTEM-CARDS.TEDIT;1/130,12200>
TSTREAM #<IO Text Stream/174,76000>
START 0
END 98304
PROPS NIL
LISPXHIST ((&) (4 "" . "_ ") "<not yet evaluated>"
NIL)
SI::*RESETFORMS* ((& NIL))
RESETSTATE NIL
\TEDIT.GET.UNFORMATTED.FILE
SI::*CLEANUP-FORMS* SI::RESETUNWIND
TEXTOBJ {TEXTOBJ}#144,172000
PWINDOW {WINDOW}#122,71664
READONLY NIL
SI::*UNWIND-PROTECT*
TEXT
#<Input Stream on {DSK}<home>paolo>il>ncdocs>11-SYSTEM-CARDS.TEDIT;1/130,12200>
TSTREAM #<IO Text Stream/174,76000>
START 0
END 98304
PROPS (BEING-EDITED T)
LISPXHIST ((&) (4 "" . "_ ") "<not yet evaluated>"
NIL)
SI::*RESETFORMS* NIL
RESETSTATE NIL
\TEDIT.OPENTEXTSTREAM.PIECES
SI::*CLEANUP-FORMS* SI::RESETUNWIND
TSTREAM #<IO Text Stream/174,76000>
TEXTOBJ {TEXTOBJ}#144,172000
TEDIT.GET.FINISHEDFORMS NIL
PRIMARYW NIL
SI::*UNWIND-PROTECT*
TEXT
#<Input Stream on {DSK}<home>paolo>il>ncdocs>11-SYSTEM-CARDS.TEDIT;1/130,12200>
WINDOW {WINDOW}#122,77000
START NIL
END NIL
PROPS (BEING-EDITED T)
LISPXHIST ((&) (4 "" . "_ ") "<not yet evaluated>"
NIL)
SI::*RESETFORMS* NIL
RESETSTATE NIL
OPENTEXTSTREAM
TEXT 11-SYSTEM-CARDS.TEDIT
WINDOW NIL
DONTSPAWN NIL
PROPS (BEING-EDITED T)
TSTREAM NIL
PROC NIL
TEDIT
TEXT 11-SYSTEM-CARDS.TEDIT
WINDOW NIL
DONTSPAWN NIL
PROPS NIL
TEDIT
*FORM* (TEDIT (QUOTE 11-SYSTEM-CARDS.TEDIT))
*ARGVAL* NIL
*TAIL* NIL
*FN* TEDIT
\EVALFORM
FAULTEVAL
*FORM* (UNDOABLY (TEDIT &))
\EVALFORM
\INTERNAL NIL
EVAL
EVAL-INPUT
RETRYFLAG NIL
HELPCLOCK 674
DO-EVENT
SI::*DUMMY-FOR-CATCH* T
SI::*CATCH-RETURN-FROM* (&)
LISPXHIST ((&) (4 "" . "_ ") "<not yet evaluated>"
NIL)
HELPCLOCK 0
XCL::EXECA0001A0002
*CURRENT-EVENT* ((&) (4 "" . "_ ")
"<not yet evaluated>" NIL)
SI::NLSETQ-VALUE NIL
*PROCEED-CASES* (&)
SI::*NLSETQFLAG* NIL
XCL::EXECA0001
\PROGV
XCL::TOP-LEVEL-P T
XCL::WINDOW {WINDOW}#161,150664
XCL::TITLE-SUPPLIED NIL
XCL::TITLE NIL
*THIS-EXEC-COMMANDS* (#<Hash-Table @ 166,31666>)
XCL::ENVIRONMENT NIL
XCL::PROMPT NIL
XCL::FN EVAL-INPUT
XCL::PROFILE "XCL"
*EXEC-ID* ""
XCL::PROFILE-CACHE (XCL::*PROFILE-NAME* "IL"
XCL:*EVAL-FUNCTION* EVAL *PACKAGE* #<Package INTERLISP>
*READTABLE* #<ReadTable INTERLISP/174,75714>
XCL:*EXEC-PROMPT* "_ " --)
EXEC
\PROC.REPEATEDLYEVALQT
*FORM* (\PROC.REPEATEDLYEVALQT)
*ARGVAL* NIL
*TAIL* NIL
*FN* \PROC.REPEATEDLYEVALQT
\EVALFORM
%#FORM# (\PROC.REPEATEDLYEVALQT)
*CURRENT-PROCESS* #<Process EXEC/174,26204>
HELPFLAG BREAK!
\CURRENTDISPLAYLINE 0
\#DISPLAYLINES 25
\LINEBUF.OFD #<IO Linebuffer Stream/144,125400>
*READTABLE* #<ReadTable INTERLISP/174,75714>
\PRIMTERMTABLE {TERMTABLEP}#174,70740
\PRIMTERMSA {CHARTABLE}#174,71000
TtyDisplayStream #<Output Display Stream/130,12700>
SI::*RESETFORMS* NIL
\INTERRUPTABLE T
\TTYWINDOW NIL
READBUF NIL
\TERM.OFD #<Output Display Stream/170,117500>
*STANDARD-OUTPUT* #<Output Display Stream/170,117500>
*STANDARD-INPUT* #<IO Linebuffer Stream/144,125400>
\MAKE.PROCESS0
T
—
Reply to this email directly, view it on GitHub <#1830 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJI3SLANYXRPDNOMPUDZYBTRBAVCNFSM6AAAAABOFG2EUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRZGE3TSNBWHE>.
You are receiving this because you were mentioned.
|
Opening {MEDLEY}/notecards/docs/user-guide-v1.2/APP-Z-PROG-INTERFACE.TEDIT with |
Note that these all fail in the Venuesysout, but not always in the same way. |
Based on our discussion and experiments today, I looked at some of the HCFILES failures in notecards/library/. For ones that wouldn't open in the display, I swapped the CR and LF bytes, and if I then got a font-not-found for size 13, I advised fontcreate in Tedit to substitute 10 for 13 when it read the font. This helped with some of the files (progress), but others still failed, but now in different ways. I also noticed when I scrolled to the bottom of some of the recovered files that there was sometimes a little garbling there. In some cases the last few characters didn't show on the display, in other cases there were a few black boxes at the end. It appears that here are some 0 bytes that are being included in the last piece (maybe 3?). Also playing around, one of the other HCFILES failures (I now forget which) was failing not in Tedit, but in the RENAMEFILE that was copying the temporary pdf file back to its place in the Medley file system. This was way down in COPYCHARS, where it was trying to read a character with the ANY eol convention (which we have talked about separately). But I'm not sure why RENAMEFILE in this context (or any context) doesn't use COPYBYTES. Is that because it is copying from {UNIX} to {DSK} ? |
I opened a separate issue on the renaming problem #1900 |
Apart from the fact that a rename shouldn't be necessary, I think I see why the COPYCHARS hit the low-level error. It appears to think that both the newly-created tmp pdf file and the destination in {DSK} are XCCS files, so it is trying to read characters as XCCS encoded. But there appear to be 255's strewn around the pdf in unholy ways, the XCCS INCCODE function doesn't quite know what to do, and it loses track of the byte count. The Unix PS-to-PDF utilities traffic only in utf8/unicode files, but RENAMEFILE doesn't know that--the external format can be specified at OPEN, but currently not through the generic RENAMEFILE function. RENAMEFILE could be extended to take some optional parameters, like OPEN. But a more immediate fix may be to figure out why GETFILEINFO is saying that PDF files are TEXT. |
It seems that the TYPE of both PS and PDF files is coming from the entry on DEFAULTFILETYPELIST, which says that they are text. Should those be changed to BINARY? That would turn off the attempted character conversions between file devices. |
I think they should both be BINARY. PDF is for sure, PS may be text, but you can't guarantee that there isn't, say, a binary blob somewhere in there. |
If the FDEVs for the source and destination of the rename operation are different then it is implemented via |
I'll change everything to BINARY. But it shouldn't be necessary even to specify this, since BINARY is the default if you don't say TEXT (#1902 ) |
The PS files generated by POSTSCRIPTSTREAM are TEXT. There are no binary objects in PostScript. (At least at the PostScript level when I first wrote POSTSCRIPTSTREAM almost 4 decades ago!) |
But I assume you wouldn’t want a utf8 byte sequence to be converted to XCCS (or the other way around) under any circmstances
… On Dec 11, 2024, at 11:22 AM, Matt Heffron ***@***.***> wrote:
I think they should both be BINARY. PDF is for sure, PS may be text, but you can't guarantee that there isn't, say, a binary blob somewhere in there.
The PS files generated by POSTSCRIPTSTREAM are TEXT. There are no binary objects in PostScript. (At least at the PostScript level when I first wrote POSTSCRIPTSTREAM almost 4 decades ago!)
That being said, I can't think of any reason the marking them as BINARY should cause any problems.
—
Reply to this email directly, view it on GitHub <#1830 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJM6RPWVQD2J3WNUMML2FCGI7AVCNFSM6AAAAABOFG2EUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMZWHEYTKNZRGM>.
You are receiving this because you were mentioned.
|
According to the PostScript Reference Manual (1985) which I used developing POSTSCRIPTSTREAM:
The standard character set for PostScript programs is the printable subset of the ASCII character set, plus the characters space, tab, and newline (return or line-feed). PostScript does not prohibit the use of characters outside this set; but such use is not recommended since it impairs portability and may make transmission and storage of PostScript programs more difficult.
POSTSCRIPTSTREAM should use that restricted character set. What I developed did. I don’t know about later edits, but I haven’t seen any such code.
Again, I can't think of any reason that marking them as BINARY should cause any problems. So, I have no objection to marking them as BINARY.
Matt
From: rmkaplan ***@***.***>
Sent: Wednesday, December 11, 2024 2:40 PM
To: Interlisp/medley ***@***.***>
Cc: Matt Heffron ***@***.***>; Comment ***@***.***>
Subject: Re: [Interlisp/medley] PDFs missing from NoteCards HC files at files.interlisp.org (Issue #1830)
But I assume you wouldn’t want a utf8 byte sequence to be converted to XCCS (or the other way around) under any circmstances
On Dec 11, 2024, at 11:22 AM, Matt Heffron ***@***.*** <mailto:***@***.***> > wrote:
I think they should both be BINARY. PDF is for sure, PS may be text, but you can't guarantee that there isn't, say, a binary blob somewhere in there.
The PS files generated by POSTSCRIPTSTREAM are TEXT. There are no binary objects in PostScript. (At least at the PostScript level when I first wrote POSTSCRIPTSTREAM almost 4 decades ago!)
That being said, I can't think of any reason the marking them as BINARY should cause any problems.
—
Reply to this email directly, view it on GitHub <#1830 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJM6RPWVQD2J3WNUMML2FCGI7AVCNFSM6AAAAABOFG2EUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMZWHEYTKNZRGM>.
You are receiving this because you were mentioned.
—
Reply to this email directly, view it on GitHub <#1830 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB7BB4TNKANCUNNDBWKXPVT2FC5K5AVCNFSM6AAAAABOFG2EUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMZXGM2DOMRWGE> .
You are receiving this because you commented. <https://github.com/notifications/beacon/AB7BB4RA272YWNV24JKCHV32FC5K5A5CNFSM6AAAAABOFG2EUGWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUXHTML2.gif> Message ID: ***@***.*** ***@***.***> >
|
Would it matter if LF’s and CR’s got swapped?
… On Dec 11, 2024, at 2:54 PM, Matt Heffron ***@***.***> wrote:
According to the PostScript Reference Manual (1985) which I used developing POSTSCRIPTSTREAM:
The standard character set for PostScript programs is the printable subset of the ASCII character set, plus the characters space, tab, and newline (return or line-feed). PostScript does not prohibit the use of characters outside this set; but such use is not recommended since it impairs portability and may make transmission and storage of PostScript programs more difficult.
POSTSCRIPTSTREAM should use that restricted character set. What I developed did. I don’t know about later edits, but I haven’t seen any such code.
Again, I can't think of any reason that marking them as BINARY should cause any problems. So, I have no objection to marking them as BINARY.
Matt
From: rmkaplan ***@***.***>
Sent: Wednesday, December 11, 2024 2:40 PM
To: Interlisp/medley ***@***.***>
Cc: Matt Heffron ***@***.***>; Comment ***@***.***>
Subject: Re: [Interlisp/medley] PDFs missing from NoteCards HC files at files.interlisp.org (Issue #1830)
But I assume you wouldn’t want a utf8 byte sequence to be converted to XCCS (or the other way around) under any circmstances
> On Dec 11, 2024, at 11:22 AM, Matt Heffron ***@***.*** <mailto:***@***.***> > wrote:
>
>
> I think they should both be BINARY. PDF is for sure, PS may be text, but you can't guarantee that there isn't, say, a binary blob somewhere in there.
>
> The PS files generated by POSTSCRIPTSTREAM are TEXT. There are no binary objects in PostScript. (At least at the PostScript level when I first wrote POSTSCRIPTSTREAM almost 4 decades ago!)
> That being said, I can't think of any reason the marking them as BINARY should cause any problems.
>
> —
> Reply to this email directly, view it on GitHub <#1830 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJM6RPWVQD2J3WNUMML2FCGI7AVCNFSM6AAAAABOFG2EUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMZWHEYTKNZRGM>.
> You are receiving this because you were mentioned.
>
—
Reply to this email directly, view it on GitHub <#1830 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB7BB4TNKANCUNNDBWKXPVT2FC5K5AVCNFSM6AAAAABOFG2EUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMZXGM2DOMRWGE> .
You are receiving this because you commented. <https://github.com/notifications/beacon/AB7BB4RA272YWNV24JKCHV32FC5K5A5CNFSM6AAAAABOFG2EUGWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUXHTML2.gif> Message ID: ***@***.*** ***@***.***> >
—
Reply to this email directly, view it on GitHub <#1830 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJKZPNTX3J5UYIDYCH32FC7DBAVCNFSM6AAAAABOFG2EUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMZXGM3DMNJWHE>.
You are receiving this because you were mentioned.
|
Swapping CR and LF helps in correcting a few of the files so far in {NOTECARDS}/library (I'll do a Notecards PR). After swapping. converting unknown fonts of size 13 to size 10 makes a difference. But there are 2 "known" fonts of size 13, Timesroman 13 MRR and Helvetica 13 MRR. Does anybody remember: are they for real, or should they be reduced to size 10 too? |
Regarding the PS language character set --
|
I believe the two size 13 fonts are real -- they're not the same as size 10 renamed, for sure -- but I can't tell easily because character looks in the current TEdit, applied through the Char Looks menu, aren't working at all for me. |
Char Looks not working in 5th round?
… On Dec 12, 2024, at 10:27 AM, Nick Briggs ***@***.***> wrote:
I believe the two size 13 fonts are real -- they're not the same as size 10 renamed, for sure -- but I can't tell easily because character looks in the current TEdit, applied through the Char Looks menu, aren't working at all for me.
—
Reply to this email directly, view it on GitHub <#1830 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSTUJLPAJFL2W4OPCKAVC32FHISJAVCNFSM6AAAAABOFG2EUGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMZZG4ZTANBRGE>.
You are receiving this because you were mentioned.
|
Some TEdit files of the NoteCards documentation are missing the matching PDFs the HC file generation process should make from them at
files.interlisp.org
. In directory {MEDLEY}/notecards/docs/user-guide-v1.2/ the following TEdit files have no matching PDFs:Most or all PDFs are missing from {MEDLEY}/notecards/docs/user-guide-v1.2/from_envos/ and {MEDLEY}/notecards/docs/misc/from_envos/.
The text was updated successfully, but these errors were encountered: