Skip to content
Ronald Haentjens Dekker edited this page Jul 11, 2014 · 2 revisions

TEI input:

Discussion 2014-07-08 (Lausanne) David J. Birnbaum / Ronald Haentjens Dekker:

  • each witness in a separate TEI document
  • take the <body> element (ignore the rest)
  • get rid of the hierarchy by converting tags into ranges or milestones
  • tokenize on whitespace and punctuation (djb: is this what we should do with punctuation?)
  • create normalized version
  • collate
  • generate variant graph
  • TEI output issue: you can't raise the hierarchy again in a direct way because the collation markup introduces an overlapping hierarchy
  • Solution: not responsibility of CollateX to raise hierarchy again; output with the milestones in place (attach milestone to the nearest token - with "nearest" still to be defined)
Clone this wiki locally