TreedownAligner is a react component library. Its components enable users to view and modify alignments between texts with the aid source language syntax via Treedown.
TreedownAligner is a react component written in Typesript using Redux for state management.
Included is a <Workbench />
component that wraps TreedownAligner for local development and testing.
- Install:
npm install treedown-aligner
- Import:
import TreedownAligner from 'treedown-aligner';
- Render:
() => { return <TreedownAligner /> }
(see below for proper usage)
The following peer dependencies are required:
react
react-dom
The component currently supports react v18.x.
- Install dependencies:
yarn
- Run local server:
yarn start
(starts local CRA server with component wrapper in a UI workbench) - Build:
yarn build
After install, import the react component:
import TreedownEditor from 'treedown-aligner';
In your render function, include the component with some props:
<TreedownEditor
theme="day"
corpora={[<...>]}
alignments={[<...>]}
alignmentUpdated={(newAlignmentData) => {
// persist alignment data here
}}
/>
string: 'day' | 'night'
specifies which css theme is used.
string
NOT YET IMPLEMENTED
The consuming application can provide a localizationCode
that component will conform internationalized UI elements to. The ISO 639-3 is expected. Supported languages are:
- English (eng)
- something else here...
- something else here...
Corpus[]
Corpora is the plural form of corpus. A corpus is one body of text that the user will interact with when using this component. Up to 4 corpus entities can be supplied.
A Corpus
looks like:
interface Corpus {
id: string;
name: string;
fullName: string;
language: string;
words: Word[];
fullText?: string;
viewType?: CorpusViewType;
syntax?: SyntaxRoot;
}
id: string
unique identifier for the corpusname: string
short name, likeNIV
fullName
long name, likeNew International Version
language
language code via ISO 639-3words: Word[]
object representing a word in the corpus. SeeWord
below.fullText?: string
optional - full unsegmented text of the corpussyntax?: SyntaxRoot
optional - syntactic parsing of the corpus. SeeSyntaxRoot
below.
Word
is an object representing a word in a corpus. In English, words are surrounded by whitespace, but in other languages not neccesarily.
A Word
looks like:
export interface Word {
id: string;
corpusId: string;
text: string;
position: number;
}
id: string
unique indetifier. used to correlate withAlignment
datacorpusId: string
unique identifier of the corpus the words belongs totext: string
content of the wordposition: number
sequential position of the word in the corpus
In some use cases, one of the supplied corpora can have syntax data. In this case, "Syntax data" is a tree-like structure denoting words, word groups, and their relationships to other. The component current supports a json representation of Lowfat Syntax XML.
Note
: The component could recieve lowfat xml in string form and then internally convert to json structure. This may be preferable.
Alignment[]
An Alignment
is a set of data the describes the relationship between two corpora. The component enables users to view and modify alignment data. It also uses alignment data to generate syntactic views in treedown notation.
interface Alignment {
source: string;
target: string;
polarity: AlignmentPolarity;
links: Link[];
}
source: string
id of the source corpustarget: string
id of the target corpuspolarity: AlignmentPolarity
describes the directionality of the alignment seeAlignmentPolarity
below.links: Link[]
relationship entities between the two corpora. seeLinks
An AlignmentPolarity
describes the "sides" of an alignment and their attributes.
Each alignment dataset much be specified with either a PrimaryAlignmentPolarity
or a SecondaryAlignmentPolarity
.
interface PrimaryAlignmentPolarity {
type: 'primary';
syntaxSide: 'sources' | 'targets';
nonSyntaxSide: 'sources' | 'targets';
}
type: 'primary'
syntaxSide: 'sources' | 'targets'
nonSyntaxSide: 'sources' | 'targets'
interface SecondaryAlignmentPolarity {
type: 'secondary';
mappedSide: 'sources' | 'targets';
nonMappedSide: 'sources' | 'targets';
}
type: 'secondary'
mappedSide: 'sources' | 'targets'
nonMappedSide: 'sources' | 'targets'
Link[]
A Link
is a single instance of alignment data. It describes the relationship between the words of two corpora. The strings on either side of the link are IDs of words that were cpecified in the provided Corpus[]
. There can be one or many words on either side of a Link
.
export interface Link {
_id?: string;
sources: string[];
targets: string[];
}
- [wip]
_id?: string
unqiue indentifier for the link. probably should just generate internally sources: string[]
array of Word IDs on the source side of the linktargets: string[]
array of Word IDs on the target side of the link
This is function provided by the consuming application that is called when a user saves alignment data. At the time of invocation, the current alignment state is passed to the function which can be used to display, send, or persist the user's alignment data.