-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert to Scripture Burrito Proposed Format #6
base: main
Are you sure you want to change the base?
Conversation
@jtauber One thing I noticed when writing this initial conversion, is that we had been using an
Thoughts on including |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Way to move quickly!
|
||
def create_sb_json_structure(): | ||
sb_alignment = {} | ||
sb_alignment["type"] = "translation" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is the appropriate type. @jtauber can confirm, but I think the 'translation' type was meant to be for cases where we knew the source was indeed the source, not simply for cases where we assume a source for the sake of alignment. Perhaps type should be 'alignment' as a default?
A 'translation' example would be if we machine-translated a text, and we knew exactly what the source and target were.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Generally, I don't pretend to have the answer, but at least we know what we need to know now.
So I interpret the question to be: when source-target affinity is dubious (as I expect it would be with nearly every Bible translation we work with), what is the correct type
to use?
sb_alignment = {} | ||
sb_alignment["type"] = "translation" | ||
sb_alignment["meta"] = {} | ||
sb_alignment["meta"]["creator"] = "GrapeCity" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a 'note' about the GrapeCity ones for posterity, perhaps something Randall uses to describe the provenance and alignment process—in case he's not available to answer questions about it at some point?
|
||
|
||
# These are not in the standard format. | ||
ALIGNMENT_EXCEPTIONS = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry i let these languish @themikejr: should be updated now in Alignments.
alignment_file_paths.append(os.path.join(root, file)) | ||
return alignment_file_paths | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@themikejr no problems with this, but at some point you might want to look at bible_alignments/config.py for dealing with the various pieces of alignment files.
|
||
|
||
def create_new_file_name(existing_path): | ||
old_path_parts = existing_path.split("/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@themikejr you should use pathlib for workings with paths (and eventually config.py for constructing filenames: see https://github.com/Clear-Bible/Alignments/blob/main/bible_alignments/config.py#L53).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just got us started -- I'm not a python pro. I could maybe come back to this later, but I'm happy for anyone to push commits that make the code more idiomatic or make better use of existing utilities.
This PR introduces a script that moves our existing alignment data to conform to the upcoming scripture burrito alignment data specification. As the specification is still solidifying and becoming more concrete, changes to this PR might be needed. For now, it's a discussion tool to look at and discuss the results of a potential conversion.
Link to specification: https://docs.google.com/document/d/1zR5gsrm3gIoNiHVBlWz5_BBw3N-Ew1-4M5rMsFrPzSw/edit