Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of mixed LTR and RTL references #59

Open
denismaier opened this issue Apr 5, 2019 · 14 comments
Open

Improve handling of mixed LTR and RTL references #59

denismaier opened this issue Apr 5, 2019 · 14 comments

Comments

@denismaier
Copy link

References with both LTR and RTL text parts need some reworking. At the moment, the situation is as follows: If I choose a JM style references with locale "he" look messed up in the bibliography: they have the orientation right to left, even if only one the title appears in Hebrew. The footnotes look fine. Strangely enough, this also happens when I the output is completely in latin text (like translation and transliteration, but nothing in Hebrew script); the rendering is actually ok in this case, but the period appears at the beginning of the entry.

@fbennett
Copy link

fbennett commented Apr 6, 2019

I guess it starts with the Firefox UI. Does editing in a Zotero/Jurism field work sensibly with mixed-direction text? I have a Hebrew string written into a test fixture from a long time ago: multilingual_RightToLeft. In the Jurism/Zotero UI, the characters show in the same order that they appear in the browser view, and the cursor does not change direction when moving across it.

In Emacs, the characters in the Hebrew string are reversed, and the cursor direction reverses when it enters the string (moving from left to right, it leaps to the right end of the string, then moves right to left until the end of the string is reached, then leaps to the right of the string).

I assume that one of these is wrong. If Firefox/Jurism/Zotero are showing expected behavior and Emacs is wrong, that would be good news---because I have played with CSS text-direction settings in the Jurism item box, and none of them have any effect. So if it's doing the wrong thing, we are unable to change its behavior.

@denismaier
Copy link
Author

I can confirm that Emacs and Firefox show a different behaviour here; to add more examples, the behaviour of Word and LibreOffice parallels that of Emacs here. I think the Emacs/Word/LibreOffice is the expected behaviour because here we have the cursor following the text logic.

But I am actually not sure this is the issue here.

  1. The problem only occurs with a CSL-M style, vanilla CSL styles produce a correct rendering.
  2. Even with CSL-M styles the footnotes look fine; the problem only occurs in the bibliography.
  3. If I change to item language from "he" to a LTR language, the problem disappears completely.

=> The problem occurs with a CSL-M style, in the bibliography, with item language "he".

@fbennett
Copy link

fbennett commented Apr 6, 2019

Oh! So if we just remove any special treatment that is being applied to CSL-M styles, we're good? We can definitely do that.

@denismaier
Copy link
Author

denismaier commented Apr 6, 2019 via email

@fbennett
Copy link

fbennett commented Apr 6, 2019

Since I don't really understand what was causing the breakage, let's start from a clean slate. I've stripped out all RTL special handling from the current Propachi plugin. If you install it in Jurism and then encounter breakage, we can take it from there:

https://github.com/Juris-M/propachi-vanilla/releases/tag/v1.1.142

@denismaier
Copy link
Author

I will check. It will take a few days though.

@fbennett
Copy link

fbennett commented Apr 6, 2019

Absolutely no rush!

@denismaier
Copy link
Author

I've updated to the new beta: Standard styles and CSL-M styles give the same output now.
This seems to work for situations like this:

  • Family, Given. Book Title [כותרת הספר]. Jerusalem, 2019.

I'll have to do some more thorough testing for other combinations.

(By the way, if things get more complicated LibreOffice seems to give much better bidi results. Words constantly messes up with parentheses and brackets.)

@fbennett
Copy link

fbennett commented Apr 9, 2019

Fingers crossed!

@fbennett
Copy link

For the next phase of work on RTL, test cases would be really useful. Have you tried the Citeproc Test Runner? It's pretty simple to use, and we could use it to build test cases that can be commented on and verified online.

Since the runner makes it easy to generate lots of tests, we should probably give them their own GitHub repo, with everything eventually ending up under the citation-style-language project. Would you like to do the honors, or shall I?

@denismaier
Copy link
Author

denismaier commented Apr 11, 2019 via email

@denismaier
Copy link
Author

Ok, there's been questions about bidi handling again. Maybe we should get back to this if/when you've got time for this...

@fbennett
Copy link

fbennett commented Jul 4, 2020

There are potential issues in three layers: the strings stored by the Jurism client and sent to citeproc-js, the handling of the strings in citeproc-js itself, and the word processor's handling of what it receives from citeproc-js.

The best starting point would be a set of processor tests with the desired input and output. We can then build a little style for testing that sends the desired output directly to the word processor, to confirm that it will display correctly. Once that's confirmed, we can look at how to persuade Jurism to prepare strings with the desired Unicode control codes to the processor.

@denismaier
Copy link
Author

Thanks for coming back to this. What's the next step? I can look into this in Monday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants