Display XML markup in LTR direction, content RTL



  • I am marking up right-to-left Hebrew text with XML. The XML code (elements and attributes) is in English.

    • If I set View > Text Direction LTR, the XML tags are correctly displayed, but the Hebrew content wraps in the wrong direction and the punctuation is in the wrong place.
    • If I set View > Text Direction RTL, the Hebrew is displayed correctly but the XML tags are displayed backwards like this:
      <elem_name/>…<“attr_value”=attr_name elem_name>

    The resulting XML document is valid and works in XML processors. Only the Notepad++ display is wrong. Is there a way to fix the display?

    I thought to create a user-defined language, and specify that the keywords of the language have an LTR display orientation. However, I don’t see a direction option in the User Defined Language window. Can I use CSS to format the Notepad++ display, or anything like that?

    David



  • The source code is always written in English and this cannot be changed.
    But when you need to write text lines in Hebrew, you can change the direction of writing from right to left.



  • Hebrew words are written from right to left, and numbers from left to right!
    This is how they get along together?!



  • When I type in a Hebrew font, the text is automatically RTL. That is not the problem.

    The problem is specifically with the wrapping at the end of a long text line. The text is supposed to break on the left (the opposite of English). In the Notepad++ LTR display mode, it breaks on the right, which makes the text unreadable. If I switch Notepad++ to the RTL display mode, the Hebrew line breaks are correct, but the English XML markup is messed up.

    To correct the problem, I need something like the CSS unicode-bidi attribute, which controls how LTR text is displayed when it is embedded in an RTL paragraph. Is a setting like that available in Notepad++?



  • Hebrew words are written from right to left, and numbers from left to right!
    This is how they get along together?!

    Correct. It’s even more complex than that. Punctuation marks such as . , ! ? " are neutral characters that can run in either direction. Imagine a telephone number that is embedded in RTL text, for example:

    .123-456-7890 txet LTR emos

    An editor or a browser that accounts this string needs to decide how to display it. There are many possibilities. Among them:

    .123-456-7890 txet LTR emos
    //correct

    123-456-7890 txet LTR emos.
    //Wrong, the period is at the wrong end. This is what happens in many editors, including Notepad++ LTR mode.

    7890-456-123 txet LTR emos.
    //Oh my gosh, incorrect phone number.

    For an erudite discussion, see:
    https://www.w3.org/International/articles/inline-bidi-markup/



  • For the record, I found a possible solution. Since I am creating the XML schema myself, I can define the element and attribute names in Hebrew characters. When I set Notepad++ to the RTL text orientation, it displays both the Hebrew content and the Hebrew XML tags in a readable way.

    From a development point of view, it’s a quirky and non-portable solution. See:
    https://www.w3.org/International/questions/qa-non-eng-tags

    If anyone else has experience with this issue, I will be grateful to hear from them.


Log in to reply