Community
    • Login

    Add comma between two sentences on a two language document

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    7 Posts 4 Posters 888 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Ali JafariA Offline
      Ali Jafari
      last edited by

      Does anybody know how can we input a comma at the endpoint of a line, when we are working in a two language document?, suppose we have a sentence like this “this is a testاین یک آزمایش است” in this sentence I have an English text and a Persian text at the end, now I would like to have this sentence like this"this is a test,این یک آزمایش است", actually I would like to do this in a large document with “Search and Replace” command.

      EkopalypseE 1 Reply Last reply Reply Quote 0
      • Terry RT Offline
        Terry R
        last edited by

        This post is deleted!
        1 Reply Last reply Reply Quote 0
        • EkopalypseE Offline
          Ekopalypse @Ali Jafari
          last edited by

          This post is deleted!
          1 Reply Last reply Reply Quote 0
          • guy038G Offline
            guy038
            last edited by guy038

            Hello, @ali-jafari, @terry-r, @ekopalypse and All,

            As you may know, the different Arabic characters belong to one of these 5 Unicode blocks :

            • Arabic : [\x{0600}-\x{06FF}]

            • Arabic Supplement : [\x{0750}-\x{07FF}]

            • Arabic Extended-A : [\x{08A0}-\x{08FF}]

            • Arabic Presentation Forms-A : [\x{FB50}-\x{FDFF}]

            • Arabic Presentation Forms-B : [\x{FE70}-\x{FEFF}]

            Refer to :

            http://www.unicode.org/charts/PDF/U0600.pdf
            http://www.unicode.org/charts/PDF/U0750.pdf
            http://www.unicode.org/charts/PDF/U08A0.pdf
            http://www.unicode.org/charts/PDF/UFB50.pdf
            http://www.unicode.org/charts/PDF/UFE70.pdf


            So, here is the road map :

            • Open the Replace dialog ( Ctrl + H )

            • SEARCH ([\x{0021}-\x{007E}])\x20?(?=[\x{0600}-\x{06FF}\x{0750}-\x{07FF}\x{08A0}-\x{08FF}\x{FB50}-\x{FDFF}\x{FE70}-\x{FEFF}])

            • REPLACE \1,

            • Tick the Wrap around option

            • Select the Regular expression search mode

            • Click on the Replace All button


            Notes :

            • The part ([\x{0021}-\x{007E}]) searches for any single ASCII character from \x{0021} = ! till \x{007E} = ~, stored as group 1, due to parentheses, possibly followed by a space char ( \x20? ) ONLY IF  followed with an arabic char, from one of the five zones described above ( due to the lookahead construction )

            • In replacement, the English-American char is simply rewritten ( \1 ) with a comma

            For instance the two lines :

            this is a testاین یک آزمایش است
            this is a test این یک آزمایش است
            

            would be changed as :

            this is a test,این یک آزمایش است
            this is a test,این یک آزمایش است
            

            Best Regards,

            guy038

            P.S. :

            Note that the sub-regex which matches the English-American character is [\x{0021}-\x{007E}] and not [\x{0020}-\x{007E}] ! Indeed, as the Arabic text contains, itself, spaces chars too, we would have some false positive matchs among the Arabic text ;-))

            1 Reply Last reply Reply Quote 3
            • Ali JafariA Offline
              Ali Jafari
              last edited by

              @guy038 said in Add comma between two sentences on a two language document:

              \1,

              Dear friend,

              Thanks for your great support, I have tried and it answered.

              All the Best.

              1 Reply Last reply Reply Quote 0
              • Ali JafariA Offline
                Ali Jafari
                last edited by

                @guy038 said in Add comma between two sentences on a two language document:

                \1,

                Dear my friend,

                Could you please tell me is this way OK for Microsoft office Word or not?, or I need to do something else ?.

                All the Best.

                1 Reply Last reply Reply Quote 0
                • guy038G Offline
                  guy038
                  last edited by guy038

                  Hi, @ali-jafari and All,

                  Unfortunately, I cannot give you valuable information :-(( I’m presently using, on my old XP laptop, the Microsoft Office Suite … 2002, which is good enough for my Word’s work !

                  SEARCH ( for Word ) : ([^0033-^0126])^0032*([^1536-^1791])

                  REPLACE ( for Word ) : \1,\2

                  In Word 2002, when your tick the search option Use generic characters, any non-Unicode char ( below \x0100 ), must be written as ^####, where #### represents the decimal value of the code-point. So, \x{0021} must be changed as ^0033, \x{007E} as ^0126 and so on…

                  Unfortunately, syntaxes over \xFF, as for the main Arabic range [^1536-^1791], that is to say {\x{0600}-{06FF}] in N++ ), is definitively not a valid syntax :-((

                  Moreover, the quantifier syntax {0,n}, after a possible space char, does not work, too. The {1,n}, only, seems valid ! So I prefered to use the usual * syntax

                  Certainly, the recent versions of Word allows the search of characters of the BMP ( so from \x{0000} to \x{FFFF} ). If so, the proposed regex S/R should work correctly !

                  On the other hand, why not process with the N++ regex engine, first and, then, paste your updated text in Word ?

                  Cheers,

                  guy038

                  1 Reply Last reply Reply Quote 0

                  Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                  Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                  With your input, this post could be even better 💗

                  Register Login
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors