Community
    • Login

    [New Plugin] MultiReplace

    Scheduled Pinned Locked Moved Notepad++ & Plugin Development
    67 Posts 11 Posters 13.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Thomas KnoefelT
      Thomas Knoefel
      last edited by

      Thanks, i will go to focus also on ANSI CodePages. In between i found an error in the plugin handling multibyte by copying to the Clipboard i’m going to fix.

      1 Reply Last reply Reply Quote 1
      • wonkawillyW
        wonkawilly
        last edited by wonkawilly

        I am working to on an advanced Find and Replace dialog, but I am not a C++ guy , so I just designed the GUI.

        If you wish I can share some screenshots of what I have done since now if you are curious.

        Alan KilbornA 1 Reply Last reply Reply Quote 0
        • Alan KilbornA
          Alan Kilborn @wonkawilly
          last edited by

          @wonkawilly

          Please don’t post any more of those screenshots here; just refer whomever you’re talking to to here, which seems to cover your ideas: https://github.com/notepad-plus-plus/notepad-plus-plus/issues/9627

          wonkawillyW 1 Reply Last reply Reply Quote 1
          • Vitalii DovganV
            Vitalii Dovgan @rdipardo
            last edited by

            @rdipardo said in [New Plugin] MultiReplace:

            Except for the Double-byte Character Sets, which are (still!) the typical OEM encoding on PCs in East Asian countries. Scintilla has a dedicated API for those.

            OMG, I’ve completely forgotten about those!
            So, looks like the most proper way is to invoke MultiByteToWideChar first and then to deal with Unicode strings (that consist of WCHAR characters) since they are natively supported by modern Windows. Actually, this is exactly what I’ve been doing in my code, mostly because WCHAR is native for Windows NT family.
            Going further, this can be enhanced to properly handle Unicode Surrogate Pairs as well. (And these may not be handled correctly in my code because I did not add any specific processing for Surrogate Pairs. Actually, I am not sure whether the standard functions such as lstrlenW take Surrogate Pairs into account or not).

            1 Reply Last reply Reply Quote 3
            • wonkawillyW
              wonkawilly @Alan Kilborn
              last edited by

              @Alan-Kilborn That one is an old ver. I’ve updated it…

              Alan KilbornA 1 Reply Last reply Reply Quote 0
              • Alan KilbornA
                Alan Kilborn @wonkawilly
                last edited by

                @wonkawilly said in [New Plugin] MultiReplace:

                I’ve updated it…

                I’ll hand it to you; you’re tough. Even getting banned over it doesn’t dissuade you. :-)

                wonkawillyW 2 Replies Last reply Reply Quote 1
                • wonkawillyW
                  wonkawilly @Alan Kilborn
                  last edited by wonkawilly

                  This post is deleted!
                  1 Reply Last reply Reply Quote 0
                  • Thomas KnoefelT
                    Thomas Knoefel
                    last edited by Thomas Knoefel

                    @rdipardo said in [New Plugin] MultiReplace:

                    because Scintilla maps the ANSI code page identifiers to the same values as the Win32 API.

                    Does it mean that UTF8 would directly match with ANSI in scintilla? I’m facing the Problem that normal characters are matching in ANSI but special Letters like Ä or Ö don’t. Anybody an idea how to convert a widestr into UTF8 for SCI_SEARCHINTARGET to find these Characters in ANSI? Unsurprisingly the letter Ä matches with Ä in ANSI if i convart Ä into ANSI. … i think i did it but pretty challenging topic.

                    rdipardoR 1 Reply Last reply Reply Quote 0
                    • wonkawillyW
                      wonkawilly @Alan Kilborn
                      last edited by

                      Off Topic:
                      @Alan-Kilborn said in [New Plugin] MultiReplace:

                      you’re tough. Even getting banned over it doesn’t dissuade you. :-)

                      Big changes make always involve taking big risks. And I understand that traditions sometimes are difficult to overcome. It is perfectly normal, at least into human logic. But I also know that traditions will be overcame when people are more aware and ready to make the jump . And this is also part of life and evolution. After all evolution is the meaning of life, and without evolution life cold be less meaningful.
                      This is a general rule that also applies to the case.

                      1 Reply Last reply Reply Quote 0
                      • rdipardoR
                        rdipardo @Thomas Knoefel
                        last edited by

                        @Thomas-Knoefel said in [New Plugin] MultiReplace:

                        @rdipardo said in [New Plugin] MultiReplace:

                        because Scintilla maps the ANSI code page identifiers to the same values as the Win32 API.

                        Does it mean that UTF8 would directly match with ANSI in scintilla?

                        Based on how SCI_GETCODEPAGE works in practice, the alternative encoding to Unicode should be thought of as the “system default” rather than “ANSI”.

                        For most of N++'s history, the “ANSI” code page was indeed single-byte (or, in the case of the legacy CJK encodings, double-byte). But the addition of a UTF-8 OEM code page in Windows version 1903 makes “ANSI” a less useful identifier, even a potentially deceptive one. The system default is no longer directly opposed to Unicode as it once was.

                        So, yes, there may be times when “UTF8 would directly match with ANSI,” but only if 65001 is the value of the ACP key in the system’s registry. Check on this first:

                        reg query HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage /s /f "CP"
                        

                        Anybody an idea how to convert a widestr into UTF8 for SCI_SEARCHINTARGET to find these Characters in ANSI?

                        When you see const char * in the prototype of a Scintilla API (as you will for SCI_SEARCHINTARGET), it means the expected input is a byte string (i.e. “ANSI”). The conversion you want is probably from wchar_t* to char*. A debugger can show you what the encoded text looks like after conversion.

                        1 Reply Last reply Reply Quote 2
                        • Thomas KnoefelT
                          Thomas Knoefel @rdipardo
                          last edited by Thomas Knoefel

                          Thanks, I was still a little bit too much focused on UTF8 with the preperation of ANSI. But this part is working now all directional.

                          @rdipardo said in [New Plugin] MultiReplace:

                          Except for the Double-byte Character Sets, which are (still!) the typical OEM encoding on PCs in East Asian countries. Scintilla has a dedicated API for those.

                          I’m trying to test DBCS on my non-Asian Windows system. Is this even possible somehow? When I test all encodings in Notepad++, SCI_GETCODEPAGE returns 0 for ANSI, and all the others give me 65001. Is there no chance of obtaining one of these encodings?
                          codePage == 932 || codePage == 936 || codePage == 949 || codePage == 950 || codePage == 136
                          I tried the BIG5 and Shift_JIS encodings, both of which are DBCS, but I obtained the same result. Even saving and reopening makes no difference. I have the feeling that i’m looking in the wrong place.

                          Vitalii DovganV Michael VincentM 2 Replies Last reply Reply Quote 0
                          • Vitalii DovganV
                            Vitalii Dovgan @Thomas Knoefel
                            last edited by

                            @Thomas-Knoefel said in [New Plugin] MultiReplace:

                            I’m trying to test DBCS on my non-Asian Windows system. Is this even possible somehow?

                            Yes, go to the “Language & region” system settings, and by clicking the “Administrative language settings” a “Region” dialog is shown. This “Region” dialog has “Administrative” tab where there is a button “Change system locale” for non-Unicode programs.
                            ( This is for Windows 11, it was much faster to find in Windows 7 :) )

                            Vitalii DovganV 1 Reply Last reply Reply Quote 4
                            • Vitalii DovganV
                              Vitalii Dovgan @Vitalii Dovgan
                              last edited by

                              And regarding your other question about conversion between a custom multi-byte encoding (either ANSI or DBCS) and UTF-8, this actually is achieved by double conversion:

                              1. First, call MultiByteToWideChar to convert the input multi-byte string (e.g. ANSI/DBCS) to WCHAR string
                              2. Second, call WideCharToMultiByte to convert the WCHAR string from the step 1 into a resulting multi-byte string (e.g. UTF-8).

                              To convert from UTF-8 to ANSI/DBCS, just specify CP_UTF8 in the step 1 and then the desired ANSI/DBCS codepage in the step 2.

                              1 Reply Last reply Reply Quote 4
                              • Michael VincentM
                                Michael Vincent @Thomas Knoefel
                                last edited by

                                @Thomas-Knoefel

                                I opened a few issues and added some pull requests to your repo.

                                If you are willing to accept pull requests, I have a few more to add once those are merged.

                                Cheers.

                                Thomas KnoefelT 2 Replies Last reply Reply Quote 3
                                • Thomas KnoefelT
                                  Thomas Knoefel @Michael Vincent
                                  last edited by Thomas Knoefel

                                  @Michael-Vincent Thanks, I’ve seen it, and I’m going to commit them. However, the latest updates for codepage handling have not been committed yet. I still need to set up a VMware for Chinese Language settings in order to test DBCS. Once that is finished, I’ll upload the final updates.

                                  Thomas KnoefelT 1 Reply Last reply Reply Quote 4
                                  • Thomas KnoefelT
                                    Thomas Knoefel @Thomas Knoefel
                                    last edited by

                                    @Thomas-Knoefel These are the facts i fgured out. In Notepad++, when you ask about SCI_GETCODEPAGE, it’s always 0 for ANSI and 65001 for UTF8 you won’t encounter any other codepage. Asian codepages, like DBCS, only matter when you’re reading or writing files. So, these codepages won’t mess things up unless you’re working with files saved in these codepages. As for the Save and Load File feature of the plugin, which is designed for an internal store, it will always save in UTF8 format when handling CSV files.
                                    I think this fact will simplify the handling of codepages.

                                    1 Reply Last reply Reply Quote 0
                                    • Thomas KnoefelT
                                      Thomas Knoefel @Michael Vincent
                                      last edited by

                                      @Michael-Vincent said in [New Plugin] MultiReplace:

                                      I have a few more to add once those are merged.

                                      Thank you for your input! All requests are welcome. I can just learn from it.

                                      1 Reply Last reply Reply Quote 1
                                      • Thomas KnoefelT
                                        Thomas Knoefel
                                        last edited by Thomas Knoefel

                                        I have finished RC-2 version with fixed ANSI support and 32 Bit code compatibility. You can find it on Github.

                                        Vitalii DovganV 1 Reply Last reply Reply Quote 2
                                        • Vitalii DovganV
                                          Vitalii Dovgan @Thomas Knoefel
                                          last edited by

                                          Thank you! I like it!
                                          What probably may add more abilities to the plugin is: 1) to have a button that swaps the text between the Find What and Replace With fields; 2) to have checkboxes in the list to specify which Find-Replace pairs to activate and which to deactivate.

                                          Thomas KnoefelT 2 Replies Last reply Reply Quote 2
                                          • Thomas KnoefelT
                                            Thomas Knoefel @Vitalii Dovgan
                                            last edited by

                                            @Vitalii-Dovgan I will probably add both options before final release. Thanks for your input!

                                            1 Reply Last reply Reply Quote 2
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors