Community
    • Login

    Search++: A work in progress

    Scheduled Pinned Locked Moved Notepad++ & Plugin Development
    21 Posts 5 Posters 591 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • CoisesC
      Coises @guy038
      last edited by

      @guy038 said in Search++: A work in progress:

      I do appreciate to temporarily reverse the search direction, with native N++ search, by hitting or releasing the Shift key ! Would it be possible to add this functionality to Search++ plugin ?

      Yes, that makes sense for the step-wise Find and Replace buttons in the plain text search. I don’t think there’s any reason I can’t make it work.

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by

        Hello @coises and All,

        Regarding the bug of Search++_01, I did verify that all is fixes in Search++_02 and I also verified that it’s OK with the Columns++ plugin !


        Regarding changing the Find All name :

        Of course, I suppose that you could change the Find All label by the List label. However note that the Find All in Current Document option, the Find All in All Opened Documents option and the Find All option of N++ search begin all with the string Find All. Thus, I think the change would rather disturb some people !


        Regarding the inclusion of the results of the Replace All option in the search++ results :

        To my mind, this extra work does not seem necessary, at first ! On the other hand, you always have the possibility to re-do a Find All search, using the previous replace expression. However, if this functionnallity seems important to you, just go ahead !


        Regarding hding all lines of a file :

        • Let’s hide all the contents of a file 1, with the option in the Tools panel

        • Switch to an other file 2

        • Switch back to file 1

        All the lines of file 1 are visible again : I suppose that this behaviour is the one that you expect, don’t you ?


        • Regarding the Mark functionnality :

        • Let’s mark some text with the native N++ Mark dialog with the Bookmark line option checked

        • If you use the Unmark All Text option of the Tools panel, it correctly clear all the marks. However, it does not clear the associated bookmarks !

        I think of two possible solutions :

        • Clear all the bookmarks, as well

        • Add a new Clear bookmarks option


        Now, let suppose this text :

         1  foo
         2  bar
         3  foo
         4  bar
         5  foo
         6  bar
         7  foo
         8  bar
         9  bar
        10  foo
        11  bar
        12  bar
        13  foo
        

        I first did a selection from line 4 bar to line 9 bar

        • With the word foo in the search dialog, I first run a Mark in Selection option

        • Now, if I use the Find in Marked Text option, I do get the first word foo of the selection ( Line 5 foo )

        • Clicking again on the Find in Marked Text option, I do get the second word foo of the selection ( Line 7 foo )

        • Clicking a third time on the Find in Marked Text option, I correctly get the message No matches found (Find again to search all marked text)

        • Clicking a fourth time on the Find in Marked Text option, it does match again the first word foo of the Marked Text region ( Line 5 foo )

        That correspond to what you say in the Help documenation :

        Marked Text: If there is any marked text, ONLY marked text will be searched;


        Now, let’s use the new text below :

         1          foo
         2          bar
         3          foo
         4          bar
         5          foo
         6          bar
         7          foo
         8          bar
         9          bar
        10          foo
        11          bar
        12          bar
        13          foo
        
        • I first do a stream selection from line 4 bar to line 9 bar. Note that this selection does contain more than 80 characters and 3 lines ! ( 102 chars / 6 lines ) and then should be considered as a true selection !

        • Now, if I use the Find in Selection option, I do get the first word foo of the selection ( Line 5 ... foo )

        • Clicking again on the Find in Selection option, I do get the second word foo of the selection ( Line 7 ... foo )

        • Clicking a third time on the Find in Selection option, it matches the word foo, outside the selection ( Line 10 ... foo )

        ( I was expecting the message No matches found (Find again to search in selection)

        • Clicking a fourth time on the Find in Selection option, it matches the word foo, outside the selection ( Line 13 ... foo )

        ( I was expecting, again, a match of the first word foo of the selection ( Line 5 ... foo ) ) !

        Indeed, you say in the Help documenation :

        If there is a non-empty selection — single, rectangular or multiple — the search will be confined to the selected text.

        May be, @coises,I just did not exactly understand the Selection concept ?

        Best Regards,

        guy038

        CoisesC 1 Reply Last reply Reply Quote 0
        • CoisesC
          Coises @guy038
          last edited by

          @guy038 said in Search++: A work in progress:

          • Let’s hide all the contents of a file 1, with the option in the Tools panel

          • Switch to an other file 2

          • Switch back to file 1

          All the lines of file 1 are visible again : I suppose that this behaviour is the one that you expect, don’t you ?

          @guy038, thank you so much for all your thoughts and your tests. They are an immense help.

          I see there is a potential problem here, without a simple solution. It looks like Notepad++ saves and restores hidden lines, but it tracks them when it hides them or shows them, so it is unaware of any changes in state made by a plugin. It not only doesn’t restore lines hidden in Search++, if you hide lines in Notepad++ and then find them in Search++, they wind up hidden again when you switch tabs away and back.

          Ugh. This could be really messy. :-(

          • Let’s mark some text with the native N++ Mark dialog with the Bookmark line option checked

          • If you use the Unmark All Text option of the Tools panel, it correctly clear all the marks. However, it does not clear the associated bookmarks !

          I think of two possible solutions :

          • Clear all the bookmarks, as well

          • Add a new Clear bookmarks option

          Unless it doesn’t work for some reason (I haven’t tried it yet), I think Notepad++’s own Search | Bookmark | Clear all bookmarks can handle that.

          May be, @coises,I just did not exactly understand the Selection concept ?

          It sounds like I need to clarify what happens a little more in the help. Short answer: the first Find selects the text it found, which wipes out the original selection; it’s not possible to search within a selection that doesn’t exist, so subsequent searches revert to searching the whole document.

          This traces back to some discussion about search in early Columns++. The purpose of having a search function there was to make it possible to search and replace in column selections, which Notepad++ search will not do. @Alan-Kilborn observed some unexpected behavior and suggested using what Scintilla calls an indicator. (Notepad++ calls it marked text or a style… both style and mark mean something different from indicator in Scintilla, so keeping the words straight is an ongoing challenge.)

          The fundamental operation of Find (step-wise) is to select what was found. You can’t have a selection within a selection; the original selection is lost when you do the first Find. My original attempt at solving this in Columns++ — by “memorizing” the original selection and using it for subsequent finds — did not work well. After Alan’s suggestion, I switched to “Search in Indicated Region” and set a marked region from the column selection. Eventually I wound up with some features for controlling the indicated region incorporated into the search dialog, and some complicated rules (described here under the heading “Other controls in the Search dialog”) to make it so that users can almost forget about the indicator, even though every search in Columns++ must happen within an indicated region.

          One of my goals in Search++ was/is to get rid of the imposition of an “indicated region” (now just called “marked text”) on users who aren’t thinking in those terms.

          The rough equivalent to Selection -> Region | Auto set in Columns++ (which is checked by default) is Settings | Convert selections to marked text before beginning a stepwise search in Search++ (which is not, at present, checked by default).

          Given my experience with Columns++, I don’t plan to attempt to restrict search to a “memorized” selection that no longer exists. Either you convert the selection to marked text, let Search++ convert it for you, or only the first step-wise Find will be confined to the original selection.

          1 Reply Last reply Reply Quote 0
          • Vitalii DovganV
            Vitalii Dovgan
            last edited by

            This is very interesting!

            Maybe you’ll also consider an alternate UI (in addition to the main one) in a form of a small one-line search panel similar to the one in Sublime Text or similar to this one:
            https://github.com/d0vgan/AkelPad-Plugs-QSearch

            To be honest, now I prefer the Incremental Search panel in Noteped++ to the Find dialog because the Incremental Search panel occupies much less space, even though it’s not that powerful as the Find dialog.
            The painful points of the Incremental Search panel are:

            • it does not clear the last word or the entire Find What field by Ctrl+Backspace. Instead, a stupid unreadable symbol is inserted. (Yes, I know, Microsoft forces us to write own handler of Ctrl+Backspace in each and every instance of an Edit control, what a shame),
            • it does not forward Ctrl+Tab to the main window, thus not allowing to switch between tabs while in the Incremental Search panel.
            CoisesC 1 Reply Last reply Reply Quote 2
            • guy038G
              guy038
              last edited by guy038

              Hello, @coises and All,

              • Regarding the Bookmarks :

              I’m pretty dumb for not thinking of the N++ native command Search > Bookmark > Clear All Bookmarks or even better : a right mouse click within the Bookmark margin with the same option !


              • Regarding the selection concept :

              Many thanks for your explanations. So, if I understand you clearly, we need to transform the selection(s) in Marked Text, first and then use the Find in Mark Text option


              • In your initial post, near the end of the Features section, you said :

              Regular expression searches in Search++ perform a fully Unicode-based search using a customized combination of Boost.Regex and ICU4C. In particular, this produces fewer “surprising” results with Unicode characters above 0xFFFF (including most emoji) and when searching in documents using a DBCS code page (which in Notepad++ can be Chinese, Japanese or Korean files that are in the system default encoding instead of in Unicode).

              Then, at the end of the Quirks and features ... section :

              The ICU button at the top is there mostly for testing. It uses the regular expression engine built into ICU, which has different syntax than the familiar Boost.Regex engine and does not integrate as well with Scintilla. Replace is not implemented for this search engine, and it only works on Unicode documents. It will probably be removed when Search++ reaches version 1.0, as it really isn’t very useful except as a check on the results from the main Regex engine (since I’ve meddled with the main Regex engine quite a lot, and I haven’t modified the ICU engine in any way).

              And later, at the end of the Missing and Planned Features ... section :

              I hope to add more features to the regular expression search. The current version is almost identical to the search in Columns++, but presented in what is hopefully a more flexible and user-friendly interface. It should be more accurate for Unicode-derived properties since it uses ICU4C directly instead of working from the home-grown parse of Unicode tables used in Columns++. If I can work out a way, I hope to add Unicode word breaks and more Unicode properties.

              So, some questions :

              • When clicking on the Regex button, do we use your Unicode search engine, as in Columns++ or is it a mix of the Columns++ version and ICU

              • Oddly, if we choose the ICU button, the Replace and Replace All buttons are not greyed and seem functional, contrary to what you said ?!

              • Can you recommend a few websites, speaking about ICU and the Unicode Word Boundaries specificity ?

              • Presently, when hitting the ICU button, do searches like \p{alphabetic} or \p[XID_Continue} are possible against my Total_Chars file of 325,590 characters ?

              TIA for all your answers !

              Best Regards,

              guy038

              CoisesC 1 Reply Last reply Reply Quote 1
              • CoisesC
                Coises @Vitalii Dovgan
                last edited by Coises

                @Vitalii-Dovgan said in Search++: A work in progress:

                Maybe you’ll also consider an alternate UI (in addition to the main one) in a form of a small one-line search panel

                Probably not one line, but reasonably compact should be possible. At present you can dock the docking Search++ dialog to the top or bottom instead of the left or right, if you like that better. The layout adapts, but it doesn’t use the full width as well as it could — right now there are only horizontal and vertical layouts, and I need to work out an “ultra-wide” layout that puts all the buttons and check boxes into a single row when the dialog is wide enough. I don’t see any reason that can’t be done, though.

                The painful points of the Incremental Search panel are:

                • it does not clear the last word or the entire Find What field by Ctrl+Backspace. Instead, a stupid unreadable symbol is inserted. (Yes, I know, Microsoft forces us to write own handler of Ctrl+Backspace in each and every instance of an Edit control, what a shame),

                Search++ does that now. (That must be the default command for Ctrl+Backspace in Scintilla, since I did nothing special to make it work. I’ve never used Ctrl+Backspace.)

                • it does not forward Ctrl+Tab to the main window, thus not allowing to switch between tabs while in the Incremental Search panel.

                I see regular Notepad++ search doesn’t do that either. (It uses Ctrl+Tab to switch dialog tabs, though, so that makes sense.) Search++ doesn’t do it now; I don’t know if it’s possible (particularly from a docked dialog) but I will see if it can be done.

                At present you can switch rapidly to the main Notepad++ window with Ctrl+N. If you’ve set a shortcut for Search++ you can then use that to switch back again. I know that’s still extra keystrokes, so I will see if Ctrl+Tab can be forwarded, since it’s not used for anything in Search++.

                Thank you for your observations and suggestions!

                Vitalii DovganV 1 Reply Last reply Reply Quote 1
                • Vitalii DovganV
                  Vitalii Dovgan @Coises
                  last edited by Vitalii Dovgan

                  @Coises
                  It should be possible to forward Ctrl+Tab and Ctrl+Shift+Tab by processing WM_KEYDOWN with VK_TAB in your dialog’s DlgProc similarly to this:
                  https://github.com/d0vgan/AkelPad-Plugs-QSearch/blob/master/Source/QSearch/QSearchDlg.c#L4569

                  Interestingly, the Right Ctrl key often emulates Ctrl+Alt, so when you verify only the presence of VK_TAB and VK_CONTROL (like in the code mentioned above), this code also works for RightCtrl+Tab which becomes VK_TAB and VK_CONTROL and VK_MENU. (VK_MENU is the Alt key. Unlike the real Alt key that comes under WM_SYSKEYDOWN, the “Ctrl+Alt” from RightCtrl comes under WM_KEYDOWN).

                  Oh, WM_KEYUP should be handled as well:
                  https://github.com/d0vgan/AkelPad-Plugs-QSearch/blob/master/Source/QSearch/QSearchDlg.c#L4607

                  1 Reply Last reply Reply Quote 1
                  • CoisesC
                    Coises @guy038
                    last edited by

                    @guy038 said in Search++: A work in progress:

                    So, if I understand you clearly, we need to transform the selection(s) in Marked Text, first and then use the Find in Mark Text option

                    Yes; or click the Tools button, open Settings and check Convert selections to marked text before beginning a stepwise search to have Search++ do it automatically. Otherwise, multiple searches that don’t affect the selection (like Count or Find All or Replace All) will work within the selection, but only the first stepwise Find (or the preliminary find in a stepwise Replace) will be constrained to the selection, since after that the original selection will be gone.

                    • When clicking on the Regex button, do we use your Unicode search engine, as in Columns++ or is it a mix of the Columns++ version and ICU

                    It’s the Columns++ search engine, except for one thing. Previously I could not figure out how to incorporate ICU4C into the plugin, so for Columns++ I devised a Python program that reads several of the Unicode character data files and writes C++ code that compiles into a gigantic table containing the information I needed. I stumbled on the way to use ICU4C shortly before I began working on Search++; instead of building and using those tables, I go straight to ICU4C for information (questions like, “What is the general category of this character?” or ”Is this a lower case character?”).

                    It might turn out that this will have an efficiency impact (better or worse? — I don’t know). It should fix some of the errors in Columns++, like [[:lower:]] missing characters that are lower case but not letters.

                    • Oddly, if we choose the ICU button, the Replace and Replace All buttons are not greyed and seem functional, contrary to what you said ?!

                    They’re not disabled, but all they do is return the message, “Command not implemented.”

                    • Can you recommend a few websites, speaking about ICU and the Unicode Word Boundaries specificity ?

                    I don’t really have anything except the Unicode documentation. In my brief testing, the practical effect in English is that words like can't are recognized as a single word. Most regular expression engines define a word boundary (\b) in terms of what is a word character (\w). The regular expression engine in ICU lets you do that, but it also provides an option to use Unicode word boundaries to define \b.

                    • Presently, when hitting the ICU button, do searches like \p{alphabetic} or \p[XID_Continue} are possible against my Total_Chars file of 325,590 characters ?

                    Yes. You can even use things like \p{script=Greek}. Unfortunately, I haven’t been able to find any place where ICU documents its own regular expression syntax. The regular-expressions.info web site includes ICU among the regex dialects it shows.

                    1 Reply Last reply Reply Quote 0
                    • CoisesC
                      Coises @guy038
                      last edited by

                      @guy038 said in Search++: A work in progress:

                      I’m a bit annoyed to not be able to clear this panel at any time and that I need to close and re-open a N++ session to that purpose ! Personally, an option in the Tools menu, to clear the Search++ Results panel would be great !

                      Regarding the search direction :
                      

                      I do appreciate to temporarily reverse the search direction, with native N++ search, by hitting or releasing the Shift key ! Would it be possible to add this functionality to Search++ plugin ?

                      These features, and some bug fixes, are in version 0.3.

                      1 Reply Last reply Reply Quote 0
                      • guy038G
                        guy038
                        last edited by guy038

                        Hi, @coises and All,

                        Thanks for your new Search++_03 release !

                        BTW, with native N++ search, the Shift + Enter shortcut is also available when you choose the Regular expression search mode ( with the condition that the regexBackward4PowerUser="yes" option is present within the config.xml file. May be, you could allow it as well in Search++ ?


                        I just discovered ICU’s features, and they’re really impressive ! Over the next few days, I’ll try to list the many Unicode properties accessible through ICU… Another whole new world is opening up to me !! Personally, I think the ICU button should remain available in future versions !


                        I ran into a problem while selecting characters. For example :

                        • Put this small text in new tab
                        
                        
                        ໜໝໞໟ໠໡໢໣໤໥໦໧໨໩໪໫໬໭໮໯໰໱໲໳໴໵໶໷໸໹໺໻໼໽໾໿ༀ༁༂༃༄༅༆༇༈༉༊་༌།༎༏༐༑༒༓༔༕༖༗༘༙༚༛༜༝༞༟༠༡༢༣༤༥༦༧༨༩༪༫༬༭༮༯༰༱༲༳༴༵༶༷༸༹༺༻༼༽༾༿ཀཁགགྷངཅཆཇ཈ཉཊཋཌཌྷཎཏཐདདྷནཔཕབབྷམཙཚཛཛྷཝཞཟའཡརལཤཥསཧཨཀྵཪཫཬ཭཮཯཰ཱཱཱིིུུྲྀཷླྀཹེཻོཽཾཿ྄ཱྀྀྂྃ྅྆྇ྈྉྊྋྌྍྎྏྐྑྒྒྷྔྕྖྗ྘ྙྚྛྜྜྷྞྟྠྡྡྷྣྤྥྦྦྷྨྩྪྫྫྷྭྮྯྰྱྲླྴྵྶྷྸྐྵྺྻྼ྽྾྿࿀࿁࿂࿃࿄࿅࿆࿇࿈࿉࿊࿋࿌࿍࿎࿏࿐࿑࿒࿓࿔࿕࿖࿗࿘࿙࿚࿛࿜࿝࿞࿟࿠࿡࿢࿣࿤࿥࿦࿧࿨࿩࿪࿫࿬࿭࿮࿯࿰࿱࿲࿳࿴࿵࿶࿷࿸࿹࿺࿻࿼࿽࿾࿿ကခဂဃငစဆဇဈဉညဋဌ
                        
                        
                        
                        • Switch to this new tab

                        • Run Plugins > Search++ > Search...

                        • Select the ICU button

                        • SEARCH \p{script=Tibetan}

                        • Check the Match case option

                        • Right click on the Find All button

                        • Choose the Select > Select in Whole Document option

                        => A selection appears with the bottom message Selected 207 matches

                        • Without doing anything else, I use the Ctrl + C shortcut

                        After opening an other new tab, I was quite surprised that the 207 tibetan chars were not pasted, after a Ctrl + V operation ?!

                        Then, I understood that the selection is effective ONLY IF :

                        • The Search++ plugin is closed with the x button or using the ESC key

                        • You click again on the New 1 tab, with Search++ not on focus

                        • You move the New 1 text one line Up or Down with the ▲ or ▼ marks of the vertical scroll bar

                        @coises, is this behaviour correct ?


                        Regarding the Unicode Word boundaries :

                        I had a look to https://www.regular-expressions.info/unicodeboundaries.html#word

                        I understood that :

                        • When ICU selected and the Unicode word boundaries not checked, the \b regex, against our tibetan text above, counts 46 matches

                        • When ICU selected and the Unicode word boundaries checked, the \b regex, against our tibetan text above, counts 176 matches

                        Quite different, indeed ! Note that if the Unicode word boundaries is not checked , the (?w)\b regex would also return 176 matches. Thus, a leading (?w) forces the use of the Unicode word boundaries option !

                        Then, reading https://www.regular-expressions.info/unicodeboundaries.html#grapheme, I realized that, presently, the \b regex cannot identify the different grapheme positions !

                        Would it be possible to add an option for this specific case, or am I asking too much ? I suppose the later is true !!

                        Best Regards,

                        guy038

                        CoisesC 1 Reply Last reply Reply Quote 1
                        • CoisesC
                          Coises @guy038
                          last edited by Coises

                          @guy038 said in Search++: A work in progress:

                          Thanks for your new Search++_03 release !

                          Thank you for testing it.

                          BTW, with native N++ search, the Shift + Enter shortcut is also available when you choose the Regular expression search mode ( with the condition that the regexBackward4PowerUser="yes" option is present within the config.xml file. May be, you could allow it as well in Search++ ?

                          Regex backward… I have my doubts, but I can leave it open as something I might try to make available some day. When I’ve thought about it before, I get caught up trying to define exactly what it means to match regular expressions backward. Regular expressions can match different lengths depending on where they start. Is the previous match the one that ends at the latest possible position? The one that begins at the latest possible position? The last one that would have occurred before the current position if you matched forward repeatedly from the beginning of the text? The one that would result from reversing both the text and the regular expression (but then what do you do with backreferences)?

                          Shift+Enter is a different problem. Enter doesn’t work to find: since the Find and Replace boxes take multiple lines, they consume the Enter key. You can use Alt+F and Alt+R (the underlined characters on the Find and Replace buttons), but those combinations are a bit awkward. I’ve been thinking of just making Shift+Enter and Ctrl+Enter do the functions on the Find and Replace buttons — I think those would be more natural than Alt+F and Alt+R for most people (including me). But then it isn’t obvious how access to backward should work. Beyond all that, there is no standard Windows mechanism for keyboard-only access to the drop-down menus on split command buttons. Once you can get to the button without clicking it, down arrow works to open the menu; but you can’t get there with Alt+underlined letter: that does the click action. I haven’t figured out a good way to deal with all of the keyboard navigation obstacles yet.

                          Which is a long way of saying I don’t know which of too many possibilities I will eventually decide must take priority for keyboard actions, so I don’t know what I can/will do in that regard.

                          Personally, I think the ICU button should remain available in future versions !

                          I’ll probably leave the function there… it might be “hidden” (like a Shift-click on Regex) so it doesn’t confuse people who would probably never use it.

                          • Choose the Select > Select in Whole Document option

                          => A selection appears with the bottom message Selected 207 matches

                          • Without doing anything else, I use the Ctrl + C shortcut

                          After opening an other new tab, I was quite surprised that the 207 tibetan chars were not pasted, after a Ctrl + V operation ?!

                          Then, I understood that the selection is effective ONLY IF :

                          It’s not that selection isn’t effective, it’s that keyboard focus was still in the Search++ dialog. You have to move focus to the document for the Ctrl+C to work.

                          You can use Ctrl+N (think “Notepad++”) to return focus to the document, or (as you discovered) click on the tab if you’re using the mouse.

                          This does make me think I should probably have an option, perhaps enabled by default, to return focus to the document automatically after a select operation, since wanting to copy is probably the most common reason for using select.

                          (I’ve been bitten by this often enough in Columns++, which works the same way. It’s just so easy to forget that focus is in the dialog, not the document.)

                          Note that if the Unicode word boundaries is not checked , the (?w)\b regex would also return 176 matches. Thus, a leading (?w) forces the use of the Unicode word boundaries option !

                          Hmmm… I’m not sure what’s happening there.

                          Then, reading https://www.regular-expressions.info/unicodeboundaries.html#grapheme, I realized that, presently, the \b regex cannot identify the different grapheme positions !

                          Would it be possible to add an option for this specific case

                          In both Regex and ICU, \X matches a single grapheme cluster. In Regex, (?=\X) matches a grapheme boundary; that doesn’t work in ICU. (It looks like in ICU, \X actually matches from the current position to the end of a grapheme cluster. In Regex, the match must begin and end on a grapheme cluster boundary. The Boost.Regex logic already worked that way, but I replaced/extended it to use the grapheme break algorithm specified by Unicode.) \X partially works in built-in Notepad++ search, too, but it misses some cases and falls apart entirely outside the BMP.

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors