Community
    • Login

    Bookmark multi words from multi lines in a text

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    27 Posts 5 Posters 4.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Alan KilbornA
      Alan Kilborn
      last edited by

      So I came up with the following Pythonscript.
      Basic scripting instructions can be found HERE.
      The script will take a list of words in the active tab of one of the views, and apply marking to active tab in the other view document.
      The script attempts to autodetect which view is which.

      Example, after running:

      06ad2222-227f-4c4a-b9d8-8f8d6c134a71-image.png

      The script:

      # -*- coding: utf-8 -*-
      
      from Npp import editor, editor1, editor2, notepad
      
      class T19249(object):
          def __init__(self):
              SCE_UNIVERSAL_FOUND_STYLE = 31  # redmarking indicator number; name from N++ source code
              files_in_view_count_list = [ 0, 0 ]
              for (_, _, _, view) in notepad.getFiles(): files_in_view_count_list[view] += 1
              if files_in_view_count_list[0] == 0 or files_in_view_count_list[1] == 0:
                  notepad.messageBox('This script requires 2 views to be open', '')
                  return
              editor_for_text_to_mark = editor1
              word_list = editor2.getText().splitlines()  # try secondary view first, looking for 1 word per line
              for word in word_list:
                  if ' ' in word.strip():
                      word_list = editor1.getText().splitlines()  # try primary view, looking for 1 word per line
                      for word in word_list:
                          if ' ' in word.strip():
                              notepad.messageBox("Can't find one-word-per-line list in either view's active tab")
                              return
                      editor_for_text_to_mark = editor2
                      break
              for word in word_list:
                  matches = []
                  editor_for_text_to_mark.search(word, lambda m: matches.append(m.span(0)))
                  editor_for_text_to_mark.setIndicatorCurrent(SCE_UNIVERSAL_FOUND_STYLE)
                  for m in matches: editor_for_text_to_mark.indicatorFillRange(m[0], m[1] - m[0])
      
      if __name__ == '__main__': T19249()
      
      1 Reply Last reply Reply Quote 4
      • guy038G
        guy038
        last edited by guy038

        Hello, @alan-kilborn and All,

        Once more, a nice script ! May I ask you for three improvements ?

        • Firstly, could it be possible to type the string Bob and Carol, which occurs twice in your text ? In other words, just simply consider all contents of each line !

        • Secondly, would it be difficult to force your script to follow the Find dialog settings or take these settings in account, in some way ? For instance we would be able to get whole words only !

        • Thirdly, it’s a regex man who speaks : could you consider regex expressions ? For instance :

          • ^.{29}\K.{10}              # Marks all a table contents from columns 30 to 39

          • \bBob.{2,20}Carol\b   # Marks the two first names Bob and Carol, no more than 20 characters apart

        In your text, it would match the 3 strings Bob and Carol, Bob’s surprise, Carol and Bob’s infidelity and Carol

        As usual, Alan, do as you like to ;-))

        Best Regards

        guy038

        Alan KilbornA 2 Replies Last reply Reply Quote 1
        • Alan KilbornA
          Alan Kilborn @guy038
          last edited by

          @guy038

          Haha, well, I was trying to keep it simple, according to the OP’s desire.
          But I will see what I can do.

          1 Reply Last reply Reply Quote 1
          • Alan KilbornA
            Alan Kilborn @guy038
            last edited by guy038

            @guy038

            Sorry it took me so long, but here’s the revised script (it is actually shorter, now!) with your desired changes, plus as a bonus it also bookmarks lines:

            # -*- coding: utf-8 -*-
            
            from Npp import editor, editor1, editor2, notepad
            
            class T19249a(object):
                def __init__(self):
                    SCE_UNIVERSAL_FOUND_STYLE = 31  # redmarking indicator number; name from N++ source code
                    MARK_BOOKMARK = 24              # bookmarking marker number; name from N++ source code
                    files_in_view_count_list = [ 0, 0 ]
                    for (_, _, _, view) in notepad.getFiles(): files_in_view_count_list[view] += 1
                    if files_in_view_count_list[0] == 0 or files_in_view_count_list[1] == 0:
                        notepad.messageBox('This script requires 2 views to be open', '')
                        return
                    search_term_list = editor2.getText().splitlines()  # search terms are in secondary view
                    for search_term in search_term_list:
                        matches = []
                        editor1.research(search_term, lambda m: matches.append(m.span(0)))
                        editor1.setIndicatorCurrent(SCE_UNIVERSAL_FOUND_STYLE)
                        for m in matches:
                            editor1.indicatorFillRange(m[0], m[1] - m[0])
                            editor1.markerAdd(editor1.lineFromPosition(m[0]), MARK_BOOKMARK)
            
            if __name__ == '__main__': T19249a()
            

            For instance we would be able to get whole words only

            I think you can use the \b assertion in your regexes for this?

            1 Reply Last reply Reply Quote 2
            • guy038G
              guy038
              last edited by

              Hi, @alan-kilobrn and All,

              Many thanks for this new enhanced version !

              Before speaking about my tests of your 2nd try, I re-read all the discussion and tried to play with your first version !

              • So, regarding this first version :

              I decided to search for the word license, program and version in the N++ license.txt file. And, strangely, I did not get all the expected matches ! After a while I understood that you must have the focus on the file containing the list of words ( which are searched in the current file of the other view ). In that case, the results are quite correct !

              But if the focus is on the file being analyzed, the script seems to match only the first two matches ?!


              Now, regarding your second version :

              • Thanks, first, for bookmarking all the lines containing the matches

              • Thanks also the possibility to search for multiple-words expressions such as Bob and Carol or, even, complete sentences !

              • Finally, thanks for supporting the regex expressions. For instance, if you have a one-line file containing Bob.*?Carol, it would mark, in the current file of the other view, all the smallest ranges of characters beginning with Bob till Carol, with this exact case !

              • Note that the Python search is sensible to case, by default and does not care about the whole word notion. So :

                • To run a search insensible to case, just prefix your regex or expression with the (?i) modifier

                • To run a search which matches only whole words, surround your expression with the \b assertion. For instance, the regex search \bted\b would find the first name ted but not the string ted in the word deleted


              Now, Alan, I’m sorry but it’s time to talk about some inconveniences ! I take up your text with Bob, Carol, Ted and Alice :

              Imagine I have this two-lines list :

              \bBob.{2,20}Carol\b
              \bTed.{2,20}Alice\b

              • If the focus is on that list, the script correctly marked all the zones in red but no bookmarking occurs ! Oddly, it bookmarks the virtual final line of the list, instead !

              • If the focus is on the file being analyzed, the script correctly bookmarks all the lines where the zones belong but, strangely, it red-marks, only, the first two matches as with your 1st script !?

              I’m sure you won’t be long to find out a solution to these problems !

              Cheers,

              guy038

              Alan KilbornA 1 Reply Last reply Reply Quote 1
              • Alan KilbornA
                Alan Kilborn @guy038
                last edited by

                This post is deleted!
                Alan KilbornA 1 Reply Last reply Reply Quote 0
                • Alan KilbornA
                  Alan Kilborn @Alan Kilborn
                  last edited by

                  This post is deleted!
                  1 Reply Last reply Reply Quote 0
                  • guy038G
                    guy038
                    last edited by guy038

                    Hello @alan-kilborn,

                    So, magically, the second version of your script seems to work better, this morning ;-))

                    However, there’s still this minor bug, as I said in my previous post :

                    • If the focus is on the file being analyzed, the script correctly bookmarks all the lines where the zones belong but, strangely, it red-marks, only, the first two matches as with your 1st script !?

                    Surely a matter of minutes to get it right !

                    Note that I can live with this : we just have to remember to focus on the view contening the list of words/regexes, before running your Python script !

                    BR

                    guy038

                    Alan KilbornA 1 Reply Last reply Reply Quote 1
                    • Alan KilbornA
                      Alan Kilborn @guy038
                      last edited by

                      @guy038

                      So somewhere between writing the first version of the script, and the changing of it to create the second script, I decided that it made sense to require that the “data” file you want the markings to be shown in should be the active file in the primary view, and the “list” file of tokens/regexes that should be used to do the markings should be the active file in the secondary view, when the script is run. It is an unwritten requirement (except that I just wrote it!).

                      The first version of the script, because it was simpler and only one “word” was expected per line in the “list” file, tried to determine which view contained which.

                      Aside from that, with the second version of the script, I’m not seeing the problems you mention. Can you give an exact example that replicates the behavior?

                      1 Reply Last reply Reply Quote 1
                      • guy038G
                        guy038
                        last edited by guy038

                        Hi, @alan-kilborn,

                        For instance, with :

                        • Your text pasted, in new 1, in the main view

                        • A list of four first names, pasted in new 2, in the secondary view

                        • Cursor at the very beginning of new 1

                        • Focus on new 1 ( the text )

                        After running the script, I got :

                        802b7bb5-52fb-42a3-8a5e-2305207a0592-image.png

                        As you can see, all bookmarks are here, but some red marks are missing ! The script seems to only red-mark the first occurrence of each word of the list !


                        In contrast to :

                        • Your text, pasted in new 1, in the main view

                        • A list of four first names, pasted in new 2, in the secondary view

                        • Cursor at the very beginning of new 2

                        • Focus on new 2 ( the list )

                        After running the script, I got :

                        a232e225-5cb0-43e9-a112-3392840f8378-image.png

                        This time, everything went fine !

                        Cheers,

                        guy038

                        Alan KilbornA 1 Reply Last reply Reply Quote 0
                        • Alan KilbornA
                          Alan Kilborn @guy038
                          last edited by

                          @guy038

                          I think I remember something about possibly Notepad++ code itself interfering with scripts that are trying to use indicators?

                          Perhaps the code in the script that “selects” an indicator needs to be “tighter” to where the indicator is used?

                          Maybe try changing these lines of the script:

                                      editor1.setIndicatorCurrent(SCE_UNIVERSAL_FOUND_STYLE)
                                      for m in matches:
                                          editor1.indicatorFillRange(m[0], m[1] - m[0])
                          

                          to this instead, and see if it runs differently?:

                                      for m in matches:
                                          editor1.setIndicatorCurrent(SCE_UNIVERSAL_FOUND_STYLE)
                                          editor1.indicatorFillRange(m[0], m[1] - m[0])
                          

                          (really what this means is just moving the “setting” of the indicator inside the loop, so that it is definitely set each time before a range is “filled” with it)

                          Let me know how that goes.

                          1 Reply Last reply Reply Quote 1
                          • guy038G
                            guy038
                            last edited by guy038

                            @alan-kilborn,

                            BINGO ! It works like a charm, whatever the focus is ( The main or the secondary view ) ;-))

                            You’re right : placing the indicator’s definition inside the loop is the key point !

                            Many thanks, again, for your cooperation :)

                            Just to be sure : did you reproduce the issue, too, when focus is in the analyzed text view ?

                            BR

                            guy038

                            Alan KilbornA 1 Reply Last reply Reply Quote 2
                            • Alan KilbornA
                              Alan Kilborn @guy038
                              last edited by

                              @guy038 said in Bookmark multi words from multi lines in a text:

                              did you reproduce the issue, too, when focus is in the analyzed text view ?

                              Mostly it worked okay for me on simpler data sets.
                              But I noticed it as an issue when I was looking into a larger-dataset problem, e.g. THIS ONE, and that made me recall an issue where an indicator needs to be set right before a fill, even though the indicator was set earlier and still should remain set.

                              1 Reply Last reply Reply Quote 2
                              • prahladmifourP
                                prahladmifour
                                last edited by

                                Hello,@BAZ-BAZOOO
                                Please follow this information, To Bookmark multi words from multi lines in a text

                                Find What:
                                (.Query . message.\R.ApplicationGatewayID = 5009.\R.\R.\R)|^(?!.Query . message).\R?
                                Replace With:
                                $1

                                The regex is of the form:** (<YOUR_REGEX_MATCHING_LINES>)|^(?!.<STARTING_PART_OF_REGEX>).\R?.**

                                Description:

                                • (.Query . message.\R.ApplicationGatewayID = 5009.\R.\R.*\R) - a line with Query and then message words on it, then the next line that has ApplicationGatewayID = 5009 on it and then 2 more lines, captured into Group 1 ($1 refers to this value)

                                • | - or

                                • ^(?!.Query . message).*\R? - start of a line (^) that has no Query and then message on it, then the whole line and optional linebreak after it are matched and eventually removed.

                                I hope this information will be useful to you.
                                Thank you.

                                1 Reply Last reply Reply Quote -3
                                • Alan KilbornA
                                  Alan Kilborn
                                  last edited by

                                  The marker ID used for bookmarks changed in Notepad++ 8.4.6 (and later). It is now 20, instead of 24. So, all references to 24 in this thread and/or its script(s), should be changed to 20.

                                  1 Reply Last reply Reply Quote 3
                                  • First post
                                    Last post
                                  The Community of users of the Notepad++ text editor.
                                  Powered by NodeBB | Contributors