Community
    • Login

    Regular expression size limited 2048 chars How to make it longer?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    14 Posts 2 Posters 4.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • hcchenH
      hcchen
      last edited by

      My regular expressions work fine, they are generate by a program and they can be very long. But Notepad++ regular expression accepts no longer than 2048 characters. So how to adjust the limitation to a bigger number? say 8192.

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @hcchen
        last edited by

        @hcchen

        You can’t.
        The text of any find or replace expression is limited to 2046 characters.

        hcchenH 1 Reply Last reply Reply Quote 1
        • hcchenH
          hcchen @Alan Kilborn
          last edited by

          @Alan-Kilborn thank you very much. We plan to develop a plugin instead. I wish someone would like to be payed to develop that for us. User use Notepad++ Style Token to draw colors onto keywords and the plugin do the search according to the colored keywords.

          e.g.

          1. ‘windows 10’, ‘win10’, and ‘window10’ are colored with blue
          2. ‘blue screen’ and ‘BSOD’ are colored with red
          3. ‘blue tooth’ and ‘bt’ are colored with green

          Then the plugin search the document with the logic shown below and get all lines that match the logic to an output window or jump to the next line that match the logic.

           ('windows 10' OR 'win10' OR 'window10') AND ('blue screen' OR 'BSOD') AND ('blue tooth' OR 'bt')
          

          Would you like do this? Or introduce us one who can do it?

          陳厚成 H.C. Chen
          緯創資通 Wistron Corporation
          新北市22181汐止區新台五路一段88號21樓
          21F., No.88, Sec.1, Xintai 5th Rd., Xizhi Dist.,
          New Taipei City 22181, Taiwan
          eMail: h.c._chen@wistron.com
          Phone: +886-2-6612-27887, Cell: +886-922-417-555
          Line: hcchen5600, WeChat: hcchen5600

          Alan KilbornA 1 Reply Last reply Reply Quote 0
          • Alan KilbornA
            Alan Kilborn @hcchen
            last edited by

            @hcchen

            This would not be terribly difficult to accomplish; if I were doing it I would do it with PythonScript, not a dedicated plugin.

            Your boolean logic could be accomplished with help from THIS information.

            The hardest part would be “presentation of results” since writing data to the Search results area is not really practical, either with a script or a plugin.

            hcchenH 2 Replies Last reply Reply Quote 1
            • Alan KilbornA
              Alan Kilborn
              last edited by

              OP jumping this to new thread for some reason: https://community.notepad-plus-plus.org/topic/22012

              1 Reply Last reply Reply Quote 0
              • hcchenH
                hcchen @Alan Kilborn
                last edited by

                @Alan-Kilborn thank you so much! PythonScript looks good, I wish it’s good enough …

                1 Reply Last reply Reply Quote 0
                • hcchenH
                  hcchen @Alan Kilborn
                  last edited by

                  @Alan-Kilborn I am so glad to have spend a day today on the Notepad++ PythonScript, it’s fun. But still too much to learn in prior being able to make something decent. We need to know:

                  1. how to get ‘Style token’ list,
                  2. use the logic to find matched lines.
                  3. If output them to a view is not easy as you mentioned then to a new buffer is also okay.

                  So users draw colors to keywords, run the python script and get a ‘new 1’ file with all matched lines.

                  I am afraid you probably very busy? We can accept only a draft sample code so that we can proceed to finish it.

                  Best regards,
                  H.C. Chen
                  Wistron Corp.

                  Alan KilbornA 2 Replies Last reply Reply Quote 1
                  • Alan KilbornA
                    Alan Kilborn @hcchen
                    last edited by

                    @hcchen said in Regular expression size limited 2048 chars How to make it longer?:

                    1. how to get ‘Style token’ list

                    Here’s a demo script for that:

                    # -*- coding: utf-8 -*-
                    from __future__ import print_function
                    
                    from Npp import *
                    
                    #-----------------------------------------------------------------------------
                    
                    def highlight_indicator_range_tups_generator(indicator_number):
                        '''
                        the following logic depends upon behavior that isn't exactly documented;
                        it was noticed that calling editor.indicatorEnd() will yield the "edge"
                        (either leading or trailing) of the specified indicator greater than the position
                        specified by the caller
                        this is definitely different than the scintilla documentation:
                        "Find the start or end of a range with one value from a position within the range"
                        '''
                        if editor.indicatorEnd(indicator_number, 0) == 0:
                            return
                        indicator_end_pos = 0  # set special value to key a check the first time thru the while loop
                        while True:
                            if indicator_end_pos == 0 and editor.indicatorValueAt(indicator_number, 0) == 1:
                                # we have an indicator starting at position 0!
                                # when an indicator highlight starts at position 0, editor.indicatorEnd()
                                #  gives us the END of the marking rather than the beginning;
                                #  have to compensate for that:
                                indicator_start_pos = 0
                            else:
                                indicator_start_pos = editor.indicatorEnd(indicator_number, indicator_end_pos)
                            indicator_end_pos = editor.indicatorEnd(indicator_number, indicator_start_pos)
                            if indicator_start_pos == indicator_end_pos: break  # no more matches
                            yield (indicator_start_pos, indicator_end_pos)
                    
                    #-----------------------------------------------------------------------------
                    
                    def INDICATORS_TEST__main():
                    
                        if 1:  # show currently ACTIVE indicators
                    
                            at_least_one = False
                    
                            num_to_english_dict = {
                                25 : 'cyan;style1',
                                24 : 'orange;style2',
                                23 : 'yellow;style3',
                                22 : 'purple;style4',
                                21 : 'dark-green;style5',
                                #31 : 'red;mark',
                                }
                    
                            for i in list(num_to_english_dict.keys()):
                    
                                english = 'indic#{0:02}'.format(i)
                                try:
                                    english += '[{}]'.format(num_to_english_dict[i])
                                except KeyError:
                                    pass
                    
                                for tup in highlight_indicator_range_tups_generator(i):
                                    (start, end) = tup
                                    print('pos{s}-{e}      {eng}      {text}'.format(
                                        s=start,
                                        e=end,
                                        eng=english,
                                        text=editor.getTextRange(start, end),
                                        )
                                    )
                                    at_least_one = True
                    
                            if not at_least_one:
                                print("no indicators active")
                    
                    #-----------------------------------------------------------------------------
                    
                    INDICATORS_TEST__main()
                    

                    Style some text, then run the script and observe something like this in the PythonScript console window:

                    pos25-29      indic#24[orange;style2]      from
                    pos127-131      indic#24[orange;style2]      from
                    pos696-700      indic#24[orange;style2]      from
                    pos1855-1874      indic#21[dark-green;style5]      num_to_english_dict
                    pos2127-2146      indic#21[dark-green;style5]      num_to_english_dict
                    pos2266-2285      indic#21[dark-green;style5]      num_to_english_dict
                    
                    hcchenH 1 Reply Last reply Reply Quote 3
                    • Alan KilbornA
                      Alan Kilborn @hcchen
                      last edited by Alan Kilborn

                      @hcchen said in Regular expression size limited 2048 chars How to make it longer?:

                      1. use the logic to find matched lines.

                      Well, you could follow the advice in the thread I pointed you to before, when I said:

                      Your boolean logic could be accomplished with help from THIS information.

                      But since you said earlier that you want to do ANDs and ORs, you might benefit from thinking about the following search regular expression:

                      (?-s)^(?=.*?and1)(?=.*?and2)(?=.*?and3)(?=.*?(?:or1|or2|or3)).+

                      which will match lines containing all of the AND expressions and one (or more) of the OR expressions.

                      Using that base expression you can alter it as needed to fit the scenario you need. For example, if at the moment you don’t need AND conditions, you just remove those parts from the regular expression.

                      hcchenH 1 Reply Last reply Reply Quote 3
                      • hcchenH
                        hcchen @Alan Kilborn
                        last edited by

                        @Alan-Kilborn First thing I do this morning is to have confirmed this sample code works very well! Wow, it’s very useful. I wish that I can send your kid a Disney land ticket to indicate how much happiness your helps have brought to us.

                        1 Reply Last reply Reply Quote 1
                        • hcchenH
                          hcchen @Alan Kilborn
                          last edited by

                          @Alan-Kilborn is it possible to call the ‘find’ dialog box? So we can easily get all matched lines to the ‘Search results’ window.

                          Alan KilbornA 1 Reply Last reply Reply Quote 0
                          • Alan KilbornA
                            Alan Kilborn @hcchen
                            last edited by

                            @hcchen said in Regular expression size limited 2048 chars How to make it longer?:

                            I wish that I can send your kid a Disney land ticket

                            My kid wishes that too. :-)
                            It wasn’t much effort, though, as I already had the sample code I posted, and the logical search regex expressions have been discussed before.

                            is it possible to call the ‘find’ dialog box?

                            It is possible but it is not easy.

                            So we can easily get all matched lines to the ‘Search results’ window.

                            Did you end up not liking the idea of writing your results to a “new” tab?

                            hcchenH 1 Reply Last reply Reply Quote 1
                            • hcchenH
                              hcchen @Alan Kilborn
                              last edited by

                              @Alan-Kilborn Thank you very much for the valuable answers. After all, I pop up a message box that informs user that the RegEx will be copy to clipboard, OK or Cancel? that’s all. Users can easily use the ‘Find’ dialog then. I am so glad to have finished this magic so well.

                              Alan KilbornA 1 Reply Last reply Reply Quote 1
                              • Alan KilbornA
                                Alan Kilborn @hcchen
                                last edited by

                                @hcchen said in Regular expression size limited 2048 chars How to make it longer?:

                                After all, I pop up a message box that informs user that the RegEx will be copy to clipboard, OK or Cancel? that’s all. Users can easily use the ‘Find’ dialog then

                                That’s a nice approach to solving your problem.
                                Sometimes it just takes some discussion and background thinking to get to the right conclusion.

                                1 Reply Last reply Reply Quote 0
                                • First post
                                  Last post
                                The Community of users of the Notepad++ text editor.
                                Powered by NodeBB | Contributors