Community
    • Login

    Bug when a multi-lines regex is used in the 'Search', 'Replace' or 'Mark' dialog

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    12 Posts 3 Posters 346 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • guy038G
      guy038
      last edited by guy038

      Hello, @mkupper an All,

      Really sorry, but I’m rather confused !

      @mkupper, you said :

      I discovered that the Find field is limited to 2046 characters.

      And two lines below, you said :

      There is another upper limit which is that the Find field allows for up to 30,000 characters

      To my mind, the former number is correct, No ??!!


      I also verified, on my old Win-XP laptop, with the last XP version of N++ ( v7.9.2 ) :

      • That it is possible to get a multi-lines regex up to the 2,046 characters

      • That the automatic check of the In selection option, although NOT configurable in the Preferences... dialog, at that time, is effective for the 1,025 or upper values

      So, at the time of the v7.9.2 release, the regex limit of chars and the automatic In selection limit seemed unrelated ! Not sure that it’s still the case, nowadays ?

      Best Regards,

      guy038

      1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @guy038
        last edited by

        @guy038:

        Note, in the status bar, that 1,047 characters have been selected

        Trying to follow along, I don’t see how the above happens.

        1 Reply Last reply Reply Quote 1
        • guy038G
          guy038
          last edited by guy038

          Hi, @alan-kilborn, and All,

          Oh, yes, sorry @alan-kilborn, It’s a typo : I use , generally, the ~~~ string to define and end a text block. But, this time, I forgot one tilde for a block end :-((

          I edited my first post and correct it !

          So, just retry and copy the third text section, of my initial post, in the clipboard with the upper-right corner button. It should be OK !

          BR

          guy038

          1 Reply Last reply Reply Quote 0
          • Alan KilbornA
            Alan Kilborn @mkupper
            last edited by

            @mkupper said:

            There is another upper limit which is that the Find field allows for up to 30,000 characters. You can’t paste more than 30,000 characters into the field.

            If you’re speaking slightly sloppily, then this makes sense. I’d guess that the limit is actually 32767, the default Windows value for an edit control, see HERE.

            But isn’t it true that in Notepad++, even though you can put more than 2046 characters in the Find what box (e.g. via pasting), it ignores anything over 2046 when executing the search?

            Side Note: Also, what Notepad++ uses as a limit might be 2046 bytes, not characters in a strict sense. I haven’t looked at this lately, but if memory serves if you use multibyte characters in the Find what box data, the limit is going to be less than 2046.

            mkupperM 1 Reply Last reply Reply Quote 2
            • mkupperM
              mkupper @Alan Kilborn
              last edited by

              @Alan-Kilborn said in Bug when a multi-lines regex is used in the 'Search', 'Replace' or 'Mark' dialog:

              If you’re speaking slightly sloppily, then this makes sense. I’d guess that the limit is actually 32767, the default Windows value for an edit control, see HERE .

              I hope it was not that sloppy. Here are the repro steps for what I did yesterday though suspect including the repro details makes this a TL;DR style post.

              1. I am running v8.8.1 though I don’t think the version matters much as all of this repro also worked in v8.7.9.

              2. I used Excel and Notepad++ to construct some “rulers” that start with the length and have markers. The rulers are 1024, 1025, and 70000 characters long.

              I also included a line with the word random which is a word that I use to pre-load the Find what field at times.

              
              random
              
              1024____10________20________30________40________50________60________70________80________90_______100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990______1000______1010______10201024
              1025____10________20________30________40________50________60________70________80________90_______100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990______1000______1010______1020_1025
              (70000 character test string removed as the forum does not allow for more than 16384 character long posts)
              
              

              At times I’ll say to preload the random word into the Find what field. By this I mean:

              1. Select or put the caret on the word random (or anything of your choosing).
              2. Ctrl-F to bring up the Find dialog box.
              3. See that the desired word is in the Find what field
              4. Press Esc to close the Find dialog box.
              5. Move the caret to a blank area of the document. (that’s why I have a blank like above and below the word random in the test data.)
              6. You may do another Ctrl-F to bring up the Find dialog box and Find what field should have the random word in it.

              ** The 30000 character Find what field limit **

              1. Preload the random word into the Find what field.
              2. Load the 70000 character ruler (without the end of line) into the copy/paste buffer.
              3. Move the caret to a blank area so that it’s not sitting on the ruler or some other word.
              4. Ctrl+F to bring up the Find dialog box and then Ctrl+V to paste the ruler into the Find what field.
              5. You should see that the Find what field starts with 70000___10... and ends with ..._____29980_____29990_____30000
              6. Ctrl+A to select all of the Find what field contents, Ctrl+C to load that into the copy/paste buffer, Esc to close the Find dialog box, and Ctrl+V to paste the results into the Notepad++ document.
              7. You should see a 30000 character long line that starts with 70000___10 and ends with _____30000.

              ** The 1024 character automatic select and load into Find what limit **

              This is the item that started this forum thread.

              1. Preload the random word into the Find what field.
              2. Put the caret on the 1025 character ruler and then do Ctrl-F to see what’s in Find what. You will see it’s still the pre-loaded random word.
              3. Try various things such as selecting the ruler and then doing Ctrl+F. You will still get the pre-loaded random word.
              4. Repeat steps 1, 2 and 3 using the 1024 character long ruler. You will discover that this ruler loads into the Find what field.

              ** The 2046 character limit for Notepad++ searches **

              1. First do steps 1 to 5 of the The 30000 character Find what field limit repo that’s above.
              2. Do Enter (or click Find Next) to search for whatever is in the Find what field and then press Esc.
              3. You will see that the first 2046 characters of the 70000 character ruler are selected. I did a Ctrl+C and pasted that to it’s own line to verify that it’s a 2046 character long line.
              4. Do F3 and Notepad++'s search thing will continue to find/select the first 2046 characters of the 70000 character ruler. (You can make extra copies of this ruler if desired)

              ** Bonus on the 2046 character limit for Notepad++ searches **

              I wondered if I could trick Notepad++ into using more than 2046 characters and so tried this:

              1. Exit Notepad++, edit the config.xml file, and added 2050______2060 to the <Find name="70000___10________20 line so that it ends with 2040______2050______2060" />
              2. I started Notepad++, went to a blank area, and did Ctrl+F. I discovered that Find what is pre-loaded with a 2060 long value that starts with 70000___10________20 and ends with 2040______2050______2060.
              3. Searches though are still limited to 2046 characters.

              ** Bonus on the F3 search **

              I discovered that Notepad++ must have a separate internal buffer that it uses for the F3 search. If you start Notepad++ and then do F3 then nothing happens even though the top of the find history is something that should be in the document.

              Related to this is if you preload something into the find history that it’s not available for an F3 search. For example, put the caret on the word Notepad, do Ctrl-F and then Esc. Tapping F3 will not search for Notepad but instead it it searches for whatever you had last searched for.

              I ran into this as I was hoping to preload the 2060 character long string by editing config.xml, starting Notepad++, and then doing an F3 to see how long the resulting selection was. Nothing happened as there was noting in the 'text to search for` buffer. Thus I could not use this method to fool Notepad++ into searching for more than 2046 characters.

              Also, when either pre-load or copy/paste something into the Find what field and then exit Notepad++ without ever searching for that value then it’s not saved to the config.xml file.

              • Thus, while you can copy/paste a 30000 character long string so that it shows up at the top of the search history this will not get saved to config.xml file.
              • If you do a search for that 30000 character long string then it seems that it’s first truncated to 2046 characters and it then does the seach. I believe it truncates first as the Find what field is truncated on the spot when you click the [Find Next]
              • Thus the dialog box does not let you search for more than 2046 characters.
              • The truncated value will also now be at the top of the search history, and when you exit Notepad++ the 2046 character long string gets written to the config.xml file.

              I did not do any testing with Notepad++ macros or PythonScript to see if I could use more than 2046 characters in a search pattern.

              But isn’t it true that in Notepad++, even though you can put more than 2046 characters in the Find what box (e.g. via pasting), it ignores anything over 2046 when executing the search?

              That seems to be true and it also truncates the field to 2046 characters when adding it to the search history.

              Side Note: Also, what Notepad++ uses as a limit might be 2046 bytes, not characters in a strict sense. I haven’t looked at this lately, but if memory serves if you use multibyte characters in the Find what box data, the limit is going to be less than 2046.

              That’s correct which is why I used plain ASCII for these tests. I’ve forgotten if the limits in this area are related to UTF-8 encoding and/or some characters need more bits than the seven needed for plain ASCII.

              mkupperM 1 Reply Last reply Reply Quote 1
              • mkupperM
                mkupper @mkupper
                last edited by mkupper

                Follow up on the previous post as the forum software did not allow for a 70000 character long ruler style test string. I had used Excel.

                10	=REPT("_",10-LEN(A1))&TEXT(A1,"0")
                =A1+10	=REPT("_",10-LEN(A2))&TEXT(A2,"0")
                =A2+10	=REPT("_",10-LEN(A3))&TEXT(A3,"0")
                =A3+10	=REPT("_",10-LEN(A4))&TEXT(A4,"0")
                ...
                

                repeat that for 7000 rows. Row 7000 has:

                =A6999+10	=REPT("_",10-LEN(A7000))&TEXT(A7000,"0")
                

                with the result being:

                10	________10
                20	________20
                30	________30
                ...
                70000	_____70000
                

                I then copy/pasted column B into Notepad++,
                verified 7000 lines, and then search/replace
                to remove the \R to generate:

                ________10________20________30 ... _____70000
                
                1 Reply Last reply Reply Quote 0
                • guy038G
                  guy038
                  last edited by guy038

                  Hello, @mkupper, @alan-kilborn and All,

                  @mkupper, I repeated all your process and indeed, your method and explanations were very instructive !


                  I just do NOT understand one point, yet. You said, in a previous post :

                  Hopefully, both the normal and extended search allow for 30,000 character searches.

                  Well, repeating the points 1 to 5 of The 30000 characters Find what field limit section, with the FIND dialog pre-configured in Normal or Extended mode, it just matches the first 2,046 characters of the 70000___10 string, although the Find field do contain the first 30,000 chars of the 70000___10 string ?!

                  Best regards,

                  guy038

                  1 Reply Last reply Reply Quote 1
                  • Alan KilbornA
                    Alan Kilborn @mkupper
                    last edited by

                    @mkupper said:

                    ** The 1024 character automatic select and load into Find what limit **
                    This is the item that started this forum thread.

                    As this IS what started the thread, I’d like to concentrate on that specific aspect.

                    While the magic number is 1024 this is apparently unrelated to the magic number 1024 found in Settings / Preferences / Searching (tab) setting for Minimum Size for Auto-Checking “In-selection”.

                    To evaluate the “apparently unrelated” part, I started changing the value for Minimum Size for Auto-Checking “In-selection”. And, unless I missed something, it does actually appear related. For example, creating 1022 and 1023 “ruler” lines and then changing the number in the box to 1023 yields what I’d expect (based on the 1024 behavior): That is, invoking Ctrl+f with the caret in the 1023 ruler checkmarks In selection, while invoking it with the caret in the 1022 ruler uncheckmarks it.

                    Thus I don’t think that 1024 is a “magic number”; it’s a default setting value, with no magic.

                    But I still feel like I’m missing something about the point @mkupper was trying to make about this.

                    1 Reply Last reply Reply Quote 0
                    • Alan KilbornA
                      Alan Kilborn
                      last edited by

                      30000 vs. 32767

                      I’ve no idea where 30000 originates. A quick search of the Notepad++ source code won’t find it literally. So… apologies to @mkupper about the “sloppily” thing; I did think you were speaking in “ballpark” terms.

                      1 Reply Last reply Reply Quote 1
                      • Alan KilbornA
                        Alan Kilborn
                        last edited by Alan Kilborn

                        Instead of Excel, why not use a bit of PythonScript to generate the “ruler” lines?:

                                accum = ''
                                for j in range(1020, 1030 + 1):
                                    desired_len = j
                                    des_len_as_str = str(desired_len)
                                    s = des_len_as_str
                                    tens_count = 0
                                    while True:
                                        if (len(s) + 1) % 10 == 0:
                                            if (tens_count + 2) * 10 <= desired_len:
                                                s += str((tens_count + 1) * 10)
                                                tens_count += 1
                                        if len(s) >= desired_len: break
                                        s += '_'
                                    s = s[:-len(des_len_as_str)] + des_len_as_str
                                    accum += s + '\r\n'
                                editor.copyText(accum)
                        

                        The example above generates ruler lines of length 1020 through 1030, inclusive. The ruler data ends up in the clipboard after the script runs.

                        Note that mine might be different from the earlier ruler lines discussed – I chose that the intermediate numbers start in their indicated column, e.g. after you paste the output of the script into a new tab, if you put the caret just to the left of the 8 in 890, the status bar will indicate Col: 890.

                        To select 890 characters from that same example line, put the caret between the 8 and the 9 and then press Shift+Home.

                        Here’s some output from the script:

                        1020_____10________20________30________40________50________60________70________80________90________100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990_______1000______1010___1020
                        1021_____10________20________30________40________50________60________70________80________90________100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990_______1000______1010____1021
                        1022_____10________20________30________40________50________60________70________80________90________100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990_______1000______1010_____1022
                        1023_____10________20________30________40________50________60________70________80________90________100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990_______1000______1010______1023
                        1024_____10________20________30________40________50________60________70________80________90________100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990_______1000______1010_______1024
                        1025_____10________20________30________40________50________60________70________80________90________100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990_______1000______1010________1025
                        1026_____10________20________30________40________50________60________70________80________90________100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990_______1000______1010_________1026
                        1027_____10________20________30________40________50________60________70________80________90________100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990_______1000______1010__________1027
                        1028_____10________20________30________40________50________60________70________80________90________100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990_______1000______1010___________1028
                        1029_____10________20________30________40________50________60________70________80________90________100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990_______1000______1010____________1029
                        1030_____10________20________30________40________50________60________70________80________90________100_______110_______120_______130_______140_______150_______160_______170_______180_______190_______200_______210_______220_______230_______240_______250_______260_______270_______280_______290_______300_______310_______320_______330_______340_______350_______360_______370_______380_______390_______400_______410_______420_______430_______440_______450_______460_______470_______480_______490_______500_______510_______520_______530_______540_______550_______560_______570_______580_______590_______600_______610_______620_______630_______640_______650_______660_______670_______680_______690_______700_______710_______720_______730_______740_______750_______760_______770_______780_______790_______800_______810_______820_______830_______840_______850_______860_______870_______880_______890_______900_______910_______920_______930_______940_______950_______960_______970_______980_______990_______1000______1010______1020___1030
                        
                        1 Reply Last reply Reply Quote 2
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors