Replace-All can only replace up to 2046 characters long
-
Use case:
I have a lot of lines of compilation commands I wanted to analyze. The commands were generated by Arduino IDE and are very long, over 26KB per line (yep). I wanted to remove the parts of the commands that have all the included library and only leave the options and the file name being compiled. I realized that even if I select all of the stuff to replace with an empty space, I only get first 2046 characters replaced.Cause: “Find what” text entry only allows 2046(or 2047) bytes.
Recommendation: Make the text entry longer or a “Long find” check box to replace the 1-line text entry with a text box that can accommodate longer texts.
I couldn’t post even sample lines for anyone to test because this post limits text to 16383 characters. So find yourself a long text and copy paste it several times to try.
Thanks.
-
you really shouldn’t need to keep the entire command in the Find what, especially if you use regular expressions. The regular expression mode was designed to work for strings where a simple search/replace wasn’t sufficient, by using wildcards and pattern matching to keep them shorter and able to be more generic.
I am assuming that Arduino compilation commands are using
-Ic:\long\path\to\library
(or something similar) to include libraries, and that you’ve got a bunch of those… the Find What =-I[^ ]+
will match one of those when Search Mode is set toRegular Expression
, and replacing with empty will delete it; doing a Replace All with that setup will remove all such instances. No reason for >2046 characters in the Find What.----
Useful References
-
@John-Liudr I suspect many users of text editors such as Notepad++ regularly deal with long strings of data. I, and I suspect many of us, deal with this by breaking the problem down into smaller chunks. For example, if I’m faced with 20K byte long command lines I’m likely to first break it down into multiple lines using
Search:(?<= )(?=-[a-z0-9])
Replace:\r\n\t
That will replace/replace a space followed by a dash followed by a letter or digit into space, CR LF TAB, and then the dash and letter/digit.
As the
-parameters
are now one per line I can do things such as deleting all of the-I
parameters using
Search:(?-i)^\t-I.+\R
Replace:(nothing)
Multiple shorter lines is mentally easier for me to both see and to do search/replaces on.
When I’m done I reassemble the thing into one long line again using
Search:(?<= )\r\n\t(?=-[a-z0-9])
Replace:(nothing)
If I find myself needing to do a long search and/or replace expression then I tend to make them shorter by first inserting what I call anchors into the data. I pick a character such as
~
for my anchor and first make sure that character does not exist in the file. I can then insert anchors into the data and then use those anchors when doing search/replaces. The CR/LF/TABs I added and then removed in the above example are also anchors that have the side benefit in that the editor and its search/replace system had many line oriented features that I can take advantage of. -
Hello, @john-liudr, @peterjones, @mkupper and All,
@john-liudr, regarding the problem of the text accepted in, both, the
Find what:
andReplace with
zones which CANNOT exceed2,046
characters, here is a general method to search a long amount of text and to replace it by an other long amount of text, too !
In order to practically test this method, I’ll use the
License.txt
file of my N++v8.7.1
version-
First, I copy this file five times in an other file that I call
Replace_Test.txt
-
Then, I select, in an other file, a large bunch of text which should be replaced, five times, in the
Replace_Test.txt
file, and I paste it, in the clipboard, with aCtrl + C
action
REMARK : do NOT select the last line-break ( IMPORTANT )
-
Now, regarding the long text to search for, which may occur one or several times in current file, and which must be replaced the same amount of times, I use the following regex S/R to define and delete the range of text :
-
SEARCH
(?s)^\QSTART of text\E.+?(?=^\QEND of text\E)
-
REPLACE
Leave EMPTY
Of course, in most of the cases, we may avoid the
\Q
and\E
regex syntaxes. So, for our practical case, I’ll simply use the regex S/R :SEARCH
(?s-i)^TERMS AND CONDITIONS.+?(?=^END OF TERMS AND CONDITIONS)
REPLACE
Leave EMPTY
- Check the
Wrap around
option and theRegular expression
search mode
=> As expected, after the
Replace All
action, it returns the messageReplace All: 5 occurrences were replaced in entire file
Note that the END boundary, i.e. the string
END OF TERMS AND CONDITIONS
, is still present after this replacement-
Now, open the Mark dialog (
Ctrl + M
) -
Select the three options
Bookmark line
,Purge for each search
andWrap around
-
MARK
(?-i)^END OF TERMS AND CONDITIONS
=> After a click on the
Mark All
button, we get the messageMark: 5 matches in entire file
- Then, run the
Search > Bookmark > Paste to (Replace) Bookmarked Lines
menu option
=> At each of the five bookmarked locations, of the
Replace_Test.txt
, the lineEND OF TERMS AND CONDITIONS
has been replaced by the clipboard contents !- Finally, save the new contents of the
Replace_Test.txt
file
Voila !
Best Regards,
guy038
-
-
An interesting technique, but I think it suffers from a limitation that may make it less useful than it at first appears.
That limitation is maybe hard to put into words, but it involves that the replace point must be a line, and can’t be an arbitrary place in the data. This is because of the use of the Paste to (Replace) Bookmarked Lines command; it always replaces a line and not in the middle of a line.
Consider OP’s statement of “over 26KB per line (yep)”. It is conceivable that someone with that data would want to replace inside that range, and without introducing any line “breaks”, in which case I don’t think your idea will satisfy.
Regardless, I think @PeterJones 's response is probably the right one for the OP’s problem. Also, @mkupper also offers some solid advice. And who knows, applying a technique offered by @mkupper might get data in good form to actually use your solution.
-
As @mkupper said, anchors and regular expressions are the magic spells that I use to untangle lines - or even files, (like blobs of downloaded bank data) - that are hopelessly mangled together.
Depending on how mangled the data is, it may take several tries - with a number of mistakes - before you get things the way you want them.
It can be a process and don’t forget to make periodic backup saves in case you have to “undo” a search-and-replace.
In my case I use Notepad++ to untangle downloaded bank data for tax purposes and, (depending on the size of the file and how mangled it is), it can take days to complete.
Be patient. Print out a regular expression cheat-sheet if you need one. (I did) And don’t give up even if it seems like you’re going backwards.