• regex replace performance regression

    15
    2 Votes
    15 Posts
    1k Views
    guy038G

    Hi, @cmeriaux, @peterjones, @alan-kilborn, @arkadiuszmichalski and All,

    So, I"m going on testing some examples of text, dealing with replacements, bookmarks and replacement modifiers ! Note that I did not consider the Admin case, rather identical !

    First, I used the @sasumner’s file data8279.txt ( 300,000 lines ) and I performed two types of text :

    A global replacement of (?-s)^.*NotepadPP.*\R with Nothing, with the Wrap around option ticked ( First table, below )

    A mark operation of the string NotepadPP, with the Bookmark line and Wrap around option ticked, but not the Match case one, followed with a Search > Bookmark > Remove Bookmarked Lines operation ( Second table, below )

    203,236 occurrences were deleted or were marked then deleted !

    •===============•=============•==========•=================• | Archi- | Version | User Mode | | | |----------•-----------------| | tecture | Notepad++ | Time | Ratio x32/x64 | •===============•=============•==========•=================• | Win XP x32 | 7.9.2 | 65.0 s | -/- | •===============•=============•==========•=================• | Win 10 x32 | 7.9.2 | 47.6 s | | •---------------•-------------•----------• 1.27 | | Win 10 x64 | 7.9.2 | 37.5 s | | •===============•=============•==========•=================• | Win 10 x32 | 7.9.5 | 47.4 s | | •---------------•-------------•----------• 1.27 | | Win 10 x64 | 7.9.5 | 37.4 s | | •===============•=============•==========•=================• | Win 10 x32 | 8.0 | 86.2 s | | •---------------•-------------•----------• 1.22 | | Win 10 x64 | 8.0 | 70.4 s | | •===============•=============•==========•=================• | Win 10 x32 | 8.1.5 | 47.6 s | | •---------------•-------------•----------• 1.27 | | Win 10 x64 | 8.1.5 | 37.5 s | | •===============•=============•==========•=================• | Win 10 x32 | 8.1.9.2 | 47.5 s | | •---------------•-------------•----------• 1.28 | | Win 10 x64 | 8.1.9.2 | 37.2 s | | •===============•=============•==========•=================• •===============•=============•====================================• | Archi- | Version | User Mode | | | |------------------•-----------------| | tecture | Notepad++ | Time | Ratio x32/x64 | •===============•=============•==================•=================• | Win XP x32 | 7.9.2 | 10.0 s + 64.2 s | -/- | •===============•=============•==================•=================• | Win 10 x32 | 7.9.2 | 4.9 s + 49.1 s | | •---------------•-------------•------------------• 1.36 | | Win 10 x64 | 7.9.2 | 2.1 s + 37.5 s | | •===============•=============•==================•=================• | Win 10 x32 | 7.9.5 | 4.8 s + 49.1 s | | •---------------•-------------•------------------• 1.35 | | Win 10 x64 | 7.9.5 | 2.3 s + 37.6 s | | •===============•=============•==================•=================• | Win 10 x32 | 8.0 | 20.0 s + 49.1 s | | •---------------•-------------•------------------• 1.31 | | Win 10 x64 | 8.0 | 14.8 s + 37.8 s | | •===============•=============•==================•=================• | Win 10 x32 | 8.1.5 | 4.8 s + 49.3 s | | •---------------•-------------•------------------• 1.35 | | Win 10 x64 | 8.1.5 | 2.3 s + 37.7 s | | •===============•=============•==================•=================• | Win 10 x32 | 8.1.9.2 | 4.8 s + 49.2 s | | •---------------•-------------•------------------• 1.37 | | Win 10 x64 | 8.1.9.2 | 2.1 s + 37.4 s | | •===============•=============•==================•=================•

    In the second table, I decomposed the total time in two parts :

    Time to bookmark the lines

    Time to delete these lines

    I summarized the two values before calculating the ratio x32/x64

    Interpretation of the results :

    If xe except the special case of the v8.0 version, the results are very similar, for the two tables :

    In the first case, the more complicated regex (?-s)^.*NotepadPP.*\R decrease a bit the ratio between the x32 and x64 versions

    In the second case, both the mark operation and the deletion of lines have an impact, but the ratio between the x32 and x64 versions is a bit better

    Note that, regarding the v8.0 version, in the second table, the performance regression comes from the bad results of the mark operation only !

    I performed a last test, using the same Search and Replace regexes than in my initial issue :

    https://github.com/notepad-plus-plus/notepad-plus-plus/issues/9636

    So the regex S/R :

    SEARCH \w

    REPLACE \U$0

    I, then, created a file containing 1,000 lines ( every odd ones ) with the French text :

    C’est là, près de la forêt, dans un gîte, où régnait un grand capharnaüm, que l’aïeul ôta sa flûte et son bâton de son canoë.

    And I added 1,000 English lines ( every even ones ) :

    Here is a example of text, containing the complete French set of accentuated characters, traditionally used.

    After replacement, 184,000 occurrences have been modified :

    •===============•=============•==========•=================• | Archi- | Version | User Mode | | | |----------•-----------------| | tecture | Notepad++ | Time | Ratio x32/x64 | •===============•=============•==========•=================• | Win XP x32 | 7.9.2 | 18.7 s | -/- | •===============•=============•==========•=================• | Win 10 x32 | 7.9.2 | 10.5 s | | •---------------•-------------•----------• 2.56 | | Win 10 x64 | 7.9.2 | 4.1 s | | •===============•=============•==========•=================• | Win 10 x32 | 7.9.5 | 10.3 s | | •---------------•-------------•----------• 2.51 | | Win 10 x64 | 7.9.5 | 4.1 s | | •===============•=============•==========•=================• | Win 10 x32 | 8.0 | 38.5 s | | •---------------•-------------•----------• 1.41 | | Win 10 x64 | 8.0 | 27.4 s | | •===============•=============•==========•=================• | Win 10 x32 | 8.1.5 | 10.4 s | | •---------------•-------------•----------• 2.54 | | Win 10 x64 | 8.1.5 | 4.1 s | | •===============•=============•==========•=================• | Win 10 x32 | 8.1.9.2 | 10.4 s | | •---------------•-------------•----------• 2.54 | | Win 10 x64 | 8.1.9.2 | 4.1 s | | •===============•=============•==========•=================•

    Interpretation of the results :

    Again, if we except the special case of the v8.0 version :

    The results, whatever the version, are quite similar, for each case ( x32 and x64 )

    The ratio x32/x64 is similar to the one of my previous post ( ~ 2.52 ) !

    Best Regards,

    guy038

  • Regex Search With Quotations

    7
    0 Votes
    7 Posts
    4k Views
    DevSrc8D

    @peterjones thanks for all your help!

    is there a marked as Solved for this forum?

  • Sorting columns numerically

    3
    0 Votes
    3 Posts
    865 Views
    Cadaver182C

    @astrosofista said in Sorting columns numerically:

    Sort Lines as Integers Ascending

    Thanks a lot, works flawlessly!

  • Big borders around edit space

    2
    0 Votes
    2 Posts
    269 Views
  • Indenting multiple empty lines

    2
    0 Votes
    2 Posts
    423 Views
    PeterJonesP

    @danz4c ,

    That’s the way it’s designed.

    If you want to get empty lines indented as well, use column select (alt+drag) on the zeroth column, then hit TAB.

  • On Dark mode, the compare not show text

    2
    0 Votes
    2 Posts
    503 Views
    PeterJonesP

    @mas-cas ,

    See my answer to last week’s “Unreadable lines when comparing in dark mode

  • black boxes on all files I open

    4
    0 Votes
    4 Posts
    2k Views
    Alan KilbornA

    @patrick-donovan said in black boxes on all files I open:

    Not a very techy guy

    I think that’s going to have to improve dramatically for you to get very far. :-)

  • Regex: Find all html links that have minimum 3 letters after .com/

    3
    0 Votes
    3 Posts
    269 Views
    Hellena CrainicuH

    @neil-schipper thanks.

    WORKS:

    Regex Search: https://website\.com/en\w{1,}

  • Bookmarks disappear upon save & reopening file

    3
    0 Votes
    3 Posts
    810 Views
    Alan KilbornA

    Bookmarks are saved as part of the “session”.
    If you close a file, it gets removed from the session, and when reopened, it looks like a new file to the session (thus no bookmarks).

    There’s a demo PythonScript HERE that could be “rounded out” to pseudo-provide a load/save extension for bookmarks.

    For me, and I presume most users, the capability in Notepad++ currently is enough.

  • Replace before

    5
    0 Votes
    5 Posts
    2k Views
    PESTICIDerP

    @peterjones First of all, I’m sorry for my incompetence to provide neccessary data - I mean it. I wrote my response in a hurry and wrongly supposed that you didn’t catch the link to my previous post (sorry for that also) and thought that info in that post is enough.
    None of this is your problem and I donť want to make any excuses for this behaivour…just for clarification.

    And BIG THANKS that you spent your time with replying despite everything above! Your guess was correct and provided solution does exactly what I need.

    So sorry again and thank you also. Next time, if I’ll need some advice in the future, I’ll try to specify my needs better and provide test data in the first place.

  • "Compare" adds an empty line

    3
    0 Votes
    3 Posts
    695 Views
    PeterJonesP

    @mrde50ae said in "Compare" adds an empty line:

    Compare plugin 1.5.6.2,

    I’m not even sure where you got that version.
    The current repo, https://github.com/pnedev/compare-plugin/releases, shows v2.0.0 and v2.0.1.

    The older repo, under the previous developer, https://github.com/jsleroy/compare-plugin/tags, has v1.5.4, v1.5.5, and v2.0.0

    Please note : Notepad++ v7.7 from 2019 included a major upgrade to “Scintilla”, the underlying code for the text editor panes correction: to the API that plugins use to communicate with the Notepad++ application; that update meant that some plugins had to update their code to be compatible with those changes. “Compare v2.0.0 for Notepad++ 7.7” was the release of the plugin for Notepad++ v7.7 and newer; no version of Compare plugin released before that special v2.0.0-release are expected to work with Notepad++ v7.7 or newer.

    So if any version of Compare Plugin prior to that special v2.0.0 work at all with your Notepad++ 7.9.1, it’s by chance, not by design. And really, you should be using the newest v2.0.1 for the best compatibility with Notepad++.

  • again another wildcard find/replace

    2
    0 Votes
    2 Posts
    219 Views
    Alan KilbornA

    @sulton-systems

    find: craft_time="\d+"
    repl: craft_time="1"
    mode: regular expression

  • 8.1.9.2 release how did I break it?

    5
    0 Votes
    5 Posts
    740 Views
    Laura StraubL

    @peterjones
    Ahhh… Gotcha! Thank you for the clarification. Makes sense.

  • not sure if this can be done...

    5
    0 Votes
    5 Posts
    416 Views
    PeterJonesP

    @qishq-42 said in not sure if this can be done...:

    im looking for a way to make this simpler is that possible?

    Yes, it’s called programming.

    Regex doesn’t have the concept of counting that your desired algorithm requires. You either have to set up the N! conditionals in the regex, or you have to use a tool – like a programming language – that handles counting and other such conditional tasks much more simply than can be done in regular expressions.

  • How to find numbers in multiline in Notepad++

    21
    0 Votes
    21 Posts
    13k Views
    Alan KilbornA

    So this is a good discussion thread, but the choice to use literal 1, 2, 3 in the examples IMO wasn’t the best for the utmost clarity. :-)

  • Converting to entiteties

    4
    0 Votes
    4 Posts
    2k Views
    Leif JohansenL

    @leif-johansen
    I think the " HTML Tag plugin" solve my problem

  • Key words In Language

    8
    0 Votes
    8 Posts
    2k Views
    Lycan ThropeL

    @dennis-bareis
    Not that I know of, but then again, I’m new to this aspect of UDL’ing also, however, the one thing that I did learn from reading that document about the UDL, is that you need to play with it to make sure your UDL is functioning properly.

    For instance, if you have nested parenthesis, you need to set up the nesting properly, or it will continue highlighting if say, you had a nested parenthesis starting inside another parenthesis and you didn’t have it set to next…and may continue highlighting or not, until it hit the second one. That seems to be part of the matching brace aspect of NPP, but if the kewords, parens, braces…etc aren’t properly set up to nest properly, it will continue highlighting text you think shouldn’t be.

    When you get a chance to get back to it, you may see what I’m saying. :-)

    Good Luck.

    Lee

  • Save As Icon

    3
    0 Votes
    3 Posts
    434 Views
    Cyndi Roether 0C

    @neil-schipper Thank you, I’ll try it

  • Extra field/menu after update

    4
    0 Votes
    4 Posts
    673 Views
    chk1xnC

    Thanks @peterjones! That explains why I’m only seeing this on one of my laptops … the one where I’m also running Slack…

  • How to find and select all words between select words?

    12
    0 Votes
    12 Posts
    31k Views
    Alan KilbornA

    @marcin-kucharski

    Don’t ask the same question in 2 threads; you asked the same question HERE