Community
    • Login

    Filter the data !!!

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    63 Posts 6 Posters 12.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Alan KilbornA
      Alan Kilborn
      last edited by

      Yea, @PeterJones raises a good point.
      We can’t know for sure that we aren’t aiding nefarious purposes.
      However, the pieces are all there, if they can be put together.
      I think it is best for me to bow out at this point.
      Sorry.

      astrosofistaA 1 Reply Last reply Reply Quote 0
      • astrosofistaA
        astrosofista @Alan Kilborn
        last edited by

        Hi @Alan-Kilborn, @PeterJones, All

        I was about to answer this question when I noticed where it had led. Therefore, as a matter of prudence, I will not post my solution. Instead, I’ll describe it, because while it’s not original, it’s different from Alan’s.

        It is a destructive method based on a logical disjunction, so it is suggested, obviously, to process a copy of the document. The regex reads line by line from the beginning of the document and if the line responds to the desired pattern -let’s say it contains the word ellipsis-, then the regex processes it and loads the desired data into two different groups, and if the line is not compatible then the regex ignores it, meaning that it will not be included into the replacement expression.

        Consequently, after making a Replace All -just one mouse click-, only the 20 lines indicated by OP will remain in the document.

        Have fun!

        1 Reply Last reply Reply Quote 2
        • guy038G
          guy038
          last edited by guy038

          Hi @fake-trum, @alan-kilborn, @peterjones, @astrosofista and All,

          Here is my attempt ! So, starting with the 142 lines of the initial HTML OP’s code, below :

          <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
            "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
          <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
          <head>
            <title>MD5 Database - Nitrxgen</title>
            <meta http-equiv="content-type" content="text/html;charset=utf-8"/>
            <meta http-equiv="content-language" content="en-gb"/>
            <meta name="author" content="Nitrxgen"/>
            <meta name="viewport" content="width=device-width, initial-scale=1.0"/>
            <meta name="description" content="A free, instant MD5 lookup service with over 1 Trillion password candidates."/>
            <link href="/css/general.css" rel="stylesheet" type="text/css"/>
            <link href="/css/md5db.css" rel="stylesheet" type="text/css"/>
            <script src="//ajax.googleapis.com/ajax/libs/jquery/1.11.2/jquery.min.js" type="text/javascript"></script>
            <script src="/js/general.js" type="text/javascript"></script>
            <script src="https://www.google.com/recaptcha/api.js?render=6Lcd-pYUAAAAAAEb1ZAHtmdf3gJAmp5AQ8Pk28W9" type="text/javascript"></script>
            <script src="/js/md5db.js" type="text/javascript" defer></script>
          </head>
          <body>
          
          <div id="s_main">
            <div id="s_head">
              <div></div>
              <a href="/">nitrxgen</a>
              <span style="background-color:rgba(255,255,255,.5);color:#FFF;height:35px;line-height:35px;"><b>STAY AT HOME</b></span>
            </div>
            <div id="s_pair">
              <div id="s_menu">
                <a href="/">Home</a>
                <a href="/contact/">Contact</a>
                <a href="/donations/"><img alt="" src="/img/star.png" style="vertical-align:-1px;"/> Donations</a>
                <hr/>
                <a href="/collatz/">Collatz Conjecture</a>
                <a href="/hashgen/">Hash Generator</a>
                <a href="/ntlmcase/">NTLM Case Corrector</a>
                <a href="/youtube_cc/">YouTube CC Downloader</a>
                <hr/>
                <a href="/factorialdb/">Factorial Calculator</a>
                <a class="current" href="/md5db/">MD5 Database</a>
              </div>
              <div id="s_body">
          
                <a class="section" name="md5_database">MD5 Database</a>
                <p>
                  This is a look-up tool for typical unsalted <acronym title="Message Digest v5">MD5</acronym> cryptographic hashes. The
                  database currently contains <acronym title="Or, exactly 1,127,962,538,784 passwords"><b>1.1+ trillion</b></acronym>
                  passwords.
                </p>
                <p>
                  To use this service, please use the <a href="https://www.nitrxgen.net/md5db_info/#api"><b>the dedicated API</b></a>.
                </p>
          
                <hr/>
          
                <p></p>
                <!-- <hr/> -->
          
                <a class="section" name="rfh">Recently Found Passwords</a>
                <p></p>
                <!-- IF YOU REALLY WANT TO CRAWL THIS BIT, GO FIND THE XML LOCATION IN THE JAVASCRIPT -->
                <!-- this bit MAY change in the future to use Server-Sent Events instead, so don't assume the XML will forever be available -->
                <div class="md5db_rfp">
                  <div><div>Hash Value</div><div>Password</div><div>Hits</div><div>Found By</div><div>When</div></div>
                  <div><div>37fdf1254303be28b01538692425c1a0</div><div class="ellipsis">nFpRJC5166</div><div>874</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>a28a5338b9bdf5946f164091b180d4c7</div><div class="ellipsis">e7219089</div><div>5</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>3b6b878850b5858771a83e0a270313f9</div><div class="ellipsis">dfh333</div><div>5</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>a93c6c7f2c3af560ed647a05a83318b8</div><div class="ellipsis">12butterflies</div><div>178</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>627e25817432ff801ccce621f39e4ff2</div><div class="ellipsis">uZ005287</div><div>6</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>43ba96c671cd4e4bec558fc82838dea9</div><div class="ellipsis">thegr81</div><div>177</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>0f5347444c2907992c7aea817b723644</div><div class="ellipsis">cvbnuiop82</div><div>183</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>428ca2a132023d13ec3d73af48ce2b6a</div><div class="ellipsis">540322</div><div>28</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>bddd7f92b46ae022c5b590a22459634d</div><div class="ellipsis">jo08jo02</div><div>101</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>c0a741e5e2fb2e3df81c1b003547825a</div><div class="ellipsis">cyl1008</div><div>9</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>45fd035ccac01f33baa48e91fb014dca</div><div class="ellipsis">6eu5v7sLwI</div><div>251</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>6f3a1642b22f6e816a2979963a3b2dff</div><div class="ellipsis">19860613</div><div>47</div><div>nitrxgen</div><div>0 secs ago</div></div>
                  <div><div>d74864ba33eb47f5b9be5a6e37d9fc20</div><div class="ellipsis">peduna5</div><div>1</div><div>nitrxgen</div><div>1 sec ago</div></div>
                  <div><div>2461b606819363e71e4f97b2b5ded126</div><div class="ellipsis">19770531</div><div>45</div><div>nitrxgen</div><div>1 sec ago</div></div>
                  <div><div>4cd0efe4070757d2f6baeeac21cdb320</div><div class="ellipsis">Fripouille76</div><div>365</div><div>nitrxgen</div><div>1 sec ago</div></div>
                  <div><div>112f30e72454a80f8a9e6168437cee4c</div><div class="ellipsis">021440</div><div>11</div><div>nitrxgen</div><div>1 sec ago</div></div>
                  <div><div>c1775d5ee5751af2492bba1cc680fbd7</div><div class="ellipsis">Strife1!</div><div>234</div><div>nitrxgen</div><div>1 sec ago</div></div>
                  <div><div>0912d4922fa5b8ec600b8ecaf3558293</div><div class="ellipsis">7yjv5lzO7Y</div><div>243</div><div>nitrxgen</div><div>1 sec ago</div></div>
                  <div><div>d2f64eeb1a1ae8eeff8288e6ccc500d3</div><div class="ellipsis">EDGAR14$</div><div>105</div><div>nitrxgen</div><div>1 sec ago</div></div>
                  <div><div>5adc95dfd45421d8e0522c90c54a2d4d</div><div class="ellipsis">hZnCinANX</div><div>118</div><div>nitrxgen</div><div>1 sec ago</div></div>
                </div>
                <p style="line-height: 20px;">&nbsp;</p>
                <hr/>
          
          
                <p>
                  <b>GOOD NEWS</b> &mdash; A tool to allow users to paste hashes and have them checked against this database will be
                  available very soon. It's 100% in the works. It will allow full speed lookups depending how many concurrent sessions
                  there are. The time consuming part of this is making sure it won't be abused. Please check the
                  <a href="/changelog/">Changelog</a> for further updates. &ndash; 25th November, 2019.
                </p>
                <hr/>
                <a class="section" name="statistics">Live Statistics</a>
                <p>
          There is a grandtotal of <span id="stats_s1" style="font-weight:bold;">27,002,118,120</span> user hash requests made to this database, <span id="stats_s2" style="font-weight:bold;">178,851,726</span> are of unique hashes (about <span id="stats_s3" style="font-weight:bold;">0%</span> of grandtotal). Out of the grandtotal number of requests, <span id="stats_s4" style="font-weight:bold;">26,389,883,116</span> were successful or cracked (about <span id="stats_s5" style="font-weight:bold;">97%</span>). Regardingly only unique hashes, <span id="stats_s6" style="font-weight:bold;">143,451,392</span> were successful or cracked (about <span id="stats_s7" style="font-weight:bold;">80%</span>).      </p>
          
                <a class="section" name="gpu">GPU Processing</a>
                <p>
                  Regular visitors may notice results showing in the table above as being found by "nitrx-gpu", these are cracked locally
                  by GPU power in real time. When a hash you submit is not found, it will be queued for GPU cracking at some point in the
                  future. Only when it is cracked by GPU will your unfound hash become found for the next time it's requested. The moment
                  it gets cracked, it will appear in the table above. Similarly with passwords as "# NOT MD5 #" means the hash was cracked
                  but not using the MD5 algorithm and will not be displayed.
                </p>
          
                <a class="section" name="information">Information</a>
                <p class="paper">
                  <b>Main article</b>: <a href="/md5db_info/">MD5 Database - Information</a><br/>
                  <b>Main article</b>: <a href="/md5db_info/#api">MD5 Database - API</a>
                </p>
                <p>
                  The only data stored as a result of using this tool is the MD5 hash you willingly submit. Invalid form/API inputs are
                  stored for the sake of monitoring unknown/malicious behaviour. Such things like IP addresses, cookies, HTTP headers,
                  anything about you, your client or your connection, etc. are NOT stored.
                </p>
                <p>
                  Do not contact me about hacking or accessing online accounts for any reason. Do not ask to access the list of passwords
                  or hashes users submit. I do not condone any illegal or malicious activity; do not use this tool if that is your
                  intention. Read more in the main article links above.
                </p>
                <p>
                  This page loads an external script from Google called <i>reCAPTCHA v3</i> which is used to collect behavioural
                  information of requests to determine if they're real users or bots. This information will eventually be used in new and
                  upcoming features to combat automated requests from bots as it may place unwanted load on the server. For more
                  information about Google's reCAPTCHA, please view Google's <a href="https://policies.google.com/privacy">Privacy Policy</a>
                  and <a href="https://policies.google.com/terms">Terms of Service</a>.
                </p>
          
              </div>
            </div>
          
            <div id="s_tail">
              &copy; Copyright 2008-2020: Nitrxgen, all rights reserved.<br/>
              XHTML 1.0 valid and CSS3 valid.<br/>
              Source last modified 141 days ago.
            </div>
          </div>
          
          </body>
          </html>
          

          Then, @fake-trum, the following regex S/R :

          SEARCH (?-is)^(?!.*[[:xdigit:]]{32}).*\R|^\h+<div><div>|(</div><div class="ellipsis">)|</div><div>.+

          REPLACE ?1\:

          with the Wrap around option ticked and the Regular expression search mode selected and a click on the Replace All button would give your expected text :

          37fdf1254303be28b01538692425c1a0:nFpRJC5166
          a28a5338b9bdf5946f164091b180d4c7:e7219089
          3b6b878850b5858771a83e0a270313f9:dfh333
          a93c6c7f2c3af560ed647a05a83318b8:12butterflies
          627e25817432ff801ccce621f39e4ff2:uZ005287
          43ba96c671cd4e4bec558fc82838dea9:thegr81
          0f5347444c2907992c7aea817b723644:cvbnuiop82
          428ca2a132023d13ec3d73af48ce2b6a:540322
          bddd7f92b46ae022c5b590a22459634d:jo08jo02
          c0a741e5e2fb2e3df81c1b003547825a:cyl1008
          45fd035ccac01f33baa48e91fb014dca:6eu5v7sLwI
          6f3a1642b22f6e816a2979963a3b2dff:19860613
          d74864ba33eb47f5b9be5a6e37d9fc20:peduna5
          2461b606819363e71e4f97b2b5ded126:19770531
          4cd0efe4070757d2f6baeeac21cdb320:Fripouille76
          112f30e72454a80f8a9e6168437cee4c:021440
          c1775d5ee5751af2492bba1cc680fbd7:Strife1!
          0912d4922fa5b8ec600b8ecaf3558293:7yjv5lzO7Y
          d2f64eeb1a1ae8eeff8288e6ccc500d3:EDGAR14$
          5adc95dfd45421d8e0522c90c54a2d4d:hZnCinANX
          

          I suppose, @astrosofista, that is something similar to your regex S/R ;-))

          Best Regards,

          guy038

          astrosofistaA 1 Reply Last reply Reply Quote 1
          • astrosofistaA
            astrosofista @guy038
            last edited by

            @guy038 said in Filter the data !!!:

            I suppose, @astrosofista, that is something similar to your regex S/R ;-))

            Hi @guy038, All:

            I think so, as both approaches are destructive. Yours looks nicer, mine seems simpler in the sense that the techniques used are more basic -no look-arounds or POSIX character classes, for example- and also in terms of the logical structure, since the alternation has only two members, A|B. A describes the line to match, taking care to capture via negative classes both the hash and the password -so if it is only wanted the last one it is easy to deliver it-, and B deals with the unwanted lines, it’s a basic ^.*\R.

            The replacement expression is, as you surely guessed, ?1$1\:$2\n.

            Best Regards.

            Alan KilbornA 1 Reply Last reply Reply Quote 1
            • Alan KilbornA
              Alan Kilborn @astrosofista
              last edited by

              @astrosofista @guy038

              I think something is being missed here. First, should we truly be helping out the OP when we suspect we might only be aiding evil purposes? This is directed more to @guy038 because @astrosofista already acknowledged this.

              Second, before it dawned on me (by @PeterJones hitting me over the head with it) that we might have a bad situation brewing, I already gave the answer for anyone that cared to follow it:

              • reference the other thread I linked early on, where @guy038 provided the general solution
              • use the regex I linked earlier in this thread which even included the capturing groups needed for the eventual (specific) solution!
              astrosofistaA 1 Reply Last reply Reply Quote 1
              • guy038G
                guy038
                last edited by guy038

                Hello, @fake-trum, @alan-kilborn, @peterjones, @astrosofista and All,

                Of course, I gave a solution, but you must admit that my post was quite succinct. As we say in France: the minimum trade union discourse ;-))

                I mean that I wanted to express my disapproval and say that @fake-trum should have been more patient to fully examine our solutions, before giving up !

                Perhaps it would have been better not to provide a solution at all, given that the PO did not want to get involved any further !

                But the power and compactness of the regular expression code prevented me from doing so ;-))) So beautiful !

                Cheers,

                guy038

                1 Reply Last reply Reply Quote 1
                • PeterJonesP
                  PeterJones
                  last edited by

                  @Alan-Kilborn said in Filter the data !!!:

                  hitting me over the head with it

                  Well, I wasn’t trying to be violent to the regulars. I just saw the signs of hash/password pairs, and I couldn’t tell from the downloaded source code whether it was one of the “has my password been hacked” white-hat sites, or “here’s a list of password hashes for infiltrating poorly-written logins” black-hat-sites. The OP’s response wasn’t overly clarifiying.

                  Unfortunately, I realized last night while trying to fall asleep what regex I should have responded with, rather than my openly-antagonistic lingual response. It wouldn’t have been a solution to the OP’s question, but it might have helped the OP. See if you can figure out what it does before running it on the example data.

                  • FIND: (?s)(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(.)?(\Z)?
                  • REPLACE: (?1\x{49})(?2\x{20})(?3\x{57})(?4\x{49})(?5\x{4C})(?6\x{4C})(?7\x{20})(?8\x{4E})(?9\x{4F})(?10\x{54})(?11\x{20})(?12\x{42})(?13\x{52})(?14\x{55})(?15\x{54})(?16\x{45})(?17\x{20})(?18\x{46})(?19\x{4F})(?20\x{52})(?21\x{43})(?22\x{45})(?23\x{20})(?24\x{50})(?25\x{41})(?26\x{53})(?27\x{53})(?28\x{57})(?29\x{4F})(?30\x{52})(?31\x{44})(?32\x{53}\x{0D}\x{0A})(?33\x{0D}\x{0A}\x{0D}\x{0A}\x{2D}\x{2D}\x{20}\x{73}\x{69}\x{67}\x{6E}\x{65}\x{64}\x{2C}\x{20}\x{74}\x{68}\x{65}\x{20}\x{65}\x{78}\x{2D}\x{73}\x{63}\x{72}\x{69}\x{70}\x{74}\x{2D}\x{6B}\x{69}\x{64}\x{64}\x{69}\x{65})
                  Alan KilbornA 1 Reply Last reply Reply Quote 0
                  • Alan KilbornA
                    Alan Kilborn @PeterJones
                    last edited by

                    @PeterJones said in Filter the data !!!:

                    hitting me over the head with it

                    Slight language misinterpretation: I meant it more as me “getting hit by a lightening bolt of realization”…after you made it plain what could be going on.

                    1 Reply Last reply Reply Quote 1
                    • astrosofistaA
                      astrosofista @Alan Kilborn
                      last edited by

                      Hi @Alan-Kilborn, All

                      What can I say! @guy038’s solution is very interesting and I don’t blame him for posting it. It’s worthy of analysis - at least I learned something - and I think it’s more geared towards the regulars on the list than to OP. As for my answer, it only describes a solution and is understandable only for those who know regular expressions, so I don’t see anything wrong with it.

                      To be honest, my first reaction was to suggest the direct selection of the passwords by means of Eko’s PS, something that doesn’t take more than a couple of seconds. However, as OP didn’t seem willing to learn anything and it was necessary to guide her/him to install the plugin and the script, I gave up this approach. But in the meantime I had noticed a different solution than yours, I tried it, it worked correctly, but didn’t publish it for the reasons seen.

                      Well, enough of this for me.

                      Now I would like to change the subject of the conversation a little, taking advantage of the fact that there are almost no new posts.

                      I have noticed that often the length of the regular expressions we are using exceeds by far the extension of the search and replacement fields, making it impossible to display the full expression. This limitation makes it difficult to analyze and understand other people’s expressions and to correct one’s own.

                      Taking into account that there is still some blank space in the Find window, wouldn’t it be a good idea to implement a line wrapping in the search and replacement fields, so that an expression exceeding 36 characters - the maximum displayable in my configuration - continues on the next line and so on until the expression is complete? Even if a limit is set for each field, say 3 lines, these would still give a better picture of the expression than one limited to a single line.

                      A bonus to facilitate the analysis and construction of regular expressions would be the implementation of a colored syntax to highligth groups and alternations at a glance - by the way, maybe I am not aware and this is currently feasible, you tell me.

                      Of course, I am aware that it is uncommon for the average user to run searches that go beyond the current length of the find field, let alone use regular expressions, so these features would not directly benefit most users. However I still find them valuables and I think they would be useful additions to Notepad++. Having made this caveat, I would like to hear your opinions. If these topics have been discussed before, I would appreciate links to those discussions.

                      Sorry for the long post :)

                      EkopalypseE Alan KilbornA 2 Replies Last reply Reply Quote 1
                      • guy038G
                        guy038
                        last edited by guy038

                        Hi, @astrosofista and All,

                        Personally, after dragging on the right, with the mouse, the Find dialog to its maximum, I’m able to type in up to 100 characters, with the monospaced search font ;-))

                        c2bb0b1b-6a01-473a-b384-009c54480ff4-image.png

                        Best Regards,

                        guy038

                        Alan KilbornA astrosofistaA 2 Replies Last reply Reply Quote 3
                        • Alan KilbornA
                          Alan Kilborn @guy038
                          last edited by

                          All,

                          Personally, I don’t widen the Find window for that purpose, but I often do it so that I can see my entire path in the Directory box on the Find in Files tab!

                          @astrosofista The Toolbucket plugin provides multiline Find and Replace boxes, maybe that is to your liking. It’s probably been debated before many times that Notepad++ itself should have bigger boxes for these things, but I can’t cite any references.

                          astrosofistaA 1 Reply Last reply Reply Quote 3
                          • guy038G
                            guy038
                            last edited by guy038

                            Hello @peterjones, @alan-kilborn, @Astrosofista,

                            Oh my God ! This morning, I only realize how stupid and naive I’ve been. So, I was, as they say, completely out of it.

                            I just concentrated on developing a correct syntax of regular expressions, adapted to the OP problem, when I should have read @Alan-kilborn and @Peterjones previous posts more carefully.

                            Like them, I do not want my help to be used to cover up questionable work about passwords. Very sorry for my blatant error of judgment :-(( This will serve as a lesson to me !

                            BR

                            guy038

                            1 Reply Last reply Reply Quote 1
                            • EkopalypseE
                              Ekopalypse @astrosofista
                              last edited by

                              @astrosofista

                              I understand the desire to see everything and would also welcome
                              if someone finds a solution, but right now I see two challenges.
                              If it were multiline search/replace textboxes, then inserting EOLs is possible.
                              How does Npp know that the inserted EOL should not be part of the search expression or replacement pattern?
                              If it is a kind of word wrapping, how can we make sure that it is wrapped at a reasonable position to avoid confusion?

                              Personally, I’d prefer that the incremental search

                              1. would be upgraded by regular expressions
                              2. automatically adjusts to the window width
                              3. provides a shortcut to easily switch to the editor and back again
                              4. and, pure optional but really nice to have, a regex-lexer which colors and check my regexes.
                              Alan KilbornA astrosofistaA 2 Replies Last reply Reply Quote 3
                              • Alan KilbornA
                                Alan Kilborn @Ekopalypse
                                last edited by

                                @Ekopalypse said in Filter the data !!!:

                                How does Npp know that the inserted EOL should not be part of the search expression or replacement pattern?
                                If it is a kind of word wrapping, how can we make sure that it is wrapped at a reasonable position to avoid confusion?

                                Very good points.

                                I’d prefer that the incremental search…

                                Very good feature requests
                                However, I don’t think Incremental Search has mass appeal or is used very much in Notepad++.
                                I have no evidence for this, aside from I don’t recall any questions here about it before.

                                provides a shortcut to easily switch to the editor

                                I just press Esc
                                Not ideal because it closes the window, but it works.

                                EkopalypseE 1 Reply Last reply Reply Quote 1
                                • EkopalypseE
                                  Ekopalypse @Alan Kilborn
                                  last edited by

                                  @Alan-Kilborn

                                  I just press Esc

                                  That was my workaround too :-)

                                  However, I don’t think Incremental Search has mass appeal or is used very much in Notepad++.

                                  Maybe because of the lacking RE feature - but if it would get it then it would be really cool
                                  as, beside from the normal find dialog, it updates its find location while typing.

                                  Alan KilbornA 1 Reply Last reply Reply Quote 0
                                  • Alan KilbornA
                                    Alan Kilborn @Ekopalypse
                                    last edited by

                                    @Ekopalypse said in Filter the data !!!:

                                    Maybe because of the lacking RE feature - but if it would get it then it would be really cool

                                    Since it runs a search at every keystroke, performance problem on huge files?

                                    it updates its find location while typing.

                                    Hmm, thinking of typing .* into this window and having my caret immediately jump from where I was concentrating on my editing to now be at very end of file. :-)

                                    EkopalypseE 1 Reply Last reply Reply Quote 0
                                    • EkopalypseE
                                      Ekopalypse @Alan Kilborn
                                      last edited by Ekopalypse

                                      @Alan-Kilborn

                                      Since it runs a search at every keystroke, performance problem on huge files?

                                      … yes but I would argue … don’t do it, use the find dialog instead :-)

                                      my caret immediately jump from where I was concentrating

                                      as this feature doesn’t exist yet it might be that it doesn’t do what you think it will do :-)
                                      But I get your point, that would be, at least, confusing. :-D

                                      Alan KilbornA 1 Reply Last reply Reply Quote 0
                                      • Alan KilbornA
                                        Alan Kilborn @Ekopalypse
                                        last edited by Alan Kilborn

                                        @Ekopalypse said in Filter the data !!!:

                                        as this feature doesn’t exist yet it might be that it doesn’t do what you think it will do

                                        So current implementation sets active selection to text matching incremental search data.
                                        If you return to the editor, your caret is left at the end of the selected text (the end closer to end-of-file).
                                        Default expectation is it would work same way if there was a regex mode.
                                        Thus, my guess is that .* would leave one’s caret at end-of-file with everything above selected.
                                        I supposed it would have to be (?s).* to be entirely correct.
                                        But, yes, I guess Notepad++ devs could change how it logically works (i.e., leave caret at start of selection, closer to original caret pos)?

                                        1 Reply Last reply Reply Quote 0
                                        • Alan KilbornA
                                          Alan Kilborn @astrosofista
                                          last edited by

                                          @astrosofista said in Filter the data !!!:

                                          implementation of a colored syntax to highligth groups and alternations at a glance - by the way, maybe I am not aware and this is currently feasible

                                          I’m sure not quite what is being asked for, but here’s a curious little Pythonscript.

                                          It takes a regex as its input and then highlights the current file according to the sections of the file that don’t match (yellow), and the overall match (left “uncolored”) and the capturing groups in the regex (group #1 = cyan, group #2 = orange, group #3 = purple, group #4 = dark-green, group #5 = red). Above group #5 I didn’t bother doing.

                                          The reason I left the overall match (group #0) uncolored is that we’d have had overlapping colors that way, and I thought that would have made things less clear.

                                          So if we take the text of the script itself:

                                          # -*- coding: utf-8 -*-
                                          
                                          # see https://community.notepad-plus-plus.org/topic/19240/filter-the-data
                                          
                                          from Npp import editor, notepad
                                          
                                          class T19240(object):
                                              def __init__(self):
                                                  indic_list = [ 23, 25, 24, 22, 21, 31 ]
                                                  for i in indic_list: editor.setIndicatorCurrent(i); editor.indicatorClearRange(0, editor.getTextLength())
                                                  regex = r'(?-s)(notepad|editor)\.(.*?)\(.*?\)'
                                                  regex = notepad.prompt('Enter regex (just Cancel to clear colors from previous run):', '', regex)
                                                  if regex == None or len(regex) == 0: return
                                                  def fill(indic, start_pos, end_pos): editor.setIndicatorCurrent(indic); editor.indicatorFillRange(start_pos, end_pos - start_pos)
                                                  self.remember = 0
                                                  def match_fn(m):
                                                      fill(indic_list[0], self.remember, m.span(0)[0])
                                                      self.remember = m.span(0)[1]
                                                      for grp in range(len(m.groups()) + 1):
                                                          #print(grp, '->', m.span(grp), m.group(grp))
                                                          if 0 < grp <= 5: fill(indic_list[grp], m.span(grp)[0], m.span(grp)[1])
                                                  editor.research(regex, match_fn)
                                                  fill(indic_list[0], self.remember, editor.getTextLength())
                                          
                                          if __name__ == '__main__': T19240()
                                          

                                          and we run the script on that, and accept the suggested regex, we get:

                                          a16b2690-8a5d-46c6-9b44-22efdbdbec96-image.png

                                          astrosofistaA 1 Reply Last reply Reply Quote 1
                                          • guy038G
                                            guy038
                                            last edited by guy038

                                            Hello, @alan-kilborn and All,

                                            I tested your Python script : Works nice :-)

                                            I noticed that the id of styles 1 to 5 are in reverse order, giving their names !

                                            So :

                                            Mark   Style 1  = 25
                                            Mark   Style 2  = 24
                                            Mark   Style 3  = 23
                                            Mark   Style 4  = 22
                                            Mark   Style 5  = 21
                                            
                                            Find Mark Style = 31
                                            

                                            I also noted that the first indicator, of the indic_list, is the color with highlights parts of text which do not match the user regex

                                            Personally, I preferred that this specific color was the Find Mark style, which allows me to wipe out the color of all non-matched parts, using the Clear all marks button of the Mark dialog !

                                            And to clear the different highlighting groups, I just use the Remove style > Clear all Styles option, of the Context menu !

                                            Now, Alan, would it be possible to show the $0 group, with the kind of highlighting, in the picture below :

                                            42f9f2fb-63a6-4a65-af0a-34bff5ca34ab-image.png

                                            Just a suggestion, of course ! Only if interested and if you get some spare time !

                                            Best Regards,

                                            guy038

                                            P.S. :

                                            I know, I abuse, but would it also be possible to easily modify the border color of that $0 group ?

                                            Alan KilbornA 1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors