Community
    • Login

    I need a function/plugin to extract only unnecessary text from lines

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    15 Posts 4 Posters 347 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • PeterJonesP
      PeterJones @Ragnar Lodbrok
      last edited by PeterJones

      @Ragnar-Lodbrok

      Don’t forget to hit the </> on the toolbar, and paste your data where the forum post editor says code_text.

      Second, these better not be real user passwords, or any private data, you are sharing with the whole internet. This is a public forum, and anyone can read them.

      I really don’t like helping people with password-file search/replace, because it’s too high of a risk that I’m helping someone who is harvesting email/password pairs.

      Since I already started, I will answer this one last question… though I will use moderator powers to change every password field that you’ve shown above, just to make sure.

      –

      But if I had text like this, it wouldn’t be possible.

      The regex I gave assumed to the left and right of the : would be “word characters”, which means letters, numbers, and underscore. Since your original example only included those, that’s all I thought I needed to match. I’ll be less restrictive this time, and assume the rules are as follows:

      • junk at the beginning of a line, ending with a ] followed by one or more whitespace characters, should be removed, even if there’s a colon
        • there isn’t an example of it, but I’ll also no longer assume that the colon-pair are the first “word” on a given line, and will throw that away as junk, as well
      • there will be a start-of-line or space before the colon-separated pair.
      • I will assume the colon-separated pairs cannot contain whitespace (space, tab, newline), but any other character is fair game

      FIND = ^(?:.*\x5D\s|[^:]+\s)?(\S+:\S+).*$(\R)?
      REPLACE = ?1$1$2
      SEARCH MODE = Regular Expression

      That will turn

      zzzzzzzzzzzz0:Azzzzzzzz00 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
      zzzzz000:Azzzz000 | Azzzęzzz zzzzzzzz = 000,0 AA | Azzzzz = 0000
      zzzzzz:zzzzzzz0 | Azzzęzzz zzzzzzzz = 000,00 AA | Azzzzz = 0000
      zzzzzzzz:zz0z0z0 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
      zzzzzzz0:zzzzzz000 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
      zzzzzz.zzzzzz@zz.zz:Az000000 | zzzzzzzzzzz.zz = Azzzzzzóz | 00 = Azzzz Azz/Azz Az
      zzzzzzzzzzzz0:Azzzzzzzz00 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
      zzzzz000:Azzzz000 | Azzzęzzz zzzzzzzz = 000,0 AA | Azzzzz = 0000
      zzzzzz:zzzzzzz0 | Azzzęzzz zzzzzzzz = 000,00 AA | Azzzzz = 0000
      zzzzzzzz:zz0z0z0 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
      zzzzzzz0:zzzzzz000 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
      zzzzzzzzz.0000:Azzzzzzzzz00 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
      00:00][AAAAAAA] zzzzzzzz0:zzzz000 - [Azzz Azzzzz = 0,00 AA | AAAAAA = 0000] - @AAAAAA
      00:00][AAAAAAA] zzzzzzz_zzzzzz:zzzzzz00zzzzzz - [Azzz Azzzzz = 0,00 AA | AAAAAA = 0000] - @AAAAAA
      00:00][AAAAAAA] zzzzzzz:zzzzzz0 - [Azzz Azzzzz = 00,00 AA | AAAAAA = 0000] - @AAAAAA
      00:00][AAAAAAA] zzzzzzzzz.00:z0zzzzA00!zzzzzzzz - [Azzz Azzzzz = 00,00 AA | AAAAAA = 0000] - @AAAAAA
      zzzz0000:zzzz00 | zzzzzzzzzz = 00.00.0000 00:00:00
      zzzzzzzzzzzz00:zzzzz0000 | zzzzzzzzzz = 00.00.0000 00:00:00
      zzzzz00:zzzzz000 | zzzzzzzzzz = 00.00.0000 00:00:00
      zzzzz00:zzzzzzz | zzzzzzzzzz = 00.00.0000 00:00:00
      zzzzzzzzz:zzzzzzzzzz0 | zzzzzzzzzz = 00.00.0000 00:00:00
      Azzzz:Azzzzzzzzz00 | zzzzzzzzzz = 00.00.0000 00:00:00
      zzzzzzzzzzzzzz000:zzzzzz00 | zzzzzzzzzz = 00.00.0000 00:00:00
      zzzz:zzzz0000 | zzzzzzzzzz = 00.00.0000 00:00:00
      Azzzzz:000000 | zzzzzzzzzz = 00.00.0000 00:00:00
      00zzzzzzzz:%AzzAAAz?AAz0zz | 
      000000000@zzzz:A0000A | 
      000000Az:0000000Azz!z
      zzzzzzz:zzzzzzz0 | zzz = [zzzzzz zzz zzzzzz]
      Azzz00000000:Azzz000 | 
      zzzz0000:0zzz0AAzzz0A_00z | 
      AzzzAzzzz00:Azzzz0000 | 
      Azzz_00:Azzzz000 | 
      zzzzzz:Azzzzzz0 | [zzzzzz zzz zzzzzz]
      zzzzzz:AzAzAz_000 | 
      zzzzzzzz:Azzzzzz0000 | [zzzzzz zzz zzzzzz]
      zzzzzzz00@zz.zz:Azzzz0000 | Azzzzzzzz zzzzzzzz = 0 | Azzzzz = 0 | Azzzzzzzzz = 0
      zz zzzzz zzzz zzzzz:zzz0z
      

      into

      zzzzzzzzzzzz0:Azzzzzzzz00
      zzzzz000:Azzzz000
      zzzzzz:zzzzzzz0
      zzzzzzzz:zz0z0z0
      zzzzzzz0:zzzzzz000
      zzzzzz.zzzzzz@zz.zz:Az000000
      zzzzzzzzzzzz0:Azzzzzzzz00
      zzzzz000:Azzzz000
      zzzzzz:zzzzzzz0
      zzzzzzzz:zz0z0z0
      zzzzzzz0:zzzzzz000
      zzzzzzzzz.0000:Azzzzzzzzz00
      zzzzzzzz0:zzzz000
      zzzzzzz_zzzzzz:zzzzzz00zzzzzz
      zzzzzzz:zzzzzz0
      zzzzzzzzz.00:z0zzzzA00!zzzzzzzz
      zzzz0000:zzzz00
      zzzzzzzzzzzz00:zzzzz0000
      zzzzz00:zzzzz000
      zzzzz00:zzzzzzz
      zzzzzzzzz:zzzzzzzzzz0
      Azzzz:Azzzzzzzzz00
      zzzzzzzzzzzzzz000:zzzzzz00
      zzzz:zzzz0000
      Azzzzz:000000
      00zzzzzzzz:%AzzAAAz?AAz0zz
      000000000@zzzz:A0000A
      000000Az:0000000Azz!z
      zzzzzzz:zzzzzzz0
      Azzz00000000:Azzz000
      zzzz0000:0zzz0AAzzz0A_00z
      AzzzAzzzz00:Azzzz0000
      Azzz_00:Azzzz000
      zzzzzz:Azzzzzz0
      zzzzzz:AzAzAz_000
      zzzzzzzz:Azzzzzz0000
      zzzzzzz00@zz.zz:Azzzz0000
      zzzzz:zzz0z
      

      Which again is what I think you want. But this is the last help I will give in this quest. Each of the pieces used in the regular expressions I showed are described in the user manual in the Regular Expressions syntax section. If you need more changes, you will have to start trying to figure it out on your own.

      1 Reply Last reply Reply Quote 1
      • mpheathM
        mpheath @Ragnar Lodbrok
        last edited by mpheath

        @Ragnar-Lodbrok

        Instead of matching from the start of the line, it might be easier to match from the end of the line and remove the whole match.

        FIND = \h*\|.*$
        REPLACE = EMPTY
        SEARCH MODE = Regular Expression

        The $ anchors to the end of the line. The greediness of .* should match back to the first | and the \h* will match back any horizontal whitespace to leave just the first segment of characters wanted.

        Ensure . does match newline with the checkbox is unchecked.

        PeterJonesP 1 Reply Last reply Reply Quote 1
        • PeterJonesP
          PeterJones @mpheath
          last edited by

          @mpheath ,

          Instead of matching from the start of the line, it might be easier to match from the end of the line and remove the whole match.

          Probably a good idea (and simpler than mine) for most of the lines, but it wouldn’t work for some of the new data:

          00:00][AAAAAAA] zzzzzzzz0:zzzz000 - [Azzz Azzzzz = 0,00 AA | AAAAAA = 0000] - @AAAAAA
          00:00][AAAAAAA] zzzzzzz_zzzzzz:zzzzzz00zzzzzz - [Azzz Azzzzz = 0,00 AA | AAAAAA = 0000] - @AAAAAA
          00:00][AAAAAAA] zzzzzzz:zzzzzz0 - [Azzz Azzzzz = 00,00 AA | AAAAAA = 0000] - @AAAAAA
          00:00][AAAAAAA] zzzzzzzzz.00:z0zzzzA00!zzzzzzzz - [Azzz Azzzzz = 00,00 AA | AAAAAA = 0000] - @AAAAAA
          
          mpheathM 1 Reply Last reply Reply Quote 2
          • mpheathM
            mpheath @PeterJones
            last edited by

            @PeterJones

            That will make it quite more complex.

            FIND = (?(?=^\d\d:\d\d).*\R|\h*\|.*$)
            REPLACE = EMPTY
            SEARCH MODE = Regular Expression

            In comparison to \h*\|.*$, this pattern removes the 4 lines mentioned with a conditional (?(condition)yes|no) so yes to match whole line if 00:00 like digits else use \h*\|.*$ .

            PeterJonesP 1 Reply Last reply Reply Quote 0
            • PeterJonesP
              PeterJones @mpheath
              last edited by

              @mpheath said in I need a function/plugin to extract only unnecessary text from lines:

              removes the 4 lines mentioned

              Why remove? The OP said (translated): “But if I had text like this, it wouldn’t be possible.” – I interpreted that to mean that all the data in the example should be stripped down to the xzy:xyz.

              So instead of deleteing the lines like that, my solution edits them down to

              zzzzzzzz0:zzzz000
              zzzzzzz_zzzzzz:zzzzzz00zzzzzz
              zzzzzzz:zzzzzz0
              zzzzzzzzz.00:z0zzzzA00!zzzzzzzz
              

              that is, it strips the stuff before and after the pairs, but keeps the pairs.

              mpheathM 1 Reply Last reply Reply Quote 1
              • mpheathM
                mpheath @PeterJones
                last edited by

                @PeterJones Your correct. Seems the colon is important and not the pipe. I am not sure with possible variations what may pass or fail to achieve the desired result. The 1st post has a result with 1 less line so I have doubt what is needed.

                1 Reply Last reply Reply Quote 1
                • guy038G
                  guy038
                  last edited by guy038

                  Hello, @ragnar-lodbrok, @peterjones, @mpheath and All

                  Using the INPUT text of @peterjones, I also searched for a single regex, without success. I’ve just found out two successive searches/replacements which produce the same OUTPUT as the @peterjones’s one ! These two regexes simply delete everything which is not wanted.

                  So, starting with :

                  zzzzzzzzzzzz0:Azzzzzzzz00 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                  zzzzz000:Azzzz000 | Azzzęzzz zzzzzzzz = 000,0 AA | Azzzzz = 0000
                  zzzzzz:zzzzzzz0 | Azzzęzzz zzzzzzzz = 000,00 AA | Azzzzz = 0000
                  zzzzzzzz:zz0z0z0 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                  zzzzzzz0:zzzzzz000 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                  zzzzzz.zzzzzz@zz.zz:Az000000 | zzzzzzzzzzz.zz = Azzzzzzóz | 00 = Azzzz Azz/Azz Az
                  zzzzzzzzzzzz0:Azzzzzzzz00 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                  zzzzz000:Azzzz000 | Azzzęzzz zzzzzzzz = 000,0 AA | Azzzzz = 0000
                  zzzzzz:zzzzzzz0 | Azzzęzzz zzzzzzzz = 000,00 AA | Azzzzz = 0000
                  zzzzzzzz:zz0z0z0 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                  zzzzzzz0:zzzzzz000 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                  zzzzzzzzz.0000:Azzzzzzzzz00 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                  00:00][AAAAAAA] zzzzzzzz0:zzzz000 - [Azzz Azzzzz = 0,00 AA | AAAAAA = 0000] - @AAAAAA
                  00:00][AAAAAAA] zzzzzzz_zzzzzz:zzzzzz00zzzzzz - [Azzz Azzzzz = 0,00 AA | AAAAAA = 0000] - @AAAAAA
                  00:00][AAAAAAA] zzzzzzz:zzzzzz0 - [Azzz Azzzzz = 00,00 AA | AAAAAA = 0000] - @AAAAAA
                  00:00][AAAAAAA] zzzzzzzzz.00:z0zzzzA00!zzzzzzzz - [Azzz Azzzzz = 00,00 AA | AAAAAA = 0000] - @AAAAAA
                  zzzz0000:zzzz00 | zzzzzzzzzz = 00.00.0000 00:00:00
                  zzzzzzzzzzzz00:zzzzz0000 | zzzzzzzzzz = 00.00.0000 00:00:00
                  zzzzz00:zzzzz000 | zzzzzzzzzz = 00.00.0000 00:00:00
                  zzzzz00:zzzzzzz | zzzzzzzzzz = 00.00.0000 00:00:00
                  zzzzzzzzz:zzzzzzzzzz0 | zzzzzzzzzz = 00.00.0000 00:00:00
                  Azzzz:Azzzzzzzzz00 | zzzzzzzzzz = 00.00.0000 00:00:00
                  zzzzzzzzzzzzzz000:zzzzzz00 | zzzzzzzzzz = 00.00.0000 00:00:00
                  zzzz:zzzz0000 | zzzzzzzzzz = 00.00.0000 00:00:00
                  Azzzzz:000000 | zzzzzzzzzz = 00.00.0000 00:00:00
                  00zzzzzzzz:%AzzAAAz?AAz0zz | 
                  000000000@zzzz:A0000A | 
                  000000Az:0000000Azz!z
                  zzzzzzz:zzzzzzz0 | zzz = [zzzzzz zzz zzzzzz]
                  Azzz00000000:Azzz000 | 
                  zzzz0000:0zzz0AAzzz0A_00z | 
                  AzzzAzzzz00:Azzzz0000 | 
                  
                  Azzz_00:Azzzz000 | 
                  zzzzzz:Azzzzzz0 | [zzzzzz zzz zzzzzz]
                  zzzzzz:AzAzAz_000 | 
                  zzzzzzzz:Azzzzzz0000 | [zzzzzz zzz zzzzzz]
                  zzzzzzz00@zz.zz:Azzzz0000 | Azzzzzzzz zzzzzzzz = 0 | Azzzzz = 0 | Azzzzzzzzz = 0
                  zz zzzzz zzzz zzzzz:zzz0z
                  

                  First search/replacement :

                  • FIND (?-s)^(\S+:\S+\x20|[^:\r\n]+\x20)?\S+:\S+(*SKIP)(*F)|.+

                  • REPLACE Leave EMPTY

                  Second search/replacement :

                  • FIND '(?-s)^.+?\x20(?=\S+:)

                  • REPLACE Leave EMPTY

                  Which gives the following OUTPUT result :

                  zzzzzzzzzzzz0:Azzzzzzzz00
                  zzzzz000:Azzzz000
                  zzzzzz:zzzzzzz0
                  zzzzzzzz:zz0z0z0
                  zzzzzzz0:zzzzzz000
                  zzzzzz.zzzzzz@zz.zz:Az000000
                  zzzzzzzzzzzz0:Azzzzzzzz00
                  zzzzz000:Azzzz000
                  zzzzzz:zzzzzzz0
                  zzzzzzzz:zz0z0z0
                  zzzzzzz0:zzzzzz000
                  zzzzzzzzz.0000:Azzzzzzzzz00
                  zzzzzzzz0:zzzz000
                  zzzzzzz_zzzzzz:zzzzzz00zzzzzz
                  zzzzzzz:zzzzzz0
                  zzzzzzzzz.00:z0zzzzA00!zzzzzzzz
                  zzzz0000:zzzz00
                  zzzzzzzzzzzz00:zzzzz0000
                  zzzzz00:zzzzz000
                  zzzzz00:zzzzzzz
                  zzzzzzzzz:zzzzzzzzzz0
                  Azzzz:Azzzzzzzzz00
                  zzzzzzzzzzzzzz000:zzzzzz00
                  zzzz:zzzz0000
                  Azzzzz:000000
                  00zzzzzzzz:%AzzAAAz?AAz0zz
                  000000000@zzzz:A0000A
                  000000Az:0000000Azz!z
                  zzzzzzz:zzzzzzz0
                  Azzz00000000:Azzz000
                  zzzz0000:0zzz0AAzzz0A_00z
                  AzzzAzzzz00:Azzzz0000
                  
                  Azzz_00:Azzzz000
                  zzzzzz:Azzzzzz0
                  zzzzzz:AzAzAz_000
                  zzzzzzzz:Azzzzzz0000
                  zzzzzzz00@zz.zz:Azzzz0000
                  zzzzz:zzz0z
                  

                  Best Regards,

                  guy038

                  1 Reply Last reply Reply Quote 2
                  • mpheathM
                    mpheath @Ragnar Lodbrok
                    last edited by

                    @Ragnar-Lodbrok

                    To use this function, download the LuaScript plugin from Plugin Admin.
                    Add this function to startup.lua which the LuaScript plugin can open from the Plugins menu item.

                    npp.AddShortcut('CleanRagnarLog', '', function()
                        -- Transform lines into ...:... value per line.
                    
                        editor:BeginUndoAction()
                    
                        for line = 0, editor.LineCount - 1 do
                            local text = editor:GetLine(line)
                            local tokens = {}
                    
                            for token in string.gmatch(text or '', '%S+') do
                    
                                -- Only tokens before the 1st pipe.
                                if token == '|' then
                                    break
                                end
                    
                                -- Match tokens without these characters.
                                if not string.find(token, '[%[%]]') then
                    
                                    -- Valid token should contain a colon.
                                    if string.find(token, '%S:%S') then
                                        table.insert(tokens, token)
                                    end
                                end
                            end
                    
                            if #tokens < 2 then
                    
                                -- Target line text excluding the newlines.
                                editor.TargetStart = editor:PositionFromLine(line)
                                editor.TargetEnd = editor.LineEndPosition[line]
                    
                                if editor.TargetEnd > editor.TargetStart then
                    
                                    -- 0 tokens to set line as empty.
                                    if #tokens == 0 then
                                        editor:ReplaceTarget('')
                    
                                    -- 1 token to set line as token.
                                    elseif #tokens == 1 then
                                        local text_trimmed = string.gsub(text, '[\r\n]', '')
                    
                                        if tokens[1] ~= text_trimmed then
                                            editor:ReplaceTarget(tokens[1])
                                        end
                                    end
                                end
                            else
                    
                                -- Code may need improvement if more than 1 token matched.
                                print(string.format('Line %i has %i possible tokens:', line, #tokens))
                    
                                for i, v in ipairs(tokens) do
                                    print(string.format(' %4s %s', i, v))
                                end
                            end
                        end
                    
                        editor:EndUndoAction()
                    end)
                    

                    Restart Notepad++. From the Plugins menu item, click on CleanRagnarLog item within the LuaScript menu item to run the function.

                    If more than 1 token is matched in a line, then check the console for line number and the tokens matched which may help with updating the function code to get just 1 token.

                    Ragnar LodbrokR 1 Reply Last reply Reply Quote 0
                    • Ragnar LodbrokR
                      Ragnar Lodbrok @mpheath
                      last edited by PeterJones

                      @mpheath mam taka liste i chcę z niej wyciągnąc same login:hasło i email:hasło

                      aaaaaa.aaaaaa@aa.aa:Za000000 | aaaaaaaaaaa.aa = Zaaaaaaaa | 00 = Zaaaa Zaa/Zaa Za
                      aaaaaaaaaaaa0:Zaaaaaaaa00 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                      aaaaa000:Zaaaa000 | Zaaaaaaa aaaaaaaa = 000,0 ZZ | Zaaaaa = 0000
                      aaaaaa:aaaaaaa0 | Zaaaaaaa aaaaaaaa = 000,00 ZZ | Zaaaaa = 0000
                      aaaaaaaa:aa0a0a0 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                      aaaaaaa0:aaaaaa000 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                      aaaaaaaaa.0000:Zaaaaaaaaa00 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                      00:00][ZZZZZZZ] aaaaaaaa0:aaaa000 - [Zaaa Zaaaaa = 0,00 ZZ | ZZZZZZ = 0000] - @ZZZZZZ
                      00:00][ZZZZZZZ] aaaaaaa_aaaaaa:aaaaaa00aaaaaa - [Zaaa Zaaaaa = 0,00 ZZ | ZZZZZZ = 0000] - @ZZZZZZ
                      00:00][ZZZZZZZ] aaaaaaa:aaaaaa0 - [Zaaa Zaaaaa = 00,00 ZZ | ZZZZZZ = 0000] - @ZZZZZZ
                      00:00][ZZZZZZZ] aaaaaaaaa.00:a0aaaaZ00!aaaaaaaa - [Zaaa Zaaaaa = 00,00 ZZ | ZZZZZZ = 0000] - @ZZZZZZ
                      aaaa0000:aaaa00 | aaaaaaaaaa = 00.00.0000 00:00:00
                      aaaaaaaaaaaa00:aaaaa0000 | aaaaaaaaaa = 00.00.0000 00:00:00
                      aaaaa00:aaaaa000 | aaaaaaaaaa = 00.00.0000 00:00:00
                      aaaaa00:aaaaaaa | aaaaaaaaaa = 00.00.0000 00:00:00
                      aaaaaaaaa:aaaaaaaaaa0 | aaaaaaaaaa = 00.00.0000 00:00:00
                      Zaaaa:Zaaaaaaaaa00 | aaaaaaaaaa = 00.00.0000 00:00:00
                      aaaaaaaaaaaaaa000:aaaaaa00 | aaaaaaaaaa = 00.00.0000 00:00:00
                      aaaa:aaaa0000 | aaaaaaaaaa = 00.00.0000 00:00:00
                      Zaaaaa:000000 | aaaaaaaaaa = 00.00.0000 00:00:00
                      00aaaaaaaa:%ZaaZZZa?ZZa0aa | 
                      000000000@aaaa:Z0000Z | 
                      000000Za:0000000Zaa!a | 
                      aaaaaaa:aaaaaaa0 | aaa = [aaaaaa aaa aaaaaa]
                      Zaaa00000000:Zaaa000 | 
                      aaaa0000:0aaa0ZZaaa0Z_00a | 
                      ZaaaZaaaa00:Zaaaa0000 | 
                      Zaaa_00:Zaaaa000 | 
                      aaaaaa:Zaaaaaa0 | [aaaaaa aaa aaaaaa]
                      aaaaaa:ZaZaZa_000 | 
                      aaaaaaaa:Zaaaaaa0000 | [aaaaaa aaa aaaaaa]
                      aaaaaaa00@aa.aa:Zaaaa0000 | Zaaaaaaaa aaaaaaaa = 0 | Zaaaaa = 0 | Zaaaaaaaaa = 0
                      aaaaaaaa@aa.aa:Zaaaaa00 | Zaaaaaaaa aaaaaaaa = 0 | Zaaaaa = 0 | Zaaaaaaaaa = 0
                      aaaaaa0aaa@aa.aa:Zaaaaaa0 | Zaaaaaaaa aaaaaaaa = 0 | Zaaaaa = 0 | Zaaaaaaaaa = 0
                      aaaaaaaaa.aaaaaa@aa.aa:aaa0000 | aaaaaa/aaaaa Zaaaaa ZZZ aaaaaaa aa = 00-00-0000
                      aaaaaaaaaaa@a0.aa:aaaaaaa0000  |  aaaaaaaaaaa.aa =  Zaaaaaaaa | 00 = ZZZZZZ ZZZZZZ | 00 = ZZZZZ ZZ | 00 = ZZZZ ZZ | 00 = ZZZZZZZZZ ZZZ ZZ | 00 = ZZZ ZZ | 00 = ZZZ | 00 = Zaaaaaa ZZ
                      a.aaaaaaaaaaa00@aaaaa.aaa:Zaaaaa0000! |  aaaaaaaaaaa.aa =  Zaaaaaaaa | 00 = Zaaaaa Zaa Za Zaaaa
                      aaaaaaaaa@aaaaaaa.aa:Zaaaaaaa0000 | Zaaaaaaaa aaaaaaaa = 0 | Zaaaaa = 0 | Zaaaaaaaaa = 0
                      aaaaa0000@a0.aa:aaaaa00 | Zaaaaaaaa aaaaaaaa = 0 | Zaaaaa = 0 | Zaaaaaaaaa = 0
                      aaaa.aaaa:Zaaaaaaaa0000 | Zaaaaaaa aaaaaaaa = 0,00 ZZ | Zaaaaa = 0000
                      aaaaaa000@aa.aa:aaaaaaaa000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaa0aaaaa@aaaa.aa:Zaaaaaaaaa0 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      ZZZZ@ZZZZ.ZZ:aaaaaa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaa.a@aaaa.aaa.aa:aaaaaaaa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaa.a@aaaa.aa:aaaaaa000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaaa0000a@aa.aa:Zaaaaaa00. | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaaa00@aa.aa:aa000000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaa000@aa.aa:aaaaaa0 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaaaaaaaaaaa@aa.aa:ZaZaZa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaaa0000@aaaa.aa:Zaaaaaaa000000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaaaaaaaaa@aaaa.aa:Zaaaaaa00$ | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaaaa.aaaaaaa@aaaa.aa:Zaaaaaaaaa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaa0000@aaaa.aa:aaaaaaa0000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaa@aa.aa:aaaaaaaaaa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaa0@aa.aa:Zaaaa0000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaaa.aaaaaaa.aaaaaa@aa.aa:aaaaaa000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaaa0@aa.aa:aaaaa0000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaa.aaaaaaa@aa.aa:Zaaaaaaa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaa000@aa.aa:Zaaaaaa0000! | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaaaaaaaaaaa00@aaaaaa.aa:Zaaaaaaaa0000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                      aaaaaaaa000:000000000a - Zaaaa = 00 | ZZ = 00 | ZZ = 0 | Zaaaaaa Zaaaaaaaa = 0 | Zaaaa Zaaaaaaa = aaaa | Zaaa Zaaaaa aa = 00/00/0000 | Zaaaa Zaaaaa = 0 | Zaaaaa = ZZZ | Zaaaa Zaaaa = 0 | Zaaaa = [] | Zaaaaa = []
                      Zaaaaaa Zaaaaaaaaaa :aaaaaaaaaaaaaa@aaa.aaa:aaaaaaaa
                      aaaaaa00@aaa.aa:Zaaaaa000 | Zaaaa = 000 | Zaaaa = 00 | Zaaa = Zaaaaa | Zaaa Zaaaaa = Zaaaa | Zaaaa = [aaaaaaaaa-aaaaa-ZZZ, aaaaaaaaa-aaaaa-aaaaaaaaaaa, aaa-aaa-0-aaaaa-aaaaaa, aaa-aaaaa, aaaaaaa-aaa-aaaaa, aaaaaaa-aaa-aaaaa, aaaaa-aaaa-aaaaaaaaa-aaa-aaaaa, aaa-aaaa, aaa-aaaaaa-aaa-aaaaaaaa, aaa-aaaaaa-aaa-aaaaaaaa-0] | Zaaaaaaaa = [ZZ, ZZ, ZZ, ZZ0, ZZ0, ZZ, ZZ0, ZZ, ZZ, ZZ0] | 0ZZ = Zaaaa
                      ZZZZZ: aaaa.aaaaa@aaaaa.aaa:Zaaa00000
                      ZaaZaa0000:aaaaaaaaa00											
                      aaaaaa.aaaaaaaaa.00@aaaaa.aaa:Zaaaaaaaaa000000!											
                      aaaaaaa00@aaaaa.aaa:aaaaaa0000										
                      aaaaaaaaaaaa@aaaaaa.aaa    Zaaaaa0000 
                      aaaaaaaaaaaaa@aaaaa.aaa   Zaaaa0000 
                      Zaaaaaaaaaaa@aaaaa.aaa    Zaaaa0000- 
                      aaaaaaaaaaaaaa@aaaaa.aaa   >   aaaaaaa0a
                      aaaaaaa_aaaaaa@aaaaa.aaa   >  aaaaaaaa00aaa
                      aaaaaaaaaaaaaaaaaaaa@aaaaa.aaa   :   Zaaaa0000
                      aaaaaa.aaaaaaa.a@aaaaa.aaa   :    aaa000000
                      aaaaaaaa:aa0a0a0 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                      aaaaaaa0:aaaaaa000 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                      

                      I jak tu teraz to zrobić w notepad by to zadziałało na wszystkich liniach wiem, że są skrypty do tego ale wygodniej jest w notepad czy użyć jakiegos polecenie regex czy wtyczki?

                      —

                      moderator added code markdown around text; please don’t forget to use the </> button to mark example text as “code” so that characters don’t get changed by the forum.

                      PeterJonesP 1 Reply Last reply Reply Quote 0
                      • PeterJonesP
                        PeterJones @Ragnar Lodbrok
                        last edited by

                        @Ragnar-Lodbrok ,

                        Seriously, try a bit harder. The preview window on the right while you are writing your post shows you exactly what your post will look like; it should have been plainly obvious that the text wasn’t ending up in the text boxes.

                        I have this list and I want to extract just the login:password and email:password from it.

                        Now, how do I do this in Notepad so that it works on all lines? I know there are scripts for this, but is it more convenient to use Notepad, or should I use a regex command or a plugin?

                        All the examples given so far, except the lua script, have literally been using a regex command. That’s what we’ve been showing you. If you don’t like it, tough.

                        <new data>

                        Seriously. Given all the data you had shown before, mine works. You just keep changing the rules every time you give new data. Now your data has lines that have no colon, but just have spaces between the email and password, or ones that have spaces and > signs. It’s completely arbitrary. Regular expressions aren’t magic or mind readers. If you lie about your data, you won’t be able to get everything.

                        I think the answer is “there is no way to do it in one meaningful regular expression, because your data is not consistent enough”

                        Besides that answer, I am tired of helping someone who’s probably just trying to make a username/password database that they can use for trying to break into other accounts. So I’m done helping (other than begrudgingly reformatting your posts and getting rid of all the passwords that you’re publishing to this site.)

                        mpheathM 1 Reply Last reply Reply Quote 3
                        • Ragnar LodbrokR
                          Ragnar Lodbrok
                          last edited by PeterJones

                          pair_re = re.compile(
                              r'([A-Za-z0-9._@+\-]{3,64})\s*(?:[:>]|(?:\s{2,}|\t))\s*(\S{3,200})',
                              re.UNICODE
                          )
                          
                          def extract_line_pair(line):
                              cutoff = re.split(r'\s\|\s| - \[|\s\|\|', line, maxsplit=1)[0].strip()
                              matches = pair_re.findall(cutoff)
                              if not matches:
                                  return None
                          
                              login, passwd = matches[-1]
                              login = login.strip()
                              passwd = passwd.strip().rstrip(r' .,;|>\])')
                          
                              if len(login) < 3 or len(passwd) < 3:
                                  return None
                              if re.fullmatch(r'\d{1,3}', login):
                                  return None
                              if login.isupper() and "@" not in login:
                                  for lg, pw in reversed(matches):
                                      if not (lg.isupper() and "@" not in lg) and len(lg) >= 3 and len(pw) >= 3:
                                          login, passwd = lg.strip(), pw.strip().rstrip(r' .,;|>\])')
                                          break
                                  else:
                                      return None
                          
                              return f"{login}:{passwd}"
                          
                          

                          da się to robi wszystko z tymi danymi co podałem powyżej, ale nie chcę używać skryptów tylko w notepad++ da się to zrobic z załą listą od razu a nie za każdym razem zmieniać regex

                          1 Reply Last reply Reply Quote 0
                          • mpheathM
                            mpheath @PeterJones
                            last edited by

                            @PeterJones said in I need a function/plugin to extract only unnecessary text from lines:

                            @Ragnar-Lodbrok ,
                            I think the answer is “there is no way to do it in one meaningful regular expression, because your data is not consistent enough”

                            The data is not consistent as it appears to be harvested data:

                            https://github.com/RagnarLodbrok1981/proxy-scraper-checker
                            forked from
                            https://github.com/monosans/proxy-scraper-checker

                            That may explain the pipe characters in the log like as shown in the repository’s readme image. This appears to be dishonest obtained data.

                            1 Reply Last reply Reply Quote 1
                            • First post
                              Last post
                            The Community of users of the Notepad++ text editor.
                            Powered by NodeBB | Contributors