Community
    • Login

    I need a function/plugin to extract only unnecessary text from lines

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    15 Posts 4 Posters 321 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • mpheathM
      mpheath @Ragnar Lodbrok
      last edited by mpheath

      @Ragnar-Lodbrok

      Instead of matching from the start of the line, it might be easier to match from the end of the line and remove the whole match.

      FIND = \h*\|.*$
      REPLACE = EMPTY
      SEARCH MODE = Regular Expression

      The $ anchors to the end of the line. The greediness of .* should match back to the first | and the \h* will match back any horizontal whitespace to leave just the first segment of characters wanted.

      Ensure . does match newline with the checkbox is unchecked.

      PeterJonesP 1 Reply Last reply Reply Quote 1
      • PeterJonesP
        PeterJones @mpheath
        last edited by

        @mpheath ,

        Instead of matching from the start of the line, it might be easier to match from the end of the line and remove the whole match.

        Probably a good idea (and simpler than mine) for most of the lines, but it wouldn’t work for some of the new data:

        00:00][AAAAAAA] zzzzzzzz0:zzzz000 - [Azzz Azzzzz = 0,00 AA | AAAAAA = 0000] - @AAAAAA
        00:00][AAAAAAA] zzzzzzz_zzzzzz:zzzzzz00zzzzzz - [Azzz Azzzzz = 0,00 AA | AAAAAA = 0000] - @AAAAAA
        00:00][AAAAAAA] zzzzzzz:zzzzzz0 - [Azzz Azzzzz = 00,00 AA | AAAAAA = 0000] - @AAAAAA
        00:00][AAAAAAA] zzzzzzzzz.00:z0zzzzA00!zzzzzzzz - [Azzz Azzzzz = 00,00 AA | AAAAAA = 0000] - @AAAAAA
        
        mpheathM 1 Reply Last reply Reply Quote 2
        • mpheathM
          mpheath @PeterJones
          last edited by

          @PeterJones

          That will make it quite more complex.

          FIND = (?(?=^\d\d:\d\d).*\R|\h*\|.*$)
          REPLACE = EMPTY
          SEARCH MODE = Regular Expression

          In comparison to \h*\|.*$, this pattern removes the 4 lines mentioned with a conditional (?(condition)yes|no) so yes to match whole line if 00:00 like digits else use \h*\|.*$ .

          PeterJonesP 1 Reply Last reply Reply Quote 0
          • PeterJonesP
            PeterJones @mpheath
            last edited by

            @mpheath said in I need a function/plugin to extract only unnecessary text from lines:

            removes the 4 lines mentioned

            Why remove? The OP said (translated): “But if I had text like this, it wouldn’t be possible.” – I interpreted that to mean that all the data in the example should be stripped down to the xzy:xyz.

            So instead of deleteing the lines like that, my solution edits them down to

            zzzzzzzz0:zzzz000
            zzzzzzz_zzzzzz:zzzzzz00zzzzzz
            zzzzzzz:zzzzzz0
            zzzzzzzzz.00:z0zzzzA00!zzzzzzzz
            

            that is, it strips the stuff before and after the pairs, but keeps the pairs.

            mpheathM 1 Reply Last reply Reply Quote 1
            • mpheathM
              mpheath @PeterJones
              last edited by

              @PeterJones Your correct. Seems the colon is important and not the pipe. I am not sure with possible variations what may pass or fail to achieve the desired result. The 1st post has a result with 1 less line so I have doubt what is needed.

              1 Reply Last reply Reply Quote 1
              • guy038G
                guy038
                last edited by guy038

                Hello, @ragnar-lodbrok, @peterjones, @mpheath and All

                Using the INPUT text of @peterjones, I also searched for a single regex, without success. I’ve just found out two successive searches/replacements which produce the same OUTPUT as the @peterjones’s one ! These two regexes simply delete everything which is not wanted.

                So, starting with :

                zzzzzzzzzzzz0:Azzzzzzzz00 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                zzzzz000:Azzzz000 | Azzzęzzz zzzzzzzz = 000,0 AA | Azzzzz = 0000
                zzzzzz:zzzzzzz0 | Azzzęzzz zzzzzzzz = 000,00 AA | Azzzzz = 0000
                zzzzzzzz:zz0z0z0 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                zzzzzzz0:zzzzzz000 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                zzzzzz.zzzzzz@zz.zz:Az000000 | zzzzzzzzzzz.zz = Azzzzzzóz | 00 = Azzzz Azz/Azz Az
                zzzzzzzzzzzz0:Azzzzzzzz00 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                zzzzz000:Azzzz000 | Azzzęzzz zzzzzzzz = 000,0 AA | Azzzzz = 0000
                zzzzzz:zzzzzzz0 | Azzzęzzz zzzzzzzz = 000,00 AA | Azzzzz = 0000
                zzzzzzzz:zz0z0z0 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                zzzzzzz0:zzzzzz000 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                zzzzzzzzz.0000:Azzzzzzzzz00 | Azzzęzzz zzzzzzzz = 00 AA | Azzzzz = 0000
                00:00][AAAAAAA] zzzzzzzz0:zzzz000 - [Azzz Azzzzz = 0,00 AA | AAAAAA = 0000] - @AAAAAA
                00:00][AAAAAAA] zzzzzzz_zzzzzz:zzzzzz00zzzzzz - [Azzz Azzzzz = 0,00 AA | AAAAAA = 0000] - @AAAAAA
                00:00][AAAAAAA] zzzzzzz:zzzzzz0 - [Azzz Azzzzz = 00,00 AA | AAAAAA = 0000] - @AAAAAA
                00:00][AAAAAAA] zzzzzzzzz.00:z0zzzzA00!zzzzzzzz - [Azzz Azzzzz = 00,00 AA | AAAAAA = 0000] - @AAAAAA
                zzzz0000:zzzz00 | zzzzzzzzzz = 00.00.0000 00:00:00
                zzzzzzzzzzzz00:zzzzz0000 | zzzzzzzzzz = 00.00.0000 00:00:00
                zzzzz00:zzzzz000 | zzzzzzzzzz = 00.00.0000 00:00:00
                zzzzz00:zzzzzzz | zzzzzzzzzz = 00.00.0000 00:00:00
                zzzzzzzzz:zzzzzzzzzz0 | zzzzzzzzzz = 00.00.0000 00:00:00
                Azzzz:Azzzzzzzzz00 | zzzzzzzzzz = 00.00.0000 00:00:00
                zzzzzzzzzzzzzz000:zzzzzz00 | zzzzzzzzzz = 00.00.0000 00:00:00
                zzzz:zzzz0000 | zzzzzzzzzz = 00.00.0000 00:00:00
                Azzzzz:000000 | zzzzzzzzzz = 00.00.0000 00:00:00
                00zzzzzzzz:%AzzAAAz?AAz0zz | 
                000000000@zzzz:A0000A | 
                000000Az:0000000Azz!z
                zzzzzzz:zzzzzzz0 | zzz = [zzzzzz zzz zzzzzz]
                Azzz00000000:Azzz000 | 
                zzzz0000:0zzz0AAzzz0A_00z | 
                AzzzAzzzz00:Azzzz0000 | 
                
                Azzz_00:Azzzz000 | 
                zzzzzz:Azzzzzz0 | [zzzzzz zzz zzzzzz]
                zzzzzz:AzAzAz_000 | 
                zzzzzzzz:Azzzzzz0000 | [zzzzzz zzz zzzzzz]
                zzzzzzz00@zz.zz:Azzzz0000 | Azzzzzzzz zzzzzzzz = 0 | Azzzzz = 0 | Azzzzzzzzz = 0
                zz zzzzz zzzz zzzzz:zzz0z
                

                First search/replacement :

                • FIND (?-s)^(\S+:\S+\x20|[^:\r\n]+\x20)?\S+:\S+(*SKIP)(*F)|.+

                • REPLACE Leave EMPTY

                Second search/replacement :

                • FIND '(?-s)^.+?\x20(?=\S+:)

                • REPLACE Leave EMPTY

                Which gives the following OUTPUT result :

                zzzzzzzzzzzz0:Azzzzzzzz00
                zzzzz000:Azzzz000
                zzzzzz:zzzzzzz0
                zzzzzzzz:zz0z0z0
                zzzzzzz0:zzzzzz000
                zzzzzz.zzzzzz@zz.zz:Az000000
                zzzzzzzzzzzz0:Azzzzzzzz00
                zzzzz000:Azzzz000
                zzzzzz:zzzzzzz0
                zzzzzzzz:zz0z0z0
                zzzzzzz0:zzzzzz000
                zzzzzzzzz.0000:Azzzzzzzzz00
                zzzzzzzz0:zzzz000
                zzzzzzz_zzzzzz:zzzzzz00zzzzzz
                zzzzzzz:zzzzzz0
                zzzzzzzzz.00:z0zzzzA00!zzzzzzzz
                zzzz0000:zzzz00
                zzzzzzzzzzzz00:zzzzz0000
                zzzzz00:zzzzz000
                zzzzz00:zzzzzzz
                zzzzzzzzz:zzzzzzzzzz0
                Azzzz:Azzzzzzzzz00
                zzzzzzzzzzzzzz000:zzzzzz00
                zzzz:zzzz0000
                Azzzzz:000000
                00zzzzzzzz:%AzzAAAz?AAz0zz
                000000000@zzzz:A0000A
                000000Az:0000000Azz!z
                zzzzzzz:zzzzzzz0
                Azzz00000000:Azzz000
                zzzz0000:0zzz0AAzzz0A_00z
                AzzzAzzzz00:Azzzz0000
                
                Azzz_00:Azzzz000
                zzzzzz:Azzzzzz0
                zzzzzz:AzAzAz_000
                zzzzzzzz:Azzzzzz0000
                zzzzzzz00@zz.zz:Azzzz0000
                zzzzz:zzz0z
                

                Best Regards,

                guy038

                1 Reply Last reply Reply Quote 2
                • mpheathM
                  mpheath @Ragnar Lodbrok
                  last edited by

                  @Ragnar-Lodbrok

                  To use this function, download the LuaScript plugin from Plugin Admin.
                  Add this function to startup.lua which the LuaScript plugin can open from the Plugins menu item.

                  npp.AddShortcut('CleanRagnarLog', '', function()
                      -- Transform lines into ...:... value per line.
                  
                      editor:BeginUndoAction()
                  
                      for line = 0, editor.LineCount - 1 do
                          local text = editor:GetLine(line)
                          local tokens = {}
                  
                          for token in string.gmatch(text or '', '%S+') do
                  
                              -- Only tokens before the 1st pipe.
                              if token == '|' then
                                  break
                              end
                  
                              -- Match tokens without these characters.
                              if not string.find(token, '[%[%]]') then
                  
                                  -- Valid token should contain a colon.
                                  if string.find(token, '%S:%S') then
                                      table.insert(tokens, token)
                                  end
                              end
                          end
                  
                          if #tokens < 2 then
                  
                              -- Target line text excluding the newlines.
                              editor.TargetStart = editor:PositionFromLine(line)
                              editor.TargetEnd = editor.LineEndPosition[line]
                  
                              if editor.TargetEnd > editor.TargetStart then
                  
                                  -- 0 tokens to set line as empty.
                                  if #tokens == 0 then
                                      editor:ReplaceTarget('')
                  
                                  -- 1 token to set line as token.
                                  elseif #tokens == 1 then
                                      local text_trimmed = string.gsub(text, '[\r\n]', '')
                  
                                      if tokens[1] ~= text_trimmed then
                                          editor:ReplaceTarget(tokens[1])
                                      end
                                  end
                              end
                          else
                  
                              -- Code may need improvement if more than 1 token matched.
                              print(string.format('Line %i has %i possible tokens:', line, #tokens))
                  
                              for i, v in ipairs(tokens) do
                                  print(string.format(' %4s %s', i, v))
                              end
                          end
                      end
                  
                      editor:EndUndoAction()
                  end)
                  

                  Restart Notepad++. From the Plugins menu item, click on CleanRagnarLog item within the LuaScript menu item to run the function.

                  If more than 1 token is matched in a line, then check the console for line number and the tokens matched which may help with updating the function code to get just 1 token.

                  Ragnar LodbrokR 1 Reply Last reply Reply Quote 0
                  • Ragnar LodbrokR
                    Ragnar Lodbrok @mpheath
                    last edited by PeterJones

                    @mpheath mam taka liste i chcę z niej wyciągnąc same login:hasło i email:hasło

                    aaaaaa.aaaaaa@aa.aa:Za000000 | aaaaaaaaaaa.aa = Zaaaaaaaa | 00 = Zaaaa Zaa/Zaa Za
                    aaaaaaaaaaaa0:Zaaaaaaaa00 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                    aaaaa000:Zaaaa000 | Zaaaaaaa aaaaaaaa = 000,0 ZZ | Zaaaaa = 0000
                    aaaaaa:aaaaaaa0 | Zaaaaaaa aaaaaaaa = 000,00 ZZ | Zaaaaa = 0000
                    aaaaaaaa:aa0a0a0 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                    aaaaaaa0:aaaaaa000 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                    aaaaaaaaa.0000:Zaaaaaaaaa00 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                    00:00][ZZZZZZZ] aaaaaaaa0:aaaa000 - [Zaaa Zaaaaa = 0,00 ZZ | ZZZZZZ = 0000] - @ZZZZZZ
                    00:00][ZZZZZZZ] aaaaaaa_aaaaaa:aaaaaa00aaaaaa - [Zaaa Zaaaaa = 0,00 ZZ | ZZZZZZ = 0000] - @ZZZZZZ
                    00:00][ZZZZZZZ] aaaaaaa:aaaaaa0 - [Zaaa Zaaaaa = 00,00 ZZ | ZZZZZZ = 0000] - @ZZZZZZ
                    00:00][ZZZZZZZ] aaaaaaaaa.00:a0aaaaZ00!aaaaaaaa - [Zaaa Zaaaaa = 00,00 ZZ | ZZZZZZ = 0000] - @ZZZZZZ
                    aaaa0000:aaaa00 | aaaaaaaaaa = 00.00.0000 00:00:00
                    aaaaaaaaaaaa00:aaaaa0000 | aaaaaaaaaa = 00.00.0000 00:00:00
                    aaaaa00:aaaaa000 | aaaaaaaaaa = 00.00.0000 00:00:00
                    aaaaa00:aaaaaaa | aaaaaaaaaa = 00.00.0000 00:00:00
                    aaaaaaaaa:aaaaaaaaaa0 | aaaaaaaaaa = 00.00.0000 00:00:00
                    Zaaaa:Zaaaaaaaaa00 | aaaaaaaaaa = 00.00.0000 00:00:00
                    aaaaaaaaaaaaaa000:aaaaaa00 | aaaaaaaaaa = 00.00.0000 00:00:00
                    aaaa:aaaa0000 | aaaaaaaaaa = 00.00.0000 00:00:00
                    Zaaaaa:000000 | aaaaaaaaaa = 00.00.0000 00:00:00
                    00aaaaaaaa:%ZaaZZZa?ZZa0aa | 
                    000000000@aaaa:Z0000Z | 
                    000000Za:0000000Zaa!a | 
                    aaaaaaa:aaaaaaa0 | aaa = [aaaaaa aaa aaaaaa]
                    Zaaa00000000:Zaaa000 | 
                    aaaa0000:0aaa0ZZaaa0Z_00a | 
                    ZaaaZaaaa00:Zaaaa0000 | 
                    Zaaa_00:Zaaaa000 | 
                    aaaaaa:Zaaaaaa0 | [aaaaaa aaa aaaaaa]
                    aaaaaa:ZaZaZa_000 | 
                    aaaaaaaa:Zaaaaaa0000 | [aaaaaa aaa aaaaaa]
                    aaaaaaa00@aa.aa:Zaaaa0000 | Zaaaaaaaa aaaaaaaa = 0 | Zaaaaa = 0 | Zaaaaaaaaa = 0
                    aaaaaaaa@aa.aa:Zaaaaa00 | Zaaaaaaaa aaaaaaaa = 0 | Zaaaaa = 0 | Zaaaaaaaaa = 0
                    aaaaaa0aaa@aa.aa:Zaaaaaa0 | Zaaaaaaaa aaaaaaaa = 0 | Zaaaaa = 0 | Zaaaaaaaaa = 0
                    aaaaaaaaa.aaaaaa@aa.aa:aaa0000 | aaaaaa/aaaaa Zaaaaa ZZZ aaaaaaa aa = 00-00-0000
                    aaaaaaaaaaa@a0.aa:aaaaaaa0000  |  aaaaaaaaaaa.aa =  Zaaaaaaaa | 00 = ZZZZZZ ZZZZZZ | 00 = ZZZZZ ZZ | 00 = ZZZZ ZZ | 00 = ZZZZZZZZZ ZZZ ZZ | 00 = ZZZ ZZ | 00 = ZZZ | 00 = Zaaaaaa ZZ
                    a.aaaaaaaaaaa00@aaaaa.aaa:Zaaaaa0000! |  aaaaaaaaaaa.aa =  Zaaaaaaaa | 00 = Zaaaaa Zaa Za Zaaaa
                    aaaaaaaaa@aaaaaaa.aa:Zaaaaaaa0000 | Zaaaaaaaa aaaaaaaa = 0 | Zaaaaa = 0 | Zaaaaaaaaa = 0
                    aaaaa0000@a0.aa:aaaaa00 | Zaaaaaaaa aaaaaaaa = 0 | Zaaaaa = 0 | Zaaaaaaaaa = 0
                    aaaa.aaaa:Zaaaaaaaa0000 | Zaaaaaaa aaaaaaaa = 0,00 ZZ | Zaaaaa = 0000
                    aaaaaa000@aa.aa:aaaaaaaa000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaa0aaaaa@aaaa.aa:Zaaaaaaaaa0 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    ZZZZ@ZZZZ.ZZ:aaaaaa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaa.a@aaaa.aaa.aa:aaaaaaaa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaa.a@aaaa.aa:aaaaaa000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaaa0000a@aa.aa:Zaaaaaa00. | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaaa00@aa.aa:aa000000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaa000@aa.aa:aaaaaa0 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaaaaaaaaaaa@aa.aa:ZaZaZa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaaa0000@aaaa.aa:Zaaaaaaa000000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaaaaaaaaa@aaaa.aa:Zaaaaaa00$ | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaaaa.aaaaaaa@aaaa.aa:Zaaaaaaaaa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaa0000@aaaa.aa:aaaaaaa0000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaa@aa.aa:aaaaaaaaaa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaa0@aa.aa:Zaaaa0000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaaa.aaaaaaa.aaaaaa@aa.aa:aaaaaa000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaaa0@aa.aa:aaaaa0000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaa.aaaaaaa@aa.aa:Zaaaaaaa00 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaa000@aa.aa:Zaaaaaa0000! | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaaaaaaaaaaa00@aaaaaa.aa:Zaaaaaaaa0000 | ZZZ Zaaaaaa = aaa | Zaaaaa = aaaaa | Zaaaaa = 0000-00-00 00:00:00
                    aaaaaaaa000:000000000a - Zaaaa = 00 | ZZ = 00 | ZZ = 0 | Zaaaaaa Zaaaaaaaa = 0 | Zaaaa Zaaaaaaa = aaaa | Zaaa Zaaaaa aa = 00/00/0000 | Zaaaa Zaaaaa = 0 | Zaaaaa = ZZZ | Zaaaa Zaaaa = 0 | Zaaaa = [] | Zaaaaa = []
                    Zaaaaaa Zaaaaaaaaaa :aaaaaaaaaaaaaa@aaa.aaa:aaaaaaaa
                    aaaaaa00@aaa.aa:Zaaaaa000 | Zaaaa = 000 | Zaaaa = 00 | Zaaa = Zaaaaa | Zaaa Zaaaaa = Zaaaa | Zaaaa = [aaaaaaaaa-aaaaa-ZZZ, aaaaaaaaa-aaaaa-aaaaaaaaaaa, aaa-aaa-0-aaaaa-aaaaaa, aaa-aaaaa, aaaaaaa-aaa-aaaaa, aaaaaaa-aaa-aaaaa, aaaaa-aaaa-aaaaaaaaa-aaa-aaaaa, aaa-aaaa, aaa-aaaaaa-aaa-aaaaaaaa, aaa-aaaaaa-aaa-aaaaaaaa-0] | Zaaaaaaaa = [ZZ, ZZ, ZZ, ZZ0, ZZ0, ZZ, ZZ0, ZZ, ZZ, ZZ0] | 0ZZ = Zaaaa
                    ZZZZZ: aaaa.aaaaa@aaaaa.aaa:Zaaa00000
                    ZaaZaa0000:aaaaaaaaa00											
                    aaaaaa.aaaaaaaaa.00@aaaaa.aaa:Zaaaaaaaaa000000!											
                    aaaaaaa00@aaaaa.aaa:aaaaaa0000										
                    aaaaaaaaaaaa@aaaaaa.aaa    Zaaaaa0000 
                    aaaaaaaaaaaaa@aaaaa.aaa   Zaaaa0000 
                    Zaaaaaaaaaaa@aaaaa.aaa    Zaaaa0000- 
                    aaaaaaaaaaaaaa@aaaaa.aaa   >   aaaaaaa0a
                    aaaaaaa_aaaaaa@aaaaa.aaa   >  aaaaaaaa00aaa
                    aaaaaaaaaaaaaaaaaaaa@aaaaa.aaa   :   Zaaaa0000
                    aaaaaa.aaaaaaa.a@aaaaa.aaa   :    aaa000000
                    aaaaaaaa:aa0a0a0 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                    aaaaaaa0:aaaaaa000 | Zaaaaaaa aaaaaaaa = 00 ZZ | Zaaaaa = 0000
                    

                    I jak tu teraz to zrobić w notepad by to zadziałało na wszystkich liniach wiem, że są skrypty do tego ale wygodniej jest w notepad czy użyć jakiegos polecenie regex czy wtyczki?

                    —

                    moderator added code markdown around text; please don’t forget to use the </> button to mark example text as “code” so that characters don’t get changed by the forum.

                    PeterJonesP 1 Reply Last reply Reply Quote 0
                    • PeterJonesP
                      PeterJones @Ragnar Lodbrok
                      last edited by

                      @Ragnar-Lodbrok ,

                      Seriously, try a bit harder. The preview window on the right while you are writing your post shows you exactly what your post will look like; it should have been plainly obvious that the text wasn’t ending up in the text boxes.

                      I have this list and I want to extract just the login:password and email:password from it.

                      Now, how do I do this in Notepad so that it works on all lines? I know there are scripts for this, but is it more convenient to use Notepad, or should I use a regex command or a plugin?

                      All the examples given so far, except the lua script, have literally been using a regex command. That’s what we’ve been showing you. If you don’t like it, tough.

                      <new data>

                      Seriously. Given all the data you had shown before, mine works. You just keep changing the rules every time you give new data. Now your data has lines that have no colon, but just have spaces between the email and password, or ones that have spaces and > signs. It’s completely arbitrary. Regular expressions aren’t magic or mind readers. If you lie about your data, you won’t be able to get everything.

                      I think the answer is “there is no way to do it in one meaningful regular expression, because your data is not consistent enough”

                      Besides that answer, I am tired of helping someone who’s probably just trying to make a username/password database that they can use for trying to break into other accounts. So I’m done helping (other than begrudgingly reformatting your posts and getting rid of all the passwords that you’re publishing to this site.)

                      mpheathM 1 Reply Last reply Reply Quote 3
                      • Ragnar LodbrokR
                        Ragnar Lodbrok
                        last edited by PeterJones

                        pair_re = re.compile(
                            r'([A-Za-z0-9._@+\-]{3,64})\s*(?:[:>]|(?:\s{2,}|\t))\s*(\S{3,200})',
                            re.UNICODE
                        )
                        
                        def extract_line_pair(line):
                            cutoff = re.split(r'\s\|\s| - \[|\s\|\|', line, maxsplit=1)[0].strip()
                            matches = pair_re.findall(cutoff)
                            if not matches:
                                return None
                        
                            login, passwd = matches[-1]
                            login = login.strip()
                            passwd = passwd.strip().rstrip(r' .,;|>\])')
                        
                            if len(login) < 3 or len(passwd) < 3:
                                return None
                            if re.fullmatch(r'\d{1,3}', login):
                                return None
                            if login.isupper() and "@" not in login:
                                for lg, pw in reversed(matches):
                                    if not (lg.isupper() and "@" not in lg) and len(lg) >= 3 and len(pw) >= 3:
                                        login, passwd = lg.strip(), pw.strip().rstrip(r' .,;|>\])')
                                        break
                                else:
                                    return None
                        
                            return f"{login}:{passwd}"
                        
                        

                        da się to robi wszystko z tymi danymi co podałem powyżej, ale nie chcę używać skryptów tylko w notepad++ da się to zrobic z załą listą od razu a nie za każdym razem zmieniać regex

                        1 Reply Last reply Reply Quote 0
                        • mpheathM
                          mpheath @PeterJones
                          last edited by

                          @PeterJones said in I need a function/plugin to extract only unnecessary text from lines:

                          @Ragnar-Lodbrok ,
                          I think the answer is “there is no way to do it in one meaningful regular expression, because your data is not consistent enough”

                          The data is not consistent as it appears to be harvested data:

                          https://github.com/RagnarLodbrok1981/proxy-scraper-checker
                          forked from
                          https://github.com/monosans/proxy-scraper-checker

                          That may explain the pipe characters in the log like as shown in the repository’s readme image. This appears to be dishonest obtained data.

                          1 Reply Last reply Reply Quote 1
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors