• Login
Community
  • Login

Extract specific data from log files?

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
10 Posts 5 Posters 1.3k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Z
    Zorba Greek
    last edited by Jan 12, 2023, 5:21 PM

    I have a log file that I need to extract specific data elements from.
    Example text:

    [22-12-20 21:16:04.521]   FROM LIVE   <$011B4B50:FleetCard_10:1,51.75,1,1,200001,Fleet No,5411,34319,TARJETA,,51.75,,3,0,1>
    [22-12-20 21:16:04.553]   auth accepted tag=9812120450668474 device=V2
    [22-12-20 21:16:40.185]   FROM LIVE   <02:PAYDONE=0000022851>
    [22-12-20 21:17:20.677]     TO LIVE   <$011B4910:FleetCard_1:9812120450669349>
    [22-12-20 21:17:21.270]   FROM LIVE   <$011B4910:FleetCard_10:1,49.48,1,1,200001,Fleet No,5237,34320,TARJETA,,49.48,,2,0,1>
    [22-12-20 21:17:21.333]   auth accepted tag=9812120450669349 device=V1
    [22-12-20 21:18:44.345]   FROM LIVE   <02:PAYDONE=0000022852>
    [22-12-20 21:19:16.399]   FROM LIVE   <03:PAYDONE=0000022853>
    [22-12-20 21:20:18.292]     TO LIVE   <$011B5150:FleetCard_1:9812120450669482>
    [22-12-20 21:20:19.073]   FROM LIVE   <$011B5150:FleetCard_10:1,51.75,1,1,200001,Fleet No,2001,34321,TARJETA,,51.75,,3,0,1>
    [22-12-20 21:20:19.167]   auth accepted tag=9812120450669482 device=V1
    [22-12-20 21:21:53.536]     TO LIVE   <$011B4B50:FleetCard_1:9812120450668854>
    [22-12-20 21:21:54.286]   FROM LIVE   <$011B4B50:FleetCard_10:1,51.75,1,1,200001,Fleet No,5418,34322,TARJETA,,51.75,,3,0,1>
    [22-12-20 21:21:54.301]   auth accepted tag=9812120450668854 device=V2
    [22-12-20 21:25:11.284]   FROM LIVE   <02:PAYDONE=0000022854>
    [22-12-20 21:25:20.141]   FROM LIVE   <04:PAYDONE=0000022855>
    

    I want to extract like below

    Fleet No : 5411  tag :9812120450668474 
    Fleet No:  5237  tag:9812120450669349
    Fleet No : 2001  tag:9812120450669482
    

    This is my first question here and would like any support for above

    M 1 Reply Last reply Jan 12, 2023, 5:39 PM Reply Quote 0
    • M
      Michael Vincent @Zorba Greek
      last edited by Jan 12, 2023, 5:39 PM

      @Zorba-Greek

      Please don’t let those 16-digit numbers above be bank card numbers … and if they are, please let them be obscured and you didn’t copy / paste real numbers, non-anonymized, into this public forum.

      Cheers.

      Z 1 Reply Last reply Jan 12, 2023, 5:41 PM Reply Quote 0
      • Z
        Zorba Greek @Michael Vincent
        last edited by Jan 12, 2023, 5:41 PM

        @Michael-Vincent

        Its just an internal reference ,

        M 1 Reply Last reply Jan 12, 2023, 5:48 PM Reply Quote 0
        • M
          Michael Vincent @Zorba Greek
          last edited by Jan 12, 2023, 5:48 PM

          @Zorba-Greek

          Phew.

          The regex:

          Fleet No\,(\d{4})\,.*?auth accepted tag=(\d{16})
          

          with the “Regular expression” radio button and “. matches newline” checked will find your data.

          Using the replace dialogue on a copy of your file (this operation would be destructive), you could use:

          \n\nFleet No : $1 tag : $2\n\n
          

          558372d2-3628-41f7-abb5-2695c0ec67d9-image.png

          Cheers.

          Z 1 Reply Last reply Jan 12, 2023, 6:05 PM Reply Quote 2
          • Z
            Zorba Greek @Michael Vincent
            last edited by Jan 12, 2023, 6:05 PM

            @Michael-Vincent said in Extract specific data from log files?:

            \n\nFleet No : $1 tag : $2\n\n

            just there, i want to delete everything except the selected file

            ![alt text](image url)Captured.JPG

            M 1 Reply Last reply Jan 12, 2023, 6:27 PM Reply Quote 0
            • M
              Michael Vincent @Zorba Greek
              last edited by Jan 12, 2023, 6:27 PM

              @Zorba-Greek said in Extract specific data from log files?:

              i want to delete everything except the selected file

              An expert more versed than I may be able to help with a single stroke - for me, without a scripting solution, this is a multi-step process.

              Now that you have the lines you want, use the Search => Mark menu item to bookmark the lines:

              Find what:

              Fleet No : \d{4} : \d{16}
              

              Check the “Bookmark line” checkbox:

              2eed987d-491e-4627-80f1-6b42ee57a2c2-image.png

              The right-click the bookmark margin and select “Copy Bookmarked Lines”, open a new tab and paste.

              Cheers.

              T Z 2 Replies Last reply Jan 12, 2023, 6:38 PM Reply Quote 3
              • T
                Terry R @Michael Vincent
                last edited by Jan 12, 2023, 6:38 PM

                @Michael-Vincent said in Extract specific data from log files?:

                An expert more versed than I may be able to help with a single stroke - for me, without a scripting solution, this is a multi-step process.

                Don’t underestimate your abilities @Michael-Vincent. Your answer is perfectly acceptable, and I would have also likely given the same. Often when providing solutions to those who are yet to understand the complexities of regex, doing it in several “easy” steps is easier to understand.

                The alternative would be to use alternation which is not so easy to understand for the uninitiated.

                Terry

                L 1 Reply Last reply Jan 13, 2023, 3:50 AM Reply Quote 3
                • Z
                  Zorba Greek @Michael Vincent
                  last edited by Jan 12, 2023, 7:39 PM

                  @Michael-Vincent

                  Thanks @Michael-Vincent , helpful …

                  1 Reply Last reply Reply Quote 1
                  • L
                    Lycan Thrope @Terry R
                    last edited by Jan 13, 2023, 3:50 AM

                    @Terry-R ,

                    I agree with your analysis. One can’t understand the more complex, nor how to come to figure it out, if they don’t do the baby steps. I’m reminded myself, as I take on doing another UDL for one of our older versions of dBASE for those still using it and wanting to use it in NPP, that I’ve started using Mark in NPP as a test bed of my fledgling regex instead of going straight to Regex101.com , since it’s almost as interactive as their web page, but since they don’t have Boost as a test engine, it behooves me to do the work in NPP with the native regex, and I’ve found that Mark is a great way test my regex in stages. Plus, sometimes, the problem has to be divided in two to accomplish it…like trimming the ellipsis and page numbers from a list of Commands from a doucment, to replacing the spaces with the hex equivalent to make it work in NPP, etc…

                    Until I came to do this UDL for the language, I hadn’t even scratched the surface of NPP, and now that I have, I find myself in awe of the things it’s going to allow me to do…but only if I take it in baby steps. :)

                    1 Reply Last reply Reply Quote 0
                    • G
                      guy038
                      last edited by guy038 Jan 13, 2023, 2:15 PM Jan 13, 2023, 11:04 AM

                      Hello, @zorba-greek, @michael-vincent, @terry-r and All,

                      A one way solution would be to use the following regex S/R :

                      SEARCH (?xs-i) ^ .+? Fleet \x20 No , ( \d+ ) , .+? tag= ( \d+) .+? $ | ^ .+

                      REPLACE (?1Fleet No \: $1 tag \: $2:)

                      This regex :

                      • Searches for two consecutive lines, without the ending line-break, containing the string Fleet No, with this exact case, in the first line and the string tag=, with this exact case, in the second one and replace these two lines with the string Fleet No : $1 tag : $2, where $1 and $2 are the numbers located after Fleet No and tag=

                      • When no more line contains the string Fleet No, it graps all the remaining text till the very end of the file and deletes it


                      So for instance, I will try to describe the process with this dummy example below :

                      
                      FIRST alternative : (?xs-i) ^ .+? Fleet \x20 No , ( \d+ ) , .+? tag= ( \d+) .+? $
                      
                      
                      [22-12-20 21:18:44.345]   FROM LIVE   <02:PAYDONE=0000022852>CRLF
                      <----------------------------------------------------------------
                      ^                           .+?
                      
                      [22-12-20 21:19:16.399]   FROM LIVE   <03:PAYDONE=0000022853>CRLF
                      -----------------------------------------------------------------
                                                  .+?
                      
                      [22-12-20 21:21:54.286]   FROM LIVE   <$011B4B50:FleetCard_10:1,51.75,1,1,200001,Fleet No,5418,34322,TARJETA,,51.75,,3,0,1>CRLF
                      -------------------------------------------------------------------------------->Fleet No,<--><--------------------------------
                                                  .+?                                                  Fleet No, \d+               .+?
                                                                                                       Fleet No, $1
                      [22-12-20 21:21:54.301]   auth accepted tag=9812120450668854 device=V2CRLF
                      --------------------------------------->tag=<--------------><--------> ( line-break NOT included in the regex )
                                      .+?                     tag=   ( \d+ )          .+?   $
                                                              tag=      $2
                      
                      
                      SECOND alternative : ^ .+
                      
                      
                      [22-12-20 21:25:11.284]   FROM LIVE   <02:PAYDONE=0000022854>CRLF
                      -----------------------------------------------------------------
                      ^                               .+
                      
                      [22-12-20 21:25:20.141]   FROM LIVE   <04:PAYDONE=0000022855>CRLF
                      ----------------------------------------------------------------- ( End of file )
                                                      .+
                      

                      Now, given the exact INPUT text, provided by @zorba-greek :

                      [22-12-20 21:16:04.521]   FROM LIVE   <$011B4B50:FleetCard_10:1,51.75,1,1,200001,Fleet No,5411,34319,TARJETA,,51.75,,3,0,1>
                      [22-12-20 21:16:04.553]   auth accepted tag=9812120450668474 device=V2
                      [22-12-20 21:16:40.185]   FROM LIVE   <02:PAYDONE=0000022851>
                      [22-12-20 21:17:20.677]     TO LIVE   <$011B4910:FleetCard_1:9812120450669349>
                      [22-12-20 21:17:21.270]   FROM LIVE   <$011B4910:FleetCard_10:1,49.48,1,1,200001,Fleet No,5237,34320,TARJETA,,49.48,,2,0,1>
                      [22-12-20 21:17:21.333]   auth accepted tag=9812120450669349 device=V1
                      [22-12-20 21:18:44.345]   FROM LIVE   <02:PAYDONE=0000022852>
                      [22-12-20 21:19:16.399]   FROM LIVE   <03:PAYDONE=0000022853>
                      [22-12-20 21:20:18.292]     TO LIVE   <$011B5150:FleetCard_1:9812120450669482>
                      [22-12-20 21:20:19.073]   FROM LIVE   <$011B5150:FleetCard_10:1,51.75,1,1,200001,Fleet No,2001,34321,TARJETA,,51.75,,3,0,1>
                      [22-12-20 21:20:19.167]   auth accepted tag=9812120450669482 device=V1
                      [22-12-20 21:21:53.536]     TO LIVE   <$011B4B50:FleetCard_1:9812120450668854>
                      [22-12-20 21:21:54.286]   FROM LIVE   <$011B4B50:FleetCard_10:1,51.75,1,1,200001,Fleet No,5418,34322,TARJETA,,51.75,,3,0,1>
                      [22-12-20 21:21:54.301]   auth accepted tag=9812120450668854 device=V2
                      [22-12-20 21:25:11.284]   FROM LIVE   <02:PAYDONE=0000022854>
                      [22-12-20 21:25:20.141]   FROM LIVE   <04:PAYDONE=0000022855>
                      

                      We get the expected OUTPUT text :

                      Fleet No : 5411 tag : 9812120450668474
                      Fleet No : 5237 tag : 9812120450669349
                      Fleet No : 2001 tag : 9812120450669482
                      Fleet No : 5418 tag : 9812120450668854
                      

                      Notes :

                      • In the search regex, I use the free-spacing mode, (?x), for an easy reading of the different parts of this regex

                      • You probably noticed that all ranges of text, in my regex, need to be non-greedy ranges ( Syntax = .+? ), as each range must not contain, itself, the strings Fleet No and tag= nor the final line break !

                      • As this regex does not include the final line-break, it will keep them, as is, in the OUTPUT text

                      • The replacement regex (?1Fleet No \: $1 tag \: $2:) means :

                        • IF a Fleet number is found ( (?1... ) it rewrites the string Fleet No : followed with the Fleet number ( $1 ), the string tag = and the tag number $2, all separated with space chars and colon chars when needed

                        • ELSE it replaces with everything between the last : character, after $2 and the end of the conditional replacement ) i.e. nothing so it deletes all the remaining text till the very end of current file !

                      • Remark that the two first : chars, in replacement, are literal characters and must be escaped, in order to be rewritten as is when the group 1 is present

                      Best Regards,

                      guy038

                      1 Reply Last reply Reply Quote 5
                      9 out of 10
                      • First post
                        9/10
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors