Community
    • Login

    Help me please, how can I extract the mail and the next column?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    6 Posts 3 Posters 1.0k Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • oscar remicccO Offline
      oscar remiccc
      last edited by

      Help me please, how can I extract the mail and the next column?

      678:ina_caeter@yahoo.com:Lina:S:{0}:2.0791812460:C:{5,0,2}:codepostal_23456:student:Kenty:level57:Elite:Knight
      889:Dogietreats:Terry:1.5000000000000000:0.3010299957:C:{2}:1:codepostal_13567:doctor:Tygger:level34:Elder:Druid
      990:Charz4you:367589:1.7500000000000000:0.6020599913:S:{0,2}:34:codepostal_45217:architect:Pog:level122:Elite:Knight
      .
      .
      .
      .
      .
      friend i need it
      ina_caeter@yahoo.com:Lina
      Dogietreats:Terry
      Charz4you:367589

      my file is 7286246 lines

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA Offline
        Alan Kilborn @oscar remiccc
        last edited by

        @oscar-remiccc

        What have you tried already?

        1 Reply Last reply Reply Quote 0
        • guy038G Offline
          guy038
          last edited by guy038

          Hi, @oscar-remiccc,

          Let’s try to be logic !

          • The different fields of your text are delimited with a colon character

          • This search can be considered as a mono-line search, as the different fields are not split on several lines

          • As you want to keep the 2nd and 3rd fields, only, any search will have to refer to an anchor ( the beginning of line location ^ seems obvious ! )

          • To search for a complete range of chars, between two : delimiters, we should search for any non-null range of consecutive characters, different from, either, a colon and any EOL char. So the negative class character [^:\n\r]

          From above, one solution could be, then :

          SEARCH ^[^:\n\r]+:([^:\n\r]+:[^:\n\r]+):.+

          REPLACE \1

          Notes :

          • From beginning of line ^, this regex looks for any line contents ( the first three fields, followed with the reminder of the line :.+ )

          • The block [^:\n\r]+:[^:\n\r]+ ( 2nd + 3rd fields, surrounded with parentheses, defines the group 1

          • So, in replacement, any line contents is replaced with these 2nd and 3rd fields, separated with a : character


          Using the lazy quantifier +?, this regex S/R is a bit shorter and becomes :

          SEARCH (?-s)^.+?:(.+?:.+?):.+

          REPLACE \1

          Note that the first part (?-s)^.+?: searches, from beginning of line ^, the shortest non-null range of standard characters, which is followed with a colon char. So, this range does not contain any : character ;-))

          Best Regards,

          guy038

          oscar remicccO 1 Reply Last reply Reply Quote 3
          • oscar remicccO Offline
            oscar remiccc @guy038
            last edited by

            @guy038
            thank you very much guy038, a query I was trying with this code, this simple example but it did not work, it eliminates the last character

            990:Charz4you:367589:1.7500000000000000:0.6020599913

            SEARCH: ^([^ ]+?):([^ ]+?):([^ ]+?):([^ ]+?):([^ ]+?).$

            REPLACE $2:$5

            Charz4you:0.602059991

            Eliminate the number 3, what am I doing wrong? please

            1 Reply Last reply Reply Quote 0
            • guy038G Offline
              guy038
              last edited by guy038

              Hi, @oscar-remiccc,

              So, to get the 2nd and 5th fields only, just delete, in your regex, the last ., before the $, as below ! That should do the trick !

              SEARCH ^([^ ]+?):([^ ]+?):([^ ]+?):([^ ]+?):([^ ]+?)$

              REPLACE $2:$5

              You’ll get the expected test :

              Charz4you:0.6020599913
              

              I think that, using the syntax of my previous post, we can simplify the search regex, as below :

              SEARCH ^([^:\n\r]+):([^:\n\r]+):([^:\n\r]+):([^:\n\r]+):([^:\n\r]+)

              REPLACE $2:$5

              But, you do not need to store all the fields between parentheses ! Just store the fields 2 and 5 and if you include the : in group 2, we get the regex S/R :

              SEARCH ^[^:\n\r]+:([^:\n\r]+:)[^:\n\r]+:[^:\n\r]+:([^:\n\r]+)

              REPLACE $1$2

              Finally, you do not need to explicit the groups 3 and 4, too ! So, the part [^:\n\r]+:[^:\n\r]+ ( groups 3 and 4 ), can, simply, be changed into .+, giving the final S/R :

              SEAARCH ^[^:\n\r]+:([^:\n\r]+:).+:([^:\n\r]+)

              REPLACE $1$2

              Best Regards,

              guy038

              oscar remicccO 1 Reply Last reply Reply Quote 2
              • oscar remicccO Offline
                oscar remiccc @guy038
                last edited by

                @guy038
                you are a great teacher, thank you very much

                1 Reply Last reply Reply Quote 1

                Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                With your input, this post could be even better 💗

                Register Login
                • First post
                  Last post
                The Community of users of the Notepad++ text editor.
                Powered by NodeBB | Contributors