Regex usage in notepad



  • How can I get the following result?
    Thank you

    4:testsubject:sv1234@gmail.com:1255798283:77.70.61.38:0:0:255798283:125579828300&0:61b68ddbc699f5576b4351a6292b771a:12564030830:testsubject:testsubject:testsubject:testsubject:0:294  
    6:dokyto:1qw234@abv.bg:1255801481:188.126.8.3:518:Newbie:1:20:1324664270:0:0:32:1327264141:13275637840&1:f44114cdfbcd9b2ddc8ad7f7a5b7ac99:1327868941:1dokyto:dokyto:0:a:{s:7:\"friends\";a:8:{i:15;s:1:\"1\";i:20;s:1:\"1\";i:35;s:1:\"1\";i:46;s:1:\"1\";i:62;s:1:\"1\";i:97;s:1:\"1\";i:1162;s:1:\"1\";i:16151;s:1:\"1\";}s:7:\"my_blog\";s:1:\"8\";s:14:\"shoutbox_prefs\";s:185:\"a:6:{s:15:\"shoutbox_height\";i:275;s:16
    7:oldboy:7:aaaa@gmail.com:1255801646:188.126.13.29:9:Newbie:1:20:1338048312:14:1971:0:333977242:1338052672:1:0:1&1:b20a119707e80f4a6f592a901cf15b20:13386525371:oldboy:oldboy:0:a:2:{s:7:\"friends\";:s:14:\"shoutbox_prefs\";s:185:\"a:6:{s:15:\"shoutbox_height\";i:275;s:16:\"shoutbox_gheight\";i:263;s:14:\"global_display\";i:1;s:15:\"enter_key_shout\";i:1;s:21:\"enable_quick_commands
    

    Result:

    sv1234@gmail.com:61b68ddbc699f5576b4351a6292b771a
    1qw234@abv.bg:f44114cdfbcd9b2ddc8ad7f7a5b7ac99
    aaaa@gmail.com:b20a119707e80f4a6f592a901cf15b20
    


  • You really want from column 3, colum 3, column 4 for the email, and column 11, column 15, and column 20 for the hex string?

    First, as I pointed out here to another similarly-formatted data set, if you are assuming that colons are not allowed in email addresses, you are wrong. Good luck when someone’s email does contain the colon.

    In the case of that wrong assumption, and assuming it’s always 32 hexadecimal digits, I’d search for

    • FIND = ^.*?:([^\:]+\@[^\:]+)(?=:).*?:([[:xdigit:]]{32}):.*
    • REPLACE = $1:$2
    • MODE = regular expression

    If your assumptions were different than mine, then actually be explicit about what you want, rather than making us guess.

    Don’t forget: we aren’t here to just do your work for you; we’re here to help you learn how to do it yourself. You’ve been asking questions with similarly-formatted files for a year now under this username (1, 2). At some point, you need to prove that you’re learning, or we’ll stop giving you the solution. The resources I link below will help you. Please show a willingness to learn.

    I do appreciate that you actually markup your data, so it comes across as literal text; that is helpful. Thank you for that.

    -----

    Please Read And Understand This

    FYI: I often add this to my response in regex threads, unless I am sure the original poster has seen it before. Here is some helpful information for finding out more about regular expressions, and for formatting posts in this forum (especially quoting data) so that we can fully understand what you’re trying to ask:

    This forum is formatted using Markdown. Fortunately, it has a formatting toolbar above the edit window, and a preview window to the right; make use of those. The </> button formats text as “code”, so that the text you format with that button will come through literally ; use that formatting for example text that you want to make sure comes through literally, no matter what characters you use in the text (otherwise, the forum might interpret your example text as Markdown, with unexpected-for-you results, giving us a bad indication of what your data really is).

    Images can be pasted directly into your post, or you can hit the image button. (For more about how to manually use Markdown in this forum, please see @Scott-Sumner’s post in the “how to markdown code on this forum” topic, and my updates near the end.) Please use the preview window on the right to confirm that your text looks right before hitting SUBMIT. If you want to clearly communicate your text data to us, you need to properly format it.

    If you have further search-and-replace (“matching”, “marking”, “bookmarking”, regular expression, “regex”) needs, study the official Notepad++ searching using regular-expressions docs, as well as this forum’s FAQ and the documentation it points to. Before asking a new regex question, understand that for future requests, many of us will expect you to show what data you have (exactly), what data you want (exactly), what regex you already tried (to show that you’re showing effort), why you thought that regex would work (to prove it wasn’t just something randomly typed), and what data you’re getting with an explanation of why that result is wrong. When you show that effort, you’ll see us bend over backward to get things working for you. If you need help formatting, see the paragraph above.

    Please note that for all regex and related queries, it is best if you are explicit about what needs to match, and what shouldn’t match, and have multiple examples of both in your example dataset. Often, what shouldn’t match helps define the regular expression as much or more than what should match.

    Here is the way I usually break down trying to figure out a regex (whether it’s for myself or for helping someone in the forum):

    1. Compare what portions of each line I want to match is identical to every other one (“constants”), and what parts do I want to allow to be different in each line (“variables”) but still be part of the match.
    1. Look at both the variables and constants, and see what portions of each I’ll want to keep or move around, vs which parts get thrown away completely. Each sub-component that I want to keep will be put in a regex group. Anything that gets completely thrown away doesn’t need to be in a group, though sometimes I put it in a numbered (___) or unnumbered (?:___) group anyway, if I have a good reason for it. Anything that needs to be split apart, I break into multiple groups, instead of having it as one group.
    1. For each group, I do a mental “how would I describe to my son how to correctly match these characters?” – which should hopefully give me a simple, foolproof algorithm of characters that must match or must not match; then I ask, “how would I translate those instructions into regex sequences?” If I don’t know the answer to the second, I read documentation, or ask a specific question.
    1. try it, debug, iterate.


  • @PeterJones

    It feels like we are doing someone’s school assignment again, what with these very similar postings…



  • Hi, @rovitie, @peterjones, @alan-kilborn and All,

    Just a variant, slightly shorter ;-))

    SEARCH ^.+?:([^:]+@.+?:).+:([[:xdigit:]]{32}):.*

    REPLACE $1$2

    Best Regards

    guy038


Log in to reply