Join/Merge two lines on condition



  • HI all. A data file has “name” and “numb” lines sequentially.
    Is there a way to combine the name/numb into one line?
    And as you see in the example below, “sam” does not have
    a “numb” line, so that one would have to be skipped over.
    Any thoughts? I’ve been at this for a while (:

    name=john
    numb=053a82bf31d0f63e669e9035bea4f311
    name=pete
    numb=a0c49c5e34724c1247b314a3647c3acb
    name=sam
    name=eric
    numb=e368627d9bd60c6e38349532d966df4e
    name=steve
    numb=b03438713f2894d97f139721f98b2794



  • You can do it using regular expressions in hte search/replace.

    name=john
    numb=053a82bf31d0f63e669e9035bea4f311
    name=pete
    numb=a0c49c5e34724c1247b314a3647c3acb
    name=sam
    name=eric
    numb=e368627d9bd60c6e38349532d966df4e
    name=steve
    numb=b03438713f2894d97f139721f98b2794
    
    • Find What = (?-s)(name=.*)\R(numb=.*)$

    • Replace With = $1 : $2

    • Mode = Regular Expression

    • the first term tells it to have . not match EOL characters

    • the second puts the name=somethingerother line into $1

    • the \R eats a newline (EOL) sequence

    • the final parenthetical stores the numb=... into $2

    • the $ says it must grab until the end-of-line

    in the replace, it’s just the two stored values, separated by a literal space, colon, space. (You wanted ideas, and didn’t specify a separator)

    result =

    name=john : numb=053a82bf31d0f63e669e9035bea4f311
    name=pete : numb=a0c49c5e34724c1247b314a3647c3acb
    name=sam
    name=eric : numb=e368627d9bd60c6e38349532d966df4e
    name=steve : numb=b03438713f2894d97f139721f98b2794
    

    -----

    FYI: if you have further regex needs, study this FAQ and the documentation it points to. Before asking a new regex question, understand that for future requests, many of us will expect you to show what data you have (exactly), what data you want (exactly), what regex you already tried (to show that you’re showing effort), why you thought that regex would work (to prove it wasn’t just something randomly typed), and what data you’re getting with an explanation of why that result is wrong. When you show that effort, you’ll see us bend over backward to get things working for you. If you need help formatting the data so that the forum doesn’t mangle it (so that it shows “exactly”, as I said earlier), see this help-with-markdown post, where @Scott-Sumner gives a great summary of how to use Markdown for this forum’s needs.
    Please note that for all “regex” queries – or queries where you want help “matching” or “marking” or “bookmarking” a certain pattern, which amounts to the same thing – it is best if you are explicit about what needs to match, and what shouldn’t match, and have multiple examples of both in your example dataset. Often, what shouldn’t match helps define the regular expression as much or more than what should match.



  • Thank you for posting that code, and for the summary of how it works.
    Sadly tho, I get the “Find: Can’t find the text” error.
    I don’t think the \r or \n works for me in any capacity here; always see that error.

    To rule out the possibility that the data file (.xml) is corrupt, I copied a few
    lines (one by one) to the plain (windows) text file, and then had Notepad++
    load that one, but saw the same error…

    I’m on Windows 10 with Notepad++ v7.6.2 in case this matters.



  • As it turns out, my file was somewhat incompatible.
    For another test, I created a new file from within Notepad++,
    added a few lines, and your code worked flawlessly.
    Probably some bad CR or LF perhaps in my org. file?



  • If you used the capital \R in my regex above, it should have worked, whether your file had CRLF (windows, aka \r\n) or LF (linux and modern mac, \n), or CR (old macos \r). It should have worked even if some were CRLF and some were LF. It wouldn’t work, however, if there was a blank line between, or other such oddity.

    If you open the original file in Notepad++, and enable View > Show Symbol > Show All Characters, you should see little black boxes for the end of line, either two with CR and LF, or one or the other… Plus, your status bar should list what EOL Notepad++ detects (Windows (CR LF), Unix (LF), or Macintosh (CR)). If you give us a screenshot (see the help-with-markdown post for screenshot help; use the ![](url.to/image.png)-notation so the image is visible in the forum), we might be able to see why the regex didn’t work in your original file.


Log in to reply