Notepad ++ Regex finding a second appearance of a character and move it to the new line



  • Good day Guys

    I am not familiar with regex, Im using notepad++ trying to clean some data i received. Is there a way to find a second appearance of a character and place it on a new line.

    E.g

    1|name|surname|address|1|name|surname|address

    I neeed to find the second appearance of the number ‘1’ and put it into a new line.

    1|name|surname|address
    1|name|surname|address



  • You may try the following:

    Search for: (\d+)(\|.*?)\|\1(\|.*)
    Replace with: \1\2\r\n\1\3

    Breakdown: (\d+): find a sequence of digits at least one character long. Remember it for further use.
    (\|.*?): find a Pipe character followed by an arbitrary sequence of characters but make it a short as possible. Remember for further use.
    \|\1: find a Pipe character followed by the first captured sequence.
    (\|.*): grab the remainder of the current line. Remember it for further use.

    Now we have stored the entire line and can start rebuilding it:
    \1: first captured sequence (first set of digits)
    \2: second captured sequence
    \r\n: new line
    \1: first captured sequence again
    \3: third captured sequence (remainder of line)

    See http://www.boost.org/doc/libs/1_57_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html and http://www.boost.org/doc/libs/1_57_0/libs/regex/doc/html/boost_regex/format/perl_format.html for more information about RegExes in NPP.



  • Hello, @luthando hanana, @gerb42 and All,

    If we supposed that :

    • All your lines, normally, begin by an number-id, followed by a | character

    • Some of them are stuck to the previous one, with the syntax |number-id|

    you could use the regex S/R, below :

    SEARCH \|(?=\d+\|)

    REPLACE \r\n ( or \n if you’re using an Unix file )

    Notes :

    • The Vertical Line character, |, must be escaped to be seen as literal, because it’s a special regex character

    • The search regex looks for a single Vertical Line character, \|, ONLY IF it’s followed by some digits \d+ and a second Vertical Line character, \|

    • In replacement this Vertical Line character is, simply, replaced by a line break

    Best Regards,

    guy038


Log in to reply