In a single line with Regular expression



  • Hi all,
    I have txt file with example 11050 lines, I would like to have in a line each paragraph.
    Is it possible with Regular expression code? any idea to do it

    =
    email:karl_78@yahoo.com
    Country: us || Suscriptions: [X]
    Credits: YES

    email:tortyu@hotmail.com
    Country: eu || Suscriptions: [X]
    Credits: NO

    email:isogf.takeda@live.com
    Country: es || Suscriptions: [X]
    Credits: NO

    TO

    =
    email:karl_78@yahoo.com

    email:tortyu@hotmail.com

    email:isogf.takeda@live.com



  • Edit > Blank Operations > trim trailing space
    Edit > Blank Operations > EOL to Space



  • Hello, @ALISSAbry, @gurikbal-singh and All

    I’m afraid, @gurikbal-singh, that the option Edit > Blank Operations > EOL to Space remove too many line-breaks !

    But with the regex S/R, below, it does the job, correctly and, in addition, it deletes all possible unnecessary line-breaks, too ;-))

    So assuming the text below :

    
    
    
    =
    
    
    
    email:karl_78@yahoo.com
    Country: us || Suscriptions: [X]
    Credits: YES
    
    
    =
    
    
    
    email:tortyu@hotmail.com
    Country: eu || Suscriptions: [X]
    Credits: NO
    
    
    
    
    
    =
    
    
    email:isogf.takeda@live.com
    Country: es || Suscriptions: [X]
    Credits: NO
    
    
    

    Now :

    • Open the Replace dialog ( Ctrl + H )

    • FIND WHAT : (\R){2,}|\R

    • REPLACE WITH ?1\1:\x20

    • Tick the Wrap around option if necessary

    • Select the Regular expression search mode

    • Click on the Replace All button

    You should get the expected result, below :

    
    =
    email:karl_78@yahoo.com Country: us || Suscriptions: [X] Credits: YES
    =
    email:tortyu@hotmail.com Country: eu || Suscriptions: [X] Credits: NO
    =
    email:isogf.takeda@live.com Country: es || Suscriptions: [X] Credits: NO
    

    Note that the first blank line, only, is kept !

    Notes on the S/R :

    • The search regex contains 2 alternatives, separated with the alternation symbol |

      • (\R){2,} which tries to match a range of, at least, 2 consecutive line-breaks, and stores current line-break form is group 1

      • \R the unique line-break, at end of the various lines

    • In the replacement regex, we’re using a conditional replacement syntax (?# ....YES.... : ....NO.... )

      • If group 1 exists, all the consecutive line breaks are replaced with an unique line-break \1

      • If the second alternative occurs, the unique line-break must be replaced with a simple space \x20

    Best Regards

    guy038



  • I am very grateful with your help friends, guy038 and gurikbal singh,



  • @guy38, Perhaps I’m not interpreting the word “paragraph” literally enough from @ALISSAbry 's comment, but this regex doesn’t work unless there are at least 2 line breaks around each chunk of information. The example data doesn’t have two consecutive line breaks anywhere. What would you use for strictly the example data?



  • @cipher-1024 said:

    @guy38, Perhaps I’m not interpreting the word “paragraph” literally enough from @ALISSAbry 's comment, but this regex doesn’t work unless there are at least 2 line breaks around each chunk of information. The example data doesn’t have two consecutive line breaks anywhere. What would you use for strictly the example data?

    add a line break with this code

    Search: =
    Replace By: \n\n=\n\n
    expression regular
    click replace all

    then apply
    FIND WHAT : (\R){2,}|\R

    REPLACE WITH ?1\1:\x20



  • Hello, @ALISSAbry, @gurikbal-singh, @cipher-1024 and All,

    Ahrrrrrhh ! You’re perfectly right, @cipher-1024 : my regex does not work, if no extra blank lines exist :-((

    I first had found a correct regex. However, as I also wanted to delete possible pure blank lines, I tried to improve my regex against a text containing some blank lines

    But I had not tested it again, with a simple text, without extra blank lines. My bad !

    So a correct regex could be :

    SEARCH (\R){2,}|(?<!=)\r\n(?!=)

    REPLACE ?1\1:\x20

    Notes :

    • Only, the second alternative have changed (?<!=)\r\n(?!=). It will replace any line-break, not preceded and not followed with an equal sign =. In other words, only the line breaks located between two data lines

    • Note that we must replace the \R syntax with the \r\n ( or with \n if you’re working with Unix files ! ) Why ? Just because \R represents \r\n, but in order to match an overall regex, \R may match only \r or \n

    • For instance, if you try to match the regex (?<!=)\R(?!=) at the end of the sentence …YEScrlf, it would grab only the \r part of the line break, before \n. Indeed, in that case, the \r character is, both, not preceded with an = sign ( as it is the letter S) and not followed with an = sign ( as it is the \n symbol )

    Cheers,

    guy038


Log in to reply