Remove unwanted CRLF in paragraphs



  • I want to change:

    Just some text<CRLF>
    I wrote to<CRLF>
    serve as example.<CRLF>
    <CRLF>
    

    into:

    Just some text I wrote to serve as example.<CRLF>
    <CRLF>
    

    In some 200 files, each files containing lots of these paragraphs.
    How can this be done?



  • @Jeroen-Borgman

    How can this be done?

    By finding a pattern which all have in common.
    Without such a pattern, … by manually joining the lines.



  • Hello, @jeroen-borgman, and All,

    Not difficult with regexes. Indeed !

    However I advice you to get rid of any trailing space characters, first ! Two possibilities :

    • Use the N++ option Edit > Blank Operations > Trim Trailing Space

    • Execute the regex S/R :

      • SEARCH \h+$

      • REPLACE Leave EMPTY


    Now, given, for instance, the sample text, below, without any trailing space :

    Just some text
    I wrote to<
    serve as example
    
    Just some text
    I wrote to<
    serve as example
    
    
    A single line !
    
    
    
    A last
    paragraph to
    see if the
    result is
    OK
    
    

    Then, the regex S/R :

    • SEARCH (?-s).\K\R(?!\R)

    • REPLACE \x20

    with a click on the Replace All button, would return the text :

    Just some text I wrote to< serve as example
    
    Just some text I wrote to< serve as example
    
    
    A single line !
    
    
    
    A last paragraph to see if the result is OK
    
    

    Notes :

    • First, the part (?-s) means that any dot (. ) will represents a single standard character, only and not EOL ones

    • Then the part .\K\R looks for any standard char, right before an end of line and, due to the \K syntax, the regex engine considers the regex at the right of \K, i.e. the syntax \R which represents any form of EOL ( \r\n if Windows, \n if Unix or \r if Mac )

    • Finally, the (?!\R) part is a negative look-around, i.e. a condition which must be verified. This condition force the replaceent of the EOL character(s) of a line ONLY IF it is not followed, itself, with other EOL character(s)

    • In replacement, the \x20 syntax is a synonym of the space character

    Best Regards,

    guy038



  • @guy038 Thanks for this, works like a charm!
    I will study the search syntax with the REGEX doc on the side to do this myself next time.


Log in to reply