one sentence per line



  • I have a .txt file win the following format:

    [Line 1]Health professionals are expected to undertake audit and
    [Line 2]service evaluation as part of quality assurance. These usually
    [Line 3]involve minimal additional risk, burden or intrusion for
    [Line 4]participants. It is important to determine at an early stage
    [Line 5]whether a project is audit or research, and sometimes that
    [Line 6]is not as easy as it seems. The decision will determine the
    [Line 7]framework in which the study is undertaken.

    How I want is, in [Line 1], one complete sentence(till ‘.’ character)
    Is there any way to automate it(text file has 119043 characters; so it will take a long time to do so manually)

    Thanks



  • @Dragoon-35 ,

    1. Ctrl+A (Edit > Select All)
    2. Ctrl+J (Edit > Line Operations > Join Lines)
      • at this point, you should have one giant paragraph
    3. Ctrl+H (Search > Replace)
      • Find What = (?-s)\.\h+
      • Replace With = .\r\n
      • Search Mode = Regular Expression
    4. Click Replace All

    Given the exact data:

    Health professionals are expected to undertake audit and
    service evaluation as part of quality assurance. These usually
    involve minimal additional risk, burden or intrusion for
    participants. It is important to determine at an early stage
    whether a project is audit or research, and sometimes that
    is not as easy as it seems. The decision will determine the
    framework in which the study is undertaken.
    

    That will result in

    Health professionals are expected to undertake audit and service evaluation as part of quality assurance.
    These usually involve minimal additional risk, burden or intrusion for participants.
    It is important to determine at an early stage whether a project is audit or research, and sometimes that is not as easy as it seems.
    The decision will determine the framework in which the study is undertaken.
    

    Assumptions:

    • [Line #] isn’t actually part of the text
    • All your sentences end with ., and none end in ? or ! or ." or other such endings
    • You don’t have any other period-space instances in your file.
      • Having text like Dr. Bob is a surgeon. will mess up the algorithm, because the . between Dr and Bob will be interpreted as a sentence-ender

    If this isn’t sufficient for your needs, you will have to clarify what you really want. Please read the advice below.


    Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All data / example text should be marked up as plaintext using the </> toolbar button or manual Markdown syntax; screenshots can be pasted in natively using Ctrl+V when you have the image in your clipboard. Show the data you have; show the regex you tried, and why you thought it should work; show what you get, and compare it to what you wanted to get; make sure to include examples of things that should match and be transformed, and things that don’t match and should be left alone. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ.
    We sometimes vent our frustration when all particular user does is demand the answer be given to them after many changes of requirements, or comes back time and again for new “gimme” requests, without showing any effort. But if you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries.



  • @PeterJones
    Thanks. This works just fine



  • Hello, @dragoon-35, @peterjones and All,

    Just an alternate regex solution, which does not need any prior line operations, before the S/R :

    SEARCH \.\h*(?!\R)|(?<!\.)(\r\n)

    REPLACE ?1\x20:.\r\n

    Notes :

    • This regex searches for :

      • Any literal . , followed with possible blank chars,  ONLY IF NOT followed with a line-break

      • A line-break  ONLY IF NOT preceded with a literal dot .

    • In replacement :

      • In case of the first alternative, as group 1 is NOT defined, the ELSE part is used and a dot . is rewritten, followed with a line-break

      • In case of the second alternative, as group 1 is defined, the THEN part is used and the line-break is replaced with a space character


    Remark :

    You may have been intrigued by the syntax (\r\n), which, obviously, could have been simplified to (\R) !

    Well, let’s use this 2-lines data, in a new tab, with no line-break after word End

    This is a test.
    End
    

    We’ll just use the second alternative of the search regex, with the \R syntax and the corresponding replacement part

    SEARCH (?<!\.)\R

    REPLACE \x20

    As line 1 does end with a dot, this regex should not match, against this text. However, one replacement does occur and we get :

    This is a test.
     End
    

    Note that the line 1 ends with the CR character, only and line 2 begins with a space char. WHY ? Well, the two-chars CR-LF, indeed, are preceded with a dot and so, does not satisfy the regex. However, when the regex engine move one position, on the right, the LF char, which matches the regex \R too, is preceded by the CR, which is not a dot symbol and, then, satisfies the regex. So, the LF is replaced with a space character !

    Now, if this regex is changed, as below :

    SEARCH (?<!\.)\r\n

    REPLACE \x20

    This time, there no more ambiguity and no match at all occurs against our small piece of text !

    Best Regards,

    guy038



  • Thank you @guy038


Log in to reply