one sentence per line
-
I have a .txt file win the following format:
[Line 1]Health professionals are expected to undertake audit and
[Line 2]service evaluation as part of quality assurance. These usually
[Line 3]involve minimal additional risk, burden or intrusion for
[Line 4]participants. It is important to determine at an early stage
[Line 5]whether a project is audit or research, and sometimes that
[Line 6]is not as easy as it seems. The decision will determine the
[Line 7]framework in which the study is undertaken.How I want is, in [Line 1], one complete sentence(till ‘.’ character)
Is there any way to automate it(text file has 119043 characters; so it will take a long time to do so manually)Thanks
-
Ctrl+A
(Edit > Select All)Ctrl+J
(Edit > Line Operations > Join Lines)- at this point, you should have one giant paragraph
Ctrl+H
(Search > Replace)- Find What =
(?-s)\.\h+
- Replace With =
.\r\n
- Search Mode = Regular Expression
- Find What =
- Click Replace All
Given the exact data:
Health professionals are expected to undertake audit and service evaluation as part of quality assurance. These usually involve minimal additional risk, burden or intrusion for participants. It is important to determine at an early stage whether a project is audit or research, and sometimes that is not as easy as it seems. The decision will determine the framework in which the study is undertaken.
That will result in
Health professionals are expected to undertake audit and service evaluation as part of quality assurance. These usually involve minimal additional risk, burden or intrusion for participants. It is important to determine at an early stage whether a project is audit or research, and sometimes that is not as easy as it seems. The decision will determine the framework in which the study is undertaken.
Assumptions:
[Line #]
isn’t actually part of the text- All your sentences end with
.
, and none end in?
or!
or."
or other such endings - You don’t have any other period-space instances in your file.
- Having text like
Dr. Bob is a surgeon.
will mess up the algorithm, because the.
betweenDr
andBob
will be interpreted as a sentence-ender
- Having text like
If this isn’t sufficient for your needs, you will have to clarify what you really want. Please read the advice below.
–
Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All data / example text should be marked up as plaintext using the</>
toolbar button or manual Markdown syntax; screenshots can be pasted in natively usingCtrl+V
when you have the image in your clipboard. Show the data you have; show the regex you tried, and why you thought it should work; show what you get, and compare it to what you wanted to get; make sure to include examples of things that should match and be transformed, and things that don’t match and should be left alone. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ.
We sometimes vent our frustration when all particular user does is demand the answer be given to them after many changes of requirements, or comes back time and again for new “gimme” requests, without showing any effort. But if you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries. -
@PeterJones
Thanks. This works just fine -
Hello, @dragoon-35, @peterjones and All,
Just an alternate regex solution, which does not need any prior line operations, before the S/R :
SEARCH
\.\h*(?!\R)|(?<!\.)(\r\n)
REPLACE
?1\x20:.\r\n
Notes :
-
This regex searches for :
-
Any literal
.
, followed with possible blank chars, ONLY IF NOT followed with a line-break -
A line-break ONLY IF NOT preceded with a literal dot
.
-
-
In replacement :
-
In case of the first alternative, as group
1
is NOT defined, the ELSE part is used and a dot.
is rewritten, followed with a line-break -
In case of the second alternative, as group
1
is defined, the THEN part is used and the line-break is replaced with a space character
-
Remark :
You may have been intrigued by the syntax
(\r\n)
, which, obviously, could have been simplified to(\R)
!Well, let’s use this 2-lines data, in a new tab, with no line-break after word End
This is a test. End
We’ll just use the second alternative of the search regex, with the
\R
syntax and the corresponding replacement partSEARCH
(?<!\.)\R
REPLACE
\x20
As line
1
does end with a dot, this regex should not match, against this text. However, one replacement does occur and we get :This is a test. End
Note that the line
1
ends with theCR
character, only and line2
begins with a space char. WHY ? Well, the two-charsCR-LF
, indeed, are preceded with a dot and so, does not satisfy the regex. However, when the regex engine move one position, on the right, theLF
char, which matches the regex\R
too, is preceded by theCR
, which is not a dot symbol and, then, satisfies the regex. So, theLF
is replaced with a space character !Now, if this regex is changed, as below :
SEARCH
(?<!\.)\r\n
REPLACE
\x20
This time, there no more ambiguity and no match at all occurs against our small piece of text !
Best Regards,
guy038
-
-
Thank you @guy038