In a single line with Regular expression
-
Hi all,
I have txt file with example 11050 lines, I would like to have in a line each paragraph.
Is it possible with Regular expression code? any idea to do it=
email:karl_78@yahoo.com
Country: us || Suscriptions: [X]
Credits: YESemail:tortyu@hotmail.com
Country: eu || Suscriptions: [X]
Credits: NOemail:isogf.takeda@live.com
Country: es || Suscriptions: [X]
Credits: NOTO
=
email:karl_78@yahoo.comCredits: YESemail:tortyu@hotmail.com
Credits: NOCredits: NO -
Edit > Blank Operations > trim trailing space
Edit > Blank Operations > EOL to Space -
Hello, @ALISSAbry, @gurikbal-singh and All
I’m afraid, @gurikbal-singh, that the option Edit > Blank Operations > EOL to Space remove too many line-breaks !
But with the regex S/R, below, it does the job, correctly and, in addition, it deletes all possible unnecessary line-breaks, too ;-))
So assuming the text below :
= email:karl_78@yahoo.com Country: us || Suscriptions: [X] Credits: YES = email:tortyu@hotmail.com Country: eu || Suscriptions: [X] Credits: NO = email:isogf.takeda@live.com Country: es || Suscriptions: [X] Credits: NO
Now :
-
Open the Replace dialog (
Ctrl + H
) -
FIND WHAT :
(\R){2,}|\R
-
REPLACE WITH
?1\1:\x20
-
Tick the
Wrap around
option if necessary -
Select the
Regular expression
search mode -
Click on the
Replace All
button
You should get the expected result, below :
= email:karl_78@yahoo.com Country: us || Suscriptions: [X] Credits: YES = email:tortyu@hotmail.com Country: eu || Suscriptions: [X] Credits: NO = email:isogf.takeda@live.com Country: es || Suscriptions: [X] Credits: NO
Note that the first blank line, only, is kept !
Notes on the S/R :
-
The search regex contains
2
alternatives, separated with the alternation symbol|
-
(\R){2,}
which tries to match a range of, at least,2
consecutive line-breaks, and stores current line-break form is group1
-
\R
the unique line-break, at end of the various lines
-
-
In the replacement regex, we’re using a conditional replacement syntax
(?# ....
YES.... : ....
NO.... )
-
If group
1
exists, all the consecutive line breaks are replaced with an unique line-break\1
-
If the second alternative occurs, the unique line-break must be replaced with a simple space
\x20
-
Best Regards
guy038
-
-
I am very grateful with your help friends, guy038 and gurikbal singh,
-
@guy38, Perhaps I’m not interpreting the word “paragraph” literally enough from @ALISSAbry 's comment, but this regex doesn’t work unless there are at least 2 line breaks around each chunk of information. The example data doesn’t have two consecutive line breaks anywhere. What would you use for strictly the example data?
-
@cipher-1024 said:
@guy38, Perhaps I’m not interpreting the word “paragraph” literally enough from @ALISSAbry 's comment, but this regex doesn’t work unless there are at least 2 line breaks around each chunk of information. The example data doesn’t have two consecutive line breaks anywhere. What would you use for strictly the example data?
add a line break with this code
Search: =
Replace By: \n\n=\n\n
expression regular
click replace allthen apply
FIND WHAT : (\R){2,}|\RREPLACE WITH ?1\1:\x20
-
Hello, @ALISSAbry, @gurikbal-singh, @cipher-1024 and All,
Ahrrrrrhh ! You’re perfectly right, @cipher-1024 : my regex does not work, if no extra blank lines exist :-((
I first had found a correct regex. However, as I also wanted to delete possible pure blank lines, I tried to improve my regex against a text containing some blank lines
But I had not tested it again, with a simple text, without extra blank lines. My bad !
So a correct regex could be :
SEARCH
(\R){2,}|(?<!=)\r\n(?!=)
REPLACE
?1\1:\x20
Notes :
-
Only, the second alternative have changed
(?<!=)\r\n(?!=)
. It will replace any line-break, not preceded and not followed with an equal sign=
. In other words, only the line breaks located between two data lines -
Note that we must replace the
\R
syntax with the\r\n
( or with\n
if you’re working with Unix files ! ) Why ? Just because\R
represents\r\n
, but in order to match an overall regex,\R
may match only\r
or\n
-
For instance, if you try to match the regex
(?<!=)\R(?!=)
at the end of the sentence …YEScrlf
, it would grab only the\r
part of the line break, before\n
. Indeed, in that case, the\r
character is, both, not preceded with an=
sign ( as it is the letterS
) and not followed with an=
sign ( as it is the\n
symbol )
Cheers,
guy038
-