In a single line with Regular expression
-
Hi all,
I have txt file with example 11050 lines, I would like to have in a line each paragraph.
Is it possible with Regular expression code? any idea to do it=
email:karl_78@yahoo.com
Country: us || Suscriptions: [X]
Credits: YESemail:tortyu@hotmail.com
Country: eu || Suscriptions: [X]
Credits: NOemail:isogf.takeda@live.com
Country: es || Suscriptions: [X]
Credits: NOTO
=
email:karl_78@yahoo.comCredits: YESemail:tortyu@hotmail.com
Credits: NOCredits: NO -
Edit > Blank Operations > trim trailing space
Edit > Blank Operations > EOL to Space -
Hello, @ALISSAbry, @gurikbal-singh and All
I’m afraid, @gurikbal-singh, that the option Edit > Blank Operations > EOL to Space remove too many line-breaks !
But with the regex S/R, below, it does the job, correctly and, in addition, it deletes all possible unnecessary line-breaks, too ;-))
So assuming the text below :
= email:karl_78@yahoo.com Country: us || Suscriptions: [X] Credits: YES = email:tortyu@hotmail.com Country: eu || Suscriptions: [X] Credits: NO = email:isogf.takeda@live.com Country: es || Suscriptions: [X] Credits: NONow :
-
Open the Replace dialog (
Ctrl + H) -
FIND WHAT :
(\R){2,}|\R -
REPLACE WITH
?1\1:\x20 -
Tick the
Wrap aroundoption if necessary -
Select the
Regular expressionsearch mode -
Click on the
Replace Allbutton
You should get the expected result, below :
= email:karl_78@yahoo.com Country: us || Suscriptions: [X] Credits: YES = email:tortyu@hotmail.com Country: eu || Suscriptions: [X] Credits: NO = email:isogf.takeda@live.com Country: es || Suscriptions: [X] Credits: NONote that the first blank line, only, is kept !
Notes on the S/R :
-
The search regex contains
2alternatives, separated with the alternation symbol|-
(\R){2,}which tries to match a range of, at least,2consecutive line-breaks, and stores current line-break form is group1 -
\Rthe unique line-break, at end of the various lines
-
-
In the replacement regex, we’re using a conditional replacement syntax
(?# ....YES.... : ....NO.... )-
If group
1exists, all the consecutive line breaks are replaced with an unique line-break\1 -
If the second alternative occurs, the unique line-break must be replaced with a simple space
\x20
-
Best Regards
guy038
-
-
I am very grateful with your help friends, guy038 and gurikbal singh,
-
@guy38, Perhaps I’m not interpreting the word “paragraph” literally enough from @ALISSAbry 's comment, but this regex doesn’t work unless there are at least 2 line breaks around each chunk of information. The example data doesn’t have two consecutive line breaks anywhere. What would you use for strictly the example data?
-
@cipher-1024 said:
@guy38, Perhaps I’m not interpreting the word “paragraph” literally enough from @ALISSAbry 's comment, but this regex doesn’t work unless there are at least 2 line breaks around each chunk of information. The example data doesn’t have two consecutive line breaks anywhere. What would you use for strictly the example data?
add a line break with this code
Search: =
Replace By: \n\n=\n\n
expression regular
click replace allthen apply
FIND WHAT : (\R){2,}|\RREPLACE WITH ?1\1:\x20
-
Hello, @ALISSAbry, @gurikbal-singh, @cipher-1024 and All,
Ahrrrrrhh ! You’re perfectly right, @cipher-1024 : my regex does not work, if no extra blank lines exist :-((
I first had found a correct regex. However, as I also wanted to delete possible pure blank lines, I tried to improve my regex against a text containing some blank lines
But I had not tested it again, with a simple text, without extra blank lines. My bad !
So a correct regex could be :
SEARCH
(\R){2,}|(?<!=)\r\n(?!=)REPLACE
?1\1:\x20Notes :
-
Only, the second alternative have changed
(?<!=)\r\n(?!=). It will replace any line-break, not preceded and not followed with an equal sign=. In other words, only the line breaks located between two data lines -
Note that we must replace the
\Rsyntax with the\r\n( or with\nif you’re working with Unix files ! ) Why ? Just because\Rrepresents\r\n, but in order to match an overall regex,\Rmay match only\ror\n -
For instance, if you try to match the regex
(?<!=)\R(?!=)at the end of the sentence …YEScrlf, it would grab only the\rpart of the line break, before\n. Indeed, in that case, the\rcharacter is, both, not preceded with an=sign ( as it is the letterS) and not followed with an=sign ( as it is the\nsymbol )
Cheers,
guy038
-