combining 3 commands for a text

Markomarin

I needed some commands and I got help from helpful people from this community. I have got 3 commands and I want to make it one performing 3 actions at one time. The commands are:
1.
Find: [ \t]+
Repl: (one space) makes sure that there is one space between words.
2.
Find:\x20\d±\x20
Repl:\r\n\r\n
this command makes sure that the numbers (with - character) disappear and start a new line with one empty line
3.
Find: \s\w)\s
Repl: \n
this command makes the characters with ) disappear and go to a new line.
Just to examplify:
“I’m … to dust, so I tie a scarf around my face when I’m cleaning dusty places. A) emotional B) sensitive C) senseless D) sensible E) sensational 18- Not until they were far away from the dark forest did Mortimer … his grip on his mother’s hand. A) tie B) find C) lose D) break E) loosen”
The one I wanted is;
"I’m … to dust, so I tie a scarf around my face when I’m cleaning dusty places.
emotional
sensitive
senseless
sensible
sensational

Not until they were far away from the dark forest did Mortimer … his grip on his mother’s hand.
tie
find
lose
break
loosen"
Thanks in advance for you nice help

Markomarin

For the second command (In preview it seems like ±, it is “+ -” without spaces)

Scott Sumner

@Markomarin

For the second command (In preview it seems like ±, it is “+ -” without spaces)

Hint: Nobody is going to sort out those issues in order to help you.

You might get help if your express your regular expressions like this:

Find: \x20\d+-\x20

Details on how to do that may be found here.

Markomarin

@Scott-Sumner I see. Thanks

PeterJones

@Markomarin ,

If you want help, you are going to have to help us help you. @Scott-Sumner wasn’t just chastising you: the link he posted contains advice for how to format your post in a way that will make your data and regular expressions clear to us.

If you could post a reply to this thread, with the content marked up properly so that your text examples come through unedited and your regular expressions marked up as Scott showed so they come through inline and unedited, there is a much higher probability of someone offering help on your question, rather than just help on your formatting.

Until you have given us a reasonable representation of your data and regular expressions, help is not likely to be forthcoming.

Scott Sumner

@PeterJones said:

Scott-Sumner wasn’t just chastising you

“just”?

I think it would read better as: “Scott-Sumner wasn’t chastising you…just stating the facts”. Who is going to unravel a mess, or even worse, guess at it…before solving it?

:-)

PeterJones

Good point.

guy038

Hello, @markomarin, @peterjones, @scott-sumner, and All,

Scott, I’m a bit surprised why you consider that markomarin’s post look like … a “mess” !? I, personally, could figure out how he wants to format his text, without too many difficulties ;-))

So, assuming your text, below :

I’m … to dust, so I tie a scarf   around my face when I’m cleaning dusty places. A) emotional B) sensitive C) senseless D) sensible E) sensational 18- Not until they    were far away from the dark forest did Mortimer …			his grip on his mother’s hand. A) tie B) find C) lose D) break E) loosen

Note that I inserted some extra space characters before the word around, between the words they and were and some tabulation characters after the … symbol !

Now :

Open the Replace dialog ( Ctrl + H )
Type in the regex (?-i)\h*[A-Z]\)\h+(\w+)\h*|\h*(\d+)-\h*|(\h{2,}|\t) , in the Find what: zone
Type in the regex (?1\r\n\1)(?2\r\n\r\n)(?3\x20) , in the Replace with: zone
Preferably, tick the Wrap around option
Select the Regular expression search mode
Click on the Replace All button

Voilà ! Magically, you should get the following text, as you expect to :

I’m … to dust, so I tie a scarf around my face when I’m cleaning dusty places.
emotional
sensitive
senseless
sensible
sensational

Not until they were far away from the dark forest did Mortimer … his grip on his mother’s hand.
tie
find
lose
break
loosen

If this first try satisfies you, I could, next time, give you some hints of these regexes ;-)

Best regards,

guy038

Scott Sumner

@PeterJones , I guess I got an answer about who’s going to guess at it before solving it. :-)

@guy038 ,

If people are going to present regexes here that are clearly bogus because they can’t figure out simple markdown (I mean, how hard is that?), I for one am not going to waste time on it. It’s not a stable starting point, and it throws out all other credibility: if they show data also without markdown, how are we to know reading it that a pertinent character(s) is not missing?–I mean, sometimes we get a clue because the text suddenly goes italics, but…

Ideally I’d like to see two rules when asking for regex help, especially since regex help (unless really specific to N++'s regex engine and its many quirks) is outside the purpose of this forum:

if you are presenting a regex that is giving you trouble (which is MUCH-preferred–show us what you’ve already tried!), you present it formatted like this: (?-i)\h*[A-Z]\)\h+(\w+)\h*|\h*(\d+)-\h*|(\h{2,}|\t)
if you have no regex (or even if you do!), show us the following in this format:

BEFORE TEXT:

Dead, week first, person mark provide, drive began paragraph, especially!Enough
pick eye prepare protect or, bone!Store magnet motion group an length found all
mix.Laugh spring what they north consider small!Free way bell age quick ease.Win
cloud time measure walk rain hear reply earth than.Catch out gather search;
radio son

AFTER TEXT:

DEAd, wEEk fIrst, pErsOn mArk prOvIdE, drIvE bEgAn pArAgrAph, EspEcIAlly!EnOUgh
pIck EyE prEpArE prOtEct Or, bOnE!StOrE mAgnEt mOtIOn grOUp An lEngth fOUnd All
mIx.LAUgh sprIng whAt thEy nOrth cOnsIdEr smAll!FrEE wAy bEll AgE qUIck EAsE.WIn
clOUd tImE mEAsUrE wAlk rAIn hEAr rEply EArth thAn.CAtch OUt gAthEr sEArch;
rAdIO sOn

If you want help on a problem, assist those that are likely to give you that help.

Scott Sumner

@Markomarin

@guy038 said:

I could, next time, give you some hints of these regexes

Just so @guy038 doesn’t have to, here’s an explanation of his regexes:

(?-i)\h*[A-Z]\)\h+(\w+)\h*|\h*(\d+)-\h*|(\h{2,}|\t)

Match this alternative (attempting the next alternative only if this one fails) «(?-i)\h*[A-Z]\)\h+(\w+)\h*»
   Use these options for the whole regular expression «(?-i)»
      (hyphen inverts the meaning of the letters that follow) «-»
      Case sensitive «i»
   Match a single character that is a “hortizonal whitespace character” (tab or any space in the active code page) «\h*»
      Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
   Match a single character in the range between “A” and “Z” (case sensitive) «[A-Z]»
   Match the closing parenthesis character «\)»
   Match a single character that is a “hortizonal whitespace character” (tab or any space in the active code page) «\h+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
   Match the regex below and capture its match into backreference number 1 «(\w+)»
      Match a single character that is a “word character” (letter, digit, or underscore in the active code page) «\w+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
   Match a single character that is a “hortizonal whitespace character” (tab or any space in the active code page) «\h*»
      Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Or match this alternative (attempting the next alternative only if this one fails) «\h*(\d+)-\h*»
   Match a single character that is a “hortizonal whitespace character” (tab or any space in the active code page) «\h*»
      Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
   Match the regex below and capture its match into backreference number 2 «(\d+)»
      Match a single character that is a “digit” (any symbol with a decimal value in the active code page) «\d+»
         Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
   Match the character “-” literally «-»
   Match a single character that is a “hortizonal whitespace character” (tab or any space in the active code page) «\h*»
      Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Or match this alternative (the entire match attempt fails if this one fails to match) «(\h{2,}|\t)»
   Match the regex below and capture its match into backreference number 3 «(\h{2,}|\t)»
      Match this alternative (attempting the next alternative only if this one fails) «\h{2,}»
         Match a single character that is a “hortizonal whitespace character” (tab or any space in the active code page) «\h{2,}»
            Between 2 and unlimited times, as many times as possible, giving back as needed (greedy) «{2,}»
      Or match this alternative (the entire group fails if this one fails to match) «\t»
         Match the tab character «\t»

(?1\r\n\1)(?2\r\n\r\n)(?3\x20)

Check whether capturing group number 1 was matched «(?1\r\n\1)»
   If the group was matched then insert the following «\r\n\1»
      Insert a carriage return «\r»
      Insert a line feed «\n»
      Insert the text that was last matched by capturing group number 1 «\1»
Check whether capturing group number 2 was matched «(?2\r\n\r\n)»
   If the group was matched then insert the following «\r\n\r\n»
      Insert a carriage return «\r»
      Insert a line feed «\n»
      Insert a carriage return «\r»
      Insert a line feed «\n»
Check whether capturing group number 3 was matched «(?3\x20)»
   If the group was matched then insert the following «\x20»
      Insert the character “ ” which occupies position 0x20 (32 decimal) in the character set «\x20»

Created with RegexBuddy

guy038

Hello, @markomarin, @peterjones, @scott-sumner, and All,

@scott-sumner,

I do agree with you that people, asking for help, should write their posts in such a way that a minimum of ambiguities would remain ! However, you surely understand that people which just begin to post on N++ community ( The OP just created 6 posts, up to now ), don’t want to bother, first, about Markdown syntax !

Of course, if we cannot figure out their problem or if the used syntax corrupts their text, it’s quite normal that we invite them to look the FAQ Desk, below :

https://notepad-plus-plus.org/community/topic/15739/faq-desk-request-for-help-without-sufficient-information-to-help-you

And to have an overview of the Markdown syntax and features, from the link :

https://daringfireball.net/projects/markdown/

We just have to hope that, after several visits, some of them will find benefit in improving the general appearance of their posts :-D

@markomarin and All,

An other and shorter formulation or my previous regex S/R could be :

SEARCH (?-i)\h*[A-Z]\)\h+(\w+)\h*|\h*\d+-\h*|(\h{2,}|\t)

REPLACE (?2\x20:\r\n(?1\1:\r\n))

@scottsumner ( again ! )

The information provided by RegexBuddy, on regexes, are really awesome ! Valuable software, indeed ! So, I suppose that my second formulation will be easily understood from your previous RegexBuddy info !

BTW, Scott, to end with a pleasant note, regarding the way to change your BEFORE TEXT into the AFTER TEXT that you proposed… may be this regex S/R could do the job :

SEARCH (?-i)[aeiou]

REPLACE \u$0

So, apparently, you do not consider that letter y is a vowel, don’t you ? Just joking ;-))

Cheers,

guy038

Scott Sumner

@guy038

…information provided by RegexBuddy

I got the go-ahead approval from Jan Goyvaerts, author of RegexBuddy, to post its decodings here. In return I promised to give credit where credit is due and indicate where the decodings originated. Of course, I’m not going to do this all the time, but if someone asks for an explanation of a regex that is a bit, shall we say, above and beyond…then why not, eh?

…to end with a pleasant note…

Wha? I’m always pleasant, aren’t I? :-)

do not consider that letter y is a vowel

Hahaha. I actually struggled a bit with what I should do to turn my BEFORE text into AFTER text…