remove text

TN MC

I am hoping to have a script that will do the following.

remove “Priority: Routine” and everything before that.
remover “Priority: Urgent” and everything after that
“( )” brackets and everything between the brackets
“[ ]” brackets and everything between the brackets
Select and copy what is left.

Thank you in advance.

Terry R

@TN-MC
Obviously you will continue to wait since you’ve provided no examples (before and after) so that we may help you.

I’d suggest reading one of our FAQ posts on how to include examples and also you need to be a bit more forthcoming in details, like the last step, copy. What do you want to do with the copy? That FAQ post is here.

Terry

TN MC

@Terry-R
Sorry about that. Not trying to be rude or non-compliant.

I can’t give cactus example due to confidentiality of the documents. But here is what I am hoping to get help for.

BEFORE MACRO

Abc.
Lmb.
Wee.

Priority:Routine.

He is a good candidate [(sed(3a)ytr]. Consider. [Fgt]

Priority: Urgent.

Try.
Sf.

AFTER MACRO

He is a good candidate. Consider.

Terry R

@TN-MC

Sorry, but your example was not within the code block (as per the post I linked to). So we have no way of knowing if there are indentations in the data (not left justified). And many examples please, say between 5 and 10 individual situations/records.

We understand that sometimes data is confidential. In that case we ask you to edit the examples with generic text. It must be in the same context however (character replaces character and number replaces number to the same number of positions).

So I suggest you continue reading the FAQ posts and give us as much as you possibly can. Without it what tends to happen is we make a guess about the data, present the solution at which time you say, “oh, I forgot to say there is this type of data also…” and around the loop we go again.

So, whilst we will be helpful, you have to give us as much as possible to allow us to help.

There are several other FAQ posts there, also relevant to your question so maybe look through them also.

Terry

Mark Olson

Start recording a macro
Use the find/replace form:
Set Search mode to Regular expression
Find what: (?s-i).*^Priority:\h*Routine\.\R+|\s*\[[^\[\]]*\]\s*|\s*\([^()]*\)\s*|Priority:\h*Urgent\..*
Replace with: nothing
Select the entire document with Ctrl+A
Copy the entire document with Ctrl+C
Stop recording the macro

The regular expression above may need to be tweaked a bit to meet your needs, especially if you have some special rules that allow “escaped” squarebraces/parens inside a [] or () block. It also assumes that Priority: X. takes up the entire line, with no leading or trailing whitespace. The \h escape can be useful because it matches non-newline whitespace.

TN MC

@Terry-R thank you. Something to keep in mind.TY.

TN MC

@Mark-Olson thank you. Will try and follow up.

TN MC

@Mark-Olson That is brilliant. It worked like a charm! Thank you. Thank you.

TN MC

@Mark-Olson Does that mean that if \h* will be redundant if say instead of “Priority: Urgent” if it was only “Urgent”?
Thank you.

Mark Olson

@TN-MC said in remove text:

Does that mean that if \h* will be redundant if say instead of “Priority: Urgent” if it was only “Urgent”?

The regex that matches the word Urgent is just Urgent.

I’ll just break down my regex above (see the user manual for a general guide):

(?s-i).*^Priority:\h*Routine\.\R+|\s*\[[^\[\]]*\]\s*|\s*\([^()]*\)\s*|Priority:\h*Urgent\..*

First, note the (?s-i) at the beginning. This is equivalent to checking the Match case and . matches newline checkboxes in the find/replace form, but I like using this approach because it minimizes my reliance on clicking boxes in the GUI.

Also note that there are a bunch of characters with \ before them, like \[ and \. and \]. These characters are “escaped” by the \ because they have special meanings in regular expressions.

Next, the regex is divided into four branches, separated by |, which is the “union” operator in regex. These branches do the following:

.*^Priority:\h*Routine\.\R+ matches all the text in the file (.*) up to the last line that contains no text except Priority:, then any amount of non-newline whitespace (\h*), then Routine, then any number of newlines (\R+)
\s*\[[^\[\]]*\]\s* matches any amount of whitespace (\s*), then a [ character (written as \[ because it’s a special character in regex), then any number of characters that are not [ or ] ([^\[\]]), then a ] (written as \] for the same reason).
\s*\([^()]*\)\s* matches any amount of whitespace (\s*), then a ( character (written as \( because it’s a special character in regex), then any number of characters that are not ( or ) ([^()]), then a ) (written as \) for the same reason).
Priority:\h*Urgent\..* matches the exact text Priority:, then any amount of non-newline whitespace (\h*), then Routine. (Routine\. because . is a special character), then the rest of the file after that (.*)

TN MC

@Mark-Olson thank you, sir for explaining. I would be lying if I say I understand. But I promise to go through it carefully and try to learn. Thank you for your time and detailed description.