How to delete blocks of simmilar text which doesn't contain a specific word?

veruc w

Fellow Notepad++ Users,

Could you please help me with the following search-and-replace problem I am having?

I have a large txt file which contains blocks of very similar text (blocks begin and end with same words), and I would like to delete those blocks that doesn’t contain a specific word. Or select and then copy into a new file those blocks which contain the desired word.

I dont know how to make notepad++ recognize these blocks of similar text as separate entities in the same file.
Is it possible to do?

I am using the latest version of this software.

Thanks a lot

Lycan Thrope

@veruc-w ,

Start off by reading the FAQ’s about how to use these forums and the markup language to describe and show your problem, and then read the FAQ on how to use Regex and the Search and Replace capability of Notepad++. This isn’t a mind reading forum, nor is this a one stop answering service for your vague descriptions. We are users helping users, not doing their work for them. Start here by reading the Online User Manual.

guy038

Hello, @veruc-w, @lycan-thrope and All,

In addition to @lycan-thrope’s advice, just one hint to begin with.

Simply replace the zones BEGIN_BOUNDARY and END_BOUNDARY with your current boundaries and the string ABSENT_WORD with the word which must not be included into the blocks to delete

Then, follow the steps below :

Start N++ and select the tab or open your file
Open the Replace dialog ( Ctrl + H )
Untick all box options
SEARCH (?s-i)^\h*BEGIN_BOUNDARY((?!ABSENT_WORD).)+?END_BOUNDARY.*?$\R
REPLACE Leave EMPTY
Check the Wrap around box option
Select the Regular expression search mode
Click, once only, on the Replace All button

Here you are !

Best Regards,

guy038

Alan Kilborn

@guy038 said :

SEARCH (?s-i)^\h*BEGIN_BOUNDARY((?!ABSENT_WORD).)+?END_BOUNDARY.*?$\R

Maybe this is better without capturing into group1?, i.e. :

SEARCH (?s-i)^\h*BEGIN_BOUNDARY(?:(?!ABSENT_WORD).)+?END_BOUNDARY.*?$\R

Also, OP said nothing about what comes before the word that begins the block nor what comes after the word that ends the block (assumption is the begin-word and the end-word are different) so thus maybe this is an even better expression:

SEARCH (?s-i)BEGIN_BOUNDARY(?:(?!ABSENT_WORD).)+?END_BOUNDARY

It’s a pity the OP never returned to either provide more specifics (and sample data) or to say whether or not the originally proposed solution was successful.