Find and Replace mass delete groups of lines that doesn't have specific word

Nguyen Quang

So i have do not know how to use the Find and Replace to mass delete the groups that doesn’t have “Keepinfo”

Example:

{
   "model": "PartA/NameA",
  "predicate": {
    "model_data": 1234
  }
},
{
  "model": "PartA/KeepInfo/NameA",
  "predicate": {
    "model_data": 1234
  }
},
{
   "model": "PartA/Name-B",
  "predicate": {
    "model_data": 2345678
  }
},
{
  "model": "PartA/KeepInfo/Name-B",
  "predicate": {
    "model_data": 2345678
  }
},

—

moderator added code markdown around text; please don’t forget to use the </> button to mark example text as “code” so that characters don’t get changed by the forum

guy038

@nguyen-quang and All,

I suppose that marking the lines which contain the KeepInfo string as well as the subsequent lines of the same group is the best bet !

So follow the steps below :

Open your file in Notepad++
Open the Mark dialog ( Ctrl + M )
Type in the regex (?x-is) ^ { \R .+ KeepInfo .+ \R (?: .+ \R)+? (?= { | \Z ) in the Find what: zone
Uncheck all box options
Check the Purge for each search and Wrap around options only !
Select the Regular expression search mode
Click on the Mark All button
Then, click on the Search > Bookmark > Inverse Bookmark menu option
Finally, click on the Search > Bookmark > Remove Bookmark lines menu option

Best Regards,

guy038

Oh ! An other solution could be :

Right after the Mark All operation, click on the Copy Marked Text button
Open a new tab ( Ctrl + N )
Paste the clipboard contents ( Ctrl + V )
Execute the simple regex S/R to delete the separation lines between each block :

SEARCH ^\R----\R

REPLACE Leave EMPTY

PeterJones

@Nguyen-Quang said in Find and Replace mass delete groups of lines that doesn’t have specific word:

how to use the Find and Replace to mass delete the groups that doesn’t have “Keepinfo”

Assuming each group starts with { at the beginning of the line as your example shows (no whitespace prefix) and each group ends with }, _at the beginning of the line as your example shows (no whitespace prefix), then I would do the “simple” version:

FIND WHAT = (?is)^{(?:(?!KeepInfo).)*?^},
REPLACE WITH = (leave box empty)
SEARCH MODE = Regular Expression

I used (?is) to make sure . matches newline is on no matter what the state of your checkbox is, and to make sure match case is off no matter what the state of your checkbox is. I made it case insensitive because your data had KeepInfo but your statement was doesn't have "Keepinfo" – so since I couldn’t be sure of whether all the I from KeepInfo would be upper case, I made the search case insensitive.

My guess is that my assumptions will be too restrictive for your actual data, but this matches what you showed us, so…

Good luck.

----

Useful References

PeterJones

@guy038 said in Find and Replace mass delete groups of lines that doesn’t have specific word:

Search > Bookmark > Inverse Bookmark

For that to work, you also have to have the Bookmark line checkbox checkmarked… but your instructions said to not have it checkmarked.

guy038

Hi, @nguyen-quang, @peterjones and All,

Yes, indeed, I was mistaken in my first post and @peterjones is right : you must check the Bookmark option, too !

Ah…, Peter, your regex S/R just aims exactly for the OP’s goal and should be quicker than mine. Good point :-))

In free-spacing mode, your regex can be expressed as :

SEARCH (?xis) ^ { (?: (?! KeepInfo ) . )*? ^ },

REPLACE Leave Empty

BR

guy038

Mark Olson

I like the solutions proposed above because they are relatively simple and to-the-point. However, regex-replaces on JSON in general get pretty hairy due to factors including but not limited to:

how do you differentiate between a pattern in a key versus a string?
how do you take into account the fact that JSON doesn’t care about the order of keys whereas regexes do?
what if there are ] or } inside strings, such that the regex is fooled into thinking the JSON object ended?

Just to illustrate how annoying regex-replaces get once you start trying to satisfy all the syntactic requirements of JSON, here’s a regex-replace I came up with to achieve this while ignoring insignificant whitespace and the order of keys and removing any trailing commas:
find/replace (?-i),\s*{(?:[^{]*{[^}]*}\s*,\s*"model"\s*:\s*"[^"]*KeepInfo[^"]*"\s*|\s*"model"\s*:\s*"[^"]*KeepInfo[^"]*"[^{]*{[^}]*}\s*)}\s*(,)? with \1.

JsonTools is a fine solution for this, if you’re able to use plugins.

Because it uses a JSON parser to parse the JSON, it is insensitive to the formatting of the JSON.

To filter JSON with the plugin, just use Alt-P-J-J (tap the keys in sequence, don’t hold them down simultaneously) to open the tree view, then enter one of the following queries into the query box and hit Ctrl+Enter, then Save query result.
Two queries that accomplish your goals:
@[:][not(@.model =~ `(?i)keepinfo`)] filters out objects where the model includes keepinfo ignoring case.
@[:][not(@.model =~ KeepInfo)] filters out objects where the model includes KeepInfo in that case only.

While I’m here, PythonScript is also an efficient solution:

import json
from Npp import editor

j = json.loads(txt)
filtered = [o for o in j if 'KeepInfo' in o['model']]
filtered_text = json.dumps(filtered, indent=4)
editor.setText(filtered_text)