How to compare & remove all multiple blocks of lines?
-
H guys,
I have a new question and looking for some help how to deal with my issue. This time I need to find a way to compare specific text content / lines within of specific start / end syntax marker. Below a example of few blocks.
< Here some text with unknown length of chars & lines > < Something here 1 2 3 >
So as you can see above there are 2 blocks which using some start & end marker <> where the block content is stored. Now I have many files with those block content but they are somehow mixed and multiple times present and my goal is it now to find and remove all multiple blocks from the files. So what I wanna do is to copy all blocks of all files into one single file and then I wanna find all same blocks and remove them. Similar like to find double or more same lines and remove them but in this case I have all in blocks. How can I do that? Is there any plugin / Python script what can do that? If not then it would be maybe a good idea to add something like that in any plugin or npp itself where the user can specify some start & end markers like <> (in this case) to tell the plugin what to compare in which area etc.
-
It would help if you gave us:
- some examples of what you actually want the final product to look like
- some more examples of things you want to match
- Some examples of things you want to not match
Right now I notice I am too confused by your question to be able to offer any help.
-
@Dean-Corso said in How to compare & remove all multiple blocks of lines?:
So as you can see above there are 2 blocks which using some start & end marker <> where the block content is stored. Now I have many files with those block content but they are somehow mixed and multiple times present and my goal is it now to find and remove all multiple blocks from the files. So what I wanna do is to copy all blocks of all files into one single file and then I wanna find all same blocks and remove them. Similar like to find double or more same lines and remove them but in this case I have all in blocks. How can I do that?
First create the combined file. You can do that in a Windows command prompt using the copy command.
Determine some character that does not appear anywhere in the combined file. For this example, I’ll use
#
, but you must pick an appropriate one.Then:
Find what:(?<!>)\R(?!<$)
Replace with:#
(Note: The first
<
in that expression is a special character indicating look-behind. The second<
and the>
are your start and end of block indicators.)Now each block is on a single line; proceed as for removing matching single lines (e.g., Edit | Line Operations | Remove Duplicate Lines).
Finally:
Find what:#
Replace with:\r\n
-
Hi again,
thanks for your help guys. So it seems to work using your method @Coises to set # or ## and to change all blocks into single lines. Very good idea and I think it did everything correctly so far on the first quick view.
@Mark-Olson, just wanted to find & remove all double / multi blocks which are same. The 2 blocks I have posted above are not same so just double them etc. Otherwise the solution from @Coises works pretty well so far to manage & handle that problem for me and now I can merge all files I have into one single file and clean it up so that every block is only present once.
PS: By the way, just have a small another question about that combination of commands. I would like to create a macro of it. Just wanna know whether I have to enter every single char again each time when I record the macro? I mean this for example “(?<!>)\R(?!<$)”. So what is if I have a larger amount of chars I need to use many times? Do I have to enter it again & again char by char etc?