Search line without ending tag
-
Hello.
I’m very bad in regex, can someone tell me, how i can find lines with next problem.
I have xml document with a lot of info for tax system. Some files generated with errors, closing tag goin to the next line, and i need to find this cases.
And one more problem, there can be spaces or tabs, before beggining tag.
Need to find cases like <TypePost>Main *and here closing tag going to the next line<Post>Manager</Post> <subdivision><Marketing</subdivision> <TypePost>Main </TypePost> -
the
\Rdenotes line endings and\hstands for horizontal spaces which can be spaces or tabs.
So this in mind you might consider
find what:<TypePost>Main\R\h+
replace with:<TypePost>Main -
@Ekopalypse TY, working.
But, i’m sorry) I forgot one moment.
What if here can be any tag and text?
Something like this, need to find line UsersC and UUID and some other different names.<UsersC>21 </UsersC> <UUID>be9a1528-9a/6/0/-4917-8857-12896a7693de</UUID> <Date>2020-01-20</Date> <UUID>7f8e38ab-ceba-45c5-ab34-834b61bad840 </UUID> -
if your data is consistent then something like this
find what:<(\w+>)(.*)\R\h*(</\1)
replace with:\1\2\3
might do it.So we are looking for
- a tag
<(\w+>)(a less sign followed by any word followed by a greater sign - followed by any text
(.*) - followed by a end of line char
\R - followed by horizontal spaces
\h* - followed by the start of a closing tag
</followed by what was found in the starting tag\1->(</\1)
- a tag
-
@Ekopalypse amazing, it’s working very well))
Thank you very much and thank u for description of the process. -
Hello, @alex-mesch, @ekopalypse and All,
A second possibility, derived from @ekopalypse’s solution, would be :
-
Open the Replace dialog (
Ctrl + H) -
SEARCH
<(\w+)>.*\K\R\h*(?=</\1>) -
REPLACE
Leave EMPTY -
Now, choice :
-
To tick the
Wrap aroundoption if you want to process the S/R on the whole file, from beginning to end -
To untick the
Wrap aroundoption to process the S/R, from current location to the end of the file -
To do a normal selection of text first and then, tick the
In selectionoption
-
-
Select the
Regular expressionsearch mode -
Click, exclusively on the
Replace Alloption, whatever your choice !
Notes :
-
Due to the
\Ksyntax, inside this regex, the search process works correctly, but the “step by step” replacement, with theReplacebutton, is not functional :-( -
The search regex looks for a line-break, possibly followed with some blank characters (
tabulationand/orspace), ONLY IF :-
It is preceded with
<, then a name tag\w+, stored as group1, because embedded in parentheses, then>and any subsequent character(s).*, even0, till the line-break -
It is followed with the same ending tag
</...>, due to the positive look-ahead structure?=</\1>)and the\1syntax which represents the name tag
-
-
As the replacement zone is
empty, the EOL, and the possible blank chars, are simply deleted !
Best Regards,
guy038
-
-
@guy038 thx)
Tomorrow I will study how it works) Very hard for my brain) -
Hi, @alex-mesch,
To begin with, click on the link, below :
https://community.notepad-plus-plus.org/topic/15765/faq-desk-where-to-find-regex-documentation
Cheers,
guy038