Search line without ending tag
-
Hello.
I’m very bad in regex, can someone tell me, how i can find lines with next problem.
I have xml document with a lot of info for tax system. Some files generated with errors, closing tag goin to the next line, and i need to find this cases.
And one more problem, there can be spaces or tabs, before beggining tag.
Need to find cases like <TypePost>Main *and here closing tag going to the next line<Post>Manager</Post> <subdivision><Marketing</subdivision> <TypePost>Main </TypePost>
-
the
\R
denotes line endings and\h
stands for horizontal spaces which can be spaces or tabs.
So this in mind you might consider
find what:<TypePost>Main\R\h+
replace with:<TypePost>Main
-
@Ekopalypse TY, working.
But, i’m sorry) I forgot one moment.
What if here can be any tag and text?
Something like this, need to find line UsersC and UUID and some other different names.<UsersC>21 </UsersC> <UUID>be9a1528-9a/6/0/-4917-8857-12896a7693de</UUID> <Date>2020-01-20</Date> <UUID>7f8e38ab-ceba-45c5-ab34-834b61bad840 </UUID>
-
if your data is consistent then something like this
find what:<(\w+>)(.*)\R\h*(</\1)
replace with:\1\2\3
might do it.So we are looking for
- a tag
<(\w+>)
(a less sign followed by any word followed by a greater sign - followed by any text
(.*)
- followed by a end of line char
\R
- followed by horizontal spaces
\h*
- followed by the start of a closing tag
</
followed by what was found in the starting tag\1
->(</\1)
- a tag
-
@Ekopalypse amazing, it’s working very well))
Thank you very much and thank u for description of the process. -
Hello, @alex-mesch, @ekopalypse and All,
A second possibility, derived from @ekopalypse’s solution, would be :
-
Open the Replace dialog (
Ctrl + H
) -
SEARCH
<(\w+)>.*\K\R\h*(?=</\1>)
-
REPLACE
Leave EMPTY
-
Now, choice :
-
To tick the
Wrap around
option if you want to process the S/R on the whole file, from beginning to end -
To untick the
Wrap around
option to process the S/R, from current location to the end of the file -
To do a normal selection of text first and then, tick the
In selection
option
-
-
Select the
Regular expression
search mode -
Click, exclusively on the
Replace All
option, whatever your choice !
Notes :
-
Due to the
\K
syntax, inside this regex, the search process works correctly, but the “step by step” replacement, with theReplace
button, is not functional :-( -
The search regex looks for a line-break, possibly followed with some blank characters (
tabulation
and/orspace
), ONLY IF :-
It is preceded with
<
, then a name tag\w+
, stored as group1
, because embedded in parentheses, then>
and any subsequent character(s).*
, even0
, till the line-break -
It is followed with the same ending tag
</...>
, due to the positive look-ahead structure?=</\1>)
and the\1
syntax which represents the name tag
-
-
As the replacement zone is
empty
, the EOL, and the possible blank chars, are simply deleted !
Best Regards,
guy038
-
-
@guy038 thx)
Tomorrow I will study how it works) Very hard for my brain) -
Hi, @alex-mesch,
To begin with, click on the link, below :
https://community.notepad-plus-plus.org/topic/15765/faq-desk-where-to-find-regex-documentation
Cheers,
guy038