Need to find string if don't have any tags
-
I need to find string/para if don’t have any tags in my xml file.
For example:
<para>my text</para>
My text
<para>my text</para>…I wanted to find that bolded word using regex,.
Is there a way to get that regex?
Thanks
Ganesan G -
Hello, @ganesan-govindarajan and All,
In order to get the text
My text
, with that exact case, when not surrounded with, both, the<para>
and</para>
tags, use the regex :SEARCH
(?-i)(?<!<para>)My text(?!</para>)
Note that if your search must be insensitive to case, change the leading modifier
(?-i)
syntax as(?i)
Now, if you want to get the text
My text
, with that exact case, when not surrounded with, either, the<para>
and</para>
tags, or none, use, either, the regex :-
SEARCH
(?-i)(?<!<para>)(My text)|(?1)(?!</para>)
-
SEARCH
(?-i)<para>My text</para>(*SKIP)(*F)|My text
Finally, if your goal is to correct all the possible wrong syntaxes, use the following regex S/R :
-
SEARCH
(?-i)(<para>)?My text(</para>)?
-
REPLACE
(?1:<para>)$0?2:</para>
Of course, select the
Regular expression
search mode and tick, if necessary, theWrap around
optionTest this S/R against this sample :
<para>My text</para> My text <para>My text My text</para>
After replacement, you should obtain the expected result :
<para>My text</para> <para>My text</para> <para>My text</para> <para>My text</para>
Best Regards
guy038
-
-
Hi @guy038
Thanks for the help!!.
Sorry here “My text” is only for example. My intention is to find any sentence like “This is a Notepad++ regex…” etc without any open and end tags. Since, rest of the xml file may have open and end tags which i can easily identify using open and end tags.
Thanks
Ganesan. G -
Hi, @ganesan-govindarajan and All,
Ah…OK. So, whatever the contents of tags, isn’t it ?
Then the following generic regex should work nice !
- SEARCH
(?-i)<(\w+)>(?2)</\1>(*SKIP)(*F)|(\Q
Whatever you want\E)
Note that the part between the
\Q
( for Quote ) and\E
( for End ) is just considered as a literal range of characters !So, in case of a very simple text to search as, for instance,
My text
the\Q
and\E
syntaxes are not necessary and you may use this practical regex :- SEARCH
(?-i)<(\w+)>(?2)</\1>(*SKIP)(*F)|(My text)
When tested against the text, below :
01 <para>My text</para> 02 <blockquote>My text <!-- MISSING tag --> 03 <abc>My text</xyz> <!-- NON-regular syntax --> 04 My text <!-- MISSING tags --> 05 <ganesan>My text</ganesan> 06 <123>My text<456> <!-- NON-regular syntax --> 07 My text</blockquote> <!-- MISSING tags --> 08 <h1>My text</h1> 09 (toto)My text(/toto) <!-- NON-regular syntax --> 10 (Test)My text[/test] <!-- NON-regular syntax -->
it would match the string
My text
, only in case of non-regular syntax or missing tag. So, in lines02
,03
,04
,06
,07
,09
and10
!
Similarly, if you’re looking for wrong syntaxes of the
This is a Notepad++ regex.
sentence, it’s better to use the syntax, below, as the text, to search for, contains the+
and the.
signs, which are regex symbols with a special meaning :- SEARCH
(?-i)<(\w+)>(?2)</\1>(*SKIP)(*F)|(\QThis is a Notepad++ regex.\E)
Test it against this similar sample :
01 <para>This is a Notepad++ regex.</para> 02 <blockquote>This is a Notepad++ regex. <!-- MISSING tag --> 03 <abc>This is a Notepad++ regex.</xyz> <!-- NON-regular syntax --> 04 This is a Notepad++ regex. <!-- MISSING tags --> 05 <ganesan>This is a Notepad++ regex.</ganesan> 06 <123>This is a Notepad++ regex.<456> <!-- NON-regular syntax --> 07 This is a Notepad++ regex.</blockquote> <!-- MISSING tags --> 08 <h1>This is a Notepad++ regex.</h1> 09 (toto)This is a Notepad++ regex.(/toto) <!-- NON-regular syntax --> 10 (Test)This is a Notepad++ regex.[/test] <!-- NON-regular syntax -->
Best Regards,
guy038
- SEARCH