finding a partial number with Wildcard
-
Hi i have an XML file with multiple account number and i want to Find in the Tag <AccNbr> all lines that have 222 as any part of the number how would I search
<AcctNbr>111111222</AcctNbr>
-
-
@Alan-Kilborn said in finding a partial number with Wildcard:
Find:
(?-s)<AcctNbr>.*?222.*?</AcctNbr>
Search mode: Regular expressionI would lean towards using
(?-i)<AcctNbr>\d*222\d*</AcctNbr>
XML is case sensitive which is why I use(?-i)
.
It’s a number and so I use\d*
instead of.*?
which allows any character. As I don’t use the dot (.
) operator I don’t need the(?-s)
nor the?
non-greedy modifier.@Darren-Wartnaby - A web page that explains the various things shown here is at https://npp-user-manual.org/docs/searching/#regular-expressions
Alan’s version is better if there is a chance that whatever is between
<AcctNbr>
and</AcctNbr>
has things other other than numeric digits at times. -
@Alan-Kilborn said in finding a partial number with Wildcard:
*?222
Alan thanks so if I want to search for two number in that one string do i repeat this part .*?222.
-
@Darren-Wartnaby said in finding a partial number with Wildcard:
so if I want to search for two number in that one string do i repeat this part .*?222
You could, but there are some issues with ordering; i.e., you would only find matches in the left-to-right order of the two numbers specified.
-
Suppose you want to find text in
AcctNbr
tags containing the numbers48
,17
, and34
in any order. You could enumerate every possible ordering of those numbers, but that would become increasingly intractable as the number of things to be matched got larger (e.g., there are 24 possible orderings of four distinct things). This is where forward lookahead really shines.(?-si)<AcctNbr>(?=(?:(?!</AcctNbr>).)*?17)(?=(?:(?!</AcctNbr>).)*?34)(?=(?:(?!</AcctNbr>).)*?48).*?</AcctNbr>
Try that out on the below text to see what I mean.
<AcctNbr>1 2 3 4</AcctNbr> <AcctNbr>1 2 23 14</AcctNbr> <AcctNbr>27</AcctNbr> <AcctNbr>1234</AcctNbr> <AcctNbr>34 18</AcctNbr> <AcctNbr>48734</AcctNbr> <FooNbr>34 17 48</FooNbr> <!-- last three should match --> <AcctNbr>34 17 48</AcctNbr> <AcctNbr>4817 34</AcctNbr> <AcctNbr>483417</AcctNbr>
NOTE: this method allows overlap between things to be matched, e.g.,
348
would count for both34
and48
. There is no efficient way to do this without making the regex much nastier.