Regex: Find out what tags does not contain english characters (ansii/utf-8)
-
hello. I have this kind of html tags:
<title>I love cars | My name (am) </title>
<title>የማይታይ እሳት በታላቅነት ከተያዘ ነፍስ | My name (am) </title>
Find out what tags does not contain english characters (ansii)
The output (after the FIND, should be the line 2)
<title>የማይታይ እሳት በታላቅነት ከተያዘ ነፍስ | My name (am) </title>
how can I do this, please?
-
This post is deleted! -
This post is deleted! -
Hello, @robin-cruise and All,
The regex, below, searches for any range
<title> •••••• </title>
if, at least, one character, between<title>
and</title>
, has a code-point overx{007F}
SEARCH
(?s-i)<title>(?=.*?[^\x00-\x7f].*?</title>).+?</title>
The positive look-ahead
(?=.*?[^\x00-\x7f].*?</title>)
, after the opening<title>
tag, looks for a char, over\x7F
, located, further on, and before an ending</title>
tagBest Regards
guy038
-
super answer @guy038 thanks.
I made a short version of yours:
<title>\K(?![^\x00-\x7F]+).*?\|
- finds the first line<title>\K([^\x00-\x7F]+).*?\|
- finds the second line