Regex: Find out what tags does not contain english characters (ansii/utf-8)
-
hello. I have this kind of html tags:
<title>I love cars | My name (am) </title><title>የማይታይ እሳት በታላቅነት ከተያዘ ነፍስ | My name (am) </title>Find out what tags does not contain english characters (ansii)
The output (after the FIND, should be the line 2)
<title>የማይታይ እሳት በታላቅነት ከተያዘ ነፍስ | My name (am) </title>how can I do this, please?
-
This post is deleted! -
This post is deleted! -
Hello, @robin-cruise and All,
The regex, below, searches for any range
<title> •••••• </title>if, at least, one character, between<title>and</title>, has a code-point overx{007F}SEARCH
(?s-i)<title>(?=.*?[^\x00-\x7f].*?</title>).+?</title>The positive look-ahead
(?=.*?[^\x00-\x7f].*?</title>), after the opening<title>tag, looks for a char, over\x7F, located, further on, and before an ending</title>tagBest Regards
guy038
-
super answer @guy038 thanks.
I made a short version of yours:
<title>\K(?![^\x00-\x7F]+).*?\|- finds the first line<title>\K([^\x00-\x7F]+).*?\|- finds the second line