• Login
Community
  • Login

Regex: Find out what tags does not contain english characters (ansii/utf-8)

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
5 Posts 2 Posters 227 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R
    Robin Cruise
    last edited by Jun 19, 2021, 4:19 PM

    hello. I have this kind of html tags:

    <title>I love cars | My name (am) </title>

    <title>የማይታይ እሳት በታላቅነት ከተያዘ ነፍስ | My name (am) </title>

    Find out what tags does not contain english characters (ansii)

    The output (after the FIND, should be the line 2)

    <title>የማይታይ እሳት በታላቅነት ከተያዘ ነፍስ | My name (am) </title>

    how can I do this, please?

    1 Reply Last reply Reply Quote 0
    • R
      Robin Cruise
      last edited by Jun 19, 2021, 4:26 PM

      This post is deleted!
      1 Reply Last reply Reply Quote 0
      • R
        Robin Cruise
        last edited by Jun 19, 2021, 4:31 PM

        This post is deleted!
        1 Reply Last reply Reply Quote 0
        • G
          guy038
          last edited by guy038 Jun 19, 2021, 6:45 PM Jun 19, 2021, 6:43 PM

          Hello, @robin-cruise and All,

          The regex, below, searches for any range <title> •••••• </title> if, at least, one character, between <title> and </title>, has a code-point over x{007F}

          SEARCH (?s-i)<title>(?=.*?[^\x00-\x7f].*?</title>).+?</title>

          The positive look-ahead (?=.*?[^\x00-\x7f].*?</title>), after the opening <title> tag, looks for a char, over \x7F, located, further on, and before an ending </title> tag

          Best Regards

          guy038

          1 Reply Last reply Reply Quote 0
          • R
            Robin Cruise
            last edited by Jun 19, 2021, 8:05 PM

            super answer @guy038 thanks.

            I made a short version of yours:

            <title>\K(?![^\x00-\x7F]+).*?\| - finds the first line

            <title>\K([^\x00-\x7F]+).*?\| - finds the second line

            1 Reply Last reply Reply Quote 0
            4 out of 5
            • First post
              4/5
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors