Community
    • Login

    Regex: Find out what tags does not contain english characters (ansii/utf-8)

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    5 Posts 2 Posters 223 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Robin CruiseR
      Robin Cruise
      last edited by

      hello. I have this kind of html tags:

      <title>I love cars | My name (am) </title>

      <title>የማይታይ እሳት በታላቅነት ከተያዘ ነፍስ | My name (am) </title>

      Find out what tags does not contain english characters (ansii)

      The output (after the FIND, should be the line 2)

      <title>የማይታይ እሳት በታላቅነት ከተያዘ ነፍስ | My name (am) </title>

      how can I do this, please?

      1 Reply Last reply Reply Quote 0
      • Robin CruiseR
        Robin Cruise
        last edited by

        This post is deleted!
        1 Reply Last reply Reply Quote 0
        • Robin CruiseR
          Robin Cruise
          last edited by

          This post is deleted!
          1 Reply Last reply Reply Quote 0
          • guy038G
            guy038
            last edited by guy038

            Hello, @robin-cruise and All,

            The regex, below, searches for any range <title> •••••• </title> if, at least, one character, between <title> and </title>, has a code-point over x{007F}

            SEARCH (?s-i)<title>(?=.*?[^\x00-\x7f].*?</title>).+?</title>

            The positive look-ahead (?=.*?[^\x00-\x7f].*?</title>), after the opening <title> tag, looks for a char, over \x7F, located, further on, and before an ending </title> tag

            Best Regards

            guy038

            1 Reply Last reply Reply Quote 0
            • Robin CruiseR
              Robin Cruise
              last edited by

              super answer @guy038 thanks.

              I made a short version of yours:

              <title>\K(?![^\x00-\x7F]+).*?\| - finds the first line

              <title>\K([^\x00-\x7F]+).*?\| - finds the second line

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors