Community
    • Login

    Regex for Searching <HEAD> Section

    Scheduled Pinned Locked Moved General Discussion
    3 Posts 3 Posters 748 Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • aksarbenA Offline
      aksarben
      last edited by

      I want to use Notepad++ to find soft hyphen characters (ISO 8859: 0xAD, Unicode U+00AD SOFT HYPHEN, HTML: ­ ­) in the <head> section of my HTML files. I tried the two regular expressions below, but both return zero hits.

      <head>.*?­.*?</head>
      <head>.*­.*</head>
      

      Curiously, the following regex does finds soft hyphens in <figcaption> sections:

      <figcaption>.*?­.*?</figcaption>
      

      I suspect the issue is that the <head> section contains newlines. I tried the search with the “. matches newline” both checked and unchecked. Still got zero hits both ways.

      Is there a way to do this kind of search in Notepad++?

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA Offline
        Alan Kilborn @aksarben
        last edited by

        @aksarben

        I think the code blocks you used above are hiding your soft-hyphen character, at least visually. I find that if I copy and paste them into Notepad++, the soft-hyphen character reappears.

        Anyway, I would try searching for: (?s)<head>.*?\x{00AD}.*?</head>

        I think there have been some recent postings about Unicode characters used explicitly in the Find-what box of the Find dialog not working correctly…?

        1 Reply Last reply Reply Quote 1
        • guy038G Offline
          guy038
          last edited by guy038

          Hello, @aksarben, @alan-kilborn and All,

          Simply, use this regex S/R :

          SEARCH (?s)(.*?<head>|\G)((?!</head>).)*?\K\xAD

          REPLACE Any SINGLE character or STRING

          Notes :

          • I assume, of course, that there only one section <head>........</head> per file

          • The <head>........</head> section can be, either, in one line or splitted into several ones

          • Any soft hyphen, found, above the starting tag <head> is ignored

          • Any soft hyphen, between the starting and the ending tag is found, individually

          • Any soft hyphen, found, under the ending tag </head> is ignored

          • Preferably, when testing on a single file, tick the wrap around option, which forces to starts the S/R process from the very beginning of the file

          Best Regards,

          guy038

          1 Reply Last reply Reply Quote 1

          Hello! It looks like you're interested in this conversation, but you don't have an account yet.

          Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

          With your input, this post could be even better 💗

          Register Login
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors