• Login
Community
  • Login

Regex for Searching <HEAD> Section

Scheduled Pinned Locked Moved General Discussion
3 Posts 3 Posters 431 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A
    aksarben
    last edited by Nov 1, 2019, 5:35 PM

    I want to use Notepad++ to find soft hyphen characters (ISO 8859: 0xAD, Unicode U+00AD SOFT HYPHEN, HTML: ­ ­) in the <head> section of my HTML files. I tried the two regular expressions below, but both return zero hits.

    <head>.*?­.*?</head>
    <head>.*­.*</head>
    

    Curiously, the following regex does finds soft hyphens in <figcaption> sections:

    <figcaption>.*?­.*?</figcaption>
    

    I suspect the issue is that the <head> section contains newlines. I tried the search with the “. matches newline” both checked and unchecked. Still got zero hits both ways.

    Is there a way to do this kind of search in Notepad++?

    A 1 Reply Last reply Nov 1, 2019, 5:44 PM Reply Quote 0
    • A
      Alan Kilborn @aksarben
      last edited by Nov 1, 2019, 5:44 PM

      @aksarben

      I think the code blocks you used above are hiding your soft-hyphen character, at least visually. I find that if I copy and paste them into Notepad++, the soft-hyphen character reappears.

      Anyway, I would try searching for: (?s)<head>.*?\x{00AD}.*?</head>

      I think there have been some recent postings about Unicode characters used explicitly in the Find-what box of the Find dialog not working correctly…?

      1 Reply Last reply Reply Quote 1
      • G
        guy038
        last edited by guy038 Nov 1, 2019, 8:01 PM Nov 1, 2019, 7:24 PM

        Hello, @aksarben, @alan-kilborn and All,

        Simply, use this regex S/R :

        SEARCH (?s)(.*?<head>|\G)((?!</head>).)*?\K\xAD

        REPLACE Any SINGLE character or STRING

        Notes :

        • I assume, of course, that there only one section <head>........</head> per file

        • The <head>........</head> section can be, either, in one line or splitted into several ones

        • Any soft hyphen, found, above the starting tag <head> is ignored

        • Any soft hyphen, between the starting and the ending tag is found, individually

        • Any soft hyphen, found, under the ending tag </head> is ignored

        • Preferably, when testing on a single file, tick the wrap around option, which forces to starts the S/R process from the very beginning of the file

        Best Regards,

        guy038

        1 Reply Last reply Reply Quote 1
        1 out of 3
        • First post
          1/3
          Last post
        The Community of users of the Notepad++ text editor.
        Powered by NodeBB | Contributors