Finding html tags and everything between them on >>multiple lines?



  • Can Notepad++ find (for example):
    <script> some stuff some more stuff (new line)
    and more stuff </script>

    I’m trying to get rid of the bloat-code on an HTML document. Thanks for any help!



  • Hello, @semicodin,

    The correct regex to match any range of characters, even on several lines, with the syntax :

    - <script .....>................... </script>
    
    or :
    
    - <script> ........................ </script>
    

    is (?s-i)<(script)( |>).*?</\1>


    Notes :

    • The initial part (?s-i) are modifiers, which forces the regex engine to consider that :

      • The special dot character can match, absolutely, any character ( Standard or End of Line characters )

      • The search will be performed in a sensitive way ( = non-insensitive )

    • Then, the first part <(script)( |>) tries to match, either, the string <script> OR the string <script, followed with a space character. Note that the word script, embedded in round parentheses, stands for group 1

    • The third part </\1> matches the exact string </script>

    • And the second part .*? represents the shortest range of any character between part 1 and part 3

    IMPORTANT :

    Don’t forgot that this simple regex supposes that no other block <script......</script> is nested, inside the initial block !

    Best Regards,

    guy038


Log in to reply