How to filter incorrect html tags with Notepad++?



  • I have some text files being translated, the final translated version has funny html tags, such as

    < p>
    </ strong>
    or even <\div>

    Is there a tool or plugin within Notepad++ to automatically correct these html tag errors?

    If that’s not available, is there a plugin that can remove all HTML tags except a list of correct tags such as <p></p> <h1></h1> ?

    Thank you for sharing your knowledge



  • @NZ-Select ,

    Didn’t the regex presented to you last month (https://community.notepad-plus-plus.org/post/56054) accomplish most of your request (removing the excess spaces)? Running a second search/replace (doesn’t even have to be in regex mode) that replaces <\ with </ would fix the rest of your example (if it is in regex mode, the search should be <\\).



  • @NZ-Select ,

    Further, if you want to automate it more (so that you don’t have to retype those regex every time), just record a macro while you’re doing those two regex replacements, then assign a keyboard shortcut. You can then easily do this fix to any new file you want.



  • Yes, with not too much work, I was able to record a macro that took:

    This is title
    < p >There is unwanted space< /p >
    < h1 >There is unwanted space< /h1 >
    < div >There is unwanted space< /div >
    < div >There is unwanted space< \div >
    < p >There is unwanted space< \ p >
    

    to become

    This is title
    <p>There is unwanted space</p>
    <h1>There is unwanted space</h1>
    <div>There is unwanted space</div>
    <div>There is unwanted space</div>
    <p>There is unwanted space</p>
    
            <Macro name="ForNZ" Ctrl="no" Alt="no" Shift="no" Key="0">
                <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
                <Action type="3" message="1601" wParam="0" lParam="0" sParam="(?-s)&lt;\h*\\" />
                <Action type="3" message="1625" wParam="0" lParam="2" sParam="" />
                <Action type="3" message="1602" wParam="0" lParam="0" sParam="&lt;/" />
                <Action type="3" message="1702" wParam="0" lParam="770" sParam="" />
                <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />
                <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
                <Action type="3" message="1601" wParam="0" lParam="0" sParam="(?-s)&lt; *?(/?) *?(\w+) *?&gt;" />
                <Action type="3" message="1625" wParam="0" lParam="2" sParam="" />
                <Action type="3" message="1602" wParam="0" lParam="0" sParam="&lt;$1$2&gt;" />
                <Action type="3" message="1702" wParam="0" lParam="770" sParam="" />
                <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />
            </Macro>
    

    To add that to your macros: close all Notepad++. Open %AppData%\Notepad++\shortcuts.xml, add the above inside the <Macros>... section, save, exit Notepad++. Open Notepad++, and your macro should now exist.



  • Thank you so much for the detailed guide.


Log in to reply