How to filter incorrect html tags with Notepad++?
-
I have some text files being translated, the final translated version has funny html tags, such as
< p>
</ strong>
or even <\div>Is there a tool or plugin within Notepad++ to automatically correct these html tag errors?
If that’s not available, is there a plugin that can remove all HTML tags except a list of correct tags such as <p></p> <h1></h1> ?
Thank you for sharing your knowledge
-
Didn’t the regex presented to you last month (https://community.notepad-plus-plus.org/post/56054) accomplish most of your request (removing the excess spaces)? Running a second search/replace (doesn’t even have to be in regex mode) that replaces
<\
with</
would fix the rest of your example (if it is in regex mode, the search should be<\\
). -
Further, if you want to automate it more (so that you don’t have to retype those regex every time), just record a macro while you’re doing those two regex replacements, then assign a keyboard shortcut. You can then easily do this fix to any new file you want.
-
Yes, with not too much work, I was able to record a macro that took:
This is title < p >There is unwanted space< /p > < h1 >There is unwanted space< /h1 > < div >There is unwanted space< /div > < div >There is unwanted space< \div > < p >There is unwanted space< \ p >
to become
This is title <p>There is unwanted space</p> <h1>There is unwanted space</h1> <div>There is unwanted space</div> <div>There is unwanted space</div> <p>There is unwanted space</p>
<Macro name="ForNZ" Ctrl="no" Alt="no" Shift="no" Key="0"> <Action type="3" message="1700" wParam="0" lParam="0" sParam="" /> <Action type="3" message="1601" wParam="0" lParam="0" sParam="(?-s)<\h*\\" /> <Action type="3" message="1625" wParam="0" lParam="2" sParam="" /> <Action type="3" message="1602" wParam="0" lParam="0" sParam="</" /> <Action type="3" message="1702" wParam="0" lParam="770" sParam="" /> <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" /> <Action type="3" message="1700" wParam="0" lParam="0" sParam="" /> <Action type="3" message="1601" wParam="0" lParam="0" sParam="(?-s)< *?(/?) *?(\w+) *?>" /> <Action type="3" message="1625" wParam="0" lParam="2" sParam="" /> <Action type="3" message="1602" wParam="0" lParam="0" sParam="<$1$2>" /> <Action type="3" message="1702" wParam="0" lParam="770" sParam="" /> <Action type="3" message="1701" wParam="0" lParam="1609" sParam="" /> </Macro>
To add that to your macros: close all Notepad++. Open
%AppData%\Notepad++\shortcuts.xml
, add the above inside the<Macros>...
section, save, exit Notepad++. Open Notepad++, and your macro should now exist. -
Thank you so much for the detailed guide.