How to remove all HTML tags except <p> or <h1> <h2> tags?
I have several articles in txt files under a directory.
The articles’ html code is somehow messed up.
I wish to remove all html tags except <p> or <h1> <h2> tags
The following code is removing all HTML tags
How to add an exception?
Keep any tags that have p, h1 or h2
Thank you in advance for your sharing of RegEx knowledge!
PeterJones last edited by
Thank you for the reply.
This code now replace all html codes except h1,h2,or p tag
But I notice that it also replace the ending </h1>, </h2>, and </p>
I tried to use these below try to keep the above tags, it failed.
Would you advise how to keep the trialing tags?
I found this code will do the job