Regex: Parsing / Extract the content of a html tag and save it another file?
-
hello, I have several html file. Is it possible to extract the content of the
<title></title>
tag on each file, and save them on another file? For example:file-1.html, file-2.html, file-3.html, …file 800.html
each of them has the same tag, but with different content:
file-1.html
<title>My name is Prince</title>
file-2.html
<title>I love cars</title>
…
file-800.html
<title>My book is here</title>
So, I need to extract the content of these tags, and save them into another file, for example save.txt
In save.txt I will have:
My name is Prince I love cars ... My book is here
The regex to select the content of all title tags is this:
(?s)<title>(.*?)<\/title>
What should I do next as to save all the results automatically? -
@Vasile-Caraus said in Regex: Parsing / Extract the content of a html tag and save it another file?:
What should I do next
I would run a Find in Files search, then Ctrl+a then Ctrl+c the output in the Search results window, then paste that into a new N++ tab and start processing that output with more regular expression replacements…
-
@Vasile-Caraus said in Regex: Parsing / Extract the content of a html tag and save it another file?:
(?s)<title>(.*?)</title>
ok, so I run this regex:
(?s)(<title>)(.*?)(<\/title>)
in all files. I copy the results in save.txt file, and I got something like this<title>My name is Prince</title> <title>I love cars</title> <title>My book is here</title>
Now, I must extract the content from tags, and I use the same regex, with replace:
Find:
(?s)(<title>)(.*?)(<\/title>)
Replace by:\2
The output
My name is Prince I love cars My book is here
thanks. I thought it could be done in one move. :)