Community
    • Login

    Regex: Parsing / Extract the content of a html tag and save it another file?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    3 Posts 2 Posters 2.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Vasile CarausV
      Vasile Caraus
      last edited by

      hello, I have several html file. Is it possible to extract the content of the <title></title> tag on each file, and save them on another file? For example:

      file-1.html, file-2.html, file-3.html, …file 800.html

      each of them has the same tag, but with different content:

      file-1.html
      <title>My name is Prince</title>

      file-2.html
      <title>I love cars</title>
      …
      file-800.html
      <title>My book is here</title>

      So, I need to extract the content of these tags, and save them into another file, for example save.txt

      In save.txt I will have:

      My name is Prince
      I love cars
      ...
      My book is here
      

      The regex to select the content of all title tags is this: (?s)<title>(.*?)<\/title> What should I do next as to save all the results automatically?

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @Vasile Caraus
        last edited by

        @Vasile-Caraus said in Regex: Parsing / Extract the content of a html tag and save it another file?:

        What should I do next

        I would run a Find in Files search, then Ctrl+a then Ctrl+c the output in the Search results window, then paste that into a new N++ tab and start processing that output with more regular expression replacements…

        1 Reply Last reply Reply Quote 1
        • Vasile CarausV
          Vasile Caraus
          last edited by Vasile Caraus

          @Vasile-Caraus said in Regex: Parsing / Extract the content of a html tag and save it another file?:

          (?s)<title>(.*?)</title>

          ok, so I run this regex: (?s)(<title>)(.*?)(<\/title>) in all files. I copy the results in save.txt file, and I got something like this

          <title>My name is Prince</title>
          <title>I love cars</title>
          <title>My book is here</title>
          

          Now, I must extract the content from tags, and I use the same regex, with replace:

          Find: (?s)(<title>)(.*?)(<\/title>)
          Replace by: \2

          The output

          My name is Prince
          I love cars
          My book is here
          

          thanks. I thought it could be done in one move. :)

          1 Reply Last reply Reply Quote 2
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors