Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    copy xml blocks with contains string

    General Discussion
    3
    4
    71
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • zemaria523
      zemaria523 last edited by

      Hi, I have an xml that contains millions of lines like those xml blocks:

      <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
      <plist version="1.0">
      <dict>
      	<key>relativePath</key>
      	<string>Library/Preferences/com.apple.Safari.SafeBrowsing.plist</string>
      	<key>birth</key>
      	<string>6955023</string>
      	<key>groupID</key>
      	<string>501</string>
      	<key>mode</key>
      	<string>33152</string>
      	<key>modified</key>
      	<string>4444</string>
      	<key>statusChanged</key>
      	<string>695502523</string>
      	<key>userID</key>
      	<string>501</string>
      	<key>size</key>
      	<string>103</string>
      </dict>
      </plist>","contents":{"signature":"BDuaupIWv889qLBr41lViCemOSlz","owner":"55555555","size":128,"reference_signature":"AYh7zng="UTF-8"?>
      <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
      <plist version="1.0">
      <dict>
      	<key>relativePath</key>
      	<string>Library/Preferences/com.apple.SetupAssistant.plist</string>
      	<key>birth</key>
      	<string>6945</string>
      	<key>groupID</key>
      	<string>501</string>
      	<key>mode</key>
      	<string>33152</string>
      	<key>modified</key>
      	<string>695045</string>
      	<key>statusChanged</key>
      	<string>44444</string>
      	<key>userID</key>
      	<string>501</string>
      	<key>size</key>
      	<string>161</string>
      </dict>
      </plist>","contents":{"signature":"fffdssffsfs","owner":"5555555","size":256," encoding="UTF-8"?>
      

      I want to copy all whole block <!DOCTYPE … </plist> that contains inside the word SetupAssistant, is there a way to do that?

      Mark Olson 1 Reply Last reply Reply Quote 0
      • Mark Olson
        Mark Olson @zemaria523 last edited by Mark Olson

        @zemaria523
        Kinda ugly, but try going to the Mark tab of the find/replace window, finding (?s-i:<!DOCTYPE\s+plist(?:(?!</plist>).)*?SetupAssistant(?:(?!</plist>).)*?</plist>), hitting Mark All, and then Copy Marked Text.
        From the above example, the text copied will be

        <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
        <plist version="1.0">
        <dict>
        	<key>relativePath</key>
        	<string>Library/Preferences/com.apple.SetupAssistant.plist</string>
        	<key>birth</key>
        	<string>6945</string>
        	<key>groupID</key>
        	<string>501</string>
        	<key>mode</key>
        	<string>33152</string>
        	<key>modified</key>
        	<string>695045</string>
        	<key>statusChanged</key>
        	<string>44444</string>
        	<key>userID</key>
        	<string>501</string>
        	<key>size</key>
        	<string>161</string>
        </dict>
        </plist>
        

        You will get some lines with ---- in between each marked thing. I suppose you could find/replace ^----$ with nothing to get rid of those.

        Explanation:

        1. <!DOCTYPE\s+plist finds the doctype declaration.
        2. (?:(?!</plist>).)*?SetupAssistant matches the text SetupAssistant up to and including SetupAssistant before the next </plist> close tag.
        3. (?:(?!</plist>).)*?</plist> finds all text up to and including the close tag.
        Alan Kilborn zemaria523 2 Replies Last reply Reply Quote 4
        • Alan Kilborn
          Alan Kilborn @Mark Olson last edited by

          @Mark-Olson said in copy xml blocks with contains string:

          You will get some lines with ---- in between each marked thing

          This happens to delimit the matches when each match is multiline.
          For the situation here, it is pretty clear where the matches separate, but it is not always so clear, so the ---- is put in.
          At least, that’s my interpretation as to the “why” it is there.
          I don’t think I’ve ever seen this explained before.

          As you say, easy enough to remove with a regex replacement operation.

          1 Reply Last reply Reply Quote 0
          • zemaria523
            zemaria523 @Mark Olson last edited by

            @Mark-Olson Thank you very much. it solved my problem!

            1 Reply Last reply Reply Quote 0
            • Referenced by  Alan Kilborn Alan Kilborn 
            • First post
              Last post
            Copyright © 2014 NodeBB Forums | Contributors