Community
    • Login

    copy xml blocks with contains string

    Scheduled Pinned Locked Moved General Discussion
    4 Posts 3 Posters 609 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • zemaria523Z
      zemaria523
      last edited by

      Hi, I have an xml that contains millions of lines like those xml blocks:

      <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
      <plist version="1.0">
      <dict>
      	<key>relativePath</key>
      	<string>Library/Preferences/com.apple.Safari.SafeBrowsing.plist</string>
      	<key>birth</key>
      	<string>6955023</string>
      	<key>groupID</key>
      	<string>501</string>
      	<key>mode</key>
      	<string>33152</string>
      	<key>modified</key>
      	<string>4444</string>
      	<key>statusChanged</key>
      	<string>695502523</string>
      	<key>userID</key>
      	<string>501</string>
      	<key>size</key>
      	<string>103</string>
      </dict>
      </plist>","contents":{"signature":"BDuaupIWv889qLBr41lViCemOSlz","owner":"55555555","size":128,"reference_signature":"AYh7zng="UTF-8"?>
      <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
      <plist version="1.0">
      <dict>
      	<key>relativePath</key>
      	<string>Library/Preferences/com.apple.SetupAssistant.plist</string>
      	<key>birth</key>
      	<string>6945</string>
      	<key>groupID</key>
      	<string>501</string>
      	<key>mode</key>
      	<string>33152</string>
      	<key>modified</key>
      	<string>695045</string>
      	<key>statusChanged</key>
      	<string>44444</string>
      	<key>userID</key>
      	<string>501</string>
      	<key>size</key>
      	<string>161</string>
      </dict>
      </plist>","contents":{"signature":"fffdssffsfs","owner":"5555555","size":256," encoding="UTF-8"?>
      

      I want to copy all whole block <!DOCTYPE … </plist> that contains inside the word SetupAssistant, is there a way to do that?

      Mark OlsonM 1 Reply Last reply Reply Quote 0
      • Mark OlsonM
        Mark Olson @zemaria523
        last edited by Mark Olson

        @zemaria523
        Kinda ugly, but try going to the Mark tab of the find/replace window, finding (?s-i:<!DOCTYPE\s+plist(?:(?!</plist>).)*?SetupAssistant(?:(?!</plist>).)*?</plist>), hitting Mark All, and then Copy Marked Text.
        From the above example, the text copied will be

        <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
        <plist version="1.0">
        <dict>
        	<key>relativePath</key>
        	<string>Library/Preferences/com.apple.SetupAssistant.plist</string>
        	<key>birth</key>
        	<string>6945</string>
        	<key>groupID</key>
        	<string>501</string>
        	<key>mode</key>
        	<string>33152</string>
        	<key>modified</key>
        	<string>695045</string>
        	<key>statusChanged</key>
        	<string>44444</string>
        	<key>userID</key>
        	<string>501</string>
        	<key>size</key>
        	<string>161</string>
        </dict>
        </plist>
        

        You will get some lines with ---- in between each marked thing. I suppose you could find/replace ^----$ with nothing to get rid of those.

        Explanation:

        1. <!DOCTYPE\s+plist finds the doctype declaration.
        2. (?:(?!</plist>).)*?SetupAssistant matches the text SetupAssistant up to and including SetupAssistant before the next </plist> close tag.
        3. (?:(?!</plist>).)*?</plist> finds all text up to and including the close tag.
        Alan KilbornA zemaria523Z 2 Replies Last reply Reply Quote 4
        • Alan KilbornA
          Alan Kilborn @Mark Olson
          last edited by

          @Mark-Olson said in copy xml blocks with contains string:

          You will get some lines with ---- in between each marked thing

          This happens to delimit the matches when each match is multiline.
          For the situation here, it is pretty clear where the matches separate, but it is not always so clear, so the ---- is put in.
          At least, that’s my interpretation as to the “why” it is there.
          I don’t think I’ve ever seen this explained before.

          As you say, easy enough to remove with a regex replacement operation.

          1 Reply Last reply Reply Quote 0
          • zemaria523Z
            zemaria523 @Mark Olson
            last edited by

            @Mark-Olson Thank you very much. it solved my problem!

            1 Reply Last reply Reply Quote 0
            • Alan KilbornA Alan Kilborn referenced this topic on
            • First post
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors