• Login
Community
  • Login

copy xml blocks with contains string

Scheduled Pinned Locked Moved General Discussion
4 Posts 3 Posters 611 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Z
    zemaria523
    last edited by Mar 28, 2023, 2:16 PM

    Hi, I have an xml that contains millions of lines like those xml blocks:

    <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
    <plist version="1.0">
    <dict>
    	<key>relativePath</key>
    	<string>Library/Preferences/com.apple.Safari.SafeBrowsing.plist</string>
    	<key>birth</key>
    	<string>6955023</string>
    	<key>groupID</key>
    	<string>501</string>
    	<key>mode</key>
    	<string>33152</string>
    	<key>modified</key>
    	<string>4444</string>
    	<key>statusChanged</key>
    	<string>695502523</string>
    	<key>userID</key>
    	<string>501</string>
    	<key>size</key>
    	<string>103</string>
    </dict>
    </plist>","contents":{"signature":"BDuaupIWv889qLBr41lViCemOSlz","owner":"55555555","size":128,"reference_signature":"AYh7zng="UTF-8"?>
    <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
    <plist version="1.0">
    <dict>
    	<key>relativePath</key>
    	<string>Library/Preferences/com.apple.SetupAssistant.plist</string>
    	<key>birth</key>
    	<string>6945</string>
    	<key>groupID</key>
    	<string>501</string>
    	<key>mode</key>
    	<string>33152</string>
    	<key>modified</key>
    	<string>695045</string>
    	<key>statusChanged</key>
    	<string>44444</string>
    	<key>userID</key>
    	<string>501</string>
    	<key>size</key>
    	<string>161</string>
    </dict>
    </plist>","contents":{"signature":"fffdssffsfs","owner":"5555555","size":256," encoding="UTF-8"?>
    

    I want to copy all whole block <!DOCTYPE … </plist> that contains inside the word SetupAssistant, is there a way to do that?

    M 1 Reply Last reply Mar 28, 2023, 2:44 PM Reply Quote 0
    • M
      Mark Olson @zemaria523
      last edited by Mark Olson Mar 28, 2023, 2:45 PM Mar 28, 2023, 2:44 PM

      @zemaria523
      Kinda ugly, but try going to the Mark tab of the find/replace window, finding (?s-i:<!DOCTYPE\s+plist(?:(?!</plist>).)*?SetupAssistant(?:(?!</plist>).)*?</plist>), hitting Mark All, and then Copy Marked Text.
      From the above example, the text copied will be

      <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
      <plist version="1.0">
      <dict>
      	<key>relativePath</key>
      	<string>Library/Preferences/com.apple.SetupAssistant.plist</string>
      	<key>birth</key>
      	<string>6945</string>
      	<key>groupID</key>
      	<string>501</string>
      	<key>mode</key>
      	<string>33152</string>
      	<key>modified</key>
      	<string>695045</string>
      	<key>statusChanged</key>
      	<string>44444</string>
      	<key>userID</key>
      	<string>501</string>
      	<key>size</key>
      	<string>161</string>
      </dict>
      </plist>
      

      You will get some lines with ---- in between each marked thing. I suppose you could find/replace ^----$ with nothing to get rid of those.

      Explanation:

      1. <!DOCTYPE\s+plist finds the doctype declaration.
      2. (?:(?!</plist>).)*?SetupAssistant matches the text SetupAssistant up to and including SetupAssistant before the next </plist> close tag.
      3. (?:(?!</plist>).)*?</plist> finds all text up to and including the close tag.
      A Z 2 Replies Last reply Mar 28, 2023, 2:51 PM Reply Quote 4
      • A
        Alan Kilborn @Mark Olson
        last edited by Mar 28, 2023, 2:51 PM

        @Mark-Olson said in copy xml blocks with contains string:

        You will get some lines with ---- in between each marked thing

        This happens to delimit the matches when each match is multiline.
        For the situation here, it is pretty clear where the matches separate, but it is not always so clear, so the ---- is put in.
        At least, that’s my interpretation as to the “why” it is there.
        I don’t think I’ve ever seen this explained before.

        As you say, easy enough to remove with a regex replacement operation.

        1 Reply Last reply Reply Quote 0
        • Z
          zemaria523 @Mark Olson
          last edited by Mar 28, 2023, 2:54 PM

          @Mark-Olson Thank you very much. it solved my problem!

          1 Reply Last reply Reply Quote 0
          • A Alan Kilborn referenced this topic on Apr 1, 2023, 3:24 PM
          1 out of 4
          • First post
            1/4
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors