copy xml blocks with contains string
-
Hi, I have an xml that contains millions of lines like those xml blocks:
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>relativePath</key> <string>Library/Preferences/com.apple.Safari.SafeBrowsing.plist</string> <key>birth</key> <string>6955023</string> <key>groupID</key> <string>501</string> <key>mode</key> <string>33152</string> <key>modified</key> <string>4444</string> <key>statusChanged</key> <string>695502523</string> <key>userID</key> <string>501</string> <key>size</key> <string>103</string> </dict> </plist>","contents":{"signature":"BDuaupIWv889qLBr41lViCemOSlz","owner":"55555555","size":128,"reference_signature":"AYh7zng="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>relativePath</key> <string>Library/Preferences/com.apple.SetupAssistant.plist</string> <key>birth</key> <string>6945</string> <key>groupID</key> <string>501</string> <key>mode</key> <string>33152</string> <key>modified</key> <string>695045</string> <key>statusChanged</key> <string>44444</string> <key>userID</key> <string>501</string> <key>size</key> <string>161</string> </dict> </plist>","contents":{"signature":"fffdssffsfs","owner":"5555555","size":256," encoding="UTF-8"?>
I want to copy all whole block <!DOCTYPE … </plist> that contains inside the word SetupAssistant, is there a way to do that?
-
@zemaria523
Kinda ugly, but try going to theMark
tab of the find/replace window, finding(?s-i:<!DOCTYPE\s+plist(?:(?!</plist>).)*?SetupAssistant(?:(?!</plist>).)*?</plist>)
, hittingMark All
, and thenCopy Marked Text
.
From the above example, the text copied will be<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>relativePath</key> <string>Library/Preferences/com.apple.SetupAssistant.plist</string> <key>birth</key> <string>6945</string> <key>groupID</key> <string>501</string> <key>mode</key> <string>33152</string> <key>modified</key> <string>695045</string> <key>statusChanged</key> <string>44444</string> <key>userID</key> <string>501</string> <key>size</key> <string>161</string> </dict> </plist>
You will get some lines with
----
in between each marked thing. I suppose you could find/replace^----$
with nothing to get rid of those.Explanation:
<!DOCTYPE\s+plist
finds the doctype declaration.(?:(?!</plist>).)*?SetupAssistant
matches the textSetupAssistant
up to and includingSetupAssistant
before the next</plist>
close tag.(?:(?!</plist>).)*?</plist>
finds all text up to and including the close tag.
-
@Mark-Olson said in copy xml blocks with contains string:
You will get some lines with ---- in between each marked thing
This happens to delimit the matches when each match is multiline.
For the situation here, it is pretty clear where the matches separate, but it is not always so clear, so the----
is put in.
At least, that’s my interpretation as to the “why” it is there.
I don’t think I’ve ever seen this explained before.As you say, easy enough to remove with a regex replacement operation.
-
@Mark-Olson Thank you very much. it solved my problem!
-