Deleting specific elements and children from an entire xml file
-
I have a problem slightly more complex than a recent help request entitle “Deleting specific string from entire xml file”.
I have a large xml file like “Original.gpx” below and I wish to use XML tools plugin (XSL Transformation) to generate something like “Result.gpx” also below…Original.gpx is …
<?xml version=‘1.0’ encoding=‘UTF-8’ standalone=‘yes’ ?>
<gpx version=“1.1” creator=“OsmAnd+” xmlns=“http://www.topografix.com/GPX/1/1” >
<trk>
<trkseg>
<trkpt lat=“45.4652056” lon=“-73.6961991”>
<ele>27.57</ele>
<time>2017-05-27T21:13:09Z</time>
<hdop>4.0</hdop>
<extensions>
<speed>24.5</speed>
</extensions>
</trkpt>
<trkpt lat=“45.4643226” lon=“-73.6958728”>
<ele>30.57</ele>
<time>2017-05-27T21:13:12Z</time>
<hdop>3.0</hdop>
<extensions>
<speed>25.25</speed>
</extensions>
</trkpt>
</trkseg>
</trk>
</gpx>Result.gpx is…
<?xml version=‘1.0’ encoding=‘UTF-8’ standalone=‘yes’ ?>
<gpx version=“1.1” creator=“OsmAnd+” xmlns=“http://www.topografix.com/GPX/1/1” >
<trk>
<trkseg>
<trkpt lat=“45.4652056” lon=“-73.6961991”>
<ele>27.57</ele>
<time>2017-05-27T21:13:09Z</time>
<hdop>4.0</hdop>
</trkpt>
<trkpt lat=“45.4643226” lon=“-73.6958728”>
<ele>30.57</ele>
<time>2017-05-27T21:13:12Z</time>
<hdop>3.0</hdop>
</trkpt>
</trkseg>
</trk>
</gpx>I summary I want to remove the element <extensions></extensions>, including its children (<speed></speed> or others). I used the XSL file below to transform it but did not have any success. Actually the transformation does not change anything to the file :-(
Transform.xsl file is…
<xsl:stylesheet version=“1.0” xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”>
<xsl:template match=“@|node()“>
xsl:copy
<xsl:apply-templates select=”@|node()”/>
</xsl:copy>
</xsl:template>
<xsl:template match=“extension”/>
</xsl:stylesheet>Any idea on what is wrong with it?
-
Hello, @daniel-bégin,
You’re wrong, daniel :-)) Not more difficult., indeed !
Just use this search regex
(?s-i)<extensions>.+?</extensions>\R, with the Regular expression option checked and anemptyreplacement zone ! and click on the Replace All button
Notes :
What this regex means ?. Well :
-
At beginning, the modifier
-iforces the search to be NON insensitive ( => search of the word extensions, with that exact case ) and the modifiersmeans that special dot.characters match, absolutely, any character ( Standard AND EOL characters ) -
The parts
<extensions>and/<extensions>, simply, match the literal strings <extensions> and </extensions> -
The regex part
.+?, located between, matches the shortest non-empty range of any characters, between the strings <extensions> and </extensions> -
Finally, the
\Rsyntax, among other characters, matches the EOL character(s) ( Windows\r\n, or Unix\r), located after the string /<extensions> -
As the replacement part is
empty, all complete lines, between the strings <extensions> and </extensions>, included, are deleted
Cheers,
guy038
P.S. :
Just notice that the slightly different regex
(?s-i)<extensions>.+</extensions>\R, without the exclamation mark, would select the longest range of characters between two strings <extensions> and </extensions>.So, this range would start at the first string <extensions> of your file and end at the last string </extensions> of your file !
-
-
Thank guy038, Clever! I’ll use it :-)
However, (in case I have a really more complex manipulation to do!-), is it possible to do the above using XSL transformation from the XML Tools plugin?
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login