Move the entire tag to another location in the same XML file

Tư Mã Tần Quảng

I have an XML file, inverted by the position of <id> to the end, like this:

<include>
	<data>
		<name>John</name>
		<age>35</age>
		<type>Teacher</type>
		<id>231001</id>
	</data>
	<data>
		<name>Vivi</name>
		<age>18</age>
		<type>Student</type>
		<id>231002</id>
	</data>
	...etc...
</include>

Now, I want to move the whole <id> up above <name>, like this:

<include>
	<data>
		<id>231001</id>
		<name>John</name>
		<age>35</age>
		<type>Teacher</type>
	</data>
	<data>
		<id>231002</id>
		<name>Vivi</name>
		<age>18</age>
		<type>Student</type>
	</data>
	...etc...
</include>

Are there any methods, plug-ins, regex, … that can help me do that?
I would be very grateful for your answers.

Alan Kilborn

@Tư-Mã-Tần-Quảng

Let’s go with regex, maybe give this a try:

Open the Replace dialog by pressing Ctrl+h and then set up the following search parameters:

Find what box: (?s-i)^(\h+<name>.*?</type>\R)(\h+<id>.*?</id>\R)
Replace with box: \2\1
Search mode radiobutton: Regular expression
Wrap around checkbox: ticked
Match case checkbox: doesn’t matter (because the (?s-i) leading off the Find what box contains an i variant)
. matches newline checkbox: doesn’t matter (because the (?s-i) leading off the Find what box contains an s variant)

Then press the Replace All button.

guy038

Hi, @tư-mã-tần-quảng, @alan-kilborn and All,

As we know that text contains only one tag per line, always preceded with blank chars, an almost symmetrical solution could be :

SEARCH (?s-i)(\h+<name>.+?)(\h+<id>.+?\R)

REPLACE \2\1

Where :

<id> is the tag to be moved
<name> is the tag which must follow the <id> tag

For instance, assuming this data list ( and I do !! )

<include>
	<data>
		<name>THEVENOT</name>
		<forename>Guy</forename>
		<age>26</age>
		<town>Streatham Hill</town>
		<country>England</country>
		<occupation>Student</occupation>
		<school>South London College</school>
		<address>108 Norfolk House Rd</address>
		<id>732104</id>
		<course>English</course>
		<year>1977-1978</year>
	</data>
</include>

And let’s suppose that we want to move the <address> tag, right before the <town> tag

Then, we have to build the following regex :

SEARCH (?s-i)(\h+<town>.+?)(\h+<address>.+?\R)

REPLACE \2\1

And we get the logical text, below :

<include>
	<data>
		<name>THEVENOT</name>
		<forename>Guy</forename>
		<age>26</age>
		<address>108 Norfolk House Rd</address>
		<town>Streatham Hill</town>
		<country>England</country>
		<occupation>Student</occupation>
		<school>South London College</school>
		<id>732104</id>
		<course>English</course>
		<year>1977-1978</year>
	</data>
</include>

Note that all this information is true, even the <Id> number of my old student card, ha ha ;-))

Cheers,

guy038

Tư Mã Tần Quảng

@Alan-Kilborn said in Move the entire tag to another location in the same XML file:

@Tư-Mã-Tần-Quảng

Let’s go with regex, maybe give this a try:

Open the Replace dialog by pressing Ctrl+h and then set up the following search parameters:

Find what box: (?s-i)^(\h+<name>.*?</type>\R)(\h+<id>.*?</id>\R)
Replace with box: \2\1
Search mode radiobutton: Regular expression
Wrap around checkbox: ticked
Match case checkbox: doesn’t matter (because the (?s-i) leading off the Find what box contains an i variant)
. matches newline checkbox: doesn’t matter (because the (?s-i) leading off the Find what box contains an s variant)

Then press the Replace All button.

Oh my god, it worked! I am very grateful to you, thank you very very much. ❤️

Tư Mã Tần Quảng

@guy038 Your way also worked, I am really grateful to you and @Alan-Kilborn! ❤️

guy038

Hello, Hi, @tư-mã-tần-quảng and All,

I forgot to explain my regex S/R :

SEARCH (?s-i)(\h+<name>.+?)(\h+<id>.+?\R)

REPLACE \2\1

So :

First, the part (?s-i) has already been explained by @alan-kilborn !
Then, the part \h+<name> searches for some horizontal blank characters, followed with the string <name>, with this exact case
Now, the part .+?, due to the s modifier, looks for the shortest non-null range of any char, even EOL ones, till … some blank chars, again, followed with the string <id>, with this exact case
The two parts above are stored as group 1 because of the surrounding parentheses. Note that it, necessarily, ends with EOL chars, as followed with the <id> line !
Then the part \h+<id> matches any horizontal blank chars, followed with the string <id>, with this exact case
Finally, the part .+?\R matches the shortest non-null range of any char till… the next EOL chars, so ending the <id> line
Again, the two parts above are stored as group 2 because of the surrounding parentheses.
Note that the \R syntax stands for any kind of line-break ( \r\n for Windows files, \n for Unix files and \r for Mac files )
In replacement, the \2\1 rewrites, in reverse order, the single-line group 2 ( line <id> ) and the multi-lines group 1

Best Regards,

guy038