Insert / Move XML Rows
-
I searched quite a bit for threads on two things but could not find anything.
I am trying to do two things:
- Insert element and attribute
- Move existing element and attribute
I apologize if I have not identified the parts of the XML file that need to be modified.
Here is what I have now:
Code: Select all
<App action=‘A’ id=‘1’>
<BaseVehicle id=‘3702’/>
<SubModel id=‘430’/>
<EngineBase id=‘396’/>
<PartType id=‘5764’/>
<Part>17-0012</Part>
<Note>Turbo Wheel and Shaft Included<Note/>
</App>First I need to insert a new element with data “<Qty>1</Qty>” above the <PartType element.
The second thing I need to do is move the entire <Note></Note> element with the data to a new position. This element and data should be positioned above quantity. The end result should look like:
Code: Select all
<App action=“A” id=“1”>
<BaseVehicle id=“3702” />
<SubModel id=“430” />
<EngineBase id=“396” />
<Note>Turbo Wheel and Shaft Included</Note>
<Qty>1</Qty>
<PartType id=“5764” />
<Part>17-0012</Part>
</App>Any guidance on how I can do this on my own using Notepad++ would be great.
Thanks,
Johnny
-
Hello Johnny,
as already written in another post regex isn’t my pet passion but with
the provided data it might be solved like soFirst add Qty… above Part…
Find what: (<PartType.*\/>) Replace with: <Qty>1<\/Qty>\r\n\1
and second move Note … above Qty…
Find what: (<Qty>1<\/Qty>)\r\n(.*)\r\n(.*)\r\n(<Note>.*\/>) Replace with: \4\r\n\1\r\n\2\r\n\3
The second regex assumes that the Note element is always
three nodes after Qty element. So if this isn’t always the case, then
you need to modify it.Cheers
Claudia -
Hello Juan and Claudia,
I think that I found a S/R which can process the two operations, at the same time.
In addition, this S/R doesn’t care about the number of lines, even zero, which may occur between the <PartType…/> line and the <Note>…</Note> line !
But, of course, in all cases, the <Note>…</Note> line must be located AFTER the <PartType…/> line !!
So :
-
Open the Replace dialog ( CTRL + H )
-
SEARCH
^(<PartType.*\R)(?s)(.*?)(?-s)(<Note>.*\R)
-
REPLACE
\3<Qty>1</Qty>\r\n\1\2
-
Check the Wrap around option, if necessary
-
Check the Regular expression search mode ( Important )
-
Click on the Replace All button
Et voilà !
Notes :
-
The first part
^(<PartType.*\R)
represents the complete line <PartType…/>, with its EOL characters, and it’s referenced as group 1, as surrounded with round brackets -
The final part
(?-s)(<Note>.*\R)
represents the complete line <Note>…</Note>, with its EOL, and it’s referenced as group 3, as surrounded with round brackets -
The middle part
(?s)(.*?)
represents the smallest range of standard or EOL characters, in one or several complete lines, between the two lines, described above. It’s referenced as group 2, as surrounded, too, with round brackets -
The modifier
(?s)
means that the dot can be either a standard or an end of line character -
Therefore the modifier
(?-s)
, at the end of the regex, means that the dot represents, as usual, any standard character, only
And, in replacement, we have to re-write :
-
Firstly, the group 3 =>
\3
( the <Note>…</Note> line ) -
Secondly, the line <Qty>1</Qty> with its EOL =>
<Qty>1</Qty>\r\n
-
Thirdly, the group 1 =>
\1
( the <PartType…/> line ) -
Fourthly, the group 2 =>
\2
( All the initial lines bewteen the <PartType> one and the <Note> one )
Best Regards,
guy038
P.S. :
Juan, You’ll find good documentation, about the new Boost C++ Regex library ( similar to the PERL Regular Common Expressions ) used by Notepad++, since the
6.0
version, at the TWO addresses below :http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html
http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/format/boost_format_syntax.html
-
The FIRST link explains the syntax, of regular expressions, in the SEARCH part
-
The SECOND link explains the syntax, of regular expressions, in the REPLACEMENT part
-
-
Cheers @guy038 and @Claudia Frank for your replies. I will attempt both of these over the next few days.
I will post my results.
Juan,
-
Claudia,
Thanks so much for the great advise. The insert <Qty> worked great!. Unfortunately I could not get the move <Note> to work.
Here is the situation after the <Qty> insertion:
<App action=‘A’ id=‘1378’>
<BaseVehicle id=‘128520’/>
<SubModel id=‘3043’/>
<EngineBase id=‘2133’/>
<Qty>1</Qty>
<PartType id=‘6708’/>
<Part>G600096</Part>
</App>
<App action=‘A’ id=‘1379’>
<BaseVehicle id=‘128520’/>
<SubModel id=‘296’/>
<EngineBase id=‘2133’/>
<Qty>1</Qty>
<PartType id=‘6708’/>
<Part>G600096</Part>
<Note>W/O TOC ATTACHED<Note/>
</App>As you will notice not all instances of <App></App> have <Note>. However when present it is always in the same position.
-
@guy038 ,
Thanks for the advise. It was my fault for not explaining better. The <Note> element is not always present. However when present it is always in the same position.
Here is a better example for the single Find/Replace Regex function with what I am working with.
<App action=‘A’ id=‘1378’>
<BaseVehicle id=‘128520’/>
<SubModel id=‘3043’/>
<EngineBase id=‘2133’/>
<PartType id=‘6708’/>
<Part>G600096</Part>
</App>
<App action=‘A’ id=‘1379’>
<BaseVehicle id=‘128520’/>
<SubModel id=‘296’/>
<EngineBase id=‘2133’/>
<PartType id=‘6708’/>
<Part>G600096</Part>
<Note>W/O TOC ATTACHED<Note/>
</App>Cheers,
-
Hi, Juan,
Well, I, finally, found the right regexes ! I say regexes ( and not regex ) because it’s more simple to split your problem, in two consecutive actions, as you described, in your first post :
-
Firstly, add the line <Qty>1</Qty>, before each <PartType…/> line
-
Secondly, if a <Note>…</Note> line is found, in a block <App … </App>, move it, upwards, before the <Qty>1</Qty> line
As I saw the new Claudia’s post, I, then, updated this post. Indeed, she’s quite right about a possible indentation ! Therefore, I will rely on the code alignment, proposed by Claudia, for the regexes below !
If you don’t have any indentation, just remove the four spaces, in the replacement part of the FIRST S/R, below
For the first S/R, I suggest that very simple regex :
SEARCH
^(?=\h*<PartType )
REPLACE
<Qty>1</Qty>\r\n
( <Qty> is preceded by four spaces )Notes :
-
We just search the zero length string
^
( beginning of line ) if it’s followed with the string <PartType, preceded by possible horizontal blank characters and followed with a space, due to the look-ahead form(?=<PartType )
-
The escape sequence
\h
represents any horizontal blank character ( the space (\x20
), the tabulation (\x09
), or the No-Break space (\xa0
) -
Then, we, simply, replace this null string with the complete line Qty
<Qty>1</Qty>\r\n
, preceded by 4 spaces
For the second S/R, I had to consider any bloc between the <Qty>1</Qty> line and the <Note>…</Note> line. However there was a problem. Let’s suppose the current block <App doesn’t contain a <Note> line and that the next block contains such a <Note> line :
-
It would wrongly select all the text between the <Qty>1</Qty> line of the current block and the <Note>…</Note> line of the next block :-((
-
I would expect all the text between the <Qty>1</Qty> line of the NEXT block and the <Note>…</Note> line of the next block, too :-))
So, when the regex engine look for any range of text, throughout several lines,
(?s).+
, it must verify, that, at ANY location of this range, the string </App> ( meaning the end of the current block ) does NOT occur ! If this string occur, the regex, then, will fail and the regex engine will go forwards, to the next <App block, in order to find a possible range <Qty>1</Qty>…<Note>…</Note>Therefore , we get the more complicated regex, below :
SEARCH
(?-s)^(\h*<Qty>.+\R)(?s)((?:(?!</App>).)+)(?-s)^(\h*<Note>.+\R)
REPLACE
\3\1\2
Notes :
-
The first part, looks for the complete <Qty>1</Qty> line,
(?-s)^(\h*<Qty>.+\R)
, memorized as group 1 -
The final part, looks for the complete <Note>…</Note> line,
(?-s)^(\h*<Note>.+\R)
, memorized as group 3 -
The middle part, searches for a range of text, on several lines, between these two lines, and belonging to a same block. That is to say where the string </app> can’t be found inside ! => the syntax
(?s)((?:(?!</App>).)+)
-
The outer parentheses defines the group 2
-
Inside, there a non-capturing group
(?:(?!</App>).)
that represents a single character of that range ( the dot ), which can never be the<
character of an end of block </App> ! -
The form
(?!</App>)
is called a negative look-ahead and means that, at the cursor’s location, the string </App> must never occur !
-
-
Finally, in replacement, we just re-copy the different groups ( lines ), in a different order !
To sump up, just run the two S/R, successively, clicking exclusively, on the Replace All button !
Cheers,
guy038
-
-
Hello Juan,
could it be that the xml is layout more like
<App action='A' id='1378'> <BaseVehicle id='128520'/> <SubModel id='3043'/> <EngineBase id='2133'/> <Qty>1</Qty> <PartType id='6708'/> <Part>G600096</Part> </App> <App action='A' id='1379'> <BaseVehicle id='128520'/> <SubModel id='296'/> <EngineBase id='2133'/> <Qty>1</Qty> <PartType id='6708'/> <Part>G600096</Part> <Note>W/O TOC ATTACHED<Note/> </App>
if so, I didn’t take care that Note node may start with spaces in front.
The find regex needs to be changed like(.*Qty>1<\/Qty>)\R(.*)\R(.*)\R(.*Note>.*\/>)
Cheers
Claudia
P.S. I tried to modify guy038’s regex but didn’t get it done :-(
but I learned and replace \r\n with \R :-)