Regen: To replace XML elements
-
@Wan-Lung-Ho said in Regen: To replace XML elements:
Hello, how can I assign a date to “DateStart” and “DateEnd” when “DateStart” and “DateEnd” are blank, and “OrgHeader” is not Blank?
It should be simple with a regex (stands for REGular EXpression, it’s not regen). However in spite of you saying the DateStart and DateEnd are blank your example shows them with values already.
So do you wish to replace all of these codes with a made up value, regardless of whether there is a value already, or ONLY insert values where there isn’t a value. And the value, is it something you assign or is it a calculated value which will change. If calculated, then regex isn’t going to work as it cannot calculate values.
Terry
-
@Terry-R said in Regen: To replace XML elements:
It should be simple with a regex (stands for REGular EXpression, it’s not regen). However in spite of you saying the DateStart and DateEnd are blank your example shows them with values already.
So do you wish to replace all of these codes with a made up value, regardless of whether there is a value already, or ONLY insert values where there isn’t a value. And the value, is it something you assign or is it a calculated value which will change. If calculated, then regex isn’t going to work as it cannot calculate values.
TerryHi Terry, thanks for your prompt reply.
In my example, there are 3 parts, and I only want to update Part 3 where DateStart and DateEnd with no date, and OrgHeader with Code.
The value is not calculated.
DateStart = 2000-01-01T00:00:00
DateEnd = 2000-12-31T00:00:00# Part 1 <CusClassPartPivot Action="MERGE"> <DateStart /> <DateEnd /> <OrgHeader /> </CusClassPartPivot> # Part 2 <CusClassPartPivot Action="MERGE"> <DateStart>2021-01-01T00:00:00</DateStart> <DateEnd>2021-01-31T00:00:00</DateEnd> <OrgHeader> <Code>YAZAKISZO1</Code> </OrgHeader> </CusClassPartPivot> # Part 3 <CusClassPartPivot Action="MERGE"> <DateStart /> <DateEnd /> <OrgHeader> <Code>YAZPARSHO5</Code> </OrgHeader> </CusClassPartPivot>
-
@Wan-Lung-Ho said in Regen: To replace XML elements:
In my example, there are 3 parts, and I only want to update Part 3 where DateStart and DateEnd with no date, and OrgHeader with Code.
I see what was confusing. In part #3 you have codes like
<DateStart />
, note there is a space between the word and the/
. When you complete the insert, this code changes. That is significant and wasn’t mentioned in the first post, hence my thought was you were changing an existing value.So to confirm then, where ever the code has the “space” in it and a following code such as
YAZPARSHO5
you want to insert the date/time codes AND also change the code slightly, correct?So
<DateStart />
line will be changed to<DateStart>2000-01-01T00:00:00</DateStart>
, similarly the next line as well.Terry
-
@Wan-Lung-Ho said in Regen: To replace XML elements:
In my example, there are 3 parts, and I only want to update Part 3 where DateStart and DateEnd with no date, and OrgHeader with Code.
So my initial solution would be, using the Replace function:
Find What:(?-si)^(\x20*)(<DateStart />)(\R\x20*)(<DateEnd />)(\R\x20*<OrgHeader>\R\x20*<Code>[^<]+</Code>)
Replace With:\1<DateStart>2000-01-01T00:00:00</DateStart>\3<DateEnd>2000-12-31T00:00:00</DateEnd>\5
As this is a regular expression, the search mode must be “regular expression”.
If unsure on what it is doing, have a file opened in Notepad++, enter the above details, then click on the “Find” button which will show the first instance to replace. Then click on “Replace” and it will replace the values (or insert if none present). At this point the Find will advance to the next occurrence in the file. To check the change you will need to move the view back to the previous replacement.
Terry
PS I need to point out that it will find DateStart BUT NOT datestart or any other variation. If this is a problem you can change the first part from
(?-si)
to(?-s)
. It’s thei
which makes it case sensitive. -
Hi Terry, thanks for your help, your solution works on my example. However, my bad, I found that the DateStart and DateEnd elements in my real case are different than what I provided in the previous post. Besides, if I have more elements in between DateEnd and OrgHeader, how to change your solution in order to make it work? Thanks!
<CusClassPartPivot Action="MERGE"> <LastAuditedDate></LastAuditedDate> <LastAuditedUser></LastAuditedUser> <TariffNum>8544492000</TariffNum> <TariffChangePending>false</TariffChangePending> <SupplementalTariff></SupplementalTariff> <DateStart></DateStart> <DateEnd></DateEnd> <ConcessionOrder></ConcessionOrder> <PrimaryPreference></PrimaryPreference> <SecondaryPreference></SecondaryPreference> <RelatedIndicator></RelatedIndicator> <ValuationCode></ValuationCode> <ValuationMarkup>0.000</ValuationMarkup> <UsageComment></UsageComment> <NAddInfo></NAddInfo> <NDescription></NDescription> <Description></Description> <PartPivotUOM></PartPivotUOM> <CusUSClassificationCollection> <CusUSClassification Action="MERGE"> <PK>3f2549ad-9bba-4054-b2bf-3e82b294a6ab</PK> </CusUSClassification> </CusUSClassificationCollection> <OrgHeader> <Code>GRUYAZCUU5</Code> <OrgCusCodeCollection> <OrgCusCode> <CustomsRegNo>MXGRUYAZ101ROS</CustomsRegNo> <CodeType>MID</CodeType> <CountryDefault>false</CountryDefault> <CodeCountry TableName="RefCountry"> <Code>US</Code> </CodeCountry> </OrgCusCode> </OrgCusCodeCollection> </OrgHeader> <CusClassification /> <Country TableName="RefCountry"> <Code>US</Code> </Country> <CountryOfOrigin TableName="RefCountry" /> <CountryOfExport TableName="RefCountry" /> </CusClassPartPivot>
-
-
@Wan-Lung-Ho said in Regen: To replace XML elements:
However, my bad, I found that the DateStart and DateEnd elements in my real case are different than what I provided in the previous post. Besides, if I have more elements in between DateEnd and OrgHeader, how to change your solution in order to make it work?
Yes your bad! I don’t intend on firing up my PC this evening, work has finished. If you are lucky maybe another member might provide a new solution. You should apply the change to your latest example and show that as well so someone has something to work with.
Be absolutely certain what your example now shows is correct otherwise you will quickly lose members offers to help.
Terry
-
Hello, @wan-lung-ho, @terry-r and All,
If we assume that the tags:
-
<DateStart>
+0
or more blank char(s) +</DateStart>
-
<DateStart>
+0
or more blank char(s) +/>
-
<DateEnd>
+0
or more blank char(s) +</DateEnd>
-
<DateEnd>
+0
or more blank char(s) +/>
can be considered as empty eligible tags
And that the tag :
<OrgHeader>
+0
or more blank char(s) +1
or more non-blank chars +0
or more blank char(s) +</OrgHeader>
is a non-empty tag needed, somewhere after the date tags, to allow a replacement
A possible solution would be :
SEARCH
(?s-i)(?:(<DateStart>\s*</DateStart>|<DateStart\s*/>)|<DateEnd>\s*</DateEnd>|<DateEnd\s*/>)(?=.+?<OrgHeader>\s*\S+\s*</OrgHeader>)
REPLACE
?1<DateStart>2000-01-01T00\:00\:00</DateStart>:<DateEnd>2000-12-31T00\:00\:00</DateEnd>
Just give it a try !
Best Regards,
guy038
-
-
Thanks Terry and @guy038 for your help and comment!
I tried to modify the solution, it allows me to search for #Part 3 but also #Part 1 which I do not want #Part 1 to be there. Could you please advise what the problem is with the statement? Thanks!
(?s-i)(?:(<DateStart>\s*</DateStart>|<DateStart\s*/>)|<DateEnd>\s*</DateEnd>|<DateEnd\s*/>)(?=.+?OrgHeader>\R\x20*<Code>[^<]+</Code>)
<Product version="2.0"> <OrgSupplierPart Action="MERGE"> <CusClassPartPivotCollection> #Part 1 <CusClassPartPivot Action="MERGE"> <DateStart /> <DateEnd /> <OrgHeader /> </CusClassPartPivot> #Part 2 <CusClassPartPivot Action="MERGE"> <DateStart>2021-01-01T00:00:00</DateStart> <DateEnd>2021-01-31T00:00:00</DateEnd> <OrgHeader> <Code>YAZAKISZO1</Code> </OrgHeader> </CusClassPartPivot> #Part 3 <CusClassPartPivot Action="MERGE"> <LastAuditedDate></LastAuditedDate> <LastAuditedUser></LastAuditedUser> <TariffNum>8544492000</TariffNum> <TariffChangePending>false</TariffChangePending> <SupplementalTariff></SupplementalTariff> <DateStart></DateStart> <DateEnd></DateEnd> <ConcessionOrder></ConcessionOrder> <PrimaryPreference></PrimaryPreference> <SecondaryPreference></SecondaryPreference> <RelatedIndicator></RelatedIndicator> <ValuationCode></ValuationCode> <ValuationMarkup>0.000</ValuationMarkup> <UsageComment></UsageComment> <NAddInfo></NAddInfo> <NDescription></NDescription> <Description></Description> <PartPivotUOM></PartPivotUOM> <CusUSClassificationCollection> <CusUSClassification Action="MERGE"> <PK>3f2549ad-9bba-4054-b2bf-3e82b294a6ab</PK> </CusUSClassification> </CusUSClassificationCollection> <OrgHeader> <Code>GRUYAZCUU5</Code> <OrgCusCodeCollection> <OrgCusCode> <CustomsRegNo>MXGRUYAZ101ROS</CustomsRegNo> <CodeType>MID</CodeType> <CountryDefault>false</CountryDefault> <CodeCountry TableName="RefCountry"> <Code>US</Code> </CodeCountry> </OrgCusCode> </OrgCusCodeCollection> </OrgHeader> <CusClassification /> <Country TableName="RefCountry"> <Code>US</Code> </Country> <CountryOfOrigin TableName="RefCountry" /> <CountryOfExport TableName="RefCountry" /> </CusClassPartPivot> </CusClassPartPivotCollection> </OrgSupplierPart> </Product>
-
Hi, @wan-lung-ho, @terry-r and All,
I’m wondering how my previous regex S/R could have matched something in part
#3
of your last example because the final part of my regex was erroneous !?Indeed, to identify a non-empty
<OrgHeader>
tag, I used the regex<OrgHeader>\s*\S+\s*</OrgHeader>
I was wrong and I should have used the regex<OrgHeader>.*?\S+.*?</OrgHeader>
Now, in order to restrict the overall search to a single section :
<CusClassPartPivot Action="MERGE"> ............. ............. ............. ............. </CusClassPartPivot>
we need to add a condition onto the characters found after an empty date tag and before the opening non-empty tag
<OrgHeader>
, which is :It must NOT cross an ending tag
</CusClassPartPivot>
So the regex equivalent is the following S/R, which should work as expected :
SEARCH
(?s-i)(?:(<DateStart>\s*</DateStart>|<DateStart\s*/>)|<DateEnd>\s*</DateEnd>|<DateEnd\s*/>)(?=(?:(?!</CusClassPartPivot>).)+?<OrgHeader>.*?\S+.*?</OrgHeader>)
REPLACE
?1<DateStart>2000-01-01T00\:00\:00</DateStart>:<DateEnd>2000-12-31T00\:00\:00</DateEnd>
If we use the free-spacing mode
(?x)
, the search regex can also be expressed as :(?xs-i) # FREE-SPACING mode | DOT matches ANY char | Search SENSITIVE to CASE (?: # Start 1st NON-CAPTURING group | ( # Start Group 1 | <DateStart> \s* </DateStart> # EMPTY DateStart tag | | # OR | <DateStart \s* /> # EMPTY DateStart tag | WHAT we search... ) # End Group 1 | | # OR | <DateEnd> \s* </DateEnd> # EMPTY DateEnd tag | | # OR | <DateEnd \s* /> # EMPTY DateEnd tag | ) # End 1st NON-CAPTURING group | (?= # Start of a POSITIVE LOOK-AHEAD | (?: # Start 2nd NON-CAPTURING group | (?! # Start of a NEGATIVE LOOK-AHEAD | </CusClassPartPivot> # ENDING tag </CusClassPartPivot> | ...With the CONDITIONS ) # End of a NEGATIVE LOOK-AHEAD | . # ANY character, including LINE-BREAK chars | )+? # End 2nd NON-CAPTURING group, REPEATED 1 or MORE times | <OrgHeader> .*? \S+ .*? </OrgHeader> # A NON-EMPTY MULTI-LINES tag <OrgHeader>.......</OrgHeader> | ) # End of the POSITIVE LOOK-AHEAD |
-
Do a normal selection, from
(?xs-i)
till# End of the POSITIVE LOOK-AHEAD
-
Open the Find dialog (
Ctrl + F
)
=> All this multi-lines block is automatically pasted in the
Find what:
zone-
Just click on the
Find Next
button to move to each match -
Now, if you open the Replace dialog (
Ctrl + H
)-
REPLACE
?1<DateStart>2000-01-01T00\:00\:00</DateStart>:<DateEnd>2000-12-31T00\:00\:00</DateEnd>
-
Tick the
Wrap around
option -
Click on the
Replace All
button, ONLY
-
Voila !
Best Regards,
guy038
-
-
Hi @guy038 , I really appreciate your help. Not only resolved my problem, but also taught me something with your detailed explanation.