Community
    • Login

    Regen: To replace XML elements

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    12 Posts 3 Posters 642 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Wan Lung HoW
      Wan Lung Ho
      last edited by

      @Terry-R said in Regen: To replace XML elements:

      It should be simple with a regex (stands for REGular EXpression, it’s not regen). However in spite of you saying the DateStart and DateEnd are blank your example shows them with values already.
      So do you wish to replace all of these codes with a made up value, regardless of whether there is a value already, or ONLY insert values where there isn’t a value. And the value, is it something you assign or is it a calculated value which will change. If calculated, then regex isn’t going to work as it cannot calculate values.
      Terry

      Hi Terry, thanks for your prompt reply.

      In my example, there are 3 parts, and I only want to update Part 3 where DateStart and DateEnd with no date, and OrgHeader with Code.

      The value is not calculated.
      DateStart = 2000-01-01T00:00:00
      DateEnd = 2000-12-31T00:00:00

      # Part 1
         <CusClassPartPivot Action="MERGE">
          <DateStart />
          <DateEnd />
          <OrgHeader />
         </CusClassPartPivot>
      
      # Part 2
         <CusClassPartPivot Action="MERGE">
          <DateStart>2021-01-01T00:00:00</DateStart>
          <DateEnd>2021-01-31T00:00:00</DateEnd>
          <OrgHeader>
           <Code>YAZAKISZO1</Code>
          </OrgHeader>
         </CusClassPartPivot>
      
      # Part 3
         <CusClassPartPivot Action="MERGE">
          <DateStart />
          <DateEnd />
          <OrgHeader>
           <Code>YAZPARSHO5</Code>
          </OrgHeader>
         </CusClassPartPivot>
      
      
      Terry RT 1 Reply Last reply Reply Quote 0
      • Terry RT
        Terry R
        last edited by

        @Wan-Lung-Ho said in Regen: To replace XML elements:

        In my example, there are 3 parts, and I only want to update Part 3 where DateStart and DateEnd with no date, and OrgHeader with Code.

        I see what was confusing. In part #3 you have codes like <DateStart />, note there is a space between the word and the /. When you complete the insert, this code changes. That is significant and wasn’t mentioned in the first post, hence my thought was you were changing an existing value.

        So to confirm then, where ever the code has the “space” in it and a following code such as YAZPARSHO5 you want to insert the date/time codes AND also change the code slightly, correct?

        So <DateStart /> line will be changed to <DateStart>2000-01-01T00:00:00</DateStart>, similarly the next line as well.

        Terry

        1 Reply Last reply Reply Quote 0
        • Terry RT
          Terry R @Wan Lung Ho
          last edited by Terry R

          @Wan-Lung-Ho said in Regen: To replace XML elements:

          In my example, there are 3 parts, and I only want to update Part 3 where DateStart and DateEnd with no date, and OrgHeader with Code.

          So my initial solution would be, using the Replace function:
          Find What:(?-si)^(\x20*)(<DateStart />)(\R\x20*)(<DateEnd />)(\R\x20*<OrgHeader>\R\x20*<Code>[^<]+</Code>)
          Replace With:\1<DateStart>2000-01-01T00:00:00</DateStart>\3<DateEnd>2000-12-31T00:00:00</DateEnd>\5

          As this is a regular expression, the search mode must be “regular expression”.

          If unsure on what it is doing, have a file opened in Notepad++, enter the above details, then click on the “Find” button which will show the first instance to replace. Then click on “Replace” and it will replace the values (or insert if none present). At this point the Find will advance to the next occurrence in the file. To check the change you will need to move the view back to the previous replacement.

          Terry

          PS I need to point out that it will find DateStart BUT NOT datestart or any other variation. If this is a problem you can change the first part from (?-si) to (?-s). It’s the i which makes it case sensitive.

          1 Reply Last reply Reply Quote 0
          • Wan Lung HoW
            Wan Lung Ho
            last edited by

            Hi Terry, thanks for your help, your solution works on my example. However, my bad, I found that the DateStart and DateEnd elements in my real case are different than what I provided in the previous post. Besides, if I have more elements in between DateEnd and OrgHeader, how to change your solution in order to make it work? Thanks!

                      <CusClassPartPivot Action="MERGE">
                        <LastAuditedDate></LastAuditedDate>
                        <LastAuditedUser></LastAuditedUser>
                        <TariffNum>8544492000</TariffNum>
                        <TariffChangePending>false</TariffChangePending>
                        <SupplementalTariff></SupplementalTariff>
                        <DateStart></DateStart>
                        <DateEnd></DateEnd>
                        <ConcessionOrder></ConcessionOrder>
                        <PrimaryPreference></PrimaryPreference>
                        <SecondaryPreference></SecondaryPreference>
                        <RelatedIndicator></RelatedIndicator>
                        <ValuationCode></ValuationCode>
                        <ValuationMarkup>0.000</ValuationMarkup>
                        <UsageComment></UsageComment>
                        <NAddInfo></NAddInfo>
                        <NDescription></NDescription>
                        <Description></Description>
                        <PartPivotUOM></PartPivotUOM>
                        <CusUSClassificationCollection>
                          <CusUSClassification Action="MERGE">
                            <PK>3f2549ad-9bba-4054-b2bf-3e82b294a6ab</PK>
                          </CusUSClassification>
                        </CusUSClassificationCollection>
                        <OrgHeader>
                          <Code>GRUYAZCUU5</Code>
                          <OrgCusCodeCollection>
                            <OrgCusCode>
                              <CustomsRegNo>MXGRUYAZ101ROS</CustomsRegNo>
                              <CodeType>MID</CodeType>
                              <CountryDefault>false</CountryDefault>
                              <CodeCountry TableName="RefCountry">
                                <Code>US</Code>
                              </CodeCountry>
                            </OrgCusCode>
                          </OrgCusCodeCollection>
                        </OrgHeader>
                        <CusClassification />
                        <Country TableName="RefCountry">
                          <Code>US</Code>
                        </Country>
                        <CountryOfOrigin TableName="RefCountry" />
                        <CountryOfExport TableName="RefCountry" />
                      </CusClassPartPivot>
            
            Terry RT 1 Reply Last reply Reply Quote 0
            • Wan Lung HoW
              Wan Lung Ho
              last edited by

              @Terry-R

              1 Reply Last reply Reply Quote 0
              • Terry RT
                Terry R @Wan Lung Ho
                last edited by

                @Wan-Lung-Ho said in Regen: To replace XML elements:

                However, my bad, I found that the DateStart and DateEnd elements in my real case are different than what I provided in the previous post. Besides, if I have more elements in between DateEnd and OrgHeader, how to change your solution in order to make it work?

                Yes your bad! I don’t intend on firing up my PC this evening, work has finished. If you are lucky maybe another member might provide a new solution. You should apply the change to your latest example and show that as well so someone has something to work with.

                Be absolutely certain what your example now shows is correct otherwise you will quickly lose members offers to help.

                Terry

                1 Reply Last reply Reply Quote 0
                • guy038G
                  guy038
                  last edited by guy038

                  Hello, @wan-lung-ho, @terry-r and All,

                  If we assume that the tags:

                  • <DateStart> + 0 or more blank char(s) + </DateStart>

                  • <DateStart> + 0 or more blank char(s) + />

                  • <DateEnd> + 0 or more blank char(s) + </DateEnd>

                  • <DateEnd> + 0 or more blank char(s) + />

                  can be considered as empty eligible tags

                  And that the tag :

                  • <OrgHeader> + 0 or more blank char(s) + 1 or more non-blank chars + 0 or more blank char(s) + </OrgHeader>

                  is a non-empty tag needed, somewhere after the date tags, to allow a replacement

                  A possible solution would be :

                  SEARCH (?s-i)(?:(<DateStart>\s*</DateStart>|<DateStart\s*/>)|<DateEnd>\s*</DateEnd>|<DateEnd\s*/>)(?=.+?<OrgHeader>\s*\S+\s*</OrgHeader>)

                  REPLACE ?1<DateStart>2000-01-01T00\:00\:00</DateStart>:<DateEnd>2000-12-31T00\:00\:00</DateEnd>

                  Just give it a try !

                  Best Regards,

                  guy038

                  1 Reply Last reply Reply Quote 0
                  • Wan Lung HoW
                    Wan Lung Ho
                    last edited by

                    Thanks Terry and @guy038 for your help and comment!

                    I tried to modify the solution, it allows me to search for #Part 3 but also #Part 1 which I do not want #Part 1 to be there. Could you please advise what the problem is with the statement? Thanks!

                    (?s-i)(?:(<DateStart>\s*</DateStart>|<DateStart\s*/>)|<DateEnd>\s*</DateEnd>|<DateEnd\s*/>)(?=.+?OrgHeader>\R\x20*<Code>[^<]+</Code>)

                    <Product version="2.0">
                     <OrgSupplierPart Action="MERGE">
                      <CusClassPartPivotCollection>
                    
                    #Part 1
                       <CusClassPartPivot Action="MERGE">
                        <DateStart />
                        <DateEnd />
                        <OrgHeader />
                       </CusClassPartPivot>
                    
                    #Part 2
                       <CusClassPartPivot Action="MERGE">
                        <DateStart>2021-01-01T00:00:00</DateStart>
                        <DateEnd>2021-01-31T00:00:00</DateEnd>
                        <OrgHeader>
                         <Code>YAZAKISZO1</Code>
                        </OrgHeader>
                       </CusClassPartPivot>
                    
                    #Part 3
                                <CusClassPartPivot Action="MERGE">
                                <LastAuditedDate></LastAuditedDate>
                                <LastAuditedUser></LastAuditedUser>
                                <TariffNum>8544492000</TariffNum>
                                <TariffChangePending>false</TariffChangePending>
                                <SupplementalTariff></SupplementalTariff>
                                <DateStart></DateStart>
                                <DateEnd></DateEnd>
                                <ConcessionOrder></ConcessionOrder>
                                <PrimaryPreference></PrimaryPreference>
                                <SecondaryPreference></SecondaryPreference>
                                <RelatedIndicator></RelatedIndicator>
                                <ValuationCode></ValuationCode>
                                <ValuationMarkup>0.000</ValuationMarkup>
                                <UsageComment></UsageComment>
                                <NAddInfo></NAddInfo>
                                <NDescription></NDescription>
                                <Description></Description>
                                <PartPivotUOM></PartPivotUOM>
                                <CusUSClassificationCollection>
                                  <CusUSClassification Action="MERGE">
                                    <PK>3f2549ad-9bba-4054-b2bf-3e82b294a6ab</PK>
                                  </CusUSClassification>
                                </CusUSClassificationCollection>
                                <OrgHeader>
                                  <Code>GRUYAZCUU5</Code>
                                  <OrgCusCodeCollection>
                                    <OrgCusCode>
                                      <CustomsRegNo>MXGRUYAZ101ROS</CustomsRegNo>
                                      <CodeType>MID</CodeType>
                                      <CountryDefault>false</CountryDefault>
                                      <CodeCountry TableName="RefCountry">
                                        <Code>US</Code>
                                      </CodeCountry>
                                    </OrgCusCode>
                                  </OrgCusCodeCollection>
                                </OrgHeader>
                                <CusClassification />
                                <Country TableName="RefCountry">
                                  <Code>US</Code>
                                </Country>
                                <CountryOfOrigin TableName="RefCountry" />
                                <CountryOfExport TableName="RefCountry" />
                              </CusClassPartPivot>
                    
                    
                    
                      </CusClassPartPivotCollection>
                     </OrgSupplierPart>
                    </Product>
                    
                    
                    1 Reply Last reply Reply Quote 0
                    • guy038G
                      guy038
                      last edited by guy038

                      Hi, @wan-lung-ho, @terry-r and All,

                      I’m wondering how my previous regex S/R could have matched something in part #3 of your last example because the final part of my regex was erroneous !?

                      Indeed, to identify a non-empty <OrgHeader> tag, I used the regex <OrgHeader>\s*\S+\s*</OrgHeader> I was wrong and I should have used the regex <OrgHeader>.*?\S+.*?</OrgHeader>


                      Now, in order to restrict the overall search to a single section :

                                <CusClassPartPivot Action="MERGE">
                                   .............
                                   .............
                                   .............
                                   .............
                                </CusClassPartPivot>
                      

                      we need to add a condition onto the characters found after an empty date tag and before the opening non-empty tag <OrgHeader>, which is :

                      It must NOT cross an ending tag </CusClassPartPivot>


                      So the regex equivalent is the following S/R, which should work as expected :

                      SEARCH (?s-i)(?:(<DateStart>\s*</DateStart>|<DateStart\s*/>)|<DateEnd>\s*</DateEnd>|<DateEnd\s*/>)(?=(?:(?!</CusClassPartPivot>).)+?<OrgHeader>.*?\S+.*?</OrgHeader>)

                      REPLACE ?1<DateStart>2000-01-01T00\:00\:00</DateStart>:<DateEnd>2000-12-31T00\:00\:00</DateEnd>


                      If we use the free-spacing mode (?x), the search regex can also be expressed as :

                      (?xs-i)  #  FREE-SPACING mode | DOT matches ANY char | Search SENSITIVE to CASE
                      
                      (?:                                         #  Start 1st NON-CAPTURING group  |
                          (                                       #      Start Group 1              |
                              <DateStart>  \s*  </DateStart>      #          EMPTY DateStart tag    |
                            |                                     #        OR                       |
                              <DateStart   \s*  />                #          EMPTY DateStart tag    |  WHAT we search...
                          )                                       #      End Group 1                |  
                        |                                         #    OR                           |  
                          <DateEnd>  \s*  </DateEnd>              #      EMPTY DateEnd tag          |  
                        |                                         #    OR                           |  
                          <DateEnd  \s*  />                       #      EMPTY DateEnd tag          |  
                      )                                           #  End 1st NON-CAPTURING group    |  
                      
                      (?=                                         #  Start of a POSITIVE LOOK-AHEAD                                |
                        (?:                                       #    Start 2nd NON-CAPTURING group                               |
                          (?!                                     #      Start of a NEGATIVE LOOK-AHEAD                            |
                            </CusClassPartPivot>                  #        ENDING tag </CusClassPartPivot>                         |  ...With the CONDITIONS
                          )                                       #      End of a NEGATIVE LOOK-AHEAD                              |
                          .                                       #      ANY character, including LINE-BREAK chars                 |
                        )+?                                       #    End 2nd NON-CAPTURING group, REPEATED 1 or MORE times       |
                        <OrgHeader>  .*?  \S+  .*?  </OrgHeader>  #    A NON-EMPTY MULTI-LINES tag <OrgHeader>.......</OrgHeader>  |
                      )                                           #  End of the POSITIVE LOOK-AHEAD                                |
                      

                      • Do a normal selection, from (?xs-i) till # End of the POSITIVE LOOK-AHEAD

                      • Open the Find dialog ( Ctrl + F )

                      => All this multi-lines block is automatically pasted in the Find what: zone

                      • Just click on the Find Next button to move to each match

                      • Now, if you open the Replace dialog ( Ctrl + H )

                        • REPLACE ?1<DateStart>2000-01-01T00\:00\:00</DateStart>:<DateEnd>2000-12-31T00\:00\:00</DateEnd>

                        • Tick the Wrap around option

                        • Click on the Replace All button, ONLY

                      Voila !

                      Best Regards,

                      guy038

                      1 Reply Last reply Reply Quote 1
                      • Wan Lung HoW
                        Wan Lung Ho
                        last edited by

                        Hi @guy038 , I really appreciate your help. Not only resolved my problem, but also taught me something with your detailed explanation.

                        1 Reply Last reply Reply Quote 3
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors