Community
    • Login

    extract XMl with regex

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    31 Posts 5 Posters 3.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ?
      A Former User
      last edited by

      Hi,

      ns:name
      ns:locationasfsafs</ns:location>
      ns:locationeventxxxx</ns:locationevent>
      ns:Action
      ns:nameabc</ns:name>
      </ns:Action>
      ns:Action
      ns:nameghy</ns:name>
      </ns:Action>
      ns:Coverage
      ns:Action
      ns:namedeg</ns:name>
      </ns:Action>
      </ns:Coverage>
      </ns:locationevent>
      </ns:name>

      ns:name
      ns:locationasfsafs</ns:location>
      ns:locationeventzzzz</ns:locationevent>
      ns:Action
      ns:nameabc</ns:name>
      </ns:Action>
      ns:Action
      ns:namedef</ns:name>
      </ns:Action>
      ns:Coverage
      ns:Action
      ns:nameefg</ns:name>
      </ns:Action>
      </ns:Coverage>
      </ns:locationevent>
      </ns:name>

      ns:name
      ns:locationasfsafs</ns:location>
      ns:locationeventyyyy</ns:locationevent>
      ns:Action
      ns:nameabc</ns:name>
      </ns:Action>
      ns:Action
      ns:namedef</ns:name>
      </ns:Action>
      ns:Coverage
      ns:Action
      ns:namedef</ns:name>
      </ns:Action>
      </ns:Coverage>
      </ns:locationevent>
      </ns:name>

      I have the attached xmls in a file, I want to take only the xml which matching with the following conditions. Can anyone help to do in regex?

      ns:name
      ns:locationasfsafs</ns:location>
      ns:locationeventxxxx</ns:locationevent>
      ns:Action
      ns:nameabc</ns:name>
      </ns:Action>
      ns:Action
      ns:nameghy</ns:name>
      </ns:Action>
      ns:Coverage
      ns:Action
      ns:namedeg</ns:name>
      </ns:Action>
      </ns:Coverage>
      </ns:locationevent>
      </ns:name>

      ns:name
      ns:locationasfsafs</ns:location>
      ns:locationeventzzzz</ns:locationevent>
      ns:Action
      ns:nameabc</ns:name>
      </ns:Action>
      ns:Action
      ns:namedef</ns:name>
      </ns:Action>
      ns:Coverage
      ns:Action
      ns:nameefg</ns:name>
      </ns:Action>
      </ns:Coverage>
      </ns:locationevent>
      </ns:name>

      ns:name
      ns:locationasfsafs</ns:location>
      ns:locationeventyyyy</ns:locationevent>
      ns:Action
      ns:nameabc</ns:name>
      </ns:Action>
      ns:Action
      ns:namedef</ns:name>
      </ns:Action>
      ns:Coverage
      ns:Action
      ns:namedef</ns:name>
      </ns:Action>
      </ns:Coverage>
      </ns:locationevent>
      </ns:name>

      I need only the xml which has ns:locationeven as yyyy and any of the ns:Action
      ns:name should be def and ns:Coverage
      ns:Action
      ns:name should be def. In this case it should return only the below

      ns:name
      ns:locationasfsafs</ns:location>
      ns:locationeventyyyy</ns:locationevent>
      ns:Action
      ns:nameabc</ns:name>
      </ns:Action>
      ns:Action
      ns:namedef</ns:name>
      </ns:Action>
      ns:Coverage
      ns:Action
      ns:namedef</ns:name>
      </ns:Action>
      </ns:Coverage>
      </ns:locationevent>
      </ns:name>

      ? 1 Reply Last reply Reply Quote 0
      • ?
        A Former User @A Former User
        last edited by

        Correct Xmls
        <ns:name>
        <ns:location>asfsafs</ns:location>
        <ns:locationevent>xxxx</ns:locationevent>
        <ns:Action>
        <ns:name>abc</ns:name>
        </ns:Action>
        <ns:Action>
        <ns:name>ghy</ns:name>
        </ns:Action>
        <ns:Coverage>
        <ns:Action>
        <ns:name>deg</ns:name>
        </ns:Action>>
        </ns:Coverage>
        </ns:locationevent>
        </ns:name>

        <ns:name>
        <ns:location>asfsafs</ns:location>
        <ns:locationevent>zzzz</ns:locationevent>
        <ns:Action>
        <ns:name>abc</ns:name>
        </ns:Action>
        <ns:Action>
        <ns:name>def</ns:name>
        </ns:Action>
        <ns:Coverage>
        <ns:Action>
        <ns:name>efg</ns:name>
        </ns:Action>
        </ns:Coverage>
        </ns:locationevent>
        </ns:name>

        <ns:name>
        <ns:location>asfsafs</ns:location>
        <ns:locationevent>yyyy</ns:locationevent>
        <ns:Action>
        <ns:name>abc</ns:name>
        </ns:Action>
        <ns:Action>
        <ns:name>def</ns:name>
        </ns:Action>
        <ns:Coverage>
        <ns:Action>
        <ns:name>def</ns:name>
        </ns:Action>
        </ns:Coverage>
        </ns:locationevent>
        </ns:name>

        expected xml:

        <ns:name>
        <ns:location>asfsafs</ns:location>
        <ns:locationevent>yyyy</ns:locationevent>
        <ns:Action>
        <ns:name>abc</ns:name>
        </ns:Action>
        <ns:Action>
        <ns:name>def</ns:name>
        </ns:Action>
        <ns:Coverage>
        <ns:Action>
        <ns:name>def</ns:name>
        </ns:Action>
        </ns:Coverage>
        </ns:locationevent>
        </ns:name>

        1 Reply Last reply Reply Quote 0
        • guy038G
          guy038
          last edited by guy038

          Hello, @vijay-s, and All,

          Apparently, your expected XML file would be :

          <ns:name>
              <ns:location>asfsafs</ns:location>
              <ns:locationevent>yyyy</ns:locationevent>
                  <ns:Action>
                      <ns:name>abc</ns:name>
                  </ns:Action>
                  <ns:Action>
                      <ns:name>def</ns:name>
                  </ns:Action>
                  <ns:Coverage>
                      <ns:Action>
                          <ns:name>def</ns:name>
                      </ns:Action>
                  </ns:Coverage>
              </ns:locationevent>
          </ns:name>
          

          Note that to correctly display code text, your need to insert it between two lines of 3 Grave Accents ( \x{0060} ), like below :

          ```
          Code text
          ```


          So, in order that a multi-lines block <ns:name>............</ns:name> of the input file would be kept in the output file :

          • The block must contain a line <ns:locationevent>yyyy</ns:locationevent>

          • The block must contain a multi-lines part <ns:Coverage> .... <ns:Action> .... <ns:name>def</ns:name>

          • The block must contain a multi-lines part <ns:Action> .... <ns:name>def</ns:name>

          Regarding this third condition, I just noticed that, the block you’re expecting to, also contains the part <ns:Action> .... <ns:name>abc</ns:name> with a different value than def. So, I guess that, as long as the third condition is true, a block may also contain other multi-lines parts <ns:Action> .... <ns:name>xxx</ns:name>, where xxx is different from def. Am I right about it ?

          See you later,

          Best Regards,

          guy038

          1 Reply Last reply Reply Quote 1
          • ?
            A Former User
            last edited by

            Hi,

            You are right.

            ns:name…ns:Action … ns:namedef</ns:name> is one of the action name under ns:name
            Regarding your third condition it is Yes it may have other multi lines parts.

            Can you provide the solution asap
            as Its very urgent.
            .

            1 Reply Last reply Reply Quote 0
            • ?
              A Former User
              last edited by

              @guy038 said in extract XMl with regex:

              \x{0060}

              \x{0060}\x{0060}\x{0060}
              ns:name…ns:Action … ns:namedef</ns:name>
              \x{0060}\x{0060}\x{0060}

              1 Reply Last reply Reply Quote 0
              • guy038G
                guy038
                last edited by guy038

                Hi @vijay-s, and All,

                If, in addition, we assume that the section <ns:Coverage> .... </ns:Coverage> is always near the end of the main <ns:name> ... </ns:name> section

                • Start Notepad++ ( your N++ version must be 7.8 or higher : Press thee F1 key to verify )

                • Open the Mark dialog ( Search > Mark... menu option )

                • SEARCH (?s)<ns:name>((?!</ns:locationevent>).)*?<ns:locationevent>yyyy</ns:locationevent.*?<ns:Action>.*?<ns:name>def</ns:name>.*?<ns:Coverage>.*?<ns:Action>.*?<ns:name>def</ns:name>.*?</ns:locationevent>.*?</ns:name>

                • Tick the Bookmark line option

                • Tick the Purge for each search option

                • Tick the Wrap around option

                • Select the Regular expression search mode

                • Click, once, on the Mark All button

                => Normally, all lines of the main <ns:name> ... </ns:name> blocks, which satisfy the 3 conditions, discussed in my previous post, should be bookmarked

                Now :

                • Run the menu option Search > Bookmark > Copy Bookmarked lines

                • Open a new tab ( Ctrl + N )

                • Paste all the bookmarked lines ( Ctrl + V )

                Et voilà !

                I’ll give you additional information about the regex search syntax, later, if you want to ;-))


                When I said that the Grave accent is \x{0060}, I just mentionned its Uniocde value. But, of course, you must insert the ` character, literally, 3 times, before and after the piece of code !

                Best Regards

                guy038

                1 Reply Last reply Reply Quote 1
                • ?
                  A Former User
                  last edited by

                  @guy038 said in extract XML with regex:

                  rmation about the regex search syntax, later, if you want to

                  Thanks a lot Guy038. It is working like a charm!!!
                  
                  Yes, Can you please explain the regex used.
                  
                  I need two more help:
                  
                  In case if there is an <ID></ID> under the <ns:coverage> tag somewhere in the middle and I need to pull only the XML which matches the given id.
                  
                  Also, 
                  
                  
                  <ns:name>
                  <ns:location>asfsafs</ns:location>
                  <ns:locationevent>zzzz</ns:locationevent>
                  <ns:locations>
                  <ns:location1>
                  <ns:locationphase>
                  <ns:Action>
                  <ns:name>Prior</ns:name>
                  </ns:Action>
                  <ns:Status>Completed</ns:Status>
                  </ns:locationphase>
                  </ns:location1>
                  <ns:location1>
                  <ns:locationphase>
                  <ns:Action>
                  <ns:name>Current</ns:name>
                  </ns:Action>
                  <ns:Status>Completed</ns:Status>
                  </ns:locationphase>
                  </ns:location1>
                  <ns:location1>
                  <ns:locationphase>
                  <ns:Action>
                  <ns:name>Future</ns:name>
                  </ns:Action>
                  <ns:Status>Pending</ns:Status>
                  </ns:locationphase>
                  </ns:location1>
                  </ns:locations>
                  <ns:Action>
                  <ns:name>abc</ns:name>
                  </ns:Action>
                  <ns:Action>
                  <ns:name>ghy</ns:name>
                  </ns:Action>
                  <ns:Coverage>
                  <ns:Action>
                  <ns:name>deg</ns:name>
                  </ns:Action>
                  <ns:Action>
                  <ns:name>def</ns:name>
                  </ns:Action>
                  </ns:Coverage>
                  </ns:name>
                  
                  
                  <ns:name>
                  <ns:location>asfsafs</ns:location>
                  <ns:locationevent>yyyy</ns:locationevent>
                  <ns:locations>
                  <ns:location1>
                  <ns:locationphase>
                  <ns:Action>
                  <ns:name>Prior</ns:name>
                  </ns:Action>
                  <ns:Status>Completed</ns:Status>
                  </ns:locationphase>
                  </ns:location1>
                  <ns:location1>
                  <ns:locationphase>
                  <ns:Action>
                  <ns:name>Current</ns:name>
                  </ns:Action>
                  <ns:Status>Completed</ns:Status>
                  </ns:locationphase>
                  </ns:location1>
                  <ns:location1>
                  <ns:locationphase>
                  <ns:Action>
                  <ns:name>Future</ns:name>
                  </ns:Action>
                  <ns:Status>Pending</ns:Status>
                  </ns:locationphase>
                  </ns:location1>
                  </ns:locations>
                  <ns:Action>
                  <ns:name>abc</ns:name>
                  </ns:Action>
                  <ns:Action>
                  <ns:name>def</ns:name>
                  </ns:Action>
                  <ns:Coverage>
                  <ns:Action>
                  <ns:name>deg</ns:name>
                  </ns:Action>
                  <ns:Action>
                  <ns:name>ddd</ns:name>
                  </ns:Action>
                  </ns:Coverage>
                  </ns:name>
                  
                  
                  
                  <ns:name>
                  <ns:location>asfsafs</ns:location>
                  <ns:locationevent>yyyy</ns:locationevent>
                  <ns:locations>
                  <ns:location1>
                  <ns:locationphase>
                  <ns:Action>
                  <ns:name>Prior</ns:name>
                  </ns:Action>
                  <ns:Status>Completed</ns:Status>
                  </ns:locationphase>
                  </ns:location1>
                  <ns:location1>
                  <ns:locationphase>
                  <ns:Action>
                  <ns:name>Current</ns:name>
                  </ns:Action>
                  <ns:Status>Completed</ns:Status>
                  </ns:locationphase>
                  </ns:location1>
                  <ns:location1>
                  <ns:locationphase>
                  <ns:Action>
                  <ns:name>Future</ns:name>
                  </ns:Action>
                  <ns:Status>Pending</ns:Status>
                  </ns:locationphase>
                  </ns:location1>
                  </ns:locations>
                  <ns:Action>
                  <ns:name>abc</ns:name>
                  </ns:Action>
                  <ns:Action>
                  <ns:name>def</ns:name>
                  </ns:Action>
                  <ns:Coverage>
                  <ns:Action>
                  <ns:name>deg</ns:name>
                  </ns:Action>
                  <ns:Action>
                  <ns:name>def</ns:name>
                  </ns:Action>
                  </ns:Coverage>
                  </ns:name>
                  
                  
                  
                  
                  For the above given file, I need to pick the <ns:locationphase>
                  <ns:Action>
                  <ns:name>Future</ns:name>
                  </ns:Action>
                  <ns:Status>Pending and <ns:name>..<ns:Action>
                  <ns:name>def and <ns:Coverage>
                  <ns:Action>
                  <ns:name>def
                  XML alone. Can you please help.
                  
                  1 Reply Last reply Reply Quote 0
                  • ?
                    A Former User
                    last edited by

                    @vijay-S said in extract XMl with regex:

                    ns:locationeventyyyy</ns:locationevent>

                    Pls include <ns:locationevent>yyyy</ns:locationevent> also in your answer for the second help.
                    1 Reply Last reply Reply Quote 0
                    • guy038G
                      guy038
                      last edited by guy038

                      @vijay-s, and All,

                      Regarding your first required help, do you mean :

                      In case if there is an <ID>xxxxx</ID> under the <ns:coverage> tag, somewhere in the middle, how to pull only the multi-lines <ns:name> ...... </ns:name> block, which contains the <ID>xxxxxx</ID> tag ?


                      Regarding your second required help, here is your last XML input file, well indented, with numbers for each line and I put the a ● symbol in front of each line that I guessed you want to extract

                      If I forgot some lines just tell me their numbers. For a range, you may shorten it to : need lines xxx-yyy

                          001<ns:name>
                          002    <ns:location>asfsafs</ns:location>
                          003    <ns:locationevent>zzzz</ns:locationevent>
                          004    <ns:locations>
                          005        <ns:location1>
                          006            <ns:locationphase>
                          007                <ns:Action>
                          008                    <ns:name>Prior</ns:name>
                          009                </ns:Action>
                          010                <ns:Status>Completed</ns:Status>
                          011            </ns:locationphase>
                          012        </ns:location1>
                          013        <ns:location1>
                          014            <ns:locationphase>
                          015                <ns:Action>
                          016                    <ns:name>Current</ns:name>
                          017                </ns:Action>
                          018                <ns:Status>Completed</ns:Status>
                          019            </ns:locationphase>
                          020        </ns:location1>
                          021        <ns:location1>
                          022            <ns:locationphase>
                          023                <ns:Action>
                          024                    <ns:name>Future</ns:name>
                          025                </ns:Action>
                          026                <ns:Status>Pending</ns:Status>
                          027            </ns:locationphase>
                          028        </ns:location1>
                          029    </ns:locations>
                          030    <ns:Action>
                          031        <ns:name>abc</ns:name>
                          032    </ns:Action>
                          033    <ns:Action>
                          034        <ns:name>ghy</ns:name>
                          035    </ns:Action>
                          036    <ns:Coverage>
                          037        <ns:Action>
                          038            <ns:name>deg</ns:name>
                          039        </ns:Action>
                          040        <ns:Action>
                          041            <ns:name>def</ns:name>
                          042        </ns:Action>
                          043    </ns:Coverage>
                          044</ns:name>
                          045
                          046
                          047<ns:name>
                          048    <ns:location>asfsafs</ns:location>
                          049    <ns:locationevent>yyyy</ns:locationevent>
                          050    <ns:locations>
                          051        <ns:location1>
                          052            <ns:locationphase>
                          053                <ns:Action>
                          054                    <ns:name>Prior</ns:name>
                          055                </ns:Action>
                          056                <ns:Status>Completed</ns:Status>
                          057            </ns:locationphase>
                          058        </ns:location1>
                          059        <ns:location1>
                          060            <ns:locationphase>
                          061                <ns:Action>
                          062                    <ns:name>Current</ns:name>
                          063                </ns:Action>
                          064                <ns:Status>Completed</ns:Status>
                          065            </ns:locationphase>
                          066        </ns:location1>
                          067        <ns:location1>
                       ●  068            <ns:locationphase>
                       ●  069                <ns:Action>
                       ●  070                    <ns:name>Future</ns:name>
                       ●  071                </ns:Action>
                       ●  072                <ns:Status>Pending</ns:Status>
                          073            </ns:locationphase>
                          074        </ns:location1>
                          075    </ns:locations>
                          076    <ns:Action>
                          077        <ns:name>abc</ns:name>
                          078    </ns:Action>
                       ●  079    <ns:Action>
                       ●  080        <ns:name>def</ns:name>
                          081    </ns:Action>
                          082    <ns:Coverage>
                          083        <ns:Action>
                          084            <ns:name>deg</ns:name>
                          085        </ns:Action>
                          086        <ns:Action>
                          087            <ns:name>ddd</ns:name>
                          088        </ns:Action>
                          089    </ns:Coverage>
                          090</ns:name>
                          091
                          092<ns:name>
                          093    <ns:location>asfsafs</ns:location>
                          094    <ns:locationevent>yyyy</ns:locationevent>
                          095    <ns:locations>
                          096        <ns:location1>
                          097            <ns:locationphase>
                          098                <ns:Action>
                          099                    <ns:name>Prior</ns:name>
                          100                </ns:Action>
                          101                <ns:Status>Completed</ns:Status>
                          102            </ns:locationphase>
                          103        </ns:location1>
                          104        <ns:location1>
                          105            <ns:locationphase>
                          106                <ns:Action>
                          107                    <ns:name>Current</ns:name>
                          108                </ns:Action>
                          109                <ns:Status>Completed</ns:Status>
                          110            </ns:locationphase>
                          111        </ns:location1>
                          112        <ns:location1>
                       ●  113            <ns:locationphase>
                       ●  114                <ns:Action>
                       ●  115                    <ns:name>Future</ns:name>
                       ●  116                </ns:Action>
                       ●  117                <ns:Status>Pending</ns:Status>
                          118            </ns:locationphase>
                          119        </ns:location1>
                          120    </ns:locations>
                          121    <ns:Action>
                          122        <ns:name>abc</ns:name>
                          123    </ns:Action>
                       ●  124    <ns:Action>
                       ●  125        <ns:name>def</ns:name>
                          126    </ns:Action>
                       ●  127    <ns:Coverage>
                          128        <ns:Action>
                          129            <ns:name>deg</ns:name>
                          130        </ns:Action>
                       ●  131        <ns:Action>
                       ●  132            <ns:name>def</ns:name>
                          133        </ns:Action>
                          134    </ns:Coverage>
                          135</ns:name>
                      

                      See you later,

                      guy038

                      1 Reply Last reply Reply Quote 1
                      • guy038G
                        guy038
                        last edited by

                        Hi, @vijay-s, and All,

                        From your posts :

                        https://community.notepad-plus-plus.org/post/49506

                        I understood that, finally, from the original XML text below, you want to only extract the lines, with the ● symbol :

                        <!----------------------------------------------------------->
                            001<ns:name>
                            002    <ns:location>asfsafs</ns:location>
                            003    <ns:locationevent>zzzz</ns:locationevent>
                            004    <ns:locations>
                            005        <ns:location1>
                            006            <ns:locationphase>
                            007                <ns:Action>
                            008                    <ns:name>Prior</ns:name>
                            009                </ns:Action>
                            010                <ns:Status>Completed</ns:Status>
                            011            </ns:locationphase>
                            012        </ns:location1>
                            013        <ns:location1>
                            014            <ns:locationphase>
                            015                <ns:Action>
                            016                    <ns:name>Current</ns:name>
                            017                </ns:Action>
                            018                <ns:Status>Completed</ns:Status>
                            019            </ns:locationphase>
                            020        </ns:location1>
                            021        <ns:location1>
                            022            <ns:locationphase>
                            023                <ns:Action>
                            024                    <ns:name>Future</ns:name>
                            025                </ns:Action>
                            026                <ns:Status>Pending</ns:Status>
                            027            </ns:locationphase>
                            028        </ns:location1>
                            029    </ns:locations>
                            030    <ns:Action>
                            031        <ns:name>abc</ns:name>
                            032    </ns:Action>
                            033    <ns:Action>
                            034        <ns:name>ghy</ns:name>
                            035    </ns:Action>
                            036    <ns:Coverage>
                            037        <ns:Action>
                            038            <ns:name>deg</ns:name>
                            039        </ns:Action>
                            040        <ns:Action>
                            041            <ns:name>def</ns:name>
                            042        </ns:Action>
                            043    </ns:Coverage>
                            044</ns:name>
                            045
                            046
                        <!----------------------------------------------------------->
                         ●  047<ns:name>
                         ●  048    <ns:location>asfsafs</ns:location>
                         ●  049    <ns:locationevent>yyyy</ns:locationevent>
                         ●  050    <ns:locations>
                        <!----------------------------------------------------------->
                            051        <ns:location1>
                            052            <ns:locationphase>
                            053                <ns:Action>
                            054                    <ns:name>Prior</ns:name>
                            055                </ns:Action>
                            056                <ns:Status>Completed</ns:Status>
                            057            </ns:locationphase>
                            058        </ns:location1>
                            059        <ns:location1>
                            060            <ns:locationphase>
                            061                <ns:Action>
                            062                    <ns:name>Current</ns:name>
                            063                </ns:Action>
                            064                <ns:Status>Completed</ns:Status>
                            065            </ns:locationphase>
                            066        </ns:location1>
                        <!----------------------------------------------------------->
                         ●  067        <ns:location1>
                         ●  068            <ns:locationphase>
                         ●  069                <ns:Action>
                         ●  070                    <ns:name>Future</ns:name>
                         ●  071                </ns:Action>
                         ●  072                <ns:Status>Pending</ns:Status>
                         ●  073            </ns:locationphase>
                         ●  074        </ns:location1>
                         ●  075    </ns:locations>
                        <!----------------------------------------------------------->
                            076    <ns:Action>
                            077        <ns:name>abc</ns:name>
                            078    </ns:Action>
                        <!----------------------------------------------------------->
                         ●  079    <ns:Action>
                         ●  080        <ns:name>def</ns:name>
                         ●  081    </ns:Action>
                        <!----------------------------------------------------------->
                            082    <ns:Coverage>
                            083        <ns:Action>
                            084            <ns:name>deg</ns:name>
                            085        </ns:Action>
                            086        <ns:Action>
                            087            <ns:name>ddd</ns:name>
                            088        </ns:Action>
                            089    </ns:Coverage>
                        <!----------------------------------------------------------->
                         ●  090</ns:name>
                        <!----------------------------------------------------------->
                            091
                        <!----------------------------------------------------------->
                         ●  092<ns:name>
                         ●  093    <ns:location>asfsafs</ns:location>
                         ●  094    <ns:locationevent>yyyy</ns:locationevent>
                         ●  095    <ns:locations>
                        <!----------------------------------------------------------->
                            096        <ns:location1>
                            097            <ns:locationphase>
                            098                <ns:Action>
                            099                    <ns:name>Prior</ns:name>
                            100                </ns:Action>
                            101                <ns:Status>Completed</ns:Status>
                            102            </ns:locationphase>
                            103        </ns:location1>
                            104        <ns:location1>
                            105            <ns:locationphase>
                            106                <ns:Action>
                            107                    <ns:name>Current</ns:name>
                            108                </ns:Action>
                            109                <ns:Status>Completed</ns:Status>
                            110            </ns:locationphase>
                            111        </ns:location1>
                        <!----------------------------------------------------------->
                         ●  112        <ns:location1>
                         ●  113            <ns:locationphase>
                         ●  114                <ns:Action>
                         ●  115                    <ns:name>Future</ns:name>
                         ●  116                </ns:Action>
                         ●  117                <ns:Status>Pending</ns:Status>
                         ●  118            </ns:locationphase>
                         ●  119        </ns:location1>
                         ●  120    </ns:locations>
                        <!----------------------------------------------------------->
                            121    <ns:Action>
                            122        <ns:name>abc</ns:name>
                            123    </ns:Action>
                        <!----------------------------------------------------------->
                         ●  124    <ns:Action>
                         ●  125        <ns:name>def</ns:name>
                         ●  126    </ns:Action>
                        <!----------------------------------------------------------->
                         ●  127    <ns:Coverage>
                            128        <ns:Action>
                            129            <ns:name>deg</ns:name>
                            130        </ns:Action>
                         ●  131        <ns:Action>
                         ●  132            <ns:name>def</ns:name>
                         ●  133        </ns:Action>
                         ●  134    </ns:Coverage>
                         ●  135</ns:name>
                        <!----------------------------------------------------------->
                        

                        Its important to note that, this time, I’ll use an other logic than before ! Indeed, the subsequent regex searches, below, will delete all unwanted zones of text. So, at the end, you’ll just get your expected text

                        Thus, assumming the initial XML text :

                        <ns:name>
                            <ns:location>asfsafs</ns:location>
                            <ns:locationevent>zzzz</ns:locationevent>
                            <ns:locations>
                                <ns:location1>
                                    <ns:locationphase>
                                        <ns:Action>
                                            <ns:name>Prior</ns:name>
                                        </ns:Action>
                                        <ns:Status>Completed</ns:Status>
                                    </ns:locationphase>
                                </ns:location1>
                                <ns:location1>
                                    <ns:locationphase>
                                        <ns:Action>
                                            <ns:name>Current</ns:name>
                                        </ns:Action>
                                        <ns:Status>Completed</ns:Status>
                                    </ns:locationphase>
                                </ns:location1>
                                <ns:location1>
                                    <ns:locationphase>
                                        <ns:Action>
                                            <ns:name>Future</ns:name>
                                        </ns:Action>
                                        <ns:Status>Pending</ns:Status>
                                    </ns:locationphase>
                                </ns:location1>
                            </ns:locations>
                            <ns:Action>
                                <ns:name>abc</ns:name>
                            </ns:Action>
                            <ns:Action>
                                <ns:name>ghy</ns:name>
                            </ns:Action>
                            <ns:Coverage>
                                <ns:Action>
                                    <ns:name>deg</ns:name>
                                </ns:Action>
                                <ns:Action>
                                    <ns:name>def</ns:name>
                                </ns:Action>
                            </ns:Coverage>
                        </ns:name>
                        
                        
                        <ns:name>
                            <ns:location>asfsafs</ns:location>
                            <ns:locationevent>yyyy</ns:locationevent>
                            <ns:locations>
                                <ns:location1>
                                    <ns:locationphase>
                                        <ns:Action>
                                            <ns:name>Prior</ns:name>
                                        </ns:Action>
                                        <ns:Status>Completed</ns:Status>
                                    </ns:locationphase>
                                </ns:location1>
                                <ns:location1>
                                    <ns:locationphase>
                                        <ns:Action>
                                            <ns:name>Current</ns:name>
                                        </ns:Action>
                                        <ns:Status>Completed</ns:Status>
                                    </ns:locationphase>
                                </ns:location1>
                                <ns:location1>
                                    <ns:locationphase>
                                        <ns:Action>
                                            <ns:name>Future</ns:name>
                                        </ns:Action>
                                        <ns:Status>Pending</ns:Status>
                                    </ns:locationphase>
                                </ns:location1>
                            </ns:locations>
                            <ns:Action>
                                <ns:name>abc</ns:name>
                            </ns:Action>
                            <ns:Action>
                                <ns:name>def</ns:name>
                            </ns:Action>
                            <ns:Coverage>
                                <ns:Action>
                                    <ns:name>deg</ns:name>
                                </ns:Action>
                                <ns:Action>
                                    <ns:name>ddd</ns:name>
                                </ns:Action>
                            </ns:Coverage>
                        </ns:name>
                        
                        <ns:name>
                            <ns:location>asfsafs</ns:location>
                            <ns:locationevent>yyyy</ns:locationevent>
                            <ns:locations>
                                <ns:location1>
                                    <ns:locationphase>
                                        <ns:Action>
                                            <ns:name>Prior</ns:name>
                                        </ns:Action>
                                        <ns:Status>Completed</ns:Status>
                                    </ns:locationphase>
                                </ns:location1>
                                <ns:location1>
                                    <ns:locationphase>
                                        <ns:Action>
                                            <ns:name>Current</ns:name>
                                        </ns:Action>
                                        <ns:Status>Completed</ns:Status>
                                    </ns:locationphase>
                                </ns:location1>
                                <ns:location1>
                                    <ns:locationphase>
                                        <ns:Action>
                                            <ns:name>Future</ns:name>
                                        </ns:Action>
                                        <ns:Status>Pending</ns:Status>
                                    </ns:locationphase>
                                </ns:location1>
                            </ns:locations>
                            <ns:Action>
                                <ns:name>abc</ns:name>
                            </ns:Action>
                            <ns:Action>
                                <ns:name>def</ns:name>
                            </ns:Action>
                            <ns:Coverage>
                                <ns:Action>
                                    <ns:name>deg</ns:name>
                                </ns:Action>
                                <ns:Action>
                                    <ns:name>def</ns:name>
                                </ns:Action>
                            </ns:Coverage>
                        </ns:name>
                        

                        After running, successively these 4 regex S/R, in that order, against the text right above,

                        Regex A :

                        SEARCH (?s)^\h*<ns:name>.*?<ns:locationevent>(?!yyyy).*?</ns:Coverage>\R\h*</ns:name>\R

                        REPLACE Leave EMPTY

                        Regex B :

                        SEARCH (?s)^\h+<ns:location(\d+)>((?!Future).)+?</ns:location\1>\R

                        REPLACE Leave EMPTY

                        Regex C :

                        SEARCH (?s)^\h+<ns:Action>((?!def|Future).)+?</ns:Action>\R

                        REPLACE Leave EMPTY

                        Regex D

                        SEARCH (?s)^\h*<(.+)>\s*</\1>\R

                        REPLACE Leave EMPTY

                        with the Wrap around option ticked, the Regular expression search mode selected and using the Replace All button

                        You should get your expected text :

                        
                        
                        <ns:name>
                            <ns:location>asfsafs</ns:location>
                            <ns:locationevent>yyyy</ns:locationevent>
                            <ns:locations>
                                <ns:location1>
                                    <ns:locationphase>
                                        <ns:Action>
                                            <ns:name>Future</ns:name>
                                        </ns:Action>
                                        <ns:Status>Pending</ns:Status>
                                    </ns:locationphase>
                                </ns:location1>
                            </ns:locations>
                            <ns:Action>
                                <ns:name>def</ns:name>
                            </ns:Action>
                        </ns:name>
                        
                        <ns:name>
                            <ns:location>asfsafs</ns:location>
                            <ns:locationevent>yyyy</ns:locationevent>
                            <ns:locations>
                                <ns:location1>
                                    <ns:locationphase>
                                        <ns:Action>
                                            <ns:name>Future</ns:name>
                                        </ns:Action>
                                        <ns:Status>Pending</ns:Status>
                                    </ns:locationphase>
                                </ns:location1>
                            </ns:locations>
                            <ns:Action>
                                <ns:name>def</ns:name>
                            </ns:Action>
                            <ns:Coverage>
                                <ns:Action>
                                    <ns:name>def</ns:name>
                                </ns:Action>
                            </ns:Coverage>
                        </ns:name>
                        

                        Now, you said :

                        1st help:
                        In the first solution you gave, I want to add the cf:coverage …<ID> condition too along with exisiting conditions.

                        • Sorry, it’s still not clear: do you speak of the string <cf:coverage> or the string <ns:Coverage>, with this exact case ?

                        • In the other hand, could you give me an real example of a <ns:Coverage> ... </ns:Coverage> section, containing an <ID> .... </ID> block and which text you expect to ? Thanks.

                        • Finally, generally speaking, does case is important to you ?

                        For instance, are you looking for the expresiion <ns:locationevent>yyyy</ns:locationevent>, with this exact case ?

                        Or, are all these other syntaxes, below still correct ?

                        <ns:locationevent>yyyy</ns:locationevent>
                        <ns:locationevent>YyyY</ns:locationevent>
                        <ns:LocationEvent>yyyy</ns:LocationEvent>
                        <NS:locationevent>Yyyy</ns:locationevent> …

                        Best Regards,

                        guy038

                        ? 1 Reply Last reply Reply Quote 1
                        • ?
                          A Former User @guy038
                          last edited by

                          @guy038

                          Sorry if i made you not clear
                          
                          For the given XML
                          <ns:name>
                              002    <ns:location>asfsafs</ns:location>
                              003    <ns:locationevent>zzzz</ns:locationevent>
                              004    <ns:locations>
                              005        <ns:location1>
                              006            <ns:locationphase>
                              007                <ns:Action>
                              008                    <ns:name>Prior</ns:name>
                              009                </ns:Action>
                              010                <ns:Status>Completed</ns:Status>
                              011            </ns:locationphase>
                              012        </ns:location1>
                              013        <ns:location1>
                              014            <ns:locationphase>
                              015                <ns:Action>
                              016                    <ns:name>Current</ns:name>
                              017                </ns:Action>
                              018                <ns:Status>Completed</ns:Status>
                              019            </ns:locationphase>
                              020        </ns:location1>
                              021        <ns:location1>
                              022            <ns:locationphase>
                              023                <ns:Action>
                              024                    <ns:name>Future</ns:name>
                              025                </ns:Action>
                              026                <ns:Status>Pending</ns:Status>
                              027            </ns:locationphase>
                              028        </ns:location1>
                              029    </ns:locations>
                              030    <ns:Action>
                              031        <ns:name>abc</ns:name>
                              032    </ns:Action>
                              033    <ns:Action>
                              034        <ns:name>ghy</ns:name>
                              035    </ns:Action>
                              036    <ns:Coverage>
                              037        <ns:Action>
                              038            <ns:name>deg</ns:name>
                              039        </ns:Action>
                              040        <ns:Action>
                              041            <ns:name>def</ns:name>
                              042        </ns:Action>
                              043    </ns:Coverage>
                              044</ns:name>
                              045
                              046
                              047<ns:name>
                              048    <ns:location>asfsafs</ns:location>
                              049    <ns:locationevent>yyyy</ns:locationevent>
                              050    <ns:locations>
                              051        <ns:location1>
                              052            <ns:locationphase>
                              053                <ns:Action>
                              054                    <ns:name>Prior</ns:name>
                              055                </ns:Action>
                              056                <ns:Status>Completed</ns:Status>
                              057            </ns:locationphase>
                              058        </ns:location1>
                              059        <ns:location1>
                              060            <ns:locationphase>
                              061                <ns:Action>
                              062                    <ns:name>Current</ns:name>
                              063                </ns:Action>
                              064                <ns:Status>Completed</ns:Status>
                              065            </ns:locationphase>
                              066        </ns:location1>
                              067        <ns:location1>
                           ●  068            <ns:locationphase>
                           ●  069                <ns:Action>
                           ●  070                    <ns:name>Future</ns:name>
                           ●  071                </ns:Action>
                           ●  072                <ns:Status>Pending</ns:Status>
                              073            </ns:locationphase>
                              074        </ns:location1>
                              075    </ns:locations>
                              076    <ns:Action>
                              077        <ns:name>abc</ns:name>
                              078    </ns:Action>
                           ●  079    <ns:Action>
                           ●  080        <ns:name>def</ns:name>
                              081    </ns:Action>
                              082    <ns:Coverage>
                              083        <ns:Action>
                              084            <ns:name>deg</ns:name>
                              085        </ns:Action>
                              086        <ns:Action>
                              087            <ns:name>ddd</ns:name>
                              088        </ns:Action>
                              089    </ns:Coverage>
                              090</ns:name>
                              091
                              092<ns:name>
                              093    <ns:location>asfsafs</ns:location>
                              094    <ns:locationevent>yyyy</ns:locationevent>
                              095    <ns:locations>
                              096        <ns:location1>
                              097            <ns:locationphase>
                              098                <ns:Action>
                              099                    <ns:name>Prior</ns:name>
                              100                </ns:Action>
                              101                <ns:Status>Completed</ns:Status>
                              102            </ns:locationphase>
                              103        </ns:location1>
                              104        <ns:location1>
                              105            <ns:locationphase>
                              106                <ns:Action>
                              107                    <ns:name>Current</ns:name>
                              108                </ns:Action>
                              109                <ns:Status>Completed</ns:Status>
                              110            </ns:locationphase>
                              111        </ns:location1>
                              112        <ns:location1>
                           ●  113            <ns:locationphase>
                           ●  114                <ns:Action>
                           ●  115                    <ns:name>Future</ns:name>
                           ●  116                </ns:Action>
                           ●  117                <ns:Status>Pending</ns:Status>
                              118            </ns:locationphase>
                              119        </ns:location1>
                              120    </ns:locations>
                              121    <ns:Action>
                              122        <ns:name>abc</ns:name>
                              123    </ns:Action>
                           ●  124    <ns:Action>
                           ●  125        <ns:name>def</ns:name>
                              126    </ns:Action>
                           ●  127    <ns:Coverage>
                              128        <ns:Action>
                              129            <ns:name>deg</ns:name>
                              130        </ns:Action>
                           ●  131        <ns:Action>
                           ●  132            <ns:name>def</ns:name>
                              133        </ns:Action>
                              134    </ns:Coverage>
                              135</ns:name>
                          
                          only I need the below one(it is not extracting the text from XML). I need to pick the whole XML which is the last one in the given XML
                          
                          <ns:name>
                              093    <ns:location>asfsafs</ns:location>
                              094    <ns:locationevent>yyyy</ns:locationevent>
                              095    <ns:locations>
                              096        <ns:location1>
                              097            <ns:locationphase>
                              098                <ns:Action>
                              099                    <ns:name>Prior</ns:name>
                              100                </ns:Action>
                              101                <ns:Status>Completed</ns:Status>
                              102            </ns:locationphase>
                              103        </ns:location1>
                              104        <ns:location1>
                              105            <ns:locationphase>
                              106                <ns:Action>
                              107                    <ns:name>Current</ns:name>
                              108                </ns:Action>
                              109                <ns:Status>Completed</ns:Status>
                              110            </ns:locationphase>
                              111        </ns:location1>
                              112        <ns:location1>
                           ●  113            <ns:locationphase>
                           ●  114                <ns:Action>
                           ●  115                    <ns:name>Future</ns:name>
                           ●  116                </ns:Action>
                           ●  117                <ns:Status>Pending</ns:Status>
                              118            </ns:locationphase>
                              119        </ns:location1>
                              120    </ns:locations>
                              121    <ns:Action>
                              122        <ns:name>abc</ns:name>
                              123    </ns:Action>
                           ●  124    <ns:Action>
                           ●  125        <ns:name>def</ns:name>
                              126    </ns:Action>
                           ●  127    <ns:Coverage>
                              128        <ns:Action>
                              129            <ns:name>deg</ns:name>
                              130        </ns:Action>
                           ●  131        <ns:Action>
                           ●  132            <ns:name>def</ns:name>
                              133        </ns:Action>
                              134    </ns:Coverage>
                              135</ns:name>
                          
                          I need to pick the above xml based on the following conditions
                          
                          Condition 1: <ns:name>...<ns:locationevent>yyyy</ns:locationevent>
                          
                          Condition 2: <ns:name>.....<ns:locations>
                                      <ns:location1>
                                         <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Future</ns:name>
                                          </ns:Action>
                                           <ns:Status>Pending</ns:Status>
                          				 </ns:locationphase>
                                      </ns:location1>
                                  </ns:locations>
                          				 
                          Condition 3: <ns:name>...<ns:Action>
                                   <ns:name>def</ns:name>  -- May have more than one action with name with other values
                              </ns:Action>
                          	
                          Condition 4: <ns:name>...<ns:Coverage>
                                      <ns:Action>
                                       <ns:name>def</ns:name>  -- May have more than one action with name with other values
                          			  </ns:Action>
                                  </ns:Coverage>
                              In this case, only the third XML is matching for the above conditions. I need the third full XML.
                          
                          There is no Case sensitive need to be apply for the values but tags.
                          
                          I will send you the the real example for <ID> soon.
                          
                          1 Reply Last reply Reply Quote 0
                          • PeterJonesP
                            PeterJones
                            last edited by

                            @vijay-S said in extract XMl with regex:

                            Can you provide the solution asap
                            as Its very urgent.

                            BTW: I hadn’t had a chance to respond to this earlier, but such a request is considered exceedingly rude in any help forum I’ve ever visited.

                            That compounds with the fact that you have shown no effort: guy038 provides you with an answer that works (or comes as close as he can guess, given the inaccurate or incomplete information you provide), and then you change the rules without attempting to modify what he has already given you; and he replies with an update, and this keeps repeating; at some point, you are going to wear out even his patience. I recommend a change in tactics before you’ve burned all bridges here.

                            1 Reply Last reply Reply Quote 0
                            • guy038G
                              guy038
                              last edited by guy038

                              Hello, @vijay-s,

                              Oh my God ! I was misunderstanding all your stuff from the very beginning :-(( Now I see that you want :

                              You would like to pick the totality of any main <ns:name> ... </ns:name> block, ONLY IF it respects ALL the below conditions, in this priority order :

                              • It contains a tag and value <ns:locationevent>yyyy</ns:locationevent>

                              • It contains a tag and value <ID>123</ID>

                              • It contains a tag and value <ns:name>Future</ns:name>

                              • It contains, at least, one tag and value <ns:name>def</ns:name>, BEFORE the <ns:Coverage> .... <ns:Coverage> block

                              • It contains, at least, one tag and value <ns:name>def</ns:name>, INSIDE the <ns:Coverage> .... <ns:Coverage> block

                              Additionnal rule : Tags are sensitive to case and values are insensitive to case

                              Am I formulating all, in a right way ?


                              If so, from your last example found above and from the one in post :

                              https://community.notepad-plus-plus.org/post/49516

                              I tried to form a real example, recapitulating all types of text, giving :

                              <!----------------  INIITAL TEXT --------------------->
                              <ns:name>
                                  <ns:location>asfsafs</ns:location>
                                  <ns:locationevent>zzzz</ns:locationevent>
                                  <ns:locations>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Prior</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Current</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Future</ns:name>
                                              </ns:Action>
                                              <ns:Status>Pending</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                  </ns:locations>
                                  <ns:Action>
                                      <ns:name>abc</ns:name>
                                  </ns:Action>
                                  <ns:Action>
                                      <ns:name>ghy</ns:name>
                                  </ns:Action>
                                  <ID>123</ID>
                                  <ns:Coverage>
                                      <ns:Action>
                                          <ns:name>deg</ns:name>
                                      </ns:Action>
                                      <ns:Action>
                                          <ns:name>def</ns:name>
                                      </ns:Action>
                                  </ns:Coverage>
                              </ns:name>
                              
                              
                              <ns:name>
                                  <ns:location>asfsafs</ns:location>
                                  <ns:locationevent>yyyy</ns:locationevent>
                                  <ns:locations>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Prior</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Current</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Future</ns:name>
                                              </ns:Action>
                                              <ns:Status>Pending</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                  </ns:locations>
                                  <ns:Action>
                                      <ns:name>abc</ns:name>
                                  </ns:Action>
                                  <ns:Action>
                                      <ns:name>def</ns:name>
                                  </ns:Action>
                                  <ID>1234</ID>
                                  <ns:Coverage>
                                      <ns:Action>
                                          <ns:name>deg</ns:name>
                                      </ns:Action>
                                      <ns:Action>
                                          <ns:name>ddd</ns:name>
                                      </ns:Action>
                                  </ns:Coverage>
                              </ns:name>
                              
                              <ns:name>
                                  <ns:location>asfsafs</ns:location>
                                  <ns:locationevent>yyyy</ns:locationevent>
                                  <ns:locations>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Prior</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Current</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Future</ns:name>
                                              </ns:Action>
                                              <ns:Status>Pending</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                  </ns:locations>
                                  <ns:Action>
                                      <ns:name>abc</ns:name>
                                  </ns:Action>
                                  <ns:Action>
                                      <ns:name>def</ns:name>
                                  </ns:Action>
                                  <ID>123</ID>
                                  <ns:Coverage>
                                      <ns:Action>
                                          <ns:name>deg</ns:name>
                                      </ns:Action>
                                      <ns:Action>
                                          <ns:name>def</ns:name>
                                      </ns:Action>
                                  </ns:Coverage>
                              </ns:name>
                              

                              IMPORTANT :

                              • I change, again, my logic about the regexes. This time, if the condition, contained in a regex, is true, it will add a number, as a benchmark, to the present main <ns:name> .... </ns:name> block

                              • For all the S/R, below, click on the Replace All button, exclusively ( Do not use the Replace button. But you may use the Find Next button to see the different matches )

                              • And, as usual, tick the Wrap around option and select the Regular expression search mode


                              So, if you apply, successively, these 5 regexes, in this order, it will add a different number, right after the ending tag </ns:name> of each main block ( and possible other digits )

                              • Regex 1 :

                                • SEARCH (?s-i)^\h*<ns:name>.*?<ns:locationevent>(?i:yyyy)</ns:locationevent>.+?</ns:Coverage>\R\h*</ns:name>\d*\K

                                • REPLACE 1

                              • Regex 2 :

                                • SEARCH (?s-i)^\h*<ns:name>.*?<ID>123</ID>.+?</ns:Coverage>\R\h*</ns:name>\d*\K

                                • REPLACE 2

                              • Regex 3 :

                                • SEARCH (?s-i)^\h*<ns:name>.*?<ns:name>(?i:Future)</ns:name>.+?</ns:Coverage>\R\h*</ns:name>\d*\K

                                • REPLACE 3

                              • Regex 4 :

                                • SEARCH (?s-i)^\h*<ns:name>.*?<ns:name>(?i:def)</ns:name>.+?<ns:Coverage>.+?</ns:Coverage>\R\h*</ns:name>\d*\K

                                • REPLACE 4

                              • Regex 5 :

                                • SEARCH (?s-i)^\h*<ns:name>.*?<ns:Coverage>.+?<ns:name>(?i:def)</ns:name>.+?</ns:Coverage>\R\h*</ns:name>\d*\K

                                • REPLACE 5

                              You should get this temporary text :

                              <!----------------  OUTPUT TEXT --------------------->
                              <ns:name>
                                  <ns:location>asfsafs</ns:location>
                                  <ns:locationevent>zzzz</ns:locationevent>
                                  <ns:locations>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Prior</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Current</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Future</ns:name>
                                              </ns:Action>
                                              <ns:Status>Pending</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                  </ns:locations>
                                  <ns:Action>
                                      <ns:name>abc</ns:name>
                                  </ns:Action>
                                  <ns:Action>
                                      <ns:name>ghy</ns:name>
                                  </ns:Action>
                                  <ID>123</ID>
                                  <ns:Coverage>
                                      <ns:Action>
                                          <ns:name>deg</ns:name>
                                      </ns:Action>
                                      <ns:Action>
                                          <ns:name>def</ns:name>
                                      </ns:Action>
                                  </ns:Coverage>
                              </ns:name>235
                              
                              
                              <ns:name>
                                  <ns:location>asfsafs</ns:location>
                                  <ns:locationevent>yyyy</ns:locationevent>
                                  <ns:locations>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Prior</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Current</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Future</ns:name>
                                              </ns:Action>
                                              <ns:Status>Pending</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                  </ns:locations>
                                  <ns:Action>
                                      <ns:name>abc</ns:name>
                                  </ns:Action>
                                  <ns:Action>
                                      <ns:name>def</ns:name>
                                  </ns:Action>
                                  <ID>1234</ID>
                                  <ns:Coverage>
                                      <ns:Action>
                                          <ns:name>deg</ns:name>
                                      </ns:Action>
                                      <ns:Action>
                                          <ns:name>ddd</ns:name>
                                      </ns:Action>
                                  </ns:Coverage>
                              </ns:name>134
                              
                              <ns:name>
                                  <ns:location>asfsafs</ns:location>
                                  <ns:locationevent>yyyy</ns:locationevent>
                                  <ns:locations>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Prior</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Current</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Future</ns:name>
                                              </ns:Action>
                                              <ns:Status>Pending</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                  </ns:locations>
                                  <ns:Action>
                                      <ns:name>abc</ns:name>
                                  </ns:Action>
                                  <ns:Action>
                                      <ns:name>def</ns:name>
                                  </ns:Action>
                                  <ID>123</ID>
                                  <ns:Coverage>
                                      <ns:Action>
                                          <ns:name>deg</ns:name>
                                      </ns:Action>
                                      <ns:Action>
                                          <ns:name>def</ns:name>
                                      </ns:Action>
                                  </ns:Coverage>
                              </ns:name>12345
                              

                              You certainly noticed that, after replacement, the 3 main <ns:name> ....</ns:name> blocks end as below :

                              ...
                              </ns:name>235
                              ...
                              ...
                              </ns:name>134
                              ...
                              ...
                              </ns:name>12345
                              

                              The number, after </ns:name>, recapitulates all the conditions which are TRUE for each block


                              Now, it’s elementary ! We just have to :

                              • Delete any main <ns:name> ....</ns:name> block which does not satisfy all the conditions, i.e. does not have the string 12345 after </ns:name>

                              • Delete the string 12345 after the ending tag of all the blocks which does satisfy all the 5 conditions

                              This can be done with the following S/R :

                              SEARCH (?s-i)^\h*<ns:name>((?!</ns:Coverage>).)+?</ns:Coverage>\R\h*</ns:name>(?!12345)\d*\R|</ns:name>\K12345

                              REPLACE Leave EMPTY

                              And you’ll get your expected text :

                              
                              
                              
                              <ns:name>
                                  <ns:location>asfsafs</ns:location>
                                  <ns:locationevent>yyyy</ns:locationevent>
                                  <ns:locations>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Prior</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Current</ns:name>
                                              </ns:Action>
                                              <ns:Status>Completed</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                      <ns:location1>
                                          <ns:locationphase>
                                              <ns:Action>
                                                  <ns:name>Future</ns:name>
                                              </ns:Action>
                                              <ns:Status>Pending</ns:Status>
                                          </ns:locationphase>
                                      </ns:location1>
                                  </ns:locations>
                                  <ns:Action>
                                      <ns:name>abc</ns:name>
                                  </ns:Action>
                                  <ns:Action>
                                      <ns:name>def</ns:name>
                                  </ns:Action>
                                  <ID>123</ID>
                                  <ns:Coverage>
                                      <ns:Action>
                                          <ns:name>deg</ns:name>
                                      </ns:Action>
                                      <ns:Action>
                                          <ns:name>def</ns:name>
                                      </ns:Action>
                                  </ns:Coverage>
                              </ns:name>
                              

                              Remark :

                              With this final regex, you could, instead, just keep all the blocks which satisfy the conditons, let’s say, 1, 3 and 4

                              In this specific case, the S/R would become :

                              SEARCH (?s-i)^\h*<ns:name>((?!</ns:Coverage>).)+?</ns:Coverage>\R\h*</ns:name>(?!134)\d*\R|</ns:name>\K134

                              REPLACE Leave EMPTY

                              Best Regards,

                              guy038

                              P.S. : When we get a complete solution, I’ll try to explain the differents regexes :-))

                              1 Reply Last reply Reply Quote 1
                              • ?
                                A Former User
                                last edited by

                                Thanks guy038
                                
                                I will also try from my end.
                                
                                But want to precise the requirement. You are understanding is correct on the below conditions. 
                                You would like to pick the totality of any main <ns:name> ... </ns:name> block, ONLY IF it respects ALL the below conditions, in this priority order :
                                
                                It contains a tag and value <ns:locationevent>yyyy</ns:locationevent>
                                
                                It contains a tag and value <ID>123</ID>
                                
                                It contains a tag and value <ns:name>Future</ns:name><ns:Status>Pending</ns:Status>--But add this condition too.
                                
                                It contains, at least, one tag and value <ns:name>def</ns:name>, BEFORE the <ns:Coverage> .... <ns:Coverage> block
                                
                                It contains, at least, one tag and value <ns:name>def</ns:name>, INSIDE the <ns:Coverage> .... <ns:Coverage> block
                                
                                
                                
                                
                                **Requirement**: I want to pick only the XML which is matching the given conditions 1-5 from the list of files(when I say file it is a log file which has other texts too)
                                
                                I need only one command that I will use in the Find in Files option to get the expected XML.
                                
                                Alan KilbornA 1 Reply Last reply Reply Quote 0
                                • Alan KilbornA
                                  Alan Kilborn @A Former User
                                  last edited by

                                  @vijay-S

                                  If this is just going to be a continue-to-sponge-off-of-Guy type, maybe it is best to take it offline into private emails. I know that Guy has given up his email address in the past in postings, so maybe he will this time as well.

                                  1 Reply Last reply Reply Quote 0
                                  • ?
                                    A Former User
                                    last edited by

                                    guy038

                                    Can you Please provide your email address?

                                    1 Reply Last reply Reply Quote 1
                                    • guy038G
                                      guy038
                                      last edited by guy038

                                      Hello, @vijay-s,

                                      You said :

                                      I need only one command that I will use in the Find in Files option to get the expected XML.

                                      I’m really sorry but I cannot ! Even if I tried to concatenate these 5 regexes in an unique one, with the free-spacing regex mode, I get erroneous results, just because the process is orderered !


                                      To explain this fact, consider the simple S/R below, which tries to search for 2 conditions, simultaneously, and adds, right after the ending tag </ns:name> the letter A if the string abcd is found OR the letter B if the string efgh is found :

                                      SEARCH (?s)<ns:name>.+?(?:(abcd)|(efgh)).+?</ns:name>\l*\K

                                      REPLACE (?1A)(?2B)

                                      Against this sample text, below :

                                      <ns:name>
                                         This text contains, both, strings "efgh" and "abcd"
                                      </ns:name>
                                      <ns:name>
                                         This text contains, both, strings "efgh" and "abcd"
                                      </ns:name>
                                      

                                      Even if you click several times on the Replace All button, you’ll just find letters B, after </ns:name>, because, when scanning the sample text from left to right, the regex engine meets the efgh string first !

                                      Now, let’s suppose you run this first S/R :

                                      SEARCH (?s)<ns:name>.+?abcd.+?</ns:name>\l*\K

                                      REPLACE A

                                      Then process this second S/R :

                                      SEARCH (?s)<ns:name>.+?efgh.+?</ns:name>\l*\K

                                      REPLACE B

                                      You get, as expected, the string AB, after </ns:name>, meaning that the two conditions are true for each block !


                                      Thus, your problem seems beyond the scope of regexes and need to be solved only with script languages or XML analyser tools !

                                      Best Regards

                                      guy038

                                      P.S. :

                                      In my multi regexes solutions, I still found out an other error of logic. So, after correction and considering your last requirement, I ended with these 5 S/R , below :

                                      • SEARCH (?s-i)^\h*<ns:name>((?!</ns:Coverage>).)+?<ns:locationevent>(?i:yyyy)</ns:locationevent>.+?</ns:Coverage>\R\h*</ns:name>\d*\K

                                      • REPLACE 1

                                      • SEARCH (?s-i)^\h*<ns:name>((?!</ns:Coverage>).)+?<ID>123</ID>.+?</ns:Coverage>\R\h*</ns:name>\d*\K

                                      • REPLACE 2

                                      • SEARCH (?s-i)^\h*<ns:name>((?!</ns:Coverage>).)+?<ns:name>(?i:Future)</ns:name>.+?<ns:Status>(?i:Pending)</ns:Status>.+?</ns:Coverage>\R\h*</ns:name>\d*\K

                                      • REPLACE 3

                                      • SEARCH (?s-i)^\h*<ns:name>((?!</ns:Coverage>).)+?<ns:name>(?i:def)</ns:name>.+?<ns:Coverage>.+?</ns:Coverage>\R\h*</ns:name>\d*\K

                                      • REPLACE 4

                                      • SEARCH (?s-i)^\h*<ns:name>((?!</ns:Coverage>).)+?<ns:Coverage>.+?<ns:name>(?i:def)</ns:name>.+?</ns:Coverage>\R\h*</ns:name>\d*\K

                                      • REPLACE 5

                                      And the last regex, which deletes all main <ns:name> .....</ns:name> blocks, which do not satisfy these 5 conditions, remains identical :

                                      SEARCH (?s-i)^\h*<ns:name>((?!</ns:Coverage>).)+?</ns:Coverage>\R\h*</ns:name>(?!12345)\d*\R|</ns:name>\K12345

                                      REPLACE Leave EMPTY

                                      1 Reply Last reply Reply Quote 0
                                      • ?
                                        A Former User
                                        last edited by

                                        Hi,
                                        
                                        Thanks for your help.
                                        
                                        For the following xml,
                                        
                                        
                                        <ns:Input>
                                        <ns:location>asfsafs</ns:location>
                                        <ns:locationevent>xxxx</ns:locationevent>
                                         <ns:Action>
                                        <ns:name>abc</ns:name>
                                        </ns:Action>
                                        <ns:Action>
                                        <ns:name>ghy</ns:name>
                                        </ns:Action>
                                        <ns:Coverage>
                                        <ns:Action>
                                        <ns:name>deg</ns:name>
                                        </ns:Action>
                                        </ns:Coverage>
                                        </ns:locationevent>
                                        <ns:PPLID>121</ns:PPLID
                                        </ns:Input>
                                        
                                        <ns:Input>
                                        <ns:location>asfsafs</ns:location>
                                        <ns:locationevent>yyyy</ns:locationevent>
                                          <ns:Action>
                                        <ns:name>abc</ns:name>
                                        </ns:Action>
                                        <ns:Action>
                                        <ns:name>def</ns:name>
                                        </ns:Action>
                                        <ns:Coverage>
                                        <ns:Action>
                                        <ns:name>deg</ns:name>
                                        </ns:Action>
                                        </ns:Coverage>
                                        <ns:PPLID>124</ns:PPLID
                                        </ns:Input>
                                        
                                        
                                        <ns:Input>
                                        <ns:location>asfsafs</ns:location>
                                        <ns:locationevent>yyyy</ns:locationevent>
                                         <ns:Action>
                                        <ns:name>abc</ns:name>
                                        </ns:Action>
                                        <ns:Action>
                                        <ns:name>def</ns:name>
                                        </ns:Action>
                                        <ns:Coverage>
                                        <ns:Action>
                                        <ns:name>def</ns:name>
                                        </ns:Action>
                                        </ns:Coverage>
                                        <ns:PPLID>123</ns:PPLID>
                                        </ns:Input>
                                        I found the command to pick the xml which should match the following conditions
                                        <ns:Input>..<ns:locationevent>yyyy</ns:locationevent>..<ns:Action>..<ns:name>def</ns:name>..<ns:Coverage>..<ns:Action>..<ns:name>def</ns:name>..<ns:PPLID>124<ns:PPLID>..</ns:Input>
                                        
                                        if I use the below command
                                        
                                        (?s)<ns:Input>((?!</ns:Input>).)*?<ns:locationevent>yyyy</ns:locationevent.*?<ns:Action>.*?<ns:name>def</ns:name>.*?<ns:Coverage>.*?<ns:Action>.*?<ns:name>def</ns:name>.*?124.*?</ns:Input>
                                        
                                        it didn't find the second XML which matches in the given XML.
                                        
                                        but whereas if I use the below command
                                        
                                        (?s)<ns:Input>((?!</ns:Input>).)*?<ns:locationevent>yyyy</ns:locationevent.*?<ns:Action>.*?<ns:name>def</ns:name>.*?<ns:Coverage>.*?<ns:Action>.*?<ns:name>def</ns:name>.*?123.*?</ns:Input>
                                        
                                        It selects both second and third. In this case it should pick only the third. Can you check on this?
                                        
                                        1 Reply Last reply Reply Quote 0
                                        • guy038G
                                          guy038
                                          last edited by guy038

                                          Hi, @vijay-s,

                                          From your last post, I see that you, again, changed the general layout of your text :

                                          • The main <ns:name> .... </ns:name> blocks seem replaced with the main <ns:Input> .... </ns:Input> ones

                                          • The part <ns:locations>....</ns:locations> are absent

                                          • You add the zones <ns:PPLID>xxx</ns:PPLID> between lines </ns:Coverage> and </ns:Input>

                                          • You add an other condition, as you want to search for a particular 124 value, in <ns:PPLID>.....</ns:PPLID>

                                          Moreover, you tried to find out an unique regex to take an account all your conditions, simultaneously, although I explained, in my previous post, that this way will not work in the general case, regarding the present regexes that I exposed.


                                          So, once and for all, could you, please :

                                          • Give us a text, which recapitulates ALL possible cases, found in your real data ( I cannot guess it, obviously ! )

                                          • Explain ALL the conditions required, in order to consider any main XML block as correct

                                          Beware that all your requirements may exceed the power of regular expressions and would need other tools !!


                                          Just consider all the wasted time, giving, each time, a part of the whole problem !!

                                          When requirements are well defined and all cases well identified, generally, most of the job is done ;-))

                                          BR

                                          guy038

                                          P.S. : Regexes are very sensitive to text. Even, one additional space character, somewhere, may prevent a regular expression from matching an expected piece of text !

                                          1 Reply Last reply Reply Quote 0
                                          • ?
                                            A Former User
                                            last edited by

                                            
                                            Hi,
                                            
                                            To fix the latests problem will fix all others. I will take care of those. Pls let me know if i can fix. For the given XML,
                                            
                                            I need to pick the XML for Below are the conditions <ns:Input>..<ns:locationevent>yyyy</ns:locationevent>..<ns:Action>..<ns:name>def</ns:name>..<ns:Coverage>..<ns:Action>..<ns:name>def</ns:name>..<ns:PPLID>124<ns:PPLID>..</ns:Input>  -- which is the second occurence of the given XML
                                            
                                            I need to pick the XML for Below are the conditions <ns:Input>..<ns:locationevent>yyyy</ns:locationevent>..<ns:Action>..<ns:name>def</ns:name>..<ns:Coverage>..<ns:Action>..<ns:name>def</ns:name>..<ns:PPLID>123<ns:PPLID>..</ns:Input>  --which is the third occurence of the given XML
                                            
                                            
                                            
                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors