Find all in current document don't provide father object



  • I have a structured files XML that for each <object> has many informations.
    example:

    <object name=“A”>
    info1 = 3
    info2 = 5
    info3 = 7
    </object>
    <object name=“B”>
    info1 = 5
    info2 = 8
    info3 = 0
    </object>

    when I try to “Find all in current document” -> “info1” I got all the line that has “info1” text included:

    info1=3
    info1=5

    but I don’t know which is the father object. Since I have thousand of objects and thousand of info, it is difficult to understand which object has info1 set to 3 and which has info 1 set to 5. So I guess if there is a way to have as result the following:

    <object name=“A”> / info1=3
    <object name=“B”> / info1=5

    Is there any way to do this?
    Thanks a lot



  • @Massimo-Araldi

    It’s kind of hard to know what you are looking for here. I mean, we kind-of know, but we can’t read your mind and know what your ultimate end goal is (this happens A LOT with this type of question). I’m sensing that it will be hard to please you fully with any solution on this…

    Also, because you didn’t bother to put your sample data into a code-block (where it will be represented verbatim, instead of processed by the markdown engine on this website), maybe your data doesn’t truly look like what a reader here thinks it does…

    A lot of supposition here…

    So maybe you are just interested in knowing what the “pairs” are in a concise format…

    You might try this:

    Find what zone: (?s-i)<object name="([^"]+)">.*?info1 = (\d+).*?</object>
    Replace with zone: \1 -> \2
    Search mode: Regular expression
    Action: Replace one at a time, or Replace All (possibly with Wrap around ticked)

    This would change data like you’ve specified:

    <object name="A">
    info1 = 3
    info2 = 5
    info3 = 7
    </object>
    <object name="B">
    info1 = 5
    info2 = 8
    info3 = 0
    </object>
    <object name="C">
    info1 = 12
    info2 = 8
    info3 = 0
    </object>
    

    into the “pairs” relation that follows:

    A -> 3
    B -> 5
    C -> 12
    


  • Hello, @massimo-araldi, @scott-sumner and All,

    I think that this task would be easily achieve with a scripting language as, for instance, the Python or Lua N++ plugins !

    However, here are 3 simple work-arounds to simulate your needs :-))

    We’ll always use the sample text below, throughout this post :

    <object name="A">
    info1 = 3
    info2 = 5
    info3 = 7
    </object>
    <object name="B">
    info1 = 5
    info3 = 0
    info5 = 10
    info6 = 
    info7 = -2
    </object>
    <object name="C">
    info2 = 9
    info7 = 1
    </object>
    <object name="D">
    info1 = 4
    info3 = 2
    </object>
    <object name="E">
    info2 = 8
    info4 = 6
    </object>
    

    A) If you just want to know which father object(s) contains a specific info# value, in the Find result panel :

    Use the generic regex (?-i)<object name="\K\w+(?="[^/]+?<Text to Find>), where <Text to Find> must be replaced with any info# value

    So, if the Find what: zone contains, for instance, the regex (?-i)<object name="\K\w+(?="[^/]+?info3), with the Wrap around option ticked and the Regular expression selected, clicking on the Find All in Current Document would display, in the Find result panel :

    Search "(?-i)<object name="\K\w+(?="[^/]+?info3)" (3 hits in 1 file)
      new 3 (3 hits)
    	Line 1: <object name="A">
    	Line 6: <object name="B">
    	Line 17: <object name="D">
    

    B) If you want to get, both, the info# = # line AND its corresponding father object, in the Find result panel :

    • Copy all your file contents in a new tab

    • Switch to that new tab

    • Open the Replace Dialog

    SEARCH \R(?!<object)

    REPLACE \x20

    • With the same options as above, click on the Replace All button

    You should get the following text :

    <object name="A"> info1 = 3 info2 = 5 info3 = 7 </object>
    <object name="B"> info1 = 5 info3 = 0 info5 = 10 info6 =  info7 = -2 </object>
    <object name="C"> info2 = 9 info7 = 1 </object>
    <object name="D"> info1 = 4 info3 = 2 </object>
    <object name="E"> info2 = 8 info4 = 6 </object>
    
    • Now, in the Find tab, simply search for info3 and click on the Find All in Current Document

    You’ll get the expected text :

    Search "info3" (3 hits in 1 file)
      new 3 (3 hits)
    	Line 1: <object name="A"> info1 = 3 info2 = 5 info3 = 7 </object>
    	Line 2: <object name="B"> info1 = 5 info3 = 0 info5 = 10 info6 =  info7 = -2 </object>
    	Line 4: <object name="D"> info1 = 4 info3 = 2 </object>
    

    Instead of highlighting the info3 zone, alone, you may prefer to highlight a taller zone ! In that case, with the Regular expression search mode and the Wrap around option set, use one of the following regexes :

    info3 = \d+

    "\w+.+info3

    "\w+.+info3 = \d+


    C) If you want to get results, like Object X -> info# = #, without using the Find result panel ( the Scott method ) :

    • Copy all your file contents in a new tab

    • Switch to that new tab

    • Open the Replace Dialog

    SEARCH (?s-i)<object name="(\w+)">[^/]+?(info1 = \d+).*?</object>|<object.+?</object>\R

    REPLACE ?1Object \1 -> \2

    • With the same options as above, click on the Replace All button

    You should obtain :

    Object A -> info1 = 3
    Object B -> info1 = 5
    Object D -> info1 = 4
    

    Remarks to Scott :

    • Like you, I was searching for a solution and found the A) and B) solutions, first, because I thought that Massimo wanted to use the Find result dialog, exclusively. I read your solution, and I tried your regex, of course !

    • I, then, realized that if some objects do not have a specific info# value, like in my sample text, your regex does not delete the corresponding areas <object name…</object> ! Hence, the second alternative of my regex <object.+?</object>\R with the implicit (?s) modifier, which is to be deleted, in replacement ;-))

    • Moreover, this possibility means, also, that the area, between the object name and the searched info# must not contain any / symbol ! That is to say it must be enclosed inside the <object name...</object> area, of course !

    Best Regards,

    guy038



  • @guy038

    Yeah…hard to know really what is wanted or what direction something could be going… In an effort to be helpful, I take a few minutes and put something quick together which hopefully elicits more info from the OP, or maybe (rarely) manages to hit the bullseye. Sure, we can guess and propose 3 or 4 (or 9 or 10!) alternatives on what might be useful to the posters, but I’m not doing that! :-)

    I don’t know if you’ve noticed on these type of postings, I’ll reply, and then if the OP replies to that and completely changes the specification for what is wanted, I’ll bail out because I’m simply not interested in “chasing the wind”…



  • Hi, @scott-sumner,

    I do agree with your last sentence. Sure that people should take care to explain their needs and provide examples in such a way that a minimum of ambiguities remains !

    Your last post wasn’t useless, anyway ! Indeed, as I’m French, I’ve just learned some other English-American words and expressions : to elicit, to hit the bull’s eye, to bail out and the very comprehensive expression “to chase the wind” ;-))

    Cheers,

    guy038



  • Hello,
    sorry for my late answer…I had some unexpected issue in the meanwhile.
    Sorry for my not clear description, it’s my first time in the forum, by the way I think that your suggestion hit my target, just I need time to better understand how to implement it…
    I understand in any case that I need to work more on this issue because there is a possibility to obtain the result, implementing yoursuggestion… I just need to hard work.
    Thanks a lot



  • @Massimo Araldi
    The information about the father object is available using Notepad++ and XPatherizerNpp plugin:
    all you have to do is search the XPath and watch the results window.


Log in to reply