Find parent tag's id if that parent tag has specific tags inside it
Actually, if a
tag is found outside a<task.....>........</task>
section OR if<task ........>........</task>
sections do not contain, at least, one<revst>
tag, like below :... <task .......> .... .... .... </task> ... ... ...<revst>... ... ... <task .......> .... .... .... </task> ...
The regex will not match anything ! I suppose that this behavior is the one which is expected !?
Now, I need additional text to get a general idea of what you need. Could you provide a real complete example, as a text, in reverse video ?
Could you indicate the zones to mark and the different mandatory conditions to respect ?
Hi @guy038 ,
I really sorry for the late response.
I am just trying to share the real time data sample here to understand clearly.
<em ..............> ....................... <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para><revst>sample para<revend></para>..... </subtask> </task> <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200001" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200003" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para>sample para</para>..... </subtask> </task> . . . . . . . . </em>
As per the above example, the xml file may contain one or more task continuously. Since, I need to find <revst> in between the <task>…</task> and to get respective task “key” value. If any task does not contain <revst> tag we dont need that task’s key value.
In the above example, the expected output is: TASK-712101200000 as <revst> found in the first task.
Your previous regex works correctly but that works only for the fist instance of task and not finding rest of the instances.
I hope you understand . Kindly let me know if any data required.
Thanks for your patience!
Ganesan G -
Hello, @ganesan-govindarajan,
Indeed, it’s been a very long time since your last post, in this other topic :
Which seems related to your same problem !
So, from your example, I understood that you want the
value because the curent<task.........</task>
section contains the<revst>
tag !Now, this section contains the
value, twice. So :-
Do you want the regex to match the two values
? -
Do you want the regex to match the first
value, only ?
And, if the regex must match all the instances of the block, may this case happen :
<task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <para><revst>sample para<revend></para>..... <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> </subtask> </task>
Where the
tag comes after the firstTASK-############
, but before the secondTASK-############
?Best Regards,
Hi @guy038 ,
Thanks for the response after a long time. Yes its been a very old post that have asked for.
I need your help on this!!
Actually each TASK, SUBTASK’s key values are unique and does not repeat or duplicate anywhere at any case. Even though all type of element’s key values are unique. That was my typo sorry for that.
In my case, I need only the TASK’s key value irrespective of other key values inside the <task>…</task> if “<revst>” tag(s) is exists.
Also the existing one that i have asked and routed by you (post 80495) is not exactly match with this requirement, Thats why i have raising here.
Thanks again!!
Ganesan. G -
Ah…, OK ! So, I duplicated your recent example,
times, giving the text below, with some leading indentations :<em ..............> . . <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200002" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para><revst>sample para<revend></para>..... </subtask> </task> <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200001" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200003" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para>sample para</para>..... </subtask> </task> <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101300000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101300002" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para><revst>sample para<revend></para>..... </subtask> </task> <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101300001" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101300003" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para>sample para</para>..... </subtask> </task> <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101400001" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101400003" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para>sample para</para>..... </subtask> </task> <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101400000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101400002" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para><revst>sample para<revend></para>..... </subtask> </task> . . </em>
To get a fair example, I also modified, on purpose, the numbers after
string, in order that they are all unique :TASK-712101200000 TASK-712101200002 TASK-712101200001 TASK-712101200003 TASK-712101300000 TASK-712101300002 TASK-712101300001 TASK-712101300003 TASK-712101400001 TASK-712101400003 TASK-712101400000 TASK-712101400002
I also swapped the two blocks of the last section, so :
<task.... </task>
block not containing the<revst>
tag, comes first -
<task.... </task>
block containing the<revst>
tag, comes in second
And finally, in the last
<task.... </task>
block, I move up the line<para><revst>sample para<revend></para>.....
right after the line :<task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101400000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01">
And I still confirm that my regex, provided in my last post and reported below, does finds :
The first occurrence of the key value
, if the present<task.........</task>
block does contain a<revst>
tag -
Do not find anything if if the present
block does not contain the<revst>
(?-si)^\h*<task .+"\KTASK.+?(?=(?s:"((?!</task>).)+?)<revst>)
To be convinced :
Paste the above text, in a new tab (
Ctrl + N
) -
Open the Mark dialog (
Ctrl + M
) -
Un-tick all box options
Tick the
Wrap around
option -
Type in the regex
(?-si)^\h*<task .+"\KTASK.+?(?=(?s:"((?!</task>).)+?)<revst>)
in the Find what: zone -
Select the
Regular expression
search mode -
Click on the
Mark All
=> You should get the message :
Mark: 3 matches in entire file
Click on the
Copy Marked Text
button -
Paste the results in the new tab
=> You should get the additional text, below :
TASK-712101200000 TASK-712101300000 TASK-712101400000
Which corresponds to the first
occurrences, found in sections which do contain a<revst>
tag !
Tell me if something seems unclear or if I still misunderstood the whole story !
Hi @guy038
Thank you so much !! for the prompt responses.
In the above,
I also swapped the two blocks of the last section, so : The <task.... </task> block not containing the <revst> tag, comes first The <task.... </task> block containing the <revst> tag, comes in second And finally, in the last <task.... </task> block, I move up the line <para><revst>sample para<revend></para>..... right after the line : <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101400000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01">
This shuffles may not possible the reason is tons of task will be there in the single file and unable to find it.
Kindly clarify on this? maybe i had overlooked on this.
Ganesan.G -
Hello, @ganesan-govindarajan,
Well, when I decided to modify your example, by copying it three times, then using only unique key-values, swapping the cases which does and does not match and finally moving up the line containing the
tag, it was just in order to show you that my regex did work with a more general text. So just a proof, in some way !
Now, I don’t want you to be upset by my modifications. So, I can just use your original text which is :
<em ..............> ....................... <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para><revst>sample para<revend></para>..... </subtask> </task> <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200001" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200003" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para>sample para</para>..... </subtask> </task> . . . . . . . . </em>
And using my regex against the abvove text, it would find
occurrence only (TASK-712101200000
) on the first line :<task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01">
Now, your example cannot be qualified as a real example. I need a bunch of text of, let’s say,
lines in order to safely test my regex in real conditions !Best regards,
Hi @guy038
Thanks for checking!
I will provide the real time example with more content later.
As of now i got an idea based on your existing suggestions to get those IDs as following.
- Mark all <task …>…</task>.
- Mark all keywords to search <revst>.
- Now “Copy marked texts”
- Paste into new file and Mark all <task…>…</task> with <revst>
- Now again “copy marked text”
- Paste into new file and remove <revst> and get those <task…>…</task> key value now.
Please suggest if you have better way!!
Ganesan G -
Hello, @ganesan-govindarajan,
Yes, I understand your point of view ! So, just follow this road map :
Open your file in Notepad++
Open the Mark dialog (
Ctrl + M
) -
Untick all box options
Enter the regex
(?s)<task (?:(?!</task).)+?<revst>.+?</task>
in the Find what: zone -
Tick the
Wrap around
option -
Select the
Regular expression
search mode -
Click on the
Mark All
button -
Click on the
Copy Marked Text
button -
Open a new file (
Ctrl + N
) -
Paste the clipboard contents (
Ctrl + V
) in this new file -
Re-open, if necessary the Mark dialog, with this new text as current file
Enter the regex
<task .+?\Kkey=".+?"
in the Find what: zone -
Keep the other options unchanged
Re-click on the
Mark All
button -
Re-click on the
Copy Marked Text
button -
Open a second new file (
Ctrl + N
) -
Paste the clipboard contents (
Ctrl + V
) in this last new file
Here you are : you should get all the key values !!
Best Regards,
See also the faq about not using regexs to parse xml documents.