Find parent tag's id if that parent tag has specific tags inside it
-
@neil-schipper said in Find parent tag's id if that parent tag has specific tags inside it:
Why didn’t you clear up the confusion about a second bold item in your original post?
This is perhaps better restated as: People that don’t know how to adequately ask for help don’t deserve to receive any.
Start (over) by reading THIS.
-
@neil-schipper
Sorry for that I missed it…
Yes absolutely correct as u stated.
Second bold I used to bold from B letter at the top of message box. But I am not sure what went wrong…
Thanks for re iterate my question. -
Hum…, I’m certainly wrong about your needs ! But, the worse thing is that your explanations still confuse me !
-
Regarding the second bold item, are you speaking about the value
BBAACC
, right after the<revst>
tag, in the first block ? -
You also said :
we need to copy that key value and search next task respectively
Then, do you mean that, in addition to the condition about the two bold strings, it should also match the complete next
<task>•••••</task>
block ?
So, @ganesan-govindarajan, before answering us, the best would be to read these general posts, in that order :
Best Regards,
guy038
-
-
Hi, @guy038
I am really sorry about that whatever i did in this help. I didn’t realize my words harm people when reading.
Okay let me explain little details using my little knowledge please forgive me.
Notepad++ v8.1.9.2 (64-bit)
Build time : Nov 21 2021 - 04:30:20
Path : C:\Program Files\Notepad++\notepad++.exe
Command Line :
Admin mode : OFF
Local Conf mode : OFF
Cloud Config : OFF
OS Name : Windows 10 Enterprise (64-bit)
OS Version : 1809
OS Build : 17763.1935
Current ANSI codepage : 1252
Plugins : ComparePlugin.dll DSpellCheck.dll IndentByFold.dll mimeTools.dll NppConverter.dll NppExec.dll NppExport.dll NppToolBucket.dll Python Indent.dll XMLTools.dllI have tried to re-iterate my words as below.
Here is the data I currently have (“before” data):
<task chapnbr=“71” chg=“U” func=“000” key=“TASK-710000000800”> <title>AAA</title> <para>BBAACC</para> <para><revst>BBAACC</para> </task> <task chapnbr=“71” chg=“U” func=“000” key=“TASK-710000000801”> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr=“71” chg=“U” func=“000” key=“TASK-710000000805”> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr=“71” chg=“U” func=“000” key=“TASK-710000002801”> <title>AAA</title> <para><revst>BBAACC</para> <para>BBAACC</para><revst><para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr=“71” chg=“U” func=“000” key=“TASK-710002003801”> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para><para>BBAACC</para> <para><revst>BBAACC</para> </task> <task chapnbr=“71” chg=“U” func=“000” key=“TASK-725000002801”> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para><para>BBAACC</para> <para>BBAACC</para> </task>
Here is how I would like that data to look (“after” data):
TASK-710000000800 TASK-710000002801 TASK-710002003801
To accomplish this, I have tried using the following Find/Replace expressions and settings
Find What =
<revst>
Search Mode = REGULAR EXPRESSION
Dot Matches Newline = NOT CHECKEDThanks
-
Hello, @ganesan-govindarajan and All,
I suppose that your before text was this one, with the use of the normal double quote character
"
:<task chapnbr="71" chg="U" func="000" key="TASK-710000000800"> <title>AAA</title> <para>BBAACC</para> <para><revst>BBAACC</para> </task> <task chapnbr="71" chg="U" func="000" key="TASK-710000000801"> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr="71" chg="U" func="000" key="TASK-710000000805"> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr="71" chg="U" func="000" key="TASK-710000002801"> <title>AAA</title> <para><revst>BBAACC</para> <para>BBAACC</para><revst><para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr="71" chg="U" func="000" key="TASK-710002003801"> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para><para>BBAACC</para> <para><revst>BBAACC</para> </task> <task chapnbr="71" chg="U" func="000" key="TASK-725000002801"> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para><para>BBAACC</para> <para>BBAACC</para> </task>
and not with the smart double quotes
“
and”
( Unicode characters\x{201C}
and\x{201D}
)
If so :
-
Open the Mark dialog (
Ctrl + M
) -
Type in
(?-s)\h*<task .+"\K.+(?=(?s:">((?!</task>).)+?)<revst>)
, in the Find what : zone -
Tick the
Bookmark line
option -
Tick the
Wrap around
option -
Select the
Regular expression
search mode -
Click on the
Mark All
button
You’ll get this text :
-
Click on the
Copy Marked Text
button -
Open a new tab (
Ctrl + N
) -
Paste all the
TASK-71xxxxxxxxxx
items
Now, if you really use the
smart double quotes
, just tell me to slightly modify the regex !Best Regards,
guy038
-
-
Hi @guy038
Thanks much for the help!!
Yes that are normal quotes not smart quotes as it is converted automatically from another application.
But i missed to tell you one thing that after the key value there are other attributes also will be there in each task as follow,
<task chapnbr=“71” chg=“U” func=“000” key=“TASK-710000000801” date="1234567" chg="R"..etc>
Due to this, if i use the above regex i can only find first line of file.
Please advise me how to proceed with the above regex.
Thanks
-
Hi, @ganesan-govindarajan and All,
Ah… OK !. So, this second regex version, where I explicitly search for ths string
TASK
, with that exact case, becomes :(?-si)^\h*<task .+"\KTASK.+?(?=(?s:"((?!</task>).)+?)<revst>)
For instance, in the text below, where, again, I changed some smart quotes to normal quotes :
<task chapnbr="71" chg="U" func="000" key="TASK-710000000801" date="1234567" chg="R"> <title>AAA</title> <para>BBAACC</para> <para><revst>BBAACC</para> </task>
It’ll enlight the string
TASK-710000000801
Note that I also assume that there is only ONE
key
attribute in the<task .........
entire line !BR
guy038
If you would like additional information about how this regex works, just tell me !
-
Please note: The following is only true if you accidentally wrote
<revst>
instead of<revst/>
or<revst> ... </revst>
. If your document only contains<revst>
, it is not a valid XML document because it lacks the related closing tag and the method I’m explaining below doesn’t work.
A more reliable and much more hassle-free method to search for things in XML documents is XPath, a query language that was explicitely designed for that use case.
To be able to use XPath you need to install the XML Tools plugin, available via Npp’s build-in Plugins Admin. After installing it, navigate to
(menu) Plugins -> XML Tools -> Evaluate XPath expression
. In the dialog box popping up, enter//task[descendant::revst]/@key
into the upper input field. The plugin will generate the result in the lower ListView.You can copy the result to your clipboard by clicking the according button. After pasting it to an empty document, you only have to do some column selection to remove the content of the ListView’s first two columns, which are useless for you but have been copied to the clipboard as well.
-
Hi @guy038,
Thanks much for your help its working fine!!.
Somewhat i didn’t realized that format of task may differ file to file as key attribute position is may change task to task as like below,
<task breaknbr="00" chapnbr="71" chg="U" func="000" key="TASK-710000000800" pgblknbr="301" revdate="19950301" sectnbr="00" seq="800" subjnbr="00">
Key attribute comes at 4th or 5th position or dynamically changed file to file. Due to this above regex cannot find in some of the sgm files.
Can you please help me out on this?
Thanks again.
-
Hi @dinkumoil ,
Thanks for asking!!
Actually we don’t worry about the closing tag as we need to find only opening tag. Also this is a sgm file and closing tag is different “<revend>”.
thanks.
-
Hello @ganesan-govindarajan and All,
Sorry, but I do not understand :-(
If I use, for instance, this text :
<task breaknbr="00" chapnbr="71" chg="U" func="000" key="TASK-710000000800" pgblknbr="301" revdate="19950301" sectnbr="00" seq="800" subjnbr="00"> <title>AAA</title> <para>BBAACC</para> <para><revst>BBAACC</para> </task>
The regex does mark the string
TASK-710000000000
? No difference since previously ! I probably miss some details.BR
guy038
-
Hi @guy038 ,
When i tried to use the regex it is only selected the first line of file. Thats the reason, i thought the issue occurs because of task format.
I don’t know why it is selected only first line of file.
-
Hi @guy038 ,
Do we need to main the same format of text as per above example underneath the “task”? because lot of contents and tags find my new file after the task element.
-
Hi, @ganesan-govindarajan and All,
I’ll try to briefly explain how this regex works and you should be able to verify if it meets your needs !
So, the regex
(?-si)^\h*<task .+"\KTASK.+?(?=(?s:"((?!</task>).)+?)<revst>)
, used with the Mark dialog, means :-
From the beginning of a line, it first finds optional leading blank characters, followed with the string
<task
, in lower case, followed with aspace
char, followed by anything till a double quote ( The one right before the stringTASK
! ) -
Then the
\K
regex syntax resets the present search. Thus, it just looks for the stringTASK
, in upper case, followed with any non-null range of chars, till the nearestdouble quote
char ( The one right after the stringTASK-xxxxxxxxxx
) -
But ONLY IF two additional conditions exist :
-
It must be followed, further on, with a
<revst>
tag, first -
Then, it must be followed, further on, with the ending tag
</task>
-
This implies that, IF the
</task>
string is found before a<revst>
tag, the regex will not match ! )
Globally, it can find any
TASKxx....xxx
string, if it is followed, downwards, with a<revst
tag then, downwards, with an ending</task>
tag ! So :-
You may have many lines, or none, between the line
<task.........>
and the<revst>
tag -
You may have many lines, or none, after the
<revst>
tag till the ending tag</task>
For instance, even the short line, below :
<task key="TASK:AB12345"><para><revst>BBAACC</para></task>
Or this minimum one :
<task key="TASK:AB12345"><revst></task>
would mark the string
TASK:AB12345
!
Of course, I assume that :
-
The part
<task ................>
begins a line, after possible blank characters, only -
The part
key="TASKxxxxx"
is located on a single line
If other cases may occur, just tell me !
Best Regards,
guy038
-
-
Hi @guy038 ,
Your explanation is so well!! thanks much for that!!
This implies that, IF the </task> string is found before a <revst> tag, the regex will not match ! )
Yes <revst> tag may appear before or after the <task—></task> element inside of file as other elements are also have the same <revst> tags. Is this the reason this regex is not supported?
-
Hi @guy038
May be my tasks starts and ends with as below,
-
Actually, if a
<revst>
tag is found outside a<task.....>........</task>
section OR if<task ........>........</task>
sections do not contain, at least, one<revst>
tag, like below :... <task .......> .... .... .... </task> ... ... ...<revst>... ... ... <task .......> .... .... .... </task> ...
The regex will not match anything ! I suppose that this behavior is the one which is expected !?
Now, I need additional text to get a general idea of what you need. Could you provide a real complete example, as a text, in reverse video ?
Could you indicate the zones to mark and the different mandatory conditions to respect ?
Thanks,
BR
guy038
-
Hi @guy038 ,
I really sorry for the late response.
I am just trying to share the real time data sample here to understand clearly.
<em ..............> ....................... <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para><revst>sample para<revend></para>..... </subtask> </task> <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200001" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200003" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para>sample para</para>..... </subtask> </task> . . . . . . . . </em>
As per the above example, the xml file may contain one or more task continuously. Since, I need to find <revst> in between the <task>…</task> and to get respective task “key” value. If any task does not contain <revst> tag we dont need that task’s key value.
In the above example, the expected output is: TASK-712101200000 as <revst> found in the first task.
Your previous regex works correctly but that works only for the fist instance of task and not finding rest of the instances.
I hope you understand . Kindly let me know if any data required.
Thanks for your patience!
Ganesan G -
Hello, @ganesan-govindarajan,
Indeed, it’s been a very long time since your last post, in this other topic :
https://community.notepad-plus-plus.org/post/80495
Which seems related to your same problem !
So, from your example, I understood that you want the
TASK-712101200000
value because the curent<task.........</task>
section contains the<revst>
tag !Now, this section contains the
TASK-712101200000
value, twice. So :-
Do you want the regex to match the two values
TASK-############
? -
Do you want the regex to match the first
TASK-############
value, only ?
And, if the regex must match all the instances of the block, may this case happen :
<task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <para><revst>sample para<revend></para>..... <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> </subtask> </task>
Where the
<revst>
tag comes after the firstTASK-############
, but before the secondTASK-############
?Best Regards,
guy038
-
-
Hi @guy038 ,
Thanks for the response after a long time. Yes its been a very old post that have asked for.
I need your help on this!!
Actually each TASK, SUBTASK’s key values are unique and does not repeat or duplicate anywhere at any case. Even though all type of element’s key values are unique. That was my typo sorry for that.
In my case, I need only the TASK’s key value irrespective of other key values inside the <task>…</task> if “<revst>” tag(s) is exists.
Also the existing one that i have asked and routed by you (post 80495) is not exactly match with this requirement, Thats why i have raising here.
Thanks again!!
Ganesan. G