Find parent tag's id if that parent tag has specific tags inside it
-
@guy038 and All
There is no any order or format for key attributes it can come in any format as this is consider as text. If we search tag <revst> and if is may present inside the any task (key value comes inside the task and task may comes multiple times as your example) we need to copy that key attribute and search next task respectively.
Note: <revst> may come multiple times inside the same task but if comes at least one time we need to copy that key value
Please let me know if still not clear.
thanks in advance!!
-
I will try to restate the requirement:
For each record starting with
<task
and ending with</task>
, if the record contains one or more occurrences of<revst>
then match the argument tokey=
Is this correct?
Why didn’t you clear up the confusion about a second bold item in your original post?
-
@neil-schipper said in Find parent tag's id if that parent tag has specific tags inside it:
Why didn’t you clear up the confusion about a second bold item in your original post?
This is perhaps better restated as: People that don’t know how to adequately ask for help don’t deserve to receive any.
Start (over) by reading THIS.
-
@neil-schipper
Sorry for that I missed it…
Yes absolutely correct as u stated.
Second bold I used to bold from B letter at the top of message box. But I am not sure what went wrong…
Thanks for re iterate my question. -
Hum…, I’m certainly wrong about your needs ! But, the worse thing is that your explanations still confuse me !
-
Regarding the second bold item, are you speaking about the value
BBAACC
, right after the<revst>
tag, in the first block ? -
You also said :
we need to copy that key value and search next task respectively
Then, do you mean that, in addition to the condition about the two bold strings, it should also match the complete next
<task>•••••</task>
block ?
So, @ganesan-govindarajan, before answering us, the best would be to read these general posts, in that order :
Best Regards,
guy038
-
-
Hi, @guy038
I am really sorry about that whatever i did in this help. I didn’t realize my words harm people when reading.
Okay let me explain little details using my little knowledge please forgive me.
Notepad++ v8.1.9.2 (64-bit)
Build time : Nov 21 2021 - 04:30:20
Path : C:\Program Files\Notepad++\notepad++.exe
Command Line :
Admin mode : OFF
Local Conf mode : OFF
Cloud Config : OFF
OS Name : Windows 10 Enterprise (64-bit)
OS Version : 1809
OS Build : 17763.1935
Current ANSI codepage : 1252
Plugins : ComparePlugin.dll DSpellCheck.dll IndentByFold.dll mimeTools.dll NppConverter.dll NppExec.dll NppExport.dll NppToolBucket.dll Python Indent.dll XMLTools.dllI have tried to re-iterate my words as below.
Here is the data I currently have (“before” data):
<task chapnbr=“71” chg=“U” func=“000” key=“TASK-710000000800”> <title>AAA</title> <para>BBAACC</para> <para><revst>BBAACC</para> </task> <task chapnbr=“71” chg=“U” func=“000” key=“TASK-710000000801”> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr=“71” chg=“U” func=“000” key=“TASK-710000000805”> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr=“71” chg=“U” func=“000” key=“TASK-710000002801”> <title>AAA</title> <para><revst>BBAACC</para> <para>BBAACC</para><revst><para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr=“71” chg=“U” func=“000” key=“TASK-710002003801”> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para><para>BBAACC</para> <para><revst>BBAACC</para> </task> <task chapnbr=“71” chg=“U” func=“000” key=“TASK-725000002801”> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para><para>BBAACC</para> <para>BBAACC</para> </task>
Here is how I would like that data to look (“after” data):
TASK-710000000800 TASK-710000002801 TASK-710002003801
To accomplish this, I have tried using the following Find/Replace expressions and settings
Find What =
<revst>
Search Mode = REGULAR EXPRESSION
Dot Matches Newline = NOT CHECKEDThanks
-
Hello, @ganesan-govindarajan and All,
I suppose that your before text was this one, with the use of the normal double quote character
"
:<task chapnbr="71" chg="U" func="000" key="TASK-710000000800"> <title>AAA</title> <para>BBAACC</para> <para><revst>BBAACC</para> </task> <task chapnbr="71" chg="U" func="000" key="TASK-710000000801"> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr="71" chg="U" func="000" key="TASK-710000000805"> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr="71" chg="U" func="000" key="TASK-710000002801"> <title>AAA</title> <para><revst>BBAACC</para> <para>BBAACC</para><revst><para>BBAACC</para> <para>BBAACC</para> </task> <task chapnbr="71" chg="U" func="000" key="TASK-710002003801"> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para><para>BBAACC</para> <para><revst>BBAACC</para> </task> <task chapnbr="71" chg="U" func="000" key="TASK-725000002801"> <title>AAA</title> <para>BBAACC</para> <para>BBAACC</para><para>BBAACC</para> <para>BBAACC</para> </task>
and not with the smart double quotes
“
and”
( Unicode characters\x{201C}
and\x{201D}
)
If so :
-
Open the Mark dialog (
Ctrl + M
) -
Type in
(?-s)\h*<task .+"\K.+(?=(?s:">((?!</task>).)+?)<revst>)
, in the Find what : zone -
Tick the
Bookmark line
option -
Tick the
Wrap around
option -
Select the
Regular expression
search mode -
Click on the
Mark All
button
You’ll get this text :
-
Click on the
Copy Marked Text
button -
Open a new tab (
Ctrl + N
) -
Paste all the
TASK-71xxxxxxxxxx
items
Now, if you really use the
smart double quotes
, just tell me to slightly modify the regex !Best Regards,
guy038
-
-
Hi @guy038
Thanks much for the help!!
Yes that are normal quotes not smart quotes as it is converted automatically from another application.
But i missed to tell you one thing that after the key value there are other attributes also will be there in each task as follow,
<task chapnbr=“71” chg=“U” func=“000” key=“TASK-710000000801” date="1234567" chg="R"..etc>
Due to this, if i use the above regex i can only find first line of file.
Please advise me how to proceed with the above regex.
Thanks
-
Hi, @ganesan-govindarajan and All,
Ah… OK !. So, this second regex version, where I explicitly search for ths string
TASK
, with that exact case, becomes :(?-si)^\h*<task .+"\KTASK.+?(?=(?s:"((?!</task>).)+?)<revst>)
For instance, in the text below, where, again, I changed some smart quotes to normal quotes :
<task chapnbr="71" chg="U" func="000" key="TASK-710000000801" date="1234567" chg="R"> <title>AAA</title> <para>BBAACC</para> <para><revst>BBAACC</para> </task>
It’ll enlight the string
TASK-710000000801
Note that I also assume that there is only ONE
key
attribute in the<task .........
entire line !BR
guy038
If you would like additional information about how this regex works, just tell me !
-
Please note: The following is only true if you accidentally wrote
<revst>
instead of<revst/>
or<revst> ... </revst>
. If your document only contains<revst>
, it is not a valid XML document because it lacks the related closing tag and the method I’m explaining below doesn’t work.
A more reliable and much more hassle-free method to search for things in XML documents is XPath, a query language that was explicitely designed for that use case.
To be able to use XPath you need to install the XML Tools plugin, available via Npp’s build-in Plugins Admin. After installing it, navigate to
(menu) Plugins -> XML Tools -> Evaluate XPath expression
. In the dialog box popping up, enter//task[descendant::revst]/@key
into the upper input field. The plugin will generate the result in the lower ListView.You can copy the result to your clipboard by clicking the according button. After pasting it to an empty document, you only have to do some column selection to remove the content of the ListView’s first two columns, which are useless for you but have been copied to the clipboard as well.
-
Hi @guy038,
Thanks much for your help its working fine!!.
Somewhat i didn’t realized that format of task may differ file to file as key attribute position is may change task to task as like below,
<task breaknbr="00" chapnbr="71" chg="U" func="000" key="TASK-710000000800" pgblknbr="301" revdate="19950301" sectnbr="00" seq="800" subjnbr="00">
Key attribute comes at 4th or 5th position or dynamically changed file to file. Due to this above regex cannot find in some of the sgm files.
Can you please help me out on this?
Thanks again.
-
Hi @dinkumoil ,
Thanks for asking!!
Actually we don’t worry about the closing tag as we need to find only opening tag. Also this is a sgm file and closing tag is different “<revend>”.
thanks.
-
Hello @ganesan-govindarajan and All,
Sorry, but I do not understand :-(
If I use, for instance, this text :
<task breaknbr="00" chapnbr="71" chg="U" func="000" key="TASK-710000000800" pgblknbr="301" revdate="19950301" sectnbr="00" seq="800" subjnbr="00"> <title>AAA</title> <para>BBAACC</para> <para><revst>BBAACC</para> </task>
The regex does mark the string
TASK-710000000000
? No difference since previously ! I probably miss some details.BR
guy038
-
Hi @guy038 ,
When i tried to use the regex it is only selected the first line of file. Thats the reason, i thought the issue occurs because of task format.
I don’t know why it is selected only first line of file.
-
Hi @guy038 ,
Do we need to main the same format of text as per above example underneath the “task”? because lot of contents and tags find my new file after the task element.
-
Hi, @ganesan-govindarajan and All,
I’ll try to briefly explain how this regex works and you should be able to verify if it meets your needs !
So, the regex
(?-si)^\h*<task .+"\KTASK.+?(?=(?s:"((?!</task>).)+?)<revst>)
, used with the Mark dialog, means :-
From the beginning of a line, it first finds optional leading blank characters, followed with the string
<task
, in lower case, followed with aspace
char, followed by anything till a double quote ( The one right before the stringTASK
! ) -
Then the
\K
regex syntax resets the present search. Thus, it just looks for the stringTASK
, in upper case, followed with any non-null range of chars, till the nearestdouble quote
char ( The one right after the stringTASK-xxxxxxxxxx
) -
But ONLY IF two additional conditions exist :
-
It must be followed, further on, with a
<revst>
tag, first -
Then, it must be followed, further on, with the ending tag
</task>
-
This implies that, IF the
</task>
string is found before a<revst>
tag, the regex will not match ! )
Globally, it can find any
TASKxx....xxx
string, if it is followed, downwards, with a<revst
tag then, downwards, with an ending</task>
tag ! So :-
You may have many lines, or none, between the line
<task.........>
and the<revst>
tag -
You may have many lines, or none, after the
<revst>
tag till the ending tag</task>
For instance, even the short line, below :
<task key="TASK:AB12345"><para><revst>BBAACC</para></task>
Or this minimum one :
<task key="TASK:AB12345"><revst></task>
would mark the string
TASK:AB12345
!
Of course, I assume that :
-
The part
<task ................>
begins a line, after possible blank characters, only -
The part
key="TASKxxxxx"
is located on a single line
If other cases may occur, just tell me !
Best Regards,
guy038
-
-
Hi @guy038 ,
Your explanation is so well!! thanks much for that!!
This implies that, IF the </task> string is found before a <revst> tag, the regex will not match ! )
Yes <revst> tag may appear before or after the <task—></task> element inside of file as other elements are also have the same <revst> tags. Is this the reason this regex is not supported?
-
Hi @guy038
May be my tasks starts and ends with as below,
-
Actually, if a
<revst>
tag is found outside a<task.....>........</task>
section OR if<task ........>........</task>
sections do not contain, at least, one<revst>
tag, like below :... <task .......> .... .... .... </task> ... ... ...<revst>... ... ... <task .......> .... .... .... </task> ...
The regex will not match anything ! I suppose that this behavior is the one which is expected !?
Now, I need additional text to get a general idea of what you need. Could you provide a real complete example, as a text, in reverse video ?
Could you indicate the zones to mark and the different mandatory conditions to respect ?
Thanks,
BR
guy038
-
Hi @guy038 ,
I really sorry for the late response.
I am just trying to share the real time data sample here to understand clearly.
<em ..............> ....................... <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200000" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para><revst>sample para<revend></para>..... </subtask> </task> <task breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200001" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <effect effrg="710/ALL" efftext="710/ALL"></effect> <title>sample text</title> <subtask breaknbr="00" chapnbr="71" chg="U" func="200" key="TASK-712101200003" pgblknbr="801" revdate="20191115" sectnbr="21" seq="000" subjnbr="01"> <title>sample text</title> <para>sample para</para>..... </subtask> </task> . . . . . . . . </em>
As per the above example, the xml file may contain one or more task continuously. Since, I need to find <revst> in between the <task>…</task> and to get respective task “key” value. If any task does not contain <revst> tag we dont need that task’s key value.
In the above example, the expected output is: TASK-712101200000 as <revst> found in the first task.
Your previous regex works correctly but that works only for the fist instance of task and not finding rest of the instances.
I hope you understand . Kindly let me know if any data required.
Thanks for your patience!
Ganesan G