how to extract number between the tags then adding some number?
-
I am trying to extract number between the two tags. Here its ```
<startSample>somenumber</startSample> and <pitch>number with dot</pitch>**sample text :**<audGrainData>
<startSample>253</startSample>
<pitch>14.1668348</pitch>
</audGrainData>
<audGrainData>
<startSample>2390</startSample>
<pitch>14.1668348</pitch>
</audGrainData>
<audGrainData>
<startSample>4521</startSample>
<pitch>14.1668348</pitch>
</audGrainData>
<audGrainData>
<startSample>6475</startSample>
<pitch>14.1668348</pitch>
</audGrainData>
<audGrainData>
<startSample>8547</startSample>
<pitch>14.2084284</pitch>
</audGrainData>**and I want it to be like this:** 253 14.1668348 10000 10200 2390 14.1668348 10000 10200 4521 14.1668348 10000 10200 6475 14.1668348 10000 10200 8547 14.2084284 10000 10200 -
Hello, @fajar-zulianto and All,
Very easy with regexes !
So, starting with your INPUT text, below, that I slightly changed in order to introduce possible leading indentation, on any line :
<audGrainData> <startSample>253</startSample> <pitch>14.1668348</pitch> </audGrainData> <audGrainData> <startSample>2390</startSample> <pitch>14.1668348</pitch> </audGrainData> <audGrainData> <startSample>4521</startSample> <pitch>14.1668348</pitch> </audGrainData> <audGrainData> <startSample>6475</startSample> <pitch>14.1668348</pitch> </audGrainData> <audGrainData> <startSample>8547</startSample> <pitch>14.2084284</pitch> </audGrainData>-
Open the Replace dialog (
Ctrl + H) -
Uncheck all box options
-
SEARCH
(?-is)^\h*<(audGrainData)>\R\h*<(startSample)>(.+)</\2>\R\h*<(pitch)>(.+)</\4>\R\h*</\1> -
REPLACE
\3 \5 10000 10200 -
Check the
Wrap aroundoption -
Select the
Regular expressionsearch mode -
Click once on the
Repalce Allbutton
=> You should get the expected OUTPUT text :
253 14.1668348 10000 10200 2390 14.1668348 10000 10200 4521 14.1668348 10000 10200 6475 14.1668348 10000 10200 8547 14.2084284 10000 10200Voila !
NOTES :
-
The
(?-is)syntax are in-line modifiers which ensure that the search is sensible to case and that the.represents one standard character only -
There are five groups, located between parentheses, whose three of them store the name of the tags (
audGrainData,startSampleandpitch) and two of them store the numbers(.+)to re-use them in the replacement, with the\3and\5syntax -
The
\h*syntax represents possible leadingspaceortabulationcharacters, on any line -
The
\Rsyntax stands for any kind of line-break (\r\n,\nor\r)
Best Regards,
guy038
-
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login