how to extract number between the tags then adding some number?

Fajar Zulianto

I am trying to extract number between the two tags. Here its ```
<startSample>somenumber</startSample> and <pitch>number with dot</pitch>


**sample text :**


**and I want it to be like this:**

  253 14.1668348 10000 10200
  2390 14.1668348 10000 10200
  4521 14.1668348 10000 10200
  6475 14.1668348 10000 10200
  8547 14.2084284 10000 10200

guy038

Hello, @fajar-zulianto and All,

Very easy with regexes !

So, starting with your INPUT text, below, that I slightly changed in order to introduce possible leading indentation, on any line :

<audGrainData>
<startSample>253</startSample>
<pitch>14.1668348</pitch>
</audGrainData>
    <audGrainData>
        <startSample>2390</startSample>
        <pitch>14.1668348</pitch>
    </audGrainData>
<audGrainData>
<startSample>4521</startSample>
			<pitch>14.1668348</pitch>
		</audGrainData>
<audGrainData>
<startSample>6475</startSample>
<pitch>14.1668348</pitch>
</audGrainData>
		    <audGrainData>
		    <startSample>8547</startSample>
		    <pitch>14.2084284</pitch>
		    </audGrainData>

Open the Replace dialog ( Ctrl + H )
Uncheck all box options
SEARCH (?-is)^\h*<(audGrainData)>\R\h*<(startSample)>(.+)</\2>\R\h*<(pitch)>(.+)</\4>\R\h*</\1>
REPLACE \3 \5 10000 10200
Check the Wrap around option
Select the Regular expression search mode
Click once on the Repalce All button

=> You should get the expected OUTPUT text :

253 14.1668348 10000 10200
2390 14.1668348 10000 10200
4521 14.1668348 10000 10200
6475 14.1668348 10000 10200
8547 14.2084284 10000 10200

Voila !

NOTES :

The (?-is) syntax are in-line modifiers which ensure that the search is sensible to case and that the . represents one standard character only
There are five groups, located between parentheses, whose three of them store the name of the tags ( audGrainData, startSample and pitch ) and two of them store the numbers (.+) to re-use them in the replacement, with the \3 and \5 syntax
The \h* syntax represents possible leading space or tabulation characters, on any line
The \R syntax stands for any kind of line-break ( \r\n, \n or \r )

Best Regards,

guy038