Automatic text transform with incremental numbering
-
Hello there!
I’m struggling with finding an automated process/workflow for a big data set of text input. I’m working in documentary film editing and have to transfer provided texts as subtitles into a current film.I’m practically new to this kind of text editing. So, please excuse my low level of knowledge with RegEx and np++ scripts.
My input data usually looks like this:
_**00:00:03,000_++ Text is written here. _**00:00:07,031_++ More text is written here. _**00:00:11,028_++ Text is written here.
The goal is to transform that into this:
1 00:00:00,000 --> 00:00:03,000 Text is written here. 2 00:00:03,000 --> 00:00:07,031 More text is written here. 3 00:00:070,31 --> 00:00:11,028 Text is written here.
The first steps are quite obvious:
- search & replace all “_++” with nothing
- search & replace all “_**” with “##\nv*#*”
Now I have “##” as a variable placeholder for incremental numbering in the next step. And “#” as another variable for later.
Next step would be to read out the provided timestamp (XX:XX:XX,XXX) and write it into my variable #. One row at a time.
Here my understanding of simple automation stops with the lack of knowledge with NP++ and RegEx.Could you help me out, or do I need to use another tool/write a script in another language?
Any lead is highly appreciated :) -
_\*\*([:,\d]+)_\+\+ .+? _\*\*([:,\d]+)_\+\+
- time range in$1
,$2
, but u must replace it to same format, at next step convert format of timestamps -
@Maxim-Abrossimow said in Automatic text transform with incremental numbering:
00:00:00,000
A python script solution might look like this
from Npp import editor import re transformed_text = '' counter = 1 current_timestamp = '00:00:00,000' replace_with = '''{} {} --> {} ''' lines = editor.getText().splitlines(True) for line in lines: m = re.search('(\d\d:\d\d:\d\d,\d\d\d)', line) if m: line = replace_with.format(counter, current_timestamp, m.group()) counter += 1 current_timestamp = m.group() transformed_text += line editor.setText(transformed_text)
-
cheater (:
-
@Ekopalypse well, this certainly does the trick. Thank you a lot!
-
@WinterSilence Thank you, too for your input! I tried with RegEx, but it is just so much easier with Python…