Automatic text transform with incremental numbering
-
Hello there!
I’m struggling with finding an automated process/workflow for a big data set of text input. I’m working in documentary film editing and have to transfer provided texts as subtitles into a current film.I’m practically new to this kind of text editing. So, please excuse my low level of knowledge with RegEx and np++ scripts.
My input data usually looks like this:
_**00:00:03,000_++ Text is written here. _**00:00:07,031_++ More text is written here. _**00:00:11,028_++ Text is written here.The goal is to transform that into this:
1 00:00:00,000 --> 00:00:03,000 Text is written here. 2 00:00:03,000 --> 00:00:07,031 More text is written here. 3 00:00:070,31 --> 00:00:11,028 Text is written here.The first steps are quite obvious:
- search & replace all “_++” with nothing
- search & replace all “_**” with “##\nv*#*”
Now I have “##” as a variable placeholder for incremental numbering in the next step. And “#” as another variable for later.
Next step would be to read out the provided timestamp (XX:XX:XX,XXX) and write it into my variable #. One row at a time.
Here my understanding of simple automation stops with the lack of knowledge with NP++ and RegEx.Could you help me out, or do I need to use another tool/write a script in another language?
Any lead is highly appreciated :) -
_\*\*([:,\d]+)_\+\+ .+? _\*\*([:,\d]+)_\+\+- time range in$1,$2, but u must replace it to same format, at next step convert format of timestamps -
@Maxim-Abrossimow said in Automatic text transform with incremental numbering:
00:00:00,000
A python script solution might look like this
from Npp import editor import re transformed_text = '' counter = 1 current_timestamp = '00:00:00,000' replace_with = '''{} {} --> {} ''' lines = editor.getText().splitlines(True) for line in lines: m = re.search('(\d\d:\d\d:\d\d,\d\d\d)', line) if m: line = replace_with.format(counter, current_timestamp, m.group()) counter += 1 current_timestamp = m.group() transformed_text += line editor.setText(transformed_text) -
cheater (:
-
@Ekopalypse well, this certainly does the trick. Thank you a lot!
-
@WinterSilence Thank you, too for your input! I tried with RegEx, but it is just so much easier with Python…
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login