Automatic text transform with incremental numbering

  • Hello there!
    I’m struggling with finding an automated process/workflow for a big data set of text input. I’m working in documentary film editing and have to transfer provided texts as subtitles into a current film.

    I’m practically new to this kind of text editing. So, please excuse my low level of knowledge with RegEx and np++ scripts.

    My input data usually looks like this:

    Text is written here.
    More text is written here.
    Text is written here.

    The goal is to transform that into this:

    00:00:00,000 --> 00:00:03,000
    Text is written here.
    00:00:03,000 --> 00:00:07,031
    More text is written here.
    00:00:070,31 --> 00:00:11,028
    Text is written here.

    The first steps are quite obvious:

    • search & replace all “_++” with nothing
    • search & replace all “_**” with “##\nv*#*”

    Now I have “##” as a variable placeholder for incremental numbering in the next step. And “#” as another variable for later.
    Next step would be to read out the provided timestamp (XX:XX:XX,XXX) and write it into my variable #. One row at a time.
    Here my understanding of simple automation stops with the lack of knowledge with NP++ and RegEx.

    Could you help me out, or do I need to use another tool/write a script in another language?
    Any lead is highly appreciated :)

  • _\*\*([:,\d]+)_\+\+ .+? _\*\*([:,\d]+)_\+\+ - time range in $1, $2, but u must replace it to same format, at next step convert format of timestamps

  • @Maxim-Abrossimow said in Automatic text transform with incremental numbering:


    A python script solution might look like this

    from Npp import editor
    import re
    transformed_text = ''
    counter = 1
    current_timestamp = '00:00:00,000'
    replace_with = '''{}
    {} --> {}
    lines = editor.getText().splitlines(True)
    for line in lines:
        m ='(\d\d:\d\d:\d\d,\d\d\d)', line)
        if m:
            line = replace_with.format(counter,
            counter += 1
            current_timestamp =
        transformed_text += line

  • cheater (:

  • @Ekopalypse well, this certainly does the trick. Thank you a lot!

  • @WinterSilence Thank you, too for your input! I tried with RegEx, but it is just so much easier with Python…

Log in to reply