• Login
Community
  • Login

Automatic text transform with incremental numbering

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
6 Posts 3 Posters 460 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    Maxim Abrossimow
    last edited by Jul 12, 2020, 12:05 PM

    Hello there!
    I’m struggling with finding an automated process/workflow for a big data set of text input. I’m working in documentary film editing and have to transfer provided texts as subtitles into a current film.

    I’m practically new to this kind of text editing. So, please excuse my low level of knowledge with RegEx and np++ scripts.

    My input data usually looks like this:

    _**00:00:03,000_++
    Text is written here.
    _**00:00:07,031_++
    More text is written here.
    _**00:00:11,028_++
    Text is written here.
    

    The goal is to transform that into this:

    1
    00:00:00,000 --> 00:00:03,000
    Text is written here.
    2
    00:00:03,000 --> 00:00:07,031
    More text is written here.
    3
    00:00:070,31 --> 00:00:11,028
    Text is written here.
    

    The first steps are quite obvious:

    • search & replace all “_++” with nothing
    • search & replace all “_**” with “##\nv*#*”

    Now I have “##” as a variable placeholder for incremental numbering in the next step. And “#” as another variable for later.
    Next step would be to read out the provided timestamp (XX:XX:XX,XXX) and write it into my variable #. One row at a time.
    Here my understanding of simple automation stops with the lack of knowledge with NP++ and RegEx.

    Could you help me out, or do I need to use another tool/write a script in another language?
    Any lead is highly appreciated :)

    E 1 Reply Last reply Jul 12, 2020, 12:51 PM Reply Quote 0
    • W
      WinterSilence
      last edited by Jul 12, 2020, 12:23 PM

      _\*\*([:,\d]+)_\+\+ .+? _\*\*([:,\d]+)_\+\+ - time range in $1, $2, but u must replace it to same format, at next step convert format of timestamps

      1 Reply Last reply Reply Quote 0
      • E
        Ekopalypse @Maxim Abrossimow
        last edited by Ekopalypse Jul 12, 2020, 12:52 PM Jul 12, 2020, 12:51 PM

        @Maxim-Abrossimow said in Automatic text transform with incremental numbering:

        00:00:00,000

        A python script solution might look like this

        from Npp import editor
        import re
        
        transformed_text = ''
        counter = 1
        current_timestamp = '00:00:00,000'
        replace_with = '''{}
        {} --> {}
        '''
        
        lines = editor.getText().splitlines(True)
        for line in lines:
            m = re.search('(\d\d:\d\d:\d\d,\d\d\d)', line)
            if m:
                line = replace_with.format(counter,
                                           current_timestamp,
                                           m.group())
                counter += 1
                current_timestamp = m.group()
            transformed_text += line
                
        editor.setText(transformed_text)
        
        M 1 Reply Last reply Jul 12, 2020, 4:19 PM Reply Quote 1
        • W
          WinterSilence
          last edited by Jul 12, 2020, 2:41 PM

          cheater (:

          M 1 Reply Last reply Jul 12, 2020, 4:20 PM Reply Quote 1
          • M
            Maxim Abrossimow @Ekopalypse
            last edited by Jul 12, 2020, 4:19 PM

            @Ekopalypse well, this certainly does the trick. Thank you a lot!

            1 Reply Last reply Reply Quote 1
            • M
              Maxim Abrossimow @WinterSilence
              last edited by Jul 12, 2020, 4:20 PM

              @WinterSilence Thank you, too for your input! I tried with RegEx, but it is just so much easier with Python…

              1 Reply Last reply Reply Quote 0
              6 out of 6
              • First post
                6/6
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors