Copy or extract only part of text



  • I have multiple files with a set-up like this:

    My Friend, My Brother, My Son

    by Ael L. Bolt

    Category: Back to the Future
    Language: English
    Status: Completed
    Published: 2005-02-08
    Updated: 2005-02-20
    Packaged: 2013-08-25 09:57:35
    Rating: K+
    Chapters: 5
    Words: 4,835
    Publisher: www.fanfiction.net
    Story URL: http://www.fanfiction.net/s/2255042/1/
    Author URL: http://www.fanfiction.net/u/45054/Ael-L-Bolt
    Summary: George thinks on the mystery that is Marty Klein. [No slash, implied or otherwise.] [Was going to have eight installments, but I lost my notes, and it can stand as it is.]

    I need to copy the id-numbers in bold in all the files and only those, can anyone help me with a regex?

    Thanks so much



  • Hello @tanja-correl,

    Hum…, You don’t speak about the number of your files and about their average size

    Moreover, do you mean that, in all your files, you’re looking for the id-numbers, in the exact line, as below ? ( dots refer to any text )

    Story URL:......................./id-number/.../
    

    If so ( I mean, only one zone /…/ after the id-number till the end of line ), here is my first attempt :

    • Firstly, recopy all your files in a new directory ( IMPORTANT )

    • Start N++ and open the Find in Files dialog ( Ctrl + Shift + F )

    • In the Find what: zone, type the regex (?s-i).+\RStory URL:(?-s).+/(.+)/.+/$(?s).+

    • In the Replace with: zone type \1

    • In the Filters zone choose *.txt or your own extension

    • In the Directory zone, indicate your new directory, containing a copy of all your files

    • Check the Regular expression search mode

    • Now, click on the Replace in Files button and valid the Are you sure dialog

    => Each file should only contain, from now on, the id-number of its Story URL line


    Now, :

    • In the Find what: zone, type the simple regex .+

    • In the Replace with: zone, type, for security, the regex $0

    • Click on the Find All button

    => The Find result panel should appear, with all the id-numbers, in line 1, of all the files


    Finally, to get all the id-numbers, in an unique file, two solutions are possible :

    • First solution :

      • Right click, on any part of the Find result panel and select the Select All option

      • Right click, again, on any part of the Find result panel and select the Copy option

      • Open an N++ new tab ( Ctrl + N )

      • Paste the previous selection, in that new tab ( Ctrl + V )

    => The id-numbers, exclusively, are recopied !

    • Second solution :

      • Right click, on any part of the Find result panel and select the Select All option

      • Hit the classical Ctrl + C shortcut

      • Open an N++ new tab ( Ctrl + N )

      • Paste the previous selection, in that new tab ( Ctrl + V )

    => This time, each id-number, and the name of their associated file, are written !


    Notes on the initial regex :

    • From the very beginning of file, due to the modifiers (?s-i), the regex .+\RStory URL: grabs all characters ( included EOL ones ), till EOL character(s), followed by the string Story URL:, in that exact case

    • Then, due to the modifier (?-s), the regex .+/(.+)/.+/$ looks for the remainder of the Story URL line, catching, with the parentheses, the id-number (.+) in group 1

    • The final part (?s).+ matches all the lines, after the Story URL line, till the very end of the current file

    • So, in replacement, all the contents of each file are simply replaced by the unique id-number \1

    Best Regards

    guy038


Log in to reply