Add the chapter title on each line with REG-EXP



  • Given the following file :

    Chapter 1
    1 line 1
    2 line 2
    3 line 3

    Chapter 2
    1 line 4
    2 line 5

    I would like to add the chapter number on each line number :
    Chapter 1
    1-1 line1
    1-2 line 2

    Chapter 2
    2-1 line 4

    Is it possible using regular expressions ?
    Thanks for any insight.



  • @Trucide-Polère

    a solution with two regular expression might look like this (not really prof but hopefully useful)

    First change each line after a Chapter

    find: (Chapter\s)(\d+)(\s*\r\n)(\d+)
    replace: \1\2\3\2-\4
    

    and next change the remaining lines

    find: (\d+)(-\d+.*?\s*\r\n)(\d+)(\s)
    replace: \1\2\1-\3\4
    

    First regex would look for
    Chapter followed by a single space (Chapter\s) followed by
    multiple digits but at lease one (\d+) followed by
    multiple spaces and carriage return (\s*\r\n)
    followed by multiple digits but at least one(\d+)

    Each of the () are groups which are reflected by \1 and \2 respectively.
    So \1 belongs to the result of (Chapter\s) and \2 to (\d+) …

    Second regex (\d+)(-\d+.?\s\r\n)(\d+)(\s) means
    multiple digits but at least one (\d+) followed by
    a dash and multiple digits but at least one followed by any chars non-greedy followed by spaces, either 0 or multiple followed by carriage return (-\d+.?\s\r\n)
    again multiple digits but at least one (\d+) followed by (\d+) followed by
    a space.

    The second regex needs to be performed more often as it always replaces one line per chapter.

    Make a backup before testing.

    Cheers
    Claudia



  • Thanks.

    Good idea !

    but… there’s a lot (and i mean a lot!) of lines in my file, and it only replace one line per chapter. Anyway, it’s a good start, and at least I only have to replace the line for one chapter…

    Great thanks,
    Cheers
    Pierre



  • @Trucide-Polère

    If it is only one chapter than a much faster attempt would be to use
    block mode functionality instead of regex.
    -Put the cursor just before the first line to change,
    -Scroll, using the scrollbar, to the last line to change
    -press SHIFT+ALT and click in front of the last line
    you should see a vertical line
    -type whatever you need to do and all lines get edited.
    Done.

    Cheers
    Claudia



  • Hello, Trucide Polère and Claudia,

    Trucide, I also found two S/R to achieve what you would like to but, compared to the Claudia’s solution, the second S/R needs to be run, once only :-))) So, just click on the Replace All button, once for each S/R !

    My solution works :

    • Whatever the location of the word “Chapter”, in current line

    • Whatever the number value, of the word “Chapter” ( Numeric sort not needed )

    • Whatever the case of the word “Chapter

    • If blank line(s) is(are) inserted between two chapters

    • If blank line(s) is(are) inserted between two lines of a chapter

    Remark : These regexes need an extra character, which must NOT exist, yet, in your file. I chose the # symbol but any other symbol may be used. Just escape it if this symbol is a special regex character !


    So, given the example text, below, with 3 blank lines after “line 10” :

                 Chapter 156
    
    line 1
        line 2
    
    
    line 3
            line 4
    line 5
    
    Chapter 2
    
    line 5
    line 6
    
    
    This is a test : chapter 37
    
    line 7
    
    
    
        line 8
        line 9
    line 10
    

    The first regex S/R, below, adds a line, beginning with a # symbol and followed by the number of the current chapter, just before the line, containing the next word chapter, whatever its case OR before the very end of the file

    SEARCH (?i)Chapter\h+(\d+)(?s).+?(?=(?-s).*Chapter|\z)

    REPLACE $0#\1\r\n

    So it gives the changed text, below :

                 Chapter 156
    
    line 1
        line 2
    
    
    line 3
            line 4
    line 5
    
    #156
    Chapter 2
    
    line 5
    line 6
    
    
    #2
    This is a test : chapter 37
    
    line 7
    
    
    
        line 8
        line 9
    line 10
    
    
    
    #37
    

    The second regex S/R, below :

    • search for a complete line, beginning with a # symbol, and deletes it

    • search for a non-empty line, which does not contain the word Chapter, whatever its case but it’s followed, further on, by the nearest # symbol with its number, stored as group 1 then adds this number, followed by a dash, in front of each line, during replacement

    SEARCH (?i-s)^#.+\R|(?!.*Chapter)^.+(?=(?s).+?#(\d+))
    REPLACE ?1\1-$0

    So, we obtain the final text, below :

                 Chapter 156
    
    156-line 1
    156-    line 2
    
    
    156-line 3
    156-        line 4
    156-line 5
    
    Chapter 2
    
    2-line 5
    2-line 6
    
    
    This is a test : chapter 37
    
    37-line 7
    
    
    
    37-    line 8
    37-    line 9
    37-line 10
    

    Best Regards,

    guy038

    Note : as usual :

    • The modifier (?-s) means that further dot character ( . ) stands for any single standard character, except for any End of Line character

    • The modifier (?s) means that further dot character ( . ) stands for any single character, included End of Line character

    • The modifier (?i) means that the regex search is performed in an insensitive way

    • The modifier (?s) means that the regex search is performed in a sensitive way


Log in to reply