Padding the result of a regular expression search



  • Hi reg ex ninja’s
    I have some source code that I want to “pretty-up”.
    Basically, I’m searching for something like this:
    ;;
    ;; foo bar
    ;;

    and I want i to be padded, so it looks like this:
    ;; foo bar ===============
    ;;

    –> current search reg ex: ;;\n;; (.+)\n
    –> current replace reg ex: ;; \1 ===============\n (I know I’m replacing the first “;;\n” as well, that’s on purpose)

    Unfortunately, I cannot just add all the “=” at the end, because I don’t know the number of “=” on beforehand, since length of “foo bar” is not known, e.g.:
    ;; foo ===================
    ;;
    (this would need 4 more “=” because of " bar" missing)

    I found some solutions specifically for numbers (i.e. “foo” and “bar” are digits), but am unable to work it out.

    Your help is highly appreciated, thanks in advance!
    Evert



  • @EvertDB ,

    This can be done in 3 distinct step groupings:

    First,

    • On the first line of your file, lengthen the line by adding space characters out beyond (by a few) the column you want your last = character to end up in. This should be further to the right than any existing line in the file has data. I used column 82 for this example.
    • Make a mental note as to where horizontally the caret is on the screen. :-D
    • Invoke the Edit (menu) -> Begin/End Select command (beginning the selection).
    • Move to the last line of the file.
    • Alt+LeftClick at about the same horizontal position on the screen as noted before, on the last line.
    • Invoke the Edit (menu) -> Begin/End Select command again (ending the selection). Ideally you will obtain a zero-width tall skinny column caret, but it is okay if it is has some small width to it (if you were not spot-on about being in the same column each time you invoked the command).
    • Insert a space; all lines will now be space-padded out to the length of the caret column.

    Second,

    • Move back the caret back to the top of the file.
    • Perform the following Replace-All, making the number of = in the replacement greater than the number of columns you want the longest line to be. A lot greater is fine. I used 85 below.
    • Find-what box: (?-s)^(;;\h(?=\H).+?)\h+$
    • Replace-with box: \1 =====================================================================================
    • Search mode: Regular expression

    Third,

    • Move back the caret back to the top of the file (actually, should still be there).
    • Perform the following Replace-All, making the number in the braces exactly the length of the longest line you want to end up with.
    • Find-what box: (?-s)^(?=;;\h\H)(.{80}).*
    • Replace-with box: \1
    • Search mode: Regular expression

    Extras:

    • At this point I would do Edit (menu) -> Blank Operations -> Trim Trailing Space to remove spurious space added in the first steps.
    • Thinking about it, setting up Settings (menu) -> Preferences -> Editing -> Vertical Edge Settings -> Show Vertical Edge (check it) and setting Number of columns to an appropriate value (80 for my example) makes some of this column stuff in the first steps easier.
    • This solution doesn’t involve Tab characters as those were not mentioned as being used by the OP. I’m not thinking about tab characters because they are evil (no flame wars, please) and I don’t use them, except when forced to, in Make files for example.

    If this (or ANY) posting was useful, don’t post a “thanks”, just upvote ( click the ^ in the ^ 0 v area on the right ).



  • I think my earlier solution was overly complicated. :(
    Hopefully this revised one isn’t the same. @guy038 will tell us if it is. :-D
    Regardless, I think this takes TWO replacement operations, which is probably more than the OP wanted, but again maybe it can be done in one?

    If the original data doesn’t have trailing spaces on lines, then it becomes just a couple of steps:

    First,

    Make sure the caret is at the top of the file.
    Perform the following Replace-All operation, making the number of = in the replacement greater than the number of columns you want the longest line to be. A lot greater is fine. I used 85 below.
    Find-what box: (?-s)^;;\h.*
    Replace-with box: $0 =====================================================================================
    Search mode: Regular expression

    Second,

    Make sure the caret is at the top of the file.
    Perform the following Replace-All operation, , making the number in the braces exactly the length of the longest line you want to end up with…
    Find-what box: (?-s)^(?=;;\h)(.{80}).*
    Replace-with box: \1
    Search mode: Regular expression

    Note: The \h in the regular expressions above is for “horizontal whitespace” and just assures we are working on lines that have some data after the ;;.

    If this (or ANY) posting was useful, don’t post a “thanks”, just upvote ( click the ^ in the ^ 0 v area on the right ).



  • Hello, @evertdb, @scott-sumner and All,

    Here is my contribution to the general padding problem, at end of lines. As in the Scott’s solution, above, it uses two consecutive regex S/R

    So, EvertDB, let’s suppose the original test example, below :

    ;; 1
    ;; 12
    ;; 123
    ;; 1234
    ;; 12345
    ;; 123456
    ;; 1234567
    ;; 12345678
    ;; 123456789
    ;; 1234567890
    ;; 12345678901
    ;; 123456789012
    ;; 1234567890123
    ;; 12345678901234
    ;; 123456789012345
    ;; 1234567890123456
    ;; 12345678901234567
    ;; 123456789012345678
    ;; 1234567890123456789
    ;; 12345678901234567890
    ;; 123456789012345678901
    

    As I suppose that the two leading semicolons is the line-comment syntax, in your language, perform, the first S/R, below :

    SEARCH (?-s)^;;.+

    REPLACE $0 ==============================

    Notes :

    • First, due to the (?-s) syntax, the special dot character will stand, strictly, for any single standard character

    • Then, in the searched part, we’re just looking for two semicolons, at beginning of lines, followed by a non-null amount of standard characters

    • In replacement, we, simply, rewrite the complete searched match, followed with a space character and 30 equal signs.

    • From your example, it happens that the minimum, of equal signs to add, is 21. But, you don’t have to bother about estimating that minimum. Just add a large enough amount of this character, at the end of the replacement regex

    So, we get the modified text, below :

    ;; 1 ==============================
    ;; 12 ==============================
    ;; 123 ==============================
    ;; 1234 ==============================
    ;; 12345 ==============================
    ;; 123456 ==============================
    ;; 1234567 ==============================
    ;; 12345678 ==============================
    ;; 123456789 ==============================
    ;; 1234567890 ==============================
    ;; 12345678901 ==============================
    ;; 123456789012 ==============================
    ;; 1234567890123 ==============================
    ;; 12345678901234 ==============================
    ;; 123456789012345 ==============================
    ;; 1234567890123456 ==============================
    ;; 12345678901234567 ==============================
    ;; 123456789012345678 ==============================
    ;; 1234567890123456789 ==============================
    ;; 12345678901234567890 ==============================
    ;; 123456789012345678901 ==============================
    

    Now, we just have to delete the extra equal signs at the end of each line. To do so, this second S/R needs the number of characters, at beginning of each line, after the two semicolons symbols, which must be preserved !

    From your example, below, you can, visually, determine that this number is 24 :

    ;; foo bar ===============
      123456789012345678901234
    

    So, we’ll use the following regex S/R, below :

    SEARCH (?-s)^;;.{24}\K.+

    REPLACE Leave Empty

    Notes :

    • For the first part (?-s), just refer the notes, above

    • The part ^;;.{24} looks, from beginning of each line ( ^ ), for two semicolons, followed by the next 24 characters

    • Then the part \K resets the regex engine working position and forgets the immediate previous search

    • Therefore, the final regex match, is, simply, .+, which stands for any non-null amount of standard characters, after the absolute location 24, till the end of each line

    • Empty replacement regex means that this ending amount of characters is just deleted

    IMPORTANT :

    For this second S/R, due to the \K regex feature, you must, exclusively use the Replace All button ( NOT the step-by-step Replace button ! )

    And, we obtain the final text, with a lined-up padding of equal characters, at the end of each line :

    ;; 1 =====================
    ;; 12 ====================
    ;; 123 ===================
    ;; 1234 ==================
    ;; 12345 =================
    ;; 123456 ================
    ;; 1234567 ===============
    ;; 12345678 ==============
    ;; 123456789 =============
    ;; 1234567890 ============
    ;; 12345678901 ===========
    ;; 123456789012 ==========
    ;; 1234567890123 =========
    ;; 12345678901234 ========
    ;; 123456789012345 =======
    ;; 1234567890123456 ======
    ;; 12345678901234567 =====
    ;; 123456789012345678 ====
    ;; 1234567890123456789 ===
    ;; 12345678901234567890 ==
    ;; 123456789012345678901 =
    

    Et voilà !


    Finally, EvertDB, to be exact, we need to get rid of any empty comment line, located right above each real comment-line. To that purpose, use the regex S/R, below :

    SEARCH (?-s)^;;\R(?=^;;.+)

    REPLACE Leave Empty

    Notes :

    • For the first part (?-s), just refer the notes, above

    • The part ^;;\R looks for two semicolons, at beginning of each line, immediately followed by End of Line character(s), whatever it is/they are !

    • The part (?=........) is a positive look-ahead, in other words, a condition which must be true, for an overall match

    • The condition, to respect, is the regex ^;;.+, which represents a non-empty comment line, in your language ( two semicolons, followed by a non-mull amount of standard characters, before the end of the line )

    • Due to empty replacement, the searched regex ( the null-comment line ) is, simply, deleted

    Best Regards,

    guy038

    P.S. :

    For padding characters, at beginning of a list, you may refer to the topic, below :

    https://notepad-plus-plus.org/community/topic/13988/find-replace-issues/5


Log in to reply