• Login
Community
  • Login

Auto insert break text every x amount of lines

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
9 Posts 5 Posters 2.4k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A
    Andrew Casey
    last edited by Oct 20, 2023, 2:35 AM

    Hi there

    Hope I can get some help

    I have a large text document and want to auto-insert some break text every 200 lines

    (ie lines 200, 400, 600 etc)

    The break text I wanted to use is a bunch of equal signs (ie like)

    =========================================

    Hope someone can point me in the right direction

    Thanks for the help

    1 Reply Last reply Reply Quote 0
    • C
      Coises
      last edited by Coises Oct 20, 2023, 6:43 AM Oct 20, 2023, 5:01 AM

      EDIT: @Andrew-Casey Ignore this post and see the post from @Terry-R below.

      I tried the obvious way to do this — using a regular expression to match groups of 200 lines — and it failed with an error message about the complexity of the search being too great. But I can think of a way, if what you want to do is have 200 lines of text, then a separator, then 200 lines of text, then a separator, etc. (meaning the separators will follow the original lines 200, 400, 600, etc, putting them at lines 201, 402, 603, etc. in the changed file).

      Put the cursor at the beginning of the file. Select Edit | Column Editor… from the menu. Select Number to Insert, fill in:
      Initial number: 0
      Increase by: 5
      Repeat: 1
      Leading: Zeros
      and be sure Format is Dec; then click OK.

      Make note of how many digits were added at the beginning of each line.

      Now select Search | Replace… from the menu and fill in:

      Find what : ^..000
      Replace with : -----==========\r\n-----

      but adjust the number of dots in the Find what expression to be three less than the number of digits that were added to each line, and adjust the number of dashes at the beginning and at the end of the Replace with string so that each group has the same number of dashes as there were digits added to each line. (The sample above would be correct if there were five digits added to each line.) Use however many equals signs you want.

      Set the Search Mode to Regular expression and click Replace All.

      You’ll have a line of equal signs you probably don’t want at the beginning of the file. You can delete that.

      Now, put the cursor at the very beginning of the file. Scroll all the way to the bottom using the scroll bar (don’t click in the file). Now, hold down the Shift and Alt keys and click between the end of the added number and the beginning of the original text on the very last line. That will create a rectangular selection enclosing the added numbers. Then press the Delete key.

      T 1 Reply Last reply Oct 20, 2023, 5:59 AM Reply Quote 1
      • T
        Terry R @Coises
        last edited by Terry R Oct 20, 2023, 6:03 AM Oct 20, 2023, 5:59 AM

        @Coises said in Auto insert break text every x amount of lines:

        and it failed with an error message about the complexity of the search being too great.

        I am interested in seeing what your regular expression (regex) was as when I tested my version, it worked fine.

        My regex is (@Andrew-Casey , make sure search mode is regular expression and click Replace All)
        Find What:(?-s)^((.+)?\R){200}\K
        Replace With:========================\r\n

        Did you employ the \K function as maybe that’s what caused the overload of the regex engine (by not using it)?

        I also considered another version
        Find What:(?-s)^(((.+)?\R){20}\K){10} and with the same Replace string as above. The only benefit of this might be that the regex engine only has to store a subset of the lines, before being reset, such that the two numbers (20 and 10) multiply to the required total lines required.

        Terry

        PS my test line length was about 240 characters (every line)

        C A 2 Replies Last reply Oct 20, 2023, 6:42 AM Reply Quote 2
        • C
          Coises @Terry R
          last edited by Oct 20, 2023, 6:42 AM

          @Terry-R said in Auto insert break text every x amount of lines:

          I am interested in seeing what your regular expression (regex) was as when I tested my version, it worked fine.

          Yours works fine for me, too.

          This is all very strange, and I haven’t yet figured out what is happening. My expression was:
          ((.*?)\r\n){199}
          replacing with:
          $0=========================================\r\n
          and the test file was 1312 lines, averaging around 150 characters each.

          Here’s where I’m baffled. That was in 8.5.7 64-bit. A little while after I wrote my comment, I happened to try the same thing in 8.4.8 32-bit, and it worked.

          So far, I’ve followed the code as far as seeing that it’s an error from boost::regex, and it’s at least somewhat related to a limiting value stored in a ptrdiff_t that at least some of the time is set to a hard-coded value of 100000000. That exceeds the range of a ptrdiff_t in 32-bit Windows (it fits in 32-bits unsigned, ptrdiff_t is signed).

          T 1 Reply Last reply Oct 20, 2023, 7:28 AM Reply Quote 2
          • T
            Terry R @Coises
            last edited by Terry R Oct 20, 2023, 7:36 AM Oct 20, 2023, 7:28 AM

            @Coises said in Auto insert break text every x amount of lines:

            Here’s where I’m baffled

            I recall a conversation some time ago, possibly a few years when some of the posters (I think it was mostly the experienced posters) experienced similar issues, overwhelming the regex engine. At the moment I haven’t located it, but if I recall correctly @guy038 also posted in that thread.

            I think the outcome suggested this issue cannot be predetermined from solely volume of characters processed. Environmental setup, involving plugins, undo functionality and other settings could all play a part in the error occurring.

            I can see your regex doesn’t use \K which may have affected your outcome although I’m not sure that hypothesis could be verified easily and be reproducible.

            Terry

            M C 2 Replies Last reply Oct 20, 2023, 4:02 PM Reply Quote 2
            • M
              mkupper @Terry R
              last edited by Oct 20, 2023, 4:02 PM

              Is 27333 magical? is one of the earlier threads that links to other threads about the topic of mysterious failures.

              For the OP I was thinking in terms of skipping lines without saving anything and so used both a non-capturing group and \K:
              Search: (?-s)^(?:.*?\R){199}\K
              Replace: =========================================\r\n
              This took about one second on both x32 and x64 builds of v8.5.8 on a 100,000 line file with each line having 500 characters (a 50,200,000 byte file).

              Using the same expression using (?s) instead of (?-s) also worked but then reported “Invalid Regular Expression.” That puzzled me as I had intentionally used .*? planning on testing the same expression in both dot-not-matches-newline and dot-matches-newline modes. This works as I knew the scanner would stop at the newlines: (?-s)^(?:.*\R){199}\K

              1 Reply Last reply Reply Quote 1
              • C
                Coises @Terry R
                last edited by Coises Oct 20, 2023, 5:14 PM Oct 20, 2023, 5:04 PM

                @Terry-R said in Auto insert break text every x amount of lines:

                I can see your regex doesn’t use \K which may have affected your outcome although I’m not sure that hypothesis could be verified easily and be reproducible.

                Apparently I needed some sleep before I tried to answer the original question. My expression was:

                ((.*?)\r\n){199}

                and, of course, should have been:

                ^(.*?\r\n){199}

                which doesn’t give a problem — so long as . matches newline is not checked; but given that condition, or the equivalent need for (?-s), it should have been .*, not .*? — so better yet:

                (?-s)^((.*)\r\n){199}

                with or without a trailing \K. The problem wasn’t the lack of \K, it was forgetting the caret at the beginning. The \K version does make more sense, though, unless you want to count or replace step-wise.

                From reading the boost::regex code and the comments, this error message comes up when a test within the matching process in boost::regex guesses that the number of potential alternatives is growing without bound. This is probably a special case of the infamous Halting problem ; if so, the only possible resolution for boost is to use a heuristic.

                1 Reply Last reply Reply Quote 0
                • G
                  guy038
                  last edited by Oct 20, 2023, 6:07 PM

                  Hello, @andrew-casey, @coises, @terry-r, @mkupper and All,

                  I did some tests with N++ v8.5.4 64 bits. I used a text file of 11,212,425 bytes, containing 158,760 lines, with an average of 70 characters per line

                  I decided to insert a line of equal signs, every 1,000 lines. Thus, the result should be 158 consecutive blocks of ( 1,000 lines + the line of = ) followed by the remaining 760 lines ( as 158,760 = 158 x 1,000 + 760 )


                  After verification, I can affirm that the two regexes S/R :

                  • SEARCH (?-s)^(?:.*\R){1000}\K

                  • REPLACE ========================\r\n

                  And :

                  • SEARCH (?-s)^(?:.*\R){1000}

                  • REPLACE $0========================\r\n

                  Do separate, as expected, text in blocks of 1,000 lines, with a remaining of 760 lines


                  Now, we can also use the initial @coises’s method, with the Column Editor

                  • First, with the same file, we add a vertical separator at beginning of all lines, with the regex S/R :

                    • SEARCH (?-s)^(?=.)

                    • REPLACE \xA6

                  • Secondly, let’s run the Edit > Column Editor option

                    • Choose the Number to Insertc option

                    • Type in 1 in the two zones Initial Number : and Increase by :

                    • Type 1,000 in the Repeat : zone

                    • Select the Zeros choice for the Leading : zone

                    • If necessary, choose the Dec format

                    • Click on the OK button

                    • Delete the isolated numbering of the last line ( value 159 )

                  • Thirdly, we’re going to add a separator line each time the leading numbering changes with the following regex S/R :

                    • SEARCH (?-s)^(\d+)\xA6.+\R\K(?!\1)(?!\Z)

                    • REPLACE ========================\r\n

                  • Finally, we get rid of this leading numbering with this simple regex S/R :

                    • SEARCH ^\d+\xA6

                    • REPLACE Leave EMPTY

                  The comparison with the above examples, just using ONE regex S/R, gave identical results !

                  Best Regards,

                  guy038

                  1 Reply Last reply Reply Quote 3
                  • A
                    Andrew Casey @Terry R
                    last edited by Oct 23, 2023, 10:13 PM

                    excellent worked great thank you

                    1 Reply Last reply Reply Quote 1
                    3 out of 9
                    • First post
                      3/9
                      Last post
                    The Community of users of the Notepad++ text editor.
                    Powered by NodeBB | Contributors