Community
    • Login

    Joining every 100000 lines into 1 line

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    4 Posts 3 Posters 804 Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M PM Offline
      M P
      last edited by

      Hi. I have 10 million lines and need to join every 100000 lines into 1 line, with the goal of having 100 lines in the end.

      I have found an approach that works like this:

      Find: (.)\r\n(.)\r\n(.)\r\n(.)\r\n(.*)\r\n
      Replace with: \1\2\3\4\5\r\n

      but the result is not quite what I want after repeating that process multiple times, as i don’t have exactly multiple 100000 lines each joined in 1 line in the end.

      Would love to get some help here.

      Terry RT 1 Reply Last reply Reply Quote 0
      • Terry RT Offline
        Terry R @M P
        last edited by

        @M-P said in Joining every 100000 lines into 1 line:

        but the result is not quite what I want after repeating that process multiple times, as i don’t have exactly multiple 100000 lines each joined in 1 line in the end.
        Would love to get some help here.

        I found this question rather intriguing, fully expecting Notepad++ to be overwhelmed with either the sheer number of lines and/or the number of possible characters when combining those lines. I must say I have been pleasantly surprised.

        I created a 20 character line, then replicated that 10 times. Copied all and replicated that 10 times. Continue that theme until I had my 10M lines with a grand total of 220M characters.

        I then ran the regex you see below. The first iteration took approximately 5 mins (I didn’t think to start the stopwatch) to complete. The second iteration took about 30 seconds, then 8 seconds, and subsequent iterations took about 5 seconds each time.

        And lo and behold it actually worked, whereas I had thought it would have crashed.

        So as you can see I worked on the “power of 10” and just ran the regex 6 times to get the 100 lines required. I was somewhat surprised to see you’d worked with 5 lines at a time, when I thought it would be obvious that 10 was the number of lines to aim for.

        I note that your regex seems to only look for 1 character per line. If that’s true then you should have no problem with my solution.

        Find What:(?-s)(.+)\R(.+)\R(.+)\R(.+)\R(.+)\R(.+)\R(.+)\R(.+)\R(.+)\R(.+)
        Replace With:${1}${2}${3}${4}${5}${6}${7}${8}${9}${10}

        Terry

        1 Reply Last reply Reply Quote 4
        • guy038G Offline
          guy038
          last edited by guy038

          Hello, @m-p, @terry-r and All,

          I’ve got a solution almost similar to @terry-r’s one !

          SEARCH (?-s)^(.+)\R(.+)\R(.+)\R(.+)\R(.+)\R(.+)\R(.+)\R(.+)\R(.+)\R(.+\R)

          REPLACE $1$2$3$4$5$6$7$8$9$10

          Note that I include the line-break in group 10


          Unlike in @terry-r 's solution, you just need to click 5 times, consecutively, on the Replace All button

          So, the time to go from power 10^1 to power 10^5 !

          Best Regards,

          guy038

          M PM 1 Reply Last reply Reply Quote 3
          • M PM Offline
            M P @guy038
            last edited by

            @Terry-R @guy038 Thank you so much! You guys saved me a lot of time. It works perfectly accurate

            1 Reply Last reply Reply Quote 2

            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

            With your input, this post could be even better 💗

            Register Login
            • First post
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors