Community
    • Login

    RegEX - How can I make automatic a new line (break) at x character

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    12 Posts 5 Posters 741 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Venus642V
      Venus642
      last edited by

      Hello,
      The makro maker of Notepad++ and TextFX ReWrap is nice but he can’t help by this problem.

      Example:
      My TXT

      "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna",
      "aliquyam erat, sed diam voluptua.",
      "At vero eos et accusam et",
      "Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing",
      "elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.",
      "At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem",
      "ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore",
      "et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. ",
      "Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet."
      

      My wish:

      • All lines max. 88 character
      • Point and less under 88 character than delete ", - make the line max. 88 (move up), last right spaces make a break an set a new ",
        I hope this is understandable…

      Finish example:

      "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor",
      "invidunt ut labore et dolore magna aliquyam erat,sed diam voluptua. At vero eos et",
      "accusam et Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur",
      "sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et, dolore magna",
      "aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo",
      "dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem",
      "ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed",
      "diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam",
      "voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd", 
      "gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet."
      

      It’s all very complicated but if someone has a solution it would be very helpful.
      Many Thanks!

      Alan KilbornA PeterJonesP Michael VincentM 3 Replies Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @Venus642
        last edited by

        @Venus642

        Seems the solution would be to:

        • Remove leading " and trailing ", from entire file (easy regex replacement)

        • Join all lines (N++ menu command)

        • Split lines so that maximum line length is 88 (N++ 7.9.3+ restores the Vertical Edge splitting capability, try the RC! – or this can be a not-super-easy regex replacement)

        • Reapply the leading and training delimiters (another easy regex operation)

        Alan KilbornA Venus642V 2 Replies Last reply Reply Quote 2
        • Alan KilbornA
          Alan Kilborn @Alan Kilborn
          last edited by

          @Alan-Kilborn said in RegEX - How can I make automatic a new line (break) at x character:

          training

          typo: trailing!

          1 Reply Last reply Reply Quote 0
          • PeterJonesP
            PeterJones @Venus642
            last edited by PeterJones

            @Alan-Kilborn beat me to the general outline, because I was giving specifics:

            @Venus642 ,

            I would do it multistep

            1. Get rid of the beginning and end quotes and end commas
              FIND = ^"|",\h*$
              REPLACE = empty
              SEARCH MODE = Regular Expression
              REPLACE ALL
            2. Merge to a single line: in editor, do keystrokes Ctrl+A Ctrl+J (Edit > Line Operations > Join Line)
            3. Split the lines into a max of 88:
              • If you had v7.9.3 (which is in release candidate right now), then use Settings > Preferences > Marings/Border/Edge and set Vertical Edge Setting to 88 and CLOSE, then Ctrl+I (Edit > Line Operations > Split Line)
              • If you have an earlier version, you could follow my advice in the first reply in this other thread, where you temporarily set the editor window’s width such that Ctrl_I (Split Line) will split at 88 characters. Though, because we will be adding back in the opening quote and the end quote and comma, set the width to 85 instead.
              • If that’s absolutely impossible for you, you can do it with regex, though it’s fragile:
                FIND = ^.{1,85}\K(\h|$)
                REPLACE = \r\n (I assume windows newline)
                MODE = Regular Expression
                REPLACE ALL
            4. Re-add the quotes and commas
              FIND = (^\h*$)|(^)|((?<!")$)
              REPLACE = (?1)(?2")(?3",)
              MODE = Regular Expression
              REPLACE ALL

            With the text you showed, this process (even the regex for step 3) resulted in

            "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod",
            "tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero",
            "eos et accusam et Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur",
            "sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna",
            "aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea",
            "rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit",
            "amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod",
            "tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero",
            "eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea",
            "takimata sanctus est Lorem ipsum dolor sit amet."
            

            which does meet the 88 char restriction:

            123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x12345678
            

            This is slightly different than your example result:

            ...
            123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x12345678
            "aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo",
            "dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem",
            ...
            

            but I didn’t understand why you didn’t join more to that line

            ...
            123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x12345678
            "aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea", 
            "rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ...",
            

            caveat emptor

            This regex sequence shown here seemed to work for me, based on my understanding of your issue, and is published here to help you learn how to do this. I make no guarantees or warranties as to the functionality for you. You are responsible to save and backup all data before and after running this sequence.

            Additional Search/Replace Info

            Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as literal text using the </> toolbar button or manual Markdown syntax. To make regex in red (and so they keep their special characters like *), use backticks, like `^.*?blah.*?\z`. Screenshots can be pasted from the clipboard to your post using Ctrl+V to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries.

            Alan KilbornA Venus642V 2 Replies Last reply Reply Quote 2
            • Michael VincentM
              Michael Vincent @Venus642
              last edited by

              @Venus642 said in RegEX - How can I make automatic a new line (break) at x character:

              My wish:

              All lines max. 88 character
              Point and less under 88 character than delete ", - make the line max. 88 (move up), last right spaces make a break an set a new ",
              I hope this is understandable…

              I think you’ll find this discussion very helpful.

              Cheers.

              1 Reply Last reply Reply Quote 4
              • Alan KilbornA
                Alan Kilborn @PeterJones
                last edited by

                @PeterJones said in RegEX - How can I make automatic a new line (break) at x character:

                beat me to the general outline, because I was giving specifics

                Which of course makes yours much better! :-)

                My feeling is it could be best to give a general outline, and then if OP has trouble, they can ask for elaboration. Perhaps the general scheme suffices?

                Of course, Peter, you are free to do as you like.

                1 Reply Last reply Reply Quote 1
                • Venus642V
                  Venus642 @PeterJones
                  last edited by

                  @PeterJones

                  Your RegEx is wonderfull - Big thanks

                  I would like to thank you very much for your help!

                  1 Reply Last reply Reply Quote 1
                  • Venus642V
                    Venus642 @Alan Kilborn
                    last edited by

                    @Alan-Kilborn
                    Alan - Thanks for your help!

                    1 Reply Last reply Reply Quote 0
                    • Venus642V
                      Venus642
                      last edited by

                      @Alan-Kilborn said in RegEX - How can I make automatic a new line (break) at x character:

                      Which of course makes yours much better! :-)
                      My feeling is it could be best to give a general outline, and then if OP has trouble, they can ask for elaboration. Perhaps the general scheme suffices?
                      Of course, Peter, you are free to do as you like.

                      Peter and Alan!
                      You both helped very well.
                      Many thanks!

                      1 Reply Last reply Reply Quote 1
                      • Robin CruiseR
                        Robin Cruise
                        last edited by

                        @PeterJones said in RegEX - How can I make automatic a new line (break) at x character:

                        ^.{1,85}\K(\h|$)

                        can anyone tell me what exactly represents the number 1,85 from regex ^.{1,85}\K(\h|$) ?

                        PeterJonesP 1 Reply Last reply Reply Quote 0
                        • PeterJonesP
                          PeterJones @Robin Cruise
                          last edited by PeterJones

                          @Robin-Cruise ,

                          Yes, the documentation that you were linked to could have described that to you.

                          The regex portion .{1,85} represents one (1) to eighty-five (85) instances of the . regex, which matches any character (except newline*). So the whole regex means:

                          • .{1,85} = find 1 to 85 characters from the start of the line (as many as it can while still matching the rest of the regex)
                          • \K = reset the match, so the first part won’t be replaced; this feature requires using REPLACE ALL
                          • (\h|$) = find either a space character, or the zero-width end of line (not including newline characters). This space or zero-width-match at the end of the line will be replaced by whatever is in the REPLACE expression.

                          Footnote *: whoops, I should have either specified disabling “. matches newline” or used (?-s) at the start of the regex. sorry: (?-s)^.{1,85}\K(\h|$)

                          PeterJonesP 1 Reply Last reply Reply Quote 2
                          • PeterJonesP
                            PeterJones @PeterJones
                            last edited by

                            @PeterJones said in RegEX - How can I make automatic a new line (break) at x character:

                            Yes, the documentation that you were linked to could have described that to you

                            specifically, the section https://npp-user-manual.org/docs/searching/#multiplying-operators.

                            1 Reply Last reply Reply Quote 1
                            • First post
                              Last post
                            The Community of users of the Notepad++ text editor.
                            Powered by NodeBB | Contributors