RegEX - How can I make automatic a new line (break) at x character



  • Hello,
    The makro maker of Notepad++ and TextFX ReWrap is nice but he can’t help by this problem.

    Example:
    My TXT

    "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna",
    "aliquyam erat, sed diam voluptua.",
    "At vero eos et accusam et",
    "Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing",
    "elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.",
    "At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem",
    "ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore",
    "et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. ",
    "Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet."
    

    My wish:

    • All lines max. 88 character
    • Point and less under 88 character than delete ", - make the line max. 88 (move up), last right spaces make a break an set a new ",
      I hope this is understandable…

    Finish example:

    "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor",
    "invidunt ut labore et dolore magna aliquyam erat,sed diam voluptua. At vero eos et",
    "accusam et Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur",
    "sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et, dolore magna",
    "aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo",
    "dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem",
    "ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed",
    "diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam",
    "voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd", 
    "gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet."
    

    It’s all very complicated but if someone has a solution it would be very helpful.
    Many Thanks!



  • @Venus642

    Seems the solution would be to:

    • Remove leading " and trailing ", from entire file (easy regex replacement)

    • Join all lines (N++ menu command)

    • Split lines so that maximum line length is 88 (N++ 7.9.3+ restores the Vertical Edge splitting capability, try the RC! – or this can be a not-super-easy regex replacement)

    • Reapply the leading and training delimiters (another easy regex operation)





  • @Alan-Kilborn beat me to the general outline, because I was giving specifics:

    @Venus642 ,

    I would do it multistep

    1. Get rid of the beginning and end quotes and end commas
      FIND = ^"|",\h*$
      REPLACE = empty
      SEARCH MODE = Regular Expression
      REPLACE ALL
    2. Merge to a single line: in editor, do keystrokes Ctrl+A Ctrl+J (Edit > Line Operations > Join Line)
    3. Split the lines into a max of 88:
      • If you had v7.9.3 (which is in release candidate right now), then use Settings > Preferences > Marings/Border/Edge and set Vertical Edge Setting to 88 and CLOSE, then Ctrl+I (Edit > Line Operations > Split Line)
      • If you have an earlier version, you could follow my advice in the first reply in this other thread, where you temporarily set the editor window’s width such that Ctrl_I (Split Line) will split at 88 characters. Though, because we will be adding back in the opening quote and the end quote and comma, set the width to 85 instead.
      • If that’s absolutely impossible for you, you can do it with regex, though it’s fragile:
        FIND = ^.{1,85}\K(\h|$)
        REPLACE = \r\n (I assume windows newline)
        MODE = Regular Expression
        REPLACE ALL
    4. Re-add the quotes and commas
      FIND = (^\h*$)|(^)|((?<!")$)
      REPLACE = (?1)(?2")(?3",)
      MODE = Regular Expression
      REPLACE ALL

    With the text you showed, this process (even the regex for step 3) resulted in

    "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod",
    "tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero",
    "eos et accusam et Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur",
    "sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna",
    "aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea",
    "rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit",
    "amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod",
    "tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero",
    "eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea",
    "takimata sanctus est Lorem ipsum dolor sit amet."
    

    which does meet the 88 char restriction:

    123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x12345678
    

    This is slightly different than your example result:

    ...
    123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x12345678
    "aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo",
    "dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem",
    ...
    

    but I didn’t understand why you didn’t join more to that line

    ...
    123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x12345678
    "aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea", 
    "rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ...",
    

    caveat emptor

    This regex sequence shown here seemed to work for me, based on my understanding of your issue, and is published here to help you learn how to do this. I make no guarantees or warranties as to the functionality for you. You are responsible to save and backup all data before and after running this sequence.

    Additional Search/Replace Info

    Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as literal text using the </> toolbar button or manual Markdown syntax. To make regex in red (and so they keep their special characters like *), use backticks, like `^.*?blah.*?\z`. Screenshots can be pasted from the clipboard to your post using Ctrl+V to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries.



  • @Venus642 said in RegEX - How can I make automatic a new line (break) at x character:

    My wish:

    All lines max. 88 character
    Point and less under 88 character than delete ", - make the line max. 88 (move up), last right spaces make a break an set a new ",
    I hope this is understandable…

    I think you’ll find this discussion very helpful.

    Cheers.



  • @PeterJones said in RegEX - How can I make automatic a new line (break) at x character:

    beat me to the general outline, because I was giving specifics

    Which of course makes yours much better! :-)

    My feeling is it could be best to give a general outline, and then if OP has trouble, they can ask for elaboration. Perhaps the general scheme suffices?

    Of course, Peter, you are free to do as you like.



  • @PeterJones

    Your RegEx is wonderfull - Big thanks

    I would like to thank you very much for your help!



  • @Alan-Kilborn
    Alan - Thanks for your help!



  • @Alan-Kilborn said in RegEX - How can I make automatic a new line (break) at x character:

    Which of course makes yours much better! :-)
    My feeling is it could be best to give a general outline, and then if OP has trouble, they can ask for elaboration. Perhaps the general scheme suffices?
    Of course, Peter, you are free to do as you like.

    Peter and Alan!
    You both helped very well.
    Many thanks!



  • @PeterJones said in RegEX - How can I make automatic a new line (break) at x character:

    ^.{1,85}\K(\h|$)

    can anyone tell me what exactly represents the number 1,85 from regex ^.{1,85}\K(\h|$) ?



  • @Robin-Cruise ,

    Yes, the documentation that you were linked to could have described that to you.

    The regex portion .{1,85} represents one (1) to eighty-five (85) instances of the . regex, which matches any character (except newline*). So the whole regex means:

    • .{1,85} = find 1 to 85 characters from the start of the line (as many as it can while still matching the rest of the regex)
    • \K = reset the match, so the first part won’t be replaced; this feature requires using REPLACE ALL
    • (\h|$) = find either a space character, or the zero-width end of line (not including newline characters). This space or zero-width-match at the end of the line will be replaced by whatever is in the REPLACE expression.

    Footnote *: whoops, I should have either specified disabling “. matches newline” or used (?-s) at the start of the regex. sorry: (?-s)^.{1,85}\K(\h|$)



  • @PeterJones said in RegEX - How can I make automatic a new line (break) at x character:

    Yes, the documentation that you were linked to could have described that to you

    specifically, the section https://npp-user-manual.org/docs/searching/#multiplying-operators.


Log in to reply