• Login
Community
  • Login

Join Lines that are breaking

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
16 Posts 5 Posters 3.9k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T
    Terry R
    last edited by Jun 29, 2018, 3:28 AM

    If my previous idea doesn’t work due to completely variable line lengths, that is to say it’s possible for an actual line to be less than 50 characters, yet an extension to a line be more than 50 characters, then another idea is to:
    Have the report include either another field which would not exist in any other area of the line, or have it print at the start of each line a special character sequence, i.e. “xZXz”.
    Then do a search and replace looking for this sequence (lets use my example)and replacing with \RxZXz. At this point we know the start of the line must be a line break followed by the char sequence xZXz.
    We then remove all line breaks which do NOT occur immediately before the character sequence. Lastly we can remove the special character sequence if you wish to get back to the original report.
    So we are talking about 3 separate find/replace runs to complete the job.

    So we have
    Find:xZXz
    Replace:\RxZXz
    followed by
    Find:\R(?!xZXz)
    Replace: “this line is empty”
    and lastly
    Find:\RxZXz
    Replace:\R
    or
    Find:xZXz
    Replace: “this line is empty”

    As always, it’s a good idea to back up the source data and test the result, generally by running it manually at first and also comparing to what you expected to get.

    Terry

    1 Reply Last reply Reply Quote 0
    • G
      guy038
      last edited by guy038 Jun 30, 2018, 1:13 AM Jun 30, 2018, 1:08 AM

      Hello, @ryan-heatherly, @terry-r and All,

      Assuming that, in your reports, any second line, which is wrapped, does not exceed 30 characters, an easy solution is to use the following regex S/R :

      SEARCH (?-s)\R(?=.{1,30}$)

      REPLACE Leave EMPTY

      Of course select the Regular expression search mode and tick the Wrap around option

      Et voilà ! Your lines which are split in two parts should be joined and ready for additional treatments in Excel

      This regex tries to match any line-break ( \R ) but ONLY IF the following line contains between 1 and 30 characters maximum !

      If so, the line-break is, then, deleted => The two lines are joined

      Best regards,

      guy038

      1 Reply Last reply Reply Quote 1
      • R
        Ryan Heatherly
        last edited by Jul 2, 2018, 9:38 AM

        Hi @guy038 @terry ,

        thanks for the Response.

        Both of the search queries are finding the lines, but there’s an issue that sometimes the lines are split into 3 lines. The first allows a maximum of 400 characters, so I amended “Find:(.{401})\R(.{1,50})\R” with 400.

        However it’s not functioning as perfect because the second line could have a maximum of 179 characters which would then be split into the third line. So effectively only the first and second line would be merged but the remaining third line wouldn’t.

        Would it be easier if we searched for a row that starts with “00~000000~0000~” ? the zero characters could be any numbers in that series.
        But then how would we have all the rows merged until it finds the next line that begins with “00~000000~0000~”.

        PS. “~” is my delimiter for this text to be imported into Excel.

        1 Reply Last reply Reply Quote 0
        • T
          Terry R
          last edited by Jul 2, 2018, 11:36 AM

          You will see my first go at an answer did make some assumptions, one of which you have just confirmed. I think given that new information, it makes the line lengths too variable to use that to determine where the breaks were added. However the addition of another group of characters, namely your number group with “~” in the middle does help.

          So my new suggestion is we search for any CR/LF character which is NOT followed by your “number grouping” and remove them.
          Find: \R(?!\d{1,2}~\d{1,6}~\d{1,4}~)
          Replace: \1

          I’ve included the possibility that the number group you refer to could be shorter in length but not longer than your example. Thus 1~2~3~ would be just as valid as 12~345678~9012~. If that is not the case then change the {1,2} to {2}, the {1,6} to {6} and the {1,4} to {4}. I hope you get the concept.

          I haven’t actually tested this, but give it a go. It’s sleepy time time, I’ll check the forum in another 8 hrs or so.

          Terry

          1 Reply Last reply Reply Quote 2
          • R
            Ryan Heatherly
            last edited by Jul 2, 2018, 11:59 AM

            Hi Terry,

            Awesome Job ! :)

            That worked like a charm on data of 783 rows!

            I used this;

            Find: \R(?!\d{2}~\d{16}~\d{4}~)

            Because my data would always be like the following format;
            “01~000002~0003”

            Thanks a million guys !!! This has literally saved me a lot of time, especially since i’ve got to keep running the report and updating it!

            I can now move on to my next report which has a similar issue ! :) This has save

            1 Reply Last reply Reply Quote 0
            • R
              Ryan Heatherly
              last edited by Jul 2, 2018, 12:30 PM

              What would be the best way to go joining broken lines with the following format?

              Status,Company Number,Order Description,Owner,Order Number,Ordered Date,Creation Date,Supplier Name,Supplier Number,Product Code,Product,Quantity Ordered,Quantity Unit,Net Unit Price,Required Delivery Date,Quantity Received,Actual Delivery Date,Cost Centre,Cost Centre Description,Account Code,Account Code Description,Analysis Code,Analysis Code Description,Sub Account Code,Sub Account Code Description,Line Net Sum,Line VAT Sum,Line Gross Sum,Cluster,Region,

              (there are three commas after region, but it’s not showing up here)

              1 Reply Last reply Reply Quote 0
              • T
                Terry R
                last edited by Jul 2, 2018, 7:45 PM

                Glad my regex did so well, and I hadn’t even tested it. I’m actually new to this as well. I had a similar issue and others helped me so I feel it’s time to pay that forward (where I can).

                As to your latest question. The concept of the regex I supplied will work with some minor amendments. Some assumptions are made:

                1. the 3 commas will only appear at the end of the line
                2. nothing will ever appear between the 3 commas

                Find: (?!,)\R
                Replace: <empty line> so nothing goes in the replace field

                I had in my previous answer \1 in the replace field, I don’t actually think that was necessary, the same goes for this regex.

                So the expression says look for a CR/LF and so long as it doesn’t appear directly behind 3 commas, then delete it.

                I am a bit concerned though that you have 3 commas together and the format suggests that commas are used between all fields. Are you sure that the 3 commas will never have data between them?

                Terry

                1 Reply Last reply Reply Quote 0
                • T
                  Terry R
                  last edited by Jul 2, 2018, 7:53 PM

                  Slight typo in my last regex, it should have been
                  (?!,)\R
                  Somehow only 1 comma had appeared where there should have been 3 together.

                  Terry

                  1 Reply Last reply Reply Quote 0
                  • T
                    Terry R
                    last edited by Jul 2, 2018, 11:09 PM

                    So I’m not going mad. Something is happening to my typing. I know for certain that I typed 3 commas together in that last regex. Yet on my screen it’s only showing 1.
                    This will be a test. In all cases 3 commas should appear together.
                    ,
                    “,”
                    /,/,/,/
                    /, , ,/
                    Where I’m typing I have the 3 commas, on the right side is a preview, and that only shows 1 in the first 2 cases. The other 2 have another character between commas so they show correctly in preview.

                    Now I need some assistance. Can anyone provide me an answer to why my characters are going missing?

                    Terry

                    1 Reply Last reply Reply Quote 0
                    • T
                      Terry R
                      last edited by Jul 3, 2018, 2:31 AM

                      Ryan, i have the answer as to why your 3 commas (and mine) didn’t show. The box you type in is an interpreter and displays what it believes you really wanted. I found another post which alerted me to what will be my most favourite character on this forum going forward, the grave accent. That’s the one on the same key as your ~, thus “`”.
                      If I type what my regex should have been using this around the characters it should come out correctly. It will also come out highlighted.

                      Find: (?!,,,)\R

                      My preview is showing it correctly, lets hope it posts it the same way.

                      Can anyone provide a FAQ on this markup/Markdown (I’ve seen something that says it’s called Markdown) interpreter? I’m looking but unsure of where it should be.

                      Terry

                      1 Reply Last reply Reply Quote 1
                      • P
                        PeterJones
                        last edited by Jul 3, 2018, 1:04 PM

                        When you reply to a post, in the upper-right corner of the typing window is the word COMPOSE ?. Click on that question mark. It pops up a window with a link to the Markdown documentation .

                        1 Reply Last reply Reply Quote 0
                        • G
                          guy038
                          last edited by Jul 3, 2018, 1:33 PM

                          Hi, @terry-r and All,

                          In addition to Peter’s information about Markdown syntax, there is also a N++ Markdown Viewer plugin, though I have not tested it, yet !

                          Refer to :

                          https://github.com/nea/MarkdownViewerPlusPlus

                          And the latest v.0.8.2 release, either in 32 and 64 bits, can be downloaded from below :

                          https://github.com/nea/MarkdownViewerPlusPlus/releases

                          Best Regards,

                          guy038

                          1 Reply Last reply Reply Quote 1
                          • P
                            PeterJones
                            last edited by PeterJones Jul 3, 2018, 2:15 PM Jul 3, 2018, 2:12 PM

                            @guy038, @terry-r,

                            I occasionally use the MarkdownViewer++ plugin… it works fairly well for rendering markdown that I’m editing in Notepad++ (though, often for posts, I just edit them in the forum / browser window with the preview visible). But the MarkdownViewer++ plugin doesn’t collapse multiple commas, whereas this forum does (both at preview and after posting). I spent about 10min googling markdown comma and similar, but I wasn’t able to find any documentation that claims markdown does it. Unfortunately, nodebb comma didn’t find anything saying it’s the forum software, either. So the collapsing commas are a mystery to me.

                            Also, @Scott-Sumner was thinking about turning this old post into an entry on our FAQ Desk , but apparently hasn’t found the Round Tuit™ yet.

                            S 1 Reply Last reply Jul 9, 2018, 1:59 PM Reply Quote 1
                            • S
                              Scott Sumner @PeterJones
                              last edited by Jul 9, 2018, 1:59 PM

                              @PeterJones

                              Also, @Scott-Sumner was thinking about turning this old post into an entry on our FAQ Desk, but apparently hasn’t found the Round Tuit™ yet.

                              I have begun getting “roun-tuit” but until it is in a form that has at least as much value as the old post there is little value in publishing it. It’s a time-available thing…if you have time, you are of course free to make it way better than I ever could–I’d gladly delete my in-progress draft if something good by someone else magically appears as a FAQ Desk posting. :-)

                              1 Reply Last reply Reply Quote 2
                              12 out of 16
                              • First post
                                12/16
                                Last post
                              The Community of users of the Notepad++ text editor.
                              Powered by NodeBB | Contributors