• Login
Community
  • Login

replace \r\n with ' ' for lines enclosed in " "

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
7 Posts 4 Posters 400 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B
    brenda gross
    last edited by Dec 16, 2024, 4:48 PM

    Re: replace \r\n with ’ ’ for lines enclosed in " "

    I have some sql queries in a text doc that needs some editing. most of the queries are on a single line. But some of them have \r\n between clauses. these are enclosed in " "

    Example below are 3 sql queries. the second query (enclosed in " ") is on multiple lines. I want all sql queries on a single line

    select * from table1\r\n
    “select *\r\n
    from table1\r\n
    where column1 in (1,2,3)”\r\n
    select * from table3\r\n

    D T B 3 Replies Last reply Dec 16, 2024, 6:03 PM Reply Quote 0
    • D
      dr ramaanand @brenda gross
      last edited by Dec 16, 2024, 6:03 PM

      @brenda-gross Please let us know how the output/desired result should be

      1 Reply Last reply Reply Quote 1
      • T
        Terry R @brenda gross
        last edited by Dec 17, 2024, 2:01 AM

        @brenda-gross said in replace \r\n with ' ' for lines enclosed in " ":

        I want all sql queries on a single line

        I will first point out that you needed to display the example in a “code box”, reference the FAQ post here.

        In your example you used double quotes and this forum tends to muck around with these so what actually posted isn’t likely to be what is copied for us to test against. Secondly, I presume you typed the \r\n in the example in an attempt to portray the actual line feed and carriage return shown in your real data. I also presumed that the 2nd (and 3rd) line of the multi line query are left adjusted in the real data, so no leading space, just as your example shows. 3 good reasons for using the code box to show examples.

        Also as the other post asks, showing the example after the data has been changed would be very helpful. For example the below solution does NOT remove the double quotes. No doubt that may be something you wanted. If so, you should be able to achieve that in a separate search/replace step.

        There is a generic regex provided, again in the FAQ section here. Unfortunately if you had used it, it would still have not worked entirely correctly. As it stands, my test suggests it would have consumed the line feed/carriage return immediately following the multi-line query. I had to adjust it slightly to compensate. I will acknowledge @guy038 as the originator of that regex in the hopes he may check it and consider if it really should be adjusted for a \R immediately following the ESR (end search region) as I believe it should.

        If I have read the question correctly I think the following regex should work for you. Obviously test it on a copy of your data just in case my assumptions are incorrect or you have more information on the situation which may alter the outcome.

        Using the Replace function (search mode is regular expression) we have:
        Find What:(?-si:"(?!\R)|(?!\A)\G)(?s-i:(?!").)*?\K(?-si:\R)
        Replace With: there is a single space here. If my assumption above on the leading space is incorrect, remove this space, so an empty field.

        Terry

        B 1 Reply Last reply Dec 17, 2024, 8:46 PM Reply Quote 3
        • B
          brenda gross @brenda gross
          last edited by Dec 17, 2024, 8:38 PM

          @brenda-gross

          desired output.

          select * from table1\r\n
          “select * from table1 where column1 in (1,2,3)”\r\n
          select * from table3\r\n
          
          1 Reply Last reply Reply Quote 0
          • B
            brenda gross @Terry R
            last edited by Dec 17, 2024, 8:46 PM

            @Terry-R I have a lot of testing to do (my input file is 30mb). But that seems to work for my test sample. Thanks!

            T 1 Reply Last reply Dec 17, 2024, 9:20 PM Reply Quote 0
            • T
              Terry R @brenda gross
              last edited by Terry R Dec 17, 2024, 9:21 PM Dec 17, 2024, 9:20 PM

              @brenda-gross said in replace \r\n with ' ' for lines enclosed in " ":

              I have a lot of testing to do (my input file is 30mb). But that seems to work for my test sample.

              Good that you have shown the desired output in a code box. Unfortunate that the double quotes aren’t what I used in my solution but from your post it seems you did adjust for that. I will say though, that you still entered the \r\n. Was that intentional as in are these characters at the end of the line, as well as the actual line feed/carriage return characters?

              I would suggest turning on View, Show Symbol, show all characters, or at least show end of line. If you do, you will see something like this at the end of every line:
              d13c9440-9afc-41f0-b1e9-3a824db4e01f-image.png
              This is the actual carriage return and line feeds which you duplicated by typing in the \r and \n characters. It is advisable not to enter additional control characters such as these unless they are actually in the line, not as I suspect, the end of line/new line control string.

              As you have a very large file to process, you will want to consider some other steps:

              1. Before processing count the number of multi-line queries. That might seem hard but using the Find function, you will see another button called count. If you use the following in the Find What field and press count it will tell you the number.
                Find What:(?-s)^“[^“”\r\n]*$
                This looks for any opening double quotes without a following ending double quote on the same line, so a multi-line query. After processing you could repeat this count to confirm none left. The count in step 1 should not be larger than the processing count. If it was then definitely some multi-line queries were not changed.

              There may be other steps you can take such as spot checks to confirm before accepting the results. Use the last regex in the Mark function and also click on “bookmark line”. Click Mark All. If the number is 0 then great, otherwise bookmarking will allow you to quickly progress through the file using the F2 and Shift-F2 keys to jump from one bookmark to the next.

              I would suggest that if the double quotes are to be eliminated, leave that as the last step once you have completed the steps above to verify validity of the changed data.

              Good luck
              Terry

              1 Reply Last reply Reply Quote 2
              • G
                guy038
                last edited by guy038 Dec 18, 2024, 3:51 PM Dec 18, 2024, 12:13 PM

                Hello, @brenda-gross, @dr-ramaanand, @terry-r and All,

                @terry-r, your use of the search regex (?-si:"(?!\R)|(?!\A)\G)(?s-i:(?!").)*?\K(?-si:\R) is just clever, because :

                • Any string may occur after the first " character ( the word Select or else )

                • It ensure that, if the global replacement is re-run, no change will occur again, thanks to the negative look-ahead (?!\R) after the double-quote character

                Now, Terry, we still can tune this search regex !

                • In all parts of this regex, alphabetic characters are not involved. So, we can omit the -i modifiers

                • In the first and last part of this regex, the regex . character is not involved, too. Thus, we can omit the -s modifiers as well !

                • We do not need the non-capturing group around the searched string \R, too

                Thus, the shortest syntax of the generic regex S/R is simply :

                FIND (?:"(?!\R)|(?!\A)\G)(?s:(?!").)*?\K\R

                REPLACE \x20

                Note that, if the OP speak of the smart quotes “ and ”, of Unicode value U+201C and U+201D, this regex S/R becomes :

                FIND (?:“(?!\R)|(?!\A)\G)(?s:(?!”).)*?\K\R

                REPLACE \x20


                So, @brenda-gross, just follow this road map :

                • Open your file in Notepad++

                • Add an empty line at the very beginning of your file, for security

                • Hit the Ctrl + Home shortcut to move at the very beginning of your file ( IMPORTANT )

                • Open the Replace dialog

                • Uncheck all box options

                • According to your case, use, either :

                  • FIND (?:"(?!\R)|(?!\A)\G)(?s:(?!").)*?\K\R

                  • FIND (?:“(?!\R)|(?!\A)\G)(?s:(?!”).)*?\K\R

                • REPLACE \x20

                • Select the Regular expression search mode

                • Click, once only, on the Replace All button

                => All the lines between regular or smart quotes should be displayed in single lines !


                Now, if you want to get rid of these quote characters, at beginning and end of some lines :

                • Move at the very beginning of your file ( IMPORTANT )

                • Find ^"|"$    or    ^“|”$

                • REPLACE Leave EMPTY

                • Select the Regular expression search mode

                • Click once on the Replace All button

                Best Regards,

                guy038

                1 Reply Last reply Reply Quote 2
                7 out of 7
                • First post
                  7/7
                  Last post
                The Community of users of the Notepad++ text editor.
                Powered by NodeBB | Contributors