Community
    • Login

    Copy and paste sections with RegEx or macro?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    7 Posts 4 Posters 1.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Venus642V
      Venus642
      last edited by

      Hello Community.

      I have a problem and hope someone can help me.

      I have a txt file with questions and answers

      Example TXT file:

      ##xx##
      ☐ Question1: L123orem i>psum dolor 
      ☐ Question2: sit a>sdfs>m213et, 
      ☑ Question3: con><<set>easdtur sadipscing 
      ##xy##
      ##xx##
      ☐ Question1: elifdf343tr, 1sed 23diam nonumyfdg
      ☐ Question2: 32423df34
      ☑ Question3: eirmd45345ocxfdvcxvd tem123por
      ☑ Question4: 543534 in34vifgdunt ut labore
      ##xy##
      

      and I will make this (for import - csv):

      "
      ☐ Question1: L123orem i>psum dolor 
      ☐ Question2: sit a>sdfs>m213et, 
      ☐ Question3: con><<set>easdtur sadipscing 
      ";"
      ☐ Question1: L123orem i>psum dolor 
      ☐ Question2: sit a>sdfs>m213et, 
      ☑ Question3: con><<set>easdtur sadipscing 
      "
      "
      ☐ Question1: elifdf343tr, 1sed 23diam nonumyfdg
      ☐ Question2: 32423df34
      ☐ Question3: eirmd45345ocxfdvcxvd tem123por
      ☐ Question4: 543534 in34vifgdunt ut labore
      ";"
      ☐ Question1: elifdf343tr, 1sed 23diam nonumyfdg
      ☐ Question2: 32423df34
      ☑ Question3: eirmd45345ocxfdvcxvd tem123por
      ☑ Question4: 543534 in34vifgdunt ut labore
      "
      

      That signals is the section:
      ##xx##
      …
      ##xy##

      The content in it can contain all kinds of characters and the amount of lines are different.
      The first position / sign in line is ☐ or ☑
      When copying, ☑ must be changed to this ☐ for the first contain.

      How can I this be implemented with Notepad++? I’ve tried using a macro, cut and paste with a new TAB window, but that doesn’t right work. Does anyone have an idea with RegEx.
      I would appreciate some help.

      PeterJonesP Neil SchipperN 2 Replies Last reply Reply Quote 0
      • PeterJonesP
        PeterJones @Venus642
        last edited by PeterJones

        @Venus642 ,

        I’ve tried using a macro, cut and paste with a new TAB window, but that doesn’t right work.

        I beg to differ.

        I just recorded a macro with the following sequence:

        1. Select All
        2. Copy
        3. File > New
        4. Paste (into new tab)
        5. Search > Replace: FIND = ☑, REPLACE = ☐, SEARCH MODE = Regular Expressoin => REPLACE ALL

        I then saved that macro. I restarted Notepad++, pasted in your example “before” text, and ran the macro: I ended up with a second file that had all the ☑ converted to ☐.

        As any regex search-and-replace can be recorded in a macro, I could have made a more complicated regex to accomplish your exact goal. But while I saw that you wanted to change some of the ☑ to ☐, others were not converted, and based on your description, I could not tell which ones should or should not be converted.

        If you need help refining the regex, you will need to provide a better description of when the ☑ should and should not be converted to ☐.

        ----

        Useful References

        • Please Read Before Posting
        • Template for Search/Replace Questions
        • Formatting Forum Posts
        • FAQ: Where to find regular expressions (regex) documentation
        • Notepad++ Online User Manual: Searching/Regex
        1 Reply Last reply Reply Quote 0
        • Neil SchipperN
          Neil Schipper @Venus642
          last edited by

          @Venus642,

          This looks like it can be solved in two regex phases.

          In the first phase, we would make exact copies of the text between record delimiters, while also altering the record delimiters.

          RecStart is ##xx## on a line, RecEnd is ##xy## on a line

          We match: <RecStart><all in-between text, captured to capture group 1, CG1><RecEnd>

          Replace with: <" on a line><CG1><“;” on a line><CG1><" on a line>

          In the second phase, match all checked box characters in the second copy, and replace with unchecked box characters.

          Each record’s second copy is now bound by these record delimiters:

          RecStart is “;” on a line, RecEnd is " on a line

          The first phase is a fairly straightforward regex (I couldn’t tell from your post if you have enough experience to do this yourself).

          The second phase is complex, but has been solved in this community many times. If you search the community for posts with text “FR BR BSR ESR” you will find many examples, such as this one: https://community.notepad-plus-plus.org/topic/22667/how-to-delete-but-only-in-certain-lines

          Once you tune the two regexes, you can create a macro that performs both as a single action. You would do this in the working file tab, and then you are free to “save as…” or select-copy-paste to new, undo (ctl-z), and so on. If you wish you could also have the macro copy all and paste into a new file tab as Peter showed.

          @peterjones, My guess is that they got in trouble doing copy and paste actions some of which don’t work as one would hope in macros.

          1 Reply Last reply Reply Quote 1
          • Venus642V
            Venus642
            last edited by

            Thanks PeterJones and Neil Schipper for your help.

            I’ll try again and hope it works.

            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by guy038

              Hello, @venus642, @peterjones, @neil-schipper and All,

              As the other posters, I propose a resolution in two regex phases !

              So, from your INPUT text :

              ##xx##
              ☐ Question1: L123orem i>psum dolor 
              ☐ Question2: sit a>sdfs>m213et, 
              ☑ Question3: con><<set>easdtur sadipscing 
              ##xy##
              ##xx##
              ☐ Question1: elifdf343tr, 1sed 23diam nonumyfdg
              ☐ Question2: 32423df34
              ☑ Question3: eirmd45345ocxfdvcxvd tem123por
              ☑ Question4: 543534 in34vifgdunt ut labore
              ##xy##
              

              Use this first regex S/R :

              • SEARCH (?s-i)^##xx##\R(.+?)^##xy##\R

              • REPLACE "\r\n\1";"\r\n\1"\r\n

              Which :

              • Duplicates the questions block
              • Changes ##xx## and ##xy## into a single " char
              • Separates each block from its duplicate one with a line ";"

              So, you will get this temporary result :

              "
              ☐ Question1: L123orem i>psum dolor 
              ☐ Question2: sit a>sdfs>m213et, 
              ☑ Question3: con><<set>easdtur sadipscing 
              ";"
              ☐ Question1: L123orem i>psum dolor 
              ☐ Question2: sit a>sdfs>m213et, 
              ☑ Question3: con><<set>easdtur sadipscing 
              "
              "
              ☐ Question1: elifdf343tr, 1sed 23diam nonumyfdg
              ☐ Question2: 32423df34
              ☑ Question3: eirmd45345ocxfdvcxvd tem123por
              ☑ Question4: 543534 in34vifgdunt ut labore
              ";"
              ☐ Question1: elifdf343tr, 1sed 23diam nonumyfdg
              ☐ Question2: 32423df34
              ☑ Question3: eirmd45345ocxfdvcxvd tem123por
              ☑ Question4: 543534 in34vifgdunt ut labore
              "
              

              Then, use this second regex S/R :

              • SEARCH ☑(?=(?s:(?!").)+";")

              • REPLACE ☐

              which :

              • Replaces any ☑ character with ☐ ONLY IF it followed, further on, with the string ";"

              • And, also, IF any character, between the ☑ char and the string ";", is different from a double-quote ( " )

              So, here is your expected OUTPUT text :

              "
              ☐ Question1: L123orem i>psum dolor 
              ☐ Question2: sit a>sdfs>m213et, 
              ☐ Question3: con><<set>easdtur sadipscing 
              ";"
              ☐ Question1: L123orem i>psum dolor 
              ☐ Question2: sit a>sdfs>m213et, 
              ☑ Question3: con><<set>easdtur sadipscing 
              "
              "
              ☐ Question1: elifdf343tr, 1sed 23diam nonumyfdg
              ☐ Question2: 32423df34
              ☐ Question3: eirmd45345ocxfdvcxvd tem123por
              ☐ Question4: 543534 in34vifgdunt ut labore
              ";"
              ☐ Question1: elifdf343tr, 1sed 23diam nonumyfdg
              ☐ Question2: 32423df34
              ☑ Question3: eirmd45345ocxfdvcxvd tem123por
              ☑ Question4: 543534 in34vifgdunt ut labore
              "
              

              Best Regards

              guy038

              Neil SchipperN 1 Reply Last reply Reply Quote 0
              • Neil SchipperN
                Neil Schipper @guy038
                last edited by

                Hi @guy038,

                Your solution made me realize I made an error in specifying the substitution of the ☑s in the second group rather than the first!

                However, your solution will fail to replace checked boxes if the text in a question contains a " which is fairly foreseeable. (It’s also possible but much less likely that ";" appears in a question.) (The OP’s samples strongly suggest that question text should be considered “anything except newline”.) So it’s safer that the record delimiters include their newlines:

                ☑(?=(?s:(?!"\R).)+";"\R)

                1 Reply Last reply Reply Quote 3
                • guy038G
                  guy038
                  last edited by guy038

                  Hi, @venus642, @neil-schipper and All,

                  Oh… yes, totally exact and clever iniiative Neil, indeed !

                  Of course, as the " character and the string ";" seemed to be separators, I presumed that they were not used within the content of the questions. But I agree that this new formulation is safer !

                  So, @venus642, the second regex S/R to use is, preferably :

                  • SEARCH ☑(?=(?s:(?!"\R).)+";"\R)

                  • REPLACE ☐

                  Cheers,

                  guy038

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors