• Login
Community
  • Login

move everything on a line after the occurrance of a string to a fixed column

Scheduled Pinned Locked Moved General Discussion
7 Posts 4 Posters 619 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A
    Anthony Bouttell
    last edited by Anthony Bouttell Jun 10, 2020, 8:25 PM Jun 10, 2020, 8:24 PM

    Hi, I have no doubt this has been asked before, but I cant find the question…
    I have a file, ( 4000+ entries )

     QUALIFIEDMORTGAGECODE QUALIFIED MORTGAGE CODE
     PURCHASEPRICE PURCHASE PRICE
     PROPERTYVALUEAMOUNT PROPERTY VALUE AMOUNT
     PROPERTYTYPE PROPERTY TYPE
    

    In which I want everything after the first space to start in column 32, using spaces, not tabs

     QUALIFIEDMORTGAGECODE            QUALIFIED MORTGAGE CODE
     PURCHASEPRICE                    PURCHASE PRICE
     PROPERTYVALUEAMOUNT              PROPERTY VALUE AMOUNT
     PROPERTYTYPE                     PROPERTY TYPE
    

    What regex expression should I using to achieve this goal?

    P A 2 Replies Last reply Jun 10, 2020, 8:42 PM Reply Quote 0
    • P
      PeterJones @Anthony Bouttell
      last edited by Jun 10, 2020, 8:42 PM

      @Anthony-Bouttell,

      Personally(*), I would accomplish this task in two steps: first, make sure everything has at least 32 spaces after the first word; second, reduce those so that there are only 32 characters in the left section.

      1. make sure everything has at least 32 spaces after the first word

        • FIND = (?-s)^(.*?)\x20
        • REPLACE = ${1}\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20 (that’s 32 of the \x20 character to encode a space
        • Search Mode = regular expression
      2. reduce those so that there are only 32 characters in the left section

        • FIND = (?-s)^(.{32})\x20*
        • REPLACE = ${1}
        • Search Mode = regular expression

      Based on your definition of “start in column 32”, I may be off by 1-2 chars in where the final placement is; change the {32} by +/- 1-2 if it’s note quite right; you might also want a few extra \x20 in the first replacement

      I gave that caveat because I would call your example text having the second word start in column 34:

      123456789x123456789x123456789x123456789x
      QUALIFIEDMORTGAGECODE            QUALIFIED MORTGAGE CODE
      PURCHASEPRICE                    PURCHASE PRICE
      PROPERTYVALUEAMOUNT              PROPERTY VALUE AMOUNT
      PROPERTYTYPE                     PROPERTY TYPE
      

      If that space at the beginning was real, then we might need to tweak that first expression to ignore a space in the first column.

      123456789x123456789x123456789x123456789x
       QUALIFIEDMORTGAGECODE            QUALIFIED MORTGAGE CODE
       PURCHASEPRICE                    PURCHASE PRICE
       PROPERTYVALUEAMOUNT              PROPERTY VALUE AMOUNT
       PROPERTYTYPE                     PROPERTY TYPE
      

      This should give you a starting point. If it doesn’t work, explain how it was wrong (including numerical rulers, if necessary)

      *: some would craft a single regex to do it. I have been known to try that, even in this forum. But really, for practical regex, it’s best to stick with what’s logical and most efficient – and with what you can easily think of and understand. It’s usually more efficient to do something like this in multiple steps, because the time spent in crafting a super-fancy regex might make it more fragile (less able to handle slight changes) than step-by-step ones.

      P 1 Reply Last reply Jun 10, 2020, 8:45 PM Reply Quote 2
      • P
        PeterJones @PeterJones
        last edited by Jun 10, 2020, 8:45 PM

        I would call your example text having the second word start in column 34

        Based on that same definition, I would actually call my result “start in 33” rather than “start in 32”.

        Yours would’ve been “start in 33” in a 0-based, which is what would be required for mine to be “start in 32”.

        This should give you a starting point. If it doesn’t work, explain how it was wrong (including numerical rulers, if necessary)

        I apparently never finished that thought. I meant to go on and say:

        But before asking us for more help, try to tweak what I gave you to get closer to your goal. If you try, but still cannot get it quite right, show us what you tried, and we can help you tweak it more.

        1 Reply Last reply Reply Quote 1
        • A
          Anthony Bouttell
          last edited by Jun 10, 2020, 9:43 PM

          @PeterJones Thanks!! It worked like a charm.

          I notice you use ${1}
          I generally use \1
          is ${1} the preferred method for labels now?

          FYI… The number 32 was just a guess, based on a small sample. Given some of the tags I just saw, it’ll be around 60.

          P 1 Reply Last reply Jun 10, 2020, 10:12 PM Reply Quote 1
          • P
            PeterJones @Anthony Bouttell
            last edited by Jun 10, 2020, 10:12 PM

            @Anthony-Bouttell,

            @PeterJones Thanks!! It worked like a charm.

            Glad it works for you.

            I notice you use ${1}

            I learned regex in Perl, where the \1 notation has been deprecated in favor of $1 or ${1} for quite some time. It’s what I’m used to. And by using the braces, it always disambiguates between ${1}0 and ${10}, whereas $10 or \10 always means the former, but looks an awful lot like it means the latter. But it’s just a stylistic choice; they all work.

            1 Reply Last reply Reply Quote 1
            • G
              guy038
              last edited by guy038 Jun 12, 2020, 11:02 AM Jun 11, 2020, 6:17 PM

              Hello, @anthony-bouttell, @peterjones and All,

              Like @peterjones, here is an other two-steps method. But:

              • As it seems that each line of your text begins with a space char, I slightly change the first search regex

              • In the second search regex, we skip the first 31 characters of each line, then we grasp any non-null range of space chars

              So, assuming your initial text :

               QUALIFIEDMORTGAGECODE QUALIFIED MORTGAGE CODE
               PURCHASEPRICE PURCHASE PRICE
               PROPERTYVALUEAMOUNT PROPERTY VALUE AMOUNT
               PROPERTYTYPE PROPERTY TYPE
              

              This regex S/R :

              SEARCH (?<=\w)\x20.+

              REPLACE \x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20$0

              In replacement, note the $0 syntax, after the 30 consecutive space chars ( \x20 )

              gives :

               QUALIFIEDMORTGAGECODE                               QUALIFIED MORTGAGE CODE
               PURCHASEPRICE                               PURCHASE PRICE
               PROPERTYVALUEAMOUNT                               PROPERTY VALUE AMOUNT
               PROPERTYTYPE                               PROPERTY TYPE
              

              And with this second regex S/R

              SEARCH ^.{31}\K\x20+

              REPLACE Leave EMPTY

              You should get the expected data :

               QUALIFIEDMORTGAGECODE         QUALIFIED MORTGAGE CODE
               PURCHASEPRICE                 PURCHASE PRICE
               PROPERTYVALUEAMOUNT           PROPERTY VALUE AMOUNT
               PROPERTYTYPE                  PROPERTY TYPE
              

              IMPORTANT :

              • Due to the presence of the (?<=\w) look-behind syntax and the \K syntax, you must use the Replace All button, exclusively, for these two S/R !

              • If you need to perform these two regex S/R on a part of your text, only, in order to restrict the scope of the Replace All button, simply do a selection of the range of text, involved in the S/R, and tick the In selection option !

              • In order that the second regex S/R correctly works, a fair amount of spaces characters must fill up each line. When in doubt, just perform, again, the first regex S/R, to add 30 spaces, again, before running the second S/R

              Best Regards,

              guy038

              1 Reply Last reply Reply Quote 3
              • A
                astrosofista @Anthony Bouttell
                last edited by Jun 11, 2020, 7:02 PM

                Hi @Anthony-Bouttell, @PeterJones, @guy038, All:

                For the sake of variety, look at this non-regex solution (it only requires the BetterMultiSelection and Elastic Tabstops plugins installed and enabled, which you can download and install via Plugins Admin):

                Giphy

                First off, open Preferences -> Editing, and enable the Multi-Editing Settings.

                Then, do as the movie shows:

                1. Go to the first line and insert the required spaces to move the second word up to column 32.

                2. Press Home to place the caret at the beginning of this line.

                3. Press Shift + Alt and move the caret from the top line until the bottom of the list with the arrow down - you will get a giant caret blinking along 8 lines with no characters selected.

                4. Press Ctrl + Right twice. Eight carets will be blinking at the left of each second word.

                5. Press Shift + Left to backward select a space.

                6. Press Tab to magically align all the second words.

                7. Press Esc and then Down to release the multiselection.

                8. Select Plugins -> Elastic Tabstops -> Convert Tabstops to Spaces.

                That’s all. Hope you like it.

                Have fun!

                1 Reply Last reply Reply Quote 3
                5 out of 7
                • First post
                  5/7
                  Last post
                The Community of users of the Notepad++ text editor.
                Powered by NodeBB | Contributors