Community
    • Login

    Can Regex paste values into another file?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    11 Posts 4 Posters 3.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Terry RT
      Terry R
      last edited by

      The extraction of phone numbers would most likely be achieved by either bookmarking the relevant lines with a regex (regular expression) so they could be removed later, or more than likely with a regex to remove all but the phone numbers. So this would be a “destructive” search and replace.
      The insertion of these numbers into another file could also be achieved with a regex, but again it would require setting up a particular way.

      From this you might gather that it ain’t gonna be quick and would very likely be a bit messy.

      Of course N++ can utilize additional plugins such as python-script. This programming language could easily do what you require, but there are far fewer python users than regex users and you might not want to be left with something you could not support later on, without learning the language yourself.

      As the source is Excel, why not just keep to that environment and learn it’s macro code (VBA). I’d think either way you will have to learn something, I’d suggest keep it all in the one package.

      I’ve done Excel programming and whilst the initial learning path can be steep, there are lots of code snippets on the web. You may even find something very close to what you want and thus get off to a great start without much effort.

      Terry

      1 Reply Last reply Reply Quote 2
      • C Ananda RaoC
        C Ananda Rao
        last edited by

        Thanks Terry R!
        The main reason I started the notepad++ way was because the ‘text to column’ in excel was turning all digits separated by either a dash or a forward slash into all sorts of dates. I tried very hard (by way of youtube and excel forums,) but I was unable to stop excel from doing this.

        I do not want to take this into an Excel discussion, but there are other problems also. Sometimes when I copy paste from and into excel into text editors, apostrophes appear here and there (both ways)

        But my first two problems (to append a hash mark to all lines staring with a digit and swapping the first and second blocks of text) were solved so elegantly with regex, that I thought may be this is the way to go. As for the process being messy or destructive, I have no issues. The original file is secure and I am just using a lot of copies as I try to get my address book into shape. And only copies of the address column.

        Best regards.

        Scott SumnerS 1 Reply Last reply Reply Quote 0
        • Scott SumnerS
          Scott Sumner @C Ananda Rao
          last edited by

          @C-Ananda-Rao

          While Terry makes some good points, reading your OP makes me think that your request is doable. How about showing some sample data? [Best to mock some up data in Notepad++, then indent each line 4 spaces before pasting the data here]

          C Ananda RaoC 1 Reply Last reply Reply Quote 0
          • C Ananda RaoC
            C Ananda Rao @Scott Sumner
            last edited by C Ananda Rao

            Dear Mr Scott Sumner
            I am pasting some lines from my address book after adding 4 spaces to the beginning of every line.
            The 10 digit numbers (continuous digits, no dash character) are mobile numbers. Those digits in the form of 09876-987654 are landline numbers. I replaced all comma space by just commas to make it uniform.

            What I would like to happen is for:-
            Another text file to be created with 12 empty lines.
            Paste only telephone numbers line by line.
            If there is no telephone number, just put “none” in that line.

            F Mothar(V),Hagrin(M),1234554321,1122334455
            W/o Torinfel Gerjip,#12-98/C, M Fasthut(V),Beliperi(M),0987-654321,9988776655
            #67-7-A,L Fandrims Colony,Pelletin(T&M),9182736450,01928-4456689
            Rastaferap,Railway Colony,Molarrint(U)
            Dentenalaity(V),Dalrufatha(M),01234-556677,08723-98765
            H/o Hirundinaines Thomme,Lombef(V&M),9786751342,0987-654321
            Gyrrichalon(V),#9-8-76/D,Algewive(M)
            Vasternalik(V),Lombef(M),9786756453,0789-124578
            #11-42-V,Yendraiken Juhar,Tricherenti(V),Doorgassu(R)
            Gotorana Colony,Symboll,Pallore(TM&D),9753124680
            Bavemisuram(V),Gaujunar Dettil(P),Votari(M),9247482263
            

            Thanks and best regards!

            Scott SumnerS 1 Reply Last reply Reply Quote 1
            • Scott SumnerS
              Scott Sumner @C Ananda Rao
              last edited by

              @C-Ananda-Rao

              I would do it this way, which leaves lines with no phone numbers at the end totally blank; if you really need the word none there you could follow it up by replacing ^$ with none.

              Find what zone: (?-s)^.*?([0-9-]*)$
              Replace with zone: \1
              Match case checkbox: ticked or unticked doesn’t matter
              Wrap around checkbox: As you like it
              Search mode: Regular expression
              . matches newline checkbox: ticked or unticked doesn’t matter
              Action: Press Replace / Replace All button

              Here’s an explanation of how it works:

              THE FIND EXPRESSION:

              (?-s)^.*?([0-9-]*)$

              • [Use these options for the whole regular expression][1 ] (?-s)
                • [(hyphen inverts the meaning of the letters that follow)][1 ] -
                • [Dot doesn’t match line breaks][1 ] s
              • [Assert position at the beginning of a line (at beginning of the string or after a line break character) (carriage return and line feed, form feed)][2 ] ^
              • [Match any single character that is NOT a line break character (line feed, carriage return, form feed)][3 ] .*?
                • [Between zero and unlimited times, as few times as possible, expanding as needed (lazy)][4 ] *?
              • [Match the regex below and capture its match into backreference number 1][5 ] ([0-9-]*)
                • [Match a single character present in the list below][6 ] [0-9-]*
                  • [Between zero and unlimited times, as many times as possible, giving back as needed (greedy)][7 ] *
                  • [A character in the range between “0” and “9”][6 ] 0-9
                  • [The literal character “-”][8 ] -
              • [Assert position at the end of a line (at the end of the string or before a line break character) (carriage return and line feed, form feed)][2 ] $

              THE REPLACE EXPRESSION:

              \1

              • [Insert the text that was last matched by capturing group number 1][9 ] \1
              • [Your regular expression may find zero-length matches][10 ]
                • [boost::regex 1.54–1.57 allows a zero-length match at the position where the previous match ends.][10 ]
                • [boost::regex 1.54–1.57 attempts the next match at the same position as the previous match if it was zero-length and may find a non-zero-length match at the same position.][10 ]

              Created with RegexBuddy

              [1 ]: http://www.regular-expressions.info/modifiers.html
              [2 ]: http://www.regular-expressions.info/anchors.html
              [3 ]: http://www.regular-expressions.info/dot.html
              [4 ]: http://www.regular-expressions.info/repeat.html#lazy
              [5 ]: http://www.regular-expressions.info/brackets.html
              [6 ]: http://www.regular-expressions.info/charclass.html
              [7 ]: http://www.regular-expressions.info/repeat.html
              [8 ]: http://www.regular-expressions.info/characters.html
              [9 ]: http://www.regular-expressions.info/replacebackref.html
              [10 ]: http://www.regular-expressions.info/zerolength.html

              C Ananda RaoC 1 Reply Last reply Reply Quote 1
              • Terry RT
                Terry R
                last edited by

                Now that we start to see the data and the output it does certainly become more doable.

                I suppose I should have asked for more detail at the start. I had visions of the file that the numbers to be inserted into already containing records and we’d have to artificially create some unique key in both so we could “data match”.

                So I was considering creating a “key” for each number, something like 6 characters of a surname, followed by 2 of something else reasonably unique, christian name or road name, then followed by “-0”
                In the 2nd file (which I thought would also contain names etc), make the same key, followed by a “-1”. Merge the 2 files and then sort alphabetically. This would put the phone number line alongside it’s receiving record. Then a regex would move the number from the first line of the “couple” to the 2nd line.

                I am a bit confused now though as “new” text file will be blank. I don’t see the sense of that. I’d just massage the data in the first file, destroy any unwanted information, then pad out with extra blank lines. There is no need to also copy that to another file, if this is the only data that will be in that file.

                Terry

                C Ananda RaoC 1 Reply Last reply Reply Quote 0
                • C Ananda RaoC
                  C Ananda Rao @Scott Sumner
                  last edited by

                  Dear Mr Sumner
                  Where is the ^$ sequence in the ‘Find what’ zone you have specified? the caret appears after the 1st right parenthesis and the dollar sign appears at the end. They do not occur together in that sequence. It is important for me to have no blanks when I take the problem back into Excel. So it is important for me to have ‘none’ or ‘null’ or some such where ever there are no telephone number sequences.

                  Also, when I tried this, it only gives me one telephone number. Mind you, there are 2 telephone numbers in some lines. 2 mobile numbers or 2 landline numbers or 1 mobile and 1 landline number or only 1 telephone number - either mobile or a landline in yet other lines.

                  The rigorous aspect about these telephone numbers is that mobile numbers are a sequence of 10 continuous digits, without any spaces or other characters.

                  The landline numbers all begin with a zero (0) and have a ‘-’ character somewhere.

                  Or was my data not representative enough?

                  I have trimmed leading and trailing spaces.

                  Am I missing something?

                  Best regards!

                  1 Reply Last reply Reply Quote 0
                  • C Ananda RaoC
                    C Ananda Rao @Terry R
                    last edited by

                    Dear Mr Terry R

                    I am depending upon the sequence of lines. They all have a serial number. In other words, each and every address line is on a line by itself. The next address starts at the beginning of the next line.

                    In my imagination, I want to create a text file with the same number of lines in my Excel address book. The lines are all empty. Now I go to my text file where I have each address on a separate line. I then do 2 things:

                    1. Delete all characters from those lines without phone numbers AND put ‘null’ or ‘none’ or ‘empty’ in all those lines.
                    2. Paste all phone numbers (only) into other lines. Whether 1 or 2 telephone numbers or whether 1 mobile or 1 landline number.

                    Best regards

                    Scott SumnerS 1 Reply Last reply Reply Quote 0
                    • guy038G
                      guy038
                      last edited by guy038

                      Hello, @c-ananda-rao and All,

                      I think that the following regex could achieve what you want, easily :-))

                      As you said :

                      I replaced all comma space by just commas to make it uniform.

                      I, then, simply supposed that your phone numbers are separated with space(s), comma(s) or a mix range of these symbols !


                      So, starting with your sample text, below, copied in a New # file :

                      F Mothar(V),Hagrin(M),1234554321,1122334455
                      W/o Torinfel Gerjip,#12-98/C, M Fasthut(V),Beliperi(M),0987-654321,9988776655
                      #67-7-A,L Fandrims Colony,Pelletin(T&M),9182736450,01928-4456689
                      Rastaferap,Railway Colony,Molarrint(U)
                      Dentenalaity(V),Dalrufatha(M),01234-556677,08723-98765
                      H/o Hirundinaines Thomme,Lombef(V&M),9786751342,0987-654321
                      Gyrrichalon(V),#9-8-76/D,Algewive(M)
                      Vasternalik(V),Lombef(M),9786756453,0789-124578
                      #11-42-V,Yendraiken Juhar,Tricherenti(V),Doorgassu(R)
                      Gotorana Colony,Symboll,Pallore(TM&D),9753124680
                      Bavemisuram(V),Gaujunar Dettil(P),Votari(M),9247482263
                      
                      • Open the Replace dialog ( Ctrl + H )

                      • Select the Regular expression search mode

                      • Tick the Wrap around option

                      • Use the following regex S/R :

                      SEARCH (?-s)(?=^.*?(([, ]+[\d-]+)+)).+|.+

                      REPLACE ?1\1:None

                      • And, finally, click once on the Replace All button or several times on the Replace button

                      You should get your expected text :

                      ,1234554321,1122334455
                      ,0987-654321,9988776655
                      ,9182736450,01928-4456689
                      None
                      ,01234-556677,08723-98765
                      ,9786751342,0987-654321
                      None
                      ,9786756453,0789-124578
                      None
                      ,9753124680
                      ,9247482263
                      

                      Notes :

                      • As usual, the (?-s) modifier means any dot ( . ) matches one standard character only ( and not the line–breaks )

                      • The regex is made up of two alternatives, separated with the | symbol

                        • The first alternative is the regex (?=^.*?(([, ]+[\d-]+)+)).+ which matches all the standard characters of the current line ( .+ ), ONLY IF the condition of the positive look-ahead is true. That is to say, if a non null range of consecutive phone numbers, preceded by, either, spaces and/or commas, exists in current line

                        • If, at beginning of line, the previous alternative cannot be matched, then the regex engine tries the second alternative, .+, which, again, matches all the standard characters of the current line, this time, WITHOUT any condition

                      • In the look-around structure, the part ([, ]+[\d-]+)+ represents the consecutive list of phone numbers, surrounded, itself, with parentheses in order to store the results as group 1, used, later, in replacement

                      • In Replacement, we use the conditional replacement structure ?1\1:None

                        • If group1 exists, the list of phone number(s) ( \1 ) is only written

                        • If group1 does not exist, the string None replaces the entire line matched

                      Remarks :

                      • Note that this regex is “auto-convergent” ! If you re-run the regex, against the obtained results, it just replaces all lines, as previously :-))

                      • Depending of your mother language, you may change the None English word, as you like !

                      Best Regards,

                      guy038

                      P.S. :

                      Of course, in order to delete the first comma, beginning each line and, for example, change any other comma into some space characters, use the regex S/R :

                      ^,|(,)

                      ?1\x20\x20\x20

                      And you’ll get the text :

                      1234554321   1122334455
                      0987-654321   9988776655
                      9182736450   01928-4456689
                      None
                      01234-556677   08723-98765
                      9786751342   0987-654321
                      None
                      9786756453   0789-124578
                      None
                      9753124680
                      9247482263
                      

                      Notes :

                      • This regex searches, either, for :

                        • A comma, beginning the line ( ^, )

                        • Any other comma, stored as group 1

                      • In replacement :

                        • If the first alternatives matches, nothing is rewritten

                        • If the second alternative matches, 3 space characters are rewritten.

                      • Note that you may enter true space chars, instead of the escape syntax \x20, in the replacement zone !

                      1 Reply Last reply Reply Quote 1
                      • Scott SumnerS
                        Scott Sumner @C Ananda Rao
                        last edited by

                        @C-Ananda-Rao

                        Well, when I asked for data, I guess I should have asked for an explanation of that data to accompany it… I suppose I assumed that the phone number was the last part of every line where digits were present. Anyway, hopefully @guy038’s response provides the help you needed.

                        1 Reply Last reply Reply Quote 2
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors