can a regex function call a words list present in an external .txt/.csv file?



  • Hello guys, first of all sorry for the mistakes i will make in english, i am an italian guy who is writing using google translator.
    I have a database in .csv format
    Each line contains the entire personal data of a contact, including the email.
    And then I have a list of 300 emails of contacts to delete from the database.
    If I had 1 email to delete, I would use the formula ^.+(email1@email1.it).+$ to find the contact and replace the entire line with blank.
    If I had 2 emails to delete, I would use the formula ^.+(email1@email1.it|email2@email2.it).+$ to find the 2 contacts and replace the entire 2 lines with blank.
    And so on

    alt text

    But if I have a list of 300 emails to delete, contained in an “email_canc.txt” file, can I recall that file within the search regex formula, to optimize the matching formula for multiple emails automatically?
    Or, if this is not possible, how can I match 300 emails in my database with a simple formula?

    Thanks for your help



  • Hello @pj and All,

    So, let start with this simple database file

    xxxx;yyyy;zzzzzzz;tt;xxxx.yyy@toto.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;yyy@abc.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;xxxx.yyy@gmail.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;hhh@123.com;aaaaaa;;;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;wwwwwwwwwwww.yy@hotmail.fr;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;hh.jjjjj@zzz.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;123456@789.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;zzzz.tttt@yahoo.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;kkk.lll.mmm@abcde.com;aaaaaa;bbbb;;cccccc
    

    And let’s suppose that your email_canc.txt file contains the 3 lines :

    xxxx.yyy@gmail.com
    wwwwwwwwwwww.yy@hotmail.fr
    zzzz.tttt@yahoo.com
    

    Here is a possible method :


    Right after the database contents :

    • Add the line =====, used as a separartor

    • Then, append all the contents of the email_canc.txt file

    We now get that text :

    xxxx;yyyy;zzzzzzz;tt;xxxx.yyy@toto.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;yyy@abc.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;xxxx.yyy@gmail.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;hhh@123.com;aaaaaa;;;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;wwwwwwwwwwww.yy@hotmail.fr;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;hh.jjjjj@zzz.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;123456@789.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;zzzz.tttt@yahoo.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;kkk.lll.mmm@abcde.com;aaaaaa;bbbb;;cccccc
    =====
    xxxx.yyy@gmail.com
    wwwwwwwwwwww.yy@hotmail.fr
    zzzz.tttt@yahoo.com
    
    • Open the Replace dialog ( Ctrl + H )

      • SEARCH (?s)(?-s:^.+;(.+?);.+)\R(?=.+^=====.+^\1$)|^=====.+

      • REPLACE Leave EMPTY

      • Tick the Wrap around option

      • Select the Regular expression search mode

      • Click on the Replace All button

    => You should get this final database contents :

    xxxx;yyyy;zzzzzzz;tt;xxxx.yyy@toto.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;yyy@abc.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;hhh@123.com;aaaaaa;;;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;hh.jjjjj@zzz.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;123456@789.com;aaaaaa;bbbb;;cccccc
    xxxx;yyyy;zzzzzzz;tt;kkk.lll.mmm@abcde.com;aaaaaa;bbbb;;cccccc
    

    As you can see, all the remaining lines do not contain any e-mail address, which are part of the email_canc.txt example file ;-))

    Best Regards,

    guy038



  • @guy038 thanks @guy038
    great solution :)


Log in to reply