Matching Phone Numbers
-
I am new to Notepad++ and trying to get some assistance on finding duplicate phone numbers.
I have two files. File (A) set of 2000 10 digit phone numbers ex. (202) 111-4456
File (B) list of 500,000 phone numbers.
I need to see if any phone numbers in File (A) are present in File (B).Any help?
-
Sounds like a job for
join
, not N++.PS VinsWorldcom ~\tmp > cat .\filea.txt (202) 111-4456 (202) 111-4458 PS VinsWorldcom ~\tmp > cat .\fileb.txt (202) 111-4456 (202) 111-4457 (202) 111-4458 (202) 111-4459 PS VinsWorldcom ~\tmp > join -ta .\filea.txt .\fileb.txt (202) 111-4456 (202) 111-4458
You can find
join
in GnuWin32.Cheers.
-
You could also try the Compare plugin, though I think that’ll just give you a visual indicator, not an output of the intersection like my
join
suggestion above.Cheers.
-
@michael-vincent Thank you! I just tried it and it did not return any values. I dont know if there is a limit on characters but no dice. Any other suggestions?
-
@ali-0 said in Matching Phone Numbers:
I need to see if any phone numbers in File (A) are present in File (B).
As long as both files have exactly the same format for the phone numbers this should work. It is better to work on copies of the files as we need to combine them into 1 file. So open both files in Notepad++. Then copy the contents of 1 file into the other, doesn’t matter which way you do it.
Then we need to order the lines, so that same numbers will appear on consecutive lines. Use the Edit, Line Operations, Sort lines as integer ascending.
Then we mark those lines which have a duplicate following using the Mark function. So Search, Mark, Find What:
^(.+)\R(?=\1)
. This is a regular expression so the search mode must be “regular expression”. Select the “bookmark line” option and then click on “Mark All”.
Although the ordering would have taken possibly a minute to complete, the marking is quite quick (in my test). It will return a number as being matches, this gives you an initial indication of how many duplicates there were.The lines are also marked with a “blue sphere” in the left margin. using F2 key (or shift F2 to go backwards) you can view them individually. To see them all together use the Search, Bookmark, Copy Bookmarked Lines to select them all. Create a new tab in Notepad++, then paste them there.
Terry
-
I don’t know if notepad++ is suitable for this purpose.
See cygwin.com or Gnu/software for grep.The following command should work:
for i in
cat smallfile: do x=
grep “$i” largefile`; echo "$i " $x; doneThere are also options in diff (GTYPE/LTYPE) which may be workable but that’s (d)iff(ie), but if you choose this route then sort the two files first. What may intrude on using diff is the size difference between the two files.
The simplest approach is probably to go with the solution posted by Vincent, above.
-
@arthur-schwarez said in Matching Phone Numbers:
I don’t know if notepad++ is suitable for this purpose.
In this case I’d say it is suitable.
We have people asking in this forum for a solution so it makes sense to provide that within the context of using Notepad++, if possible. Certainly sometimes there are applications elsewhere which may do the task just as well, sometimes even better. However it would likely mean the person asking will need to learn yet another thing, something they may not need to ever use again. At least a Notepad++ based solution offers the person asking, some ideas on how Notepad++ may help them with ever increasing complex issues.
So whilst we try to keep to the Notepad++ theme, solution providers here will sometimes suggest an alternative, especially where Notepad++ is definitely not the best platform.
Terry