List frequency of duplicates
-
I currently have a list of words with one word per line, I’d like to get a list of how many times each word was duplicated.
-
Notepad++ is a great tool, but some things are easier to solve with another tool.
You can easily do this in most of the “scripting” lanugages, such as Perl, Python, or Lua. If you have Perl (see Strawberry Perl for an easy-to-install version for Windows), save this file as
countWords.pl
#/usr/bin/perl use warnings; use strict; my %count; while($_ = <>) { chomp; ++$count{$_}; } printf "%-20s => %d\n", $_, $count{$_} for sort { $count{$b} <=> $count{$a} or $a cmp $b } keys %count;
Then run
perl countWords.pl input.txt
(whereinput.txt
is the file, with one word per line), and it will give you the list of words, sorted by frequency then alphabeticallyOr, if you have some of the linux-like tools – using GnuWin32, or the git-bash shell, or cygwin, or the new WSL (Windows Subsystem for Linux) – then it could be done with a one-liner:
sort list.txt | uniq -c | sort -nr
(take the file, sort it so the same words are next to each other, output each line with its unique count, and sort it in reverse numerical order).Actually, if you grabbed the
sort
,uniq
, andwc
from GnuWin32 and had them in your path (beforec:\windows\system32\
in your path, otherwise it will use windows’sort
, not gnusort
), then you could run that through the Notepad++ Run > Run, usingcmd.exe /K "sort ^"$(FULL_CURRENT_PATH)^" | uniq -c | sort -nr"
.Alternatively to Perl or GnuWin32 examples shown here, you could do it with either Python or Lua inside Notepad++ if you have the PythonScript or LuaScript plugins for Notepad++.
-
Thank You very much. Do you know if there is any way to count the no. of times a phrase was duplicated. For eg. The list below automatically get a count without inputting a phrase to look for. Please note that a username appears in front of the phrase which I would not like to carry over.
BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS *2
History Of Manchester United DVD *2NEWInstant Water Heater Shower Head
Comment:great er!!! no hassles!!!
rated on 07 Jun 2016
stanton261 BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS
Comment:great er!!! no hassles!!!
rated on 07 Jun 2016
Rarebit314 Kenwood Prospero Chef Kitchen 900W Mixer + ATTACHMENTS
Comment:great er!!! no hassles!!!
rated on 07 Jun 2016
Yvie240182 Manchester United - The Official History 1878 - 2008: 2 DVD Set
Comment:great er!!! no hassles!!!
rated on 07 Jun 2016
ubombo57 History Of Manchester United DVD
Comment:great er!!! no hassles!!!
rated on 07 Jun 2016
sanjayramroop4 History Of Manchester United DVD
Comment:great er!!! no hassles!!!
rated on 07 Jun 2016
TonyS129 BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS
Comment:great er!!! no hassles!!!
rated on 07 Jun 2016 -
Danger…warning…GETTING OFF-TOPIC…let’s get back to talking about Notepad++…
-
@Scott-Sumner This is about using notepad++ how am i getting off topic
-
I don’t see anything about Notepad++ in this entire thread. What I see are data-content manipulations that don’t involve Notepad++. @PeterJones provided you a non-Notepad++ solution (Perl-based). If you want to extend/change that, then super, but please take that to somewhere else (e.g. a Perl help site), because here we talk about Notepad++.
@PeterJones said it well: Notepad++ is a great tool, but some things are easier to solve with another tool. And the problem is, we don’t get into in-depth conversations here about other tools.
If you still don’t see why this thread isn’t Notepad++ related, please have a look at “baking cookies” in this FAQ.
-
@Scott-Sumner I asked for a solution. I can’t control the responses of others. So review your own statements before you put nonsense up. And don’t talk here if you’re not providing a solution you’re just wasting my time and extending this thread for no reason.
-
I can’t control the responses of others.
Note that I am not negative about @PeterJones 's response at all. If you like what he proposed, run with it. But since it isn’t about Notepad++, take further discussion about it elsewhere if you need to. Ideally, you should spend some time learning about the solution provided if it is an unfamiliar technology to you, so that you can solve your own problems, such as extending it to do something different.
Do you know if there is any way to count the no. of times a phrase was duplicated
In the event that this question is back to Notepad++ and isn’t related to the @PeterJones Perl solution to the first thing you asked (I really have no idea), then you should have created a new discussion thread for this. However, let’s assume that it is a question about Notepad++, which will be gladly answered here.
You can certainly count the number of times a phrase is duplicated. In the Find window there is a Count button which will provide this functionality. Put your search phrase in the Find what zone and press the Count button and the number of occurrences that match will appear in the Find window’s status bar. In your case I guess the Find what would be HANDSETS.
-
It’s still an open question I’m not further discussing it so don’t say warning getting off topic. And read the thread carefully I said without inputting a statement.
-
@Muhammad-Khan said:
I’m not further discussing it
Why not, if you think it is something Notepad++ might be able to do? Your choice, though, but I think you may have developed an “attitude” that will preclude further discussion.
And read the thread carefully
Well, we try to but often we get posters here who aren’t that great at expressing their needs in English. So we try to make our best interpretation; sometimes things are missed/misinterpreted.
without inputting a statement
Well, you have to be a bit more precise about what you want. Without “inputting a statement”, do you want Notepad++ to read your mind about what you want to search for? Or is it supposed to search for every possible combination of characters or words or string of words that appear elsewhere in your document? We are definitely willing to assist with the things that Notepad++ can do for you, if you make that need clear.
-
You’re an absolute fucking retard
-
You’re an absolute f***ing retard
Maybe…but if so it doesn’t make sense that I have far more “reputation points” on this site than anyone else. Hmmmm…