• Login
Community
  • Login

List frequency of duplicates

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
12 Posts 3 Posters 3.8k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    Muhammad Khan
    last edited by Dec 7, 2018, 2:22 AM

    I currently have a list of words with one word per line, I’d like to get a list of how many times each word was duplicated.

    1 Reply Last reply Reply Quote 0
    • P
      PeterJones
      last edited by Dec 7, 2018, 2:27 PM

      Notepad++ is a great tool, but some things are easier to solve with another tool.

      You can easily do this in most of the “scripting” lanugages, such as Perl, Python, or Lua. If you have Perl (see Strawberry Perl for an easy-to-install version for Windows), save this file as countWords.pl

      #/usr/bin/perl
      use warnings;
      use strict;
      my %count;
      
      while($_ = <>) {
          chomp;
          ++$count{$_};
      }
      
      printf "%-20s => %d\n", $_, $count{$_} for sort { $count{$b} <=> $count{$a} or $a cmp $b } keys %count;
      

      Then run perl countWords.pl input.txt (where input.txt is the file, with one word per line), and it will give you the list of words, sorted by frequency then alphabetically

      Or, if you have some of the linux-like tools – using GnuWin32 , or the git-bash shell, or cygwin, or the new WSL (Windows Subsystem for Linux) – then it could be done with a one-liner: sort list.txt | uniq -c | sort -nr (take the file, sort it so the same words are next to each other, output each line with its unique count, and sort it in reverse numerical order).

      Actually, if you grabbed the sort, uniq, and wc from GnuWin32 and had them in your path (before c:\windows\system32\ in your path, otherwise it will use windows’ sort, not gnu sort), then you could run that through the Notepad++ Run > Run, using cmd.exe /K "sort ^"$(FULL_CURRENT_PATH)^" | uniq -c | sort -nr".

      Alternatively to Perl or GnuWin32 examples shown here, you could do it with either Python or Lua inside Notepad++ if you have the PythonScript or LuaScript plugins for Notepad++.

      1 Reply Last reply Reply Quote 3
      • M
        Muhammad Khan
        last edited by Dec 8, 2018, 1:55 AM

        Thank You very much. Do you know if there is any way to count the no. of times a phrase was duplicated. For eg. The list below automatically get a count without inputting a phrase to look for. Please note that a username appears in front of the phrase which I would not like to carry over.

        BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS *2
        History Of Manchester United DVD *2

        NEWInstant Water Heater Shower Head
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        stanton261 BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        Rarebit314 Kenwood Prospero Chef Kitchen 900W Mixer + ATTACHMENTS
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        Yvie240182 Manchester United - The Official History 1878 - 2008: 2 DVD Set
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        ubombo57 History Of Manchester United DVD
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        sanjayramroop4 History Of Manchester United DVD
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        TonyS129 BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016

        S 1 Reply Last reply Dec 8, 2018, 2:55 AM Reply Quote 0
        • S
          Scott Sumner @Muhammad Khan
          last edited by Scott Sumner Dec 8, 2018, 2:56 AM Dec 8, 2018, 2:55 AM

          @Muhammad-Khan

          Danger…warning…GETTING OFF-TOPIC…let’s get back to talking about Notepad++…

          M 1 Reply Last reply Dec 8, 2018, 1:09 PM Reply Quote 0
          • M
            Muhammad Khan @Scott Sumner
            last edited by Dec 8, 2018, 1:09 PM

            @Scott-Sumner This is about using notepad++ how am i getting off topic

            S 1 Reply Last reply Dec 8, 2018, 1:19 PM Reply Quote 0
            • S
              Scott Sumner @Muhammad Khan
              last edited by Scott Sumner Dec 8, 2018, 1:20 PM Dec 8, 2018, 1:19 PM

              @Muhammad-Khan

              I don’t see anything about Notepad++ in this entire thread. What I see are data-content manipulations that don’t involve Notepad++. @PeterJones provided you a non-Notepad++ solution (Perl-based). If you want to extend/change that, then super, but please take that to somewhere else (e.g. a Perl help site), because here we talk about Notepad++.

              @PeterJones said it well: Notepad++ is a great tool, but some things are easier to solve with another tool. And the problem is, we don’t get into in-depth conversations here about other tools.

              If you still don’t see why this thread isn’t Notepad++ related, please have a look at “baking cookies” in this FAQ .

              M 1 Reply Last reply Dec 8, 2018, 2:52 PM Reply Quote 0
              • M
                Muhammad Khan @Scott Sumner
                last edited by Dec 8, 2018, 2:52 PM

                @Scott-Sumner I asked for a solution. I can’t control the responses of others. So review your own statements before you put nonsense up. And don’t talk here if you’re not providing a solution you’re just wasting my time and extending this thread for no reason.

                S 1 Reply Last reply Dec 8, 2018, 3:46 PM Reply Quote -2
                • S
                  Scott Sumner @Muhammad Khan
                  last edited by Dec 8, 2018, 3:46 PM

                  @Muhammad-Khan

                  I can’t control the responses of others.

                  Note that I am not negative about @PeterJones 's response at all. If you like what he proposed, run with it. But since it isn’t about Notepad++, take further discussion about it elsewhere if you need to. Ideally, you should spend some time learning about the solution provided if it is an unfamiliar technology to you, so that you can solve your own problems, such as extending it to do something different.

                  Do you know if there is any way to count the no. of times a phrase was duplicated

                  In the event that this question is back to Notepad++ and isn’t related to the @PeterJones Perl solution to the first thing you asked (I really have no idea), then you should have created a new discussion thread for this. However, let’s assume that it is a question about Notepad++, which will be gladly answered here.

                  You can certainly count the number of times a phrase is duplicated. In the Find window there is a Count button which will provide this functionality. Put your search phrase in the Find what zone and press the Count button and the number of occurrences that match will appear in the Find window’s status bar. In your case I guess the Find what would be HANDSETS.

                  1 Reply Last reply Reply Quote 2
                  • M
                    Muhammad Khan
                    last edited by Dec 8, 2018, 6:14 PM

                    It’s still an open question I’m not further discussing it so don’t say warning getting off topic. And read the thread carefully I said without inputting a statement.

                    S 1 Reply Last reply Dec 8, 2018, 8:27 PM Reply Quote 0
                    • S
                      Scott Sumner @Muhammad Khan
                      last edited by Dec 8, 2018, 8:27 PM

                      @Muhammad-Khan said:

                      I’m not further discussing it

                      Why not, if you think it is something Notepad++ might be able to do? Your choice, though, but I think you may have developed an “attitude” that will preclude further discussion.

                      And read the thread carefully

                      Well, we try to but often we get posters here who aren’t that great at expressing their needs in English. So we try to make our best interpretation; sometimes things are missed/misinterpreted.

                      without inputting a statement

                      Well, you have to be a bit more precise about what you want. Without “inputting a statement”, do you want Notepad++ to read your mind about what you want to search for? Or is it supposed to search for every possible combination of characters or words or string of words that appear elsewhere in your document? We are definitely willing to assist with the things that Notepad++ can do for you, if you make that need clear.

                      1 Reply Last reply Reply Quote 0
                      • M
                        Muhammad Khan
                        last edited by Dec 9, 2018, 4:04 AM

                        You’re an absolute fucking retard

                        S 1 Reply Last reply Dec 9, 2018, 1:23 PM Reply Quote -7
                        • S
                          Scott Sumner @Muhammad Khan
                          last edited by Dec 9, 2018, 1:23 PM

                          @Muhammad-Khan

                          You’re an absolute f***ing retard

                          Maybe…but if so it doesn’t make sense that I have far more “reputation points” on this site than anyone else. Hmmmm…

                          1 Reply Last reply Reply Quote 2
                          4 out of 12
                          • First post
                            4/12
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors