Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    List frequency of duplicates

    Help wanted · · · – – – · · ·
    3
    12
    2226
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Muhammad Khan
      Muhammad Khan last edited by

      I currently have a list of words with one word per line, I’d like to get a list of how many times each word was duplicated.

      1 Reply Last reply Reply Quote 0
      • PeterJones
        PeterJones last edited by

        Notepad++ is a great tool, but some things are easier to solve with another tool.

        You can easily do this in most of the “scripting” lanugages, such as Perl, Python, or Lua. If you have Perl (see Strawberry Perl for an easy-to-install version for Windows), save this file as countWords.pl

        #/usr/bin/perl
        use warnings;
        use strict;
        my %count;
        
        while($_ = <>) {
            chomp;
            ++$count{$_};
        }
        
        printf "%-20s => %d\n", $_, $count{$_} for sort { $count{$b} <=> $count{$a} or $a cmp $b } keys %count;
        

        Then run perl countWords.pl input.txt (where input.txt is the file, with one word per line), and it will give you the list of words, sorted by frequency then alphabetically

        Or, if you have some of the linux-like tools – using GnuWin32, or the git-bash shell, or cygwin, or the new WSL (Windows Subsystem for Linux) – then it could be done with a one-liner: sort list.txt | uniq -c | sort -nr (take the file, sort it so the same words are next to each other, output each line with its unique count, and sort it in reverse numerical order).

        Actually, if you grabbed the sort, uniq, and wc from GnuWin32 and had them in your path (before c:\windows\system32\ in your path, otherwise it will use windows’ sort, not gnu sort), then you could run that through the Notepad++ Run > Run, using cmd.exe /K "sort ^"$(FULL_CURRENT_PATH)^" | uniq -c | sort -nr".

        Alternatively to Perl or GnuWin32 examples shown here, you could do it with either Python or Lua inside Notepad++ if you have the PythonScript or LuaScript plugins for Notepad++.

        1 Reply Last reply Reply Quote 3
        • Muhammad Khan
          Muhammad Khan last edited by

          Thank You very much. Do you know if there is any way to count the no. of times a phrase was duplicated. For eg. The list below automatically get a count without inputting a phrase to look for. Please note that a username appears in front of the phrase which I would not like to carry over.

          BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS *2
          History Of Manchester United DVD *2

          NEWInstant Water Heater Shower Head
          Comment:great er!!! no hassles!!!
          rated on 07 Jun 2016
          stanton261 BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS
          Comment:great er!!! no hassles!!!
          rated on 07 Jun 2016
          Rarebit314 Kenwood Prospero Chef Kitchen 900W Mixer + ATTACHMENTS
          Comment:great er!!! no hassles!!!
          rated on 07 Jun 2016
          Yvie240182 Manchester United - The Official History 1878 - 2008: 2 DVD Set
          Comment:great er!!! no hassles!!!
          rated on 07 Jun 2016
          ubombo57 History Of Manchester United DVD
          Comment:great er!!! no hassles!!!
          rated on 07 Jun 2016
          sanjayramroop4 History Of Manchester United DVD
          Comment:great er!!! no hassles!!!
          rated on 07 Jun 2016
          TonyS129 BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS
          Comment:great er!!! no hassles!!!
          rated on 07 Jun 2016

          Scott Sumner 1 Reply Last reply Reply Quote 0
          • Scott Sumner
            Scott Sumner @Muhammad Khan last edited by Scott Sumner

            @Muhammad-Khan

            Danger…warning…GETTING OFF-TOPIC…let’s get back to talking about Notepad++…

            Muhammad Khan 1 Reply Last reply Reply Quote 0
            • Muhammad Khan
              Muhammad Khan @Scott Sumner last edited by

              @Scott-Sumner This is about using notepad++ how am i getting off topic

              Scott Sumner 1 Reply Last reply Reply Quote 0
              • Scott Sumner
                Scott Sumner @Muhammad Khan last edited by Scott Sumner

                @Muhammad-Khan

                I don’t see anything about Notepad++ in this entire thread. What I see are data-content manipulations that don’t involve Notepad++. @PeterJones provided you a non-Notepad++ solution (Perl-based). If you want to extend/change that, then super, but please take that to somewhere else (e.g. a Perl help site), because here we talk about Notepad++.

                @PeterJones said it well: Notepad++ is a great tool, but some things are easier to solve with another tool. And the problem is, we don’t get into in-depth conversations here about other tools.

                If you still don’t see why this thread isn’t Notepad++ related, please have a look at “baking cookies” in this FAQ.

                Muhammad Khan 1 Reply Last reply Reply Quote 0
                • Muhammad Khan
                  Muhammad Khan @Scott Sumner last edited by

                  @Scott-Sumner I asked for a solution. I can’t control the responses of others. So review your own statements before you put nonsense up. And don’t talk here if you’re not providing a solution you’re just wasting my time and extending this thread for no reason.

                  Scott Sumner 1 Reply Last reply Reply Quote -2
                  • Scott Sumner
                    Scott Sumner @Muhammad Khan last edited by

                    @Muhammad-Khan

                    I can’t control the responses of others.

                    Note that I am not negative about @PeterJones 's response at all. If you like what he proposed, run with it. But since it isn’t about Notepad++, take further discussion about it elsewhere if you need to. Ideally, you should spend some time learning about the solution provided if it is an unfamiliar technology to you, so that you can solve your own problems, such as extending it to do something different.

                    Do you know if there is any way to count the no. of times a phrase was duplicated

                    In the event that this question is back to Notepad++ and isn’t related to the @PeterJones Perl solution to the first thing you asked (I really have no idea), then you should have created a new discussion thread for this. However, let’s assume that it is a question about Notepad++, which will be gladly answered here.

                    You can certainly count the number of times a phrase is duplicated. In the Find window there is a Count button which will provide this functionality. Put your search phrase in the Find what zone and press the Count button and the number of occurrences that match will appear in the Find window’s status bar. In your case I guess the Find what would be HANDSETS.

                    1 Reply Last reply Reply Quote 2
                    • Muhammad Khan
                      Muhammad Khan last edited by

                      It’s still an open question I’m not further discussing it so don’t say warning getting off topic. And read the thread carefully I said without inputting a statement.

                      Scott Sumner 1 Reply Last reply Reply Quote 0
                      • Scott Sumner
                        Scott Sumner @Muhammad Khan last edited by

                        @Muhammad-Khan said:

                        I’m not further discussing it

                        Why not, if you think it is something Notepad++ might be able to do? Your choice, though, but I think you may have developed an “attitude” that will preclude further discussion.

                        And read the thread carefully

                        Well, we try to but often we get posters here who aren’t that great at expressing their needs in English. So we try to make our best interpretation; sometimes things are missed/misinterpreted.

                        without inputting a statement

                        Well, you have to be a bit more precise about what you want. Without “inputting a statement”, do you want Notepad++ to read your mind about what you want to search for? Or is it supposed to search for every possible combination of characters or words or string of words that appear elsewhere in your document? We are definitely willing to assist with the things that Notepad++ can do for you, if you make that need clear.

                        1 Reply Last reply Reply Quote 0
                        • Muhammad Khan
                          Muhammad Khan last edited by

                          You’re an absolute fucking retard

                          Scott Sumner 1 Reply Last reply Reply Quote -7
                          • Scott Sumner
                            Scott Sumner @Muhammad Khan last edited by

                            @Muhammad-Khan

                            You’re an absolute f***ing retard

                            Maybe…but if so it doesn’t make sense that I have far more “reputation points” on this site than anyone else. Hmmmm…

                            1 Reply Last reply Reply Quote 2
                            • First post
                              Last post
                            Copyright © 2014 NodeBB Forums | Contributors