Community
    • Login

    List frequency of duplicates

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    12 Posts 3 Posters 3.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • PeterJonesP
      PeterJones
      last edited by

      Notepad++ is a great tool, but some things are easier to solve with another tool.

      You can easily do this in most of the “scripting” lanugages, such as Perl, Python, or Lua. If you have Perl (see Strawberry Perl for an easy-to-install version for Windows), save this file as countWords.pl

      #/usr/bin/perl
      use warnings;
      use strict;
      my %count;
      
      while($_ = <>) {
          chomp;
          ++$count{$_};
      }
      
      printf "%-20s => %d\n", $_, $count{$_} for sort { $count{$b} <=> $count{$a} or $a cmp $b } keys %count;
      

      Then run perl countWords.pl input.txt (where input.txt is the file, with one word per line), and it will give you the list of words, sorted by frequency then alphabetically

      Or, if you have some of the linux-like tools – using GnuWin32, or the git-bash shell, or cygwin, or the new WSL (Windows Subsystem for Linux) – then it could be done with a one-liner: sort list.txt | uniq -c | sort -nr (take the file, sort it so the same words are next to each other, output each line with its unique count, and sort it in reverse numerical order).

      Actually, if you grabbed the sort, uniq, and wc from GnuWin32 and had them in your path (before c:\windows\system32\ in your path, otherwise it will use windows’ sort, not gnu sort), then you could run that through the Notepad++ Run > Run, using cmd.exe /K "sort ^"$(FULL_CURRENT_PATH)^" | uniq -c | sort -nr".

      Alternatively to Perl or GnuWin32 examples shown here, you could do it with either Python or Lua inside Notepad++ if you have the PythonScript or LuaScript plugins for Notepad++.

      1 Reply Last reply Reply Quote 3
      • Muhammad KhanM
        Muhammad Khan
        last edited by

        Thank You very much. Do you know if there is any way to count the no. of times a phrase was duplicated. For eg. The list below automatically get a count without inputting a phrase to look for. Please note that a username appears in front of the phrase which I would not like to carry over.

        BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS *2
        History Of Manchester United DVD *2

        NEWInstant Water Heater Shower Head
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        stanton261 BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        Rarebit314 Kenwood Prospero Chef Kitchen 900W Mixer + ATTACHMENTS
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        Yvie240182 Manchester United - The Official History 1878 - 2008: 2 DVD Set
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        ubombo57 History Of Manchester United DVD
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        sanjayramroop4 History Of Manchester United DVD
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016
        TonyS129 BELL TWIN DUO CORDLESS PHONES- AIR-02 2HANDSETS
        Comment:great er!!! no hassles!!!
        rated on 07 Jun 2016

        Scott SumnerS 1 Reply Last reply Reply Quote 0
        • Scott SumnerS
          Scott Sumner @Muhammad Khan
          last edited by Scott Sumner

          @Muhammad-Khan

          Danger…warning…GETTING OFF-TOPIC…let’s get back to talking about Notepad++…

          Muhammad KhanM 1 Reply Last reply Reply Quote 0
          • Muhammad KhanM
            Muhammad Khan @Scott Sumner
            last edited by

            @Scott-Sumner This is about using notepad++ how am i getting off topic

            Scott SumnerS 1 Reply Last reply Reply Quote 0
            • Scott SumnerS
              Scott Sumner @Muhammad Khan
              last edited by Scott Sumner

              @Muhammad-Khan

              I don’t see anything about Notepad++ in this entire thread. What I see are data-content manipulations that don’t involve Notepad++. @PeterJones provided you a non-Notepad++ solution (Perl-based). If you want to extend/change that, then super, but please take that to somewhere else (e.g. a Perl help site), because here we talk about Notepad++.

              @PeterJones said it well: Notepad++ is a great tool, but some things are easier to solve with another tool. And the problem is, we don’t get into in-depth conversations here about other tools.

              If you still don’t see why this thread isn’t Notepad++ related, please have a look at “baking cookies” in this FAQ.

              Muhammad KhanM 1 Reply Last reply Reply Quote 0
              • Muhammad KhanM
                Muhammad Khan @Scott Sumner
                last edited by

                @Scott-Sumner I asked for a solution. I can’t control the responses of others. So review your own statements before you put nonsense up. And don’t talk here if you’re not providing a solution you’re just wasting my time and extending this thread for no reason.

                Scott SumnerS 1 Reply Last reply Reply Quote -2
                • Scott SumnerS
                  Scott Sumner @Muhammad Khan
                  last edited by

                  @Muhammad-Khan

                  I can’t control the responses of others.

                  Note that I am not negative about @PeterJones 's response at all. If you like what he proposed, run with it. But since it isn’t about Notepad++, take further discussion about it elsewhere if you need to. Ideally, you should spend some time learning about the solution provided if it is an unfamiliar technology to you, so that you can solve your own problems, such as extending it to do something different.

                  Do you know if there is any way to count the no. of times a phrase was duplicated

                  In the event that this question is back to Notepad++ and isn’t related to the @PeterJones Perl solution to the first thing you asked (I really have no idea), then you should have created a new discussion thread for this. However, let’s assume that it is a question about Notepad++, which will be gladly answered here.

                  You can certainly count the number of times a phrase is duplicated. In the Find window there is a Count button which will provide this functionality. Put your search phrase in the Find what zone and press the Count button and the number of occurrences that match will appear in the Find window’s status bar. In your case I guess the Find what would be HANDSETS.

                  1 Reply Last reply Reply Quote 2
                  • Muhammad KhanM
                    Muhammad Khan
                    last edited by

                    It’s still an open question I’m not further discussing it so don’t say warning getting off topic. And read the thread carefully I said without inputting a statement.

                    Scott SumnerS 1 Reply Last reply Reply Quote 0
                    • Scott SumnerS
                      Scott Sumner @Muhammad Khan
                      last edited by

                      @Muhammad-Khan said:

                      I’m not further discussing it

                      Why not, if you think it is something Notepad++ might be able to do? Your choice, though, but I think you may have developed an “attitude” that will preclude further discussion.

                      And read the thread carefully

                      Well, we try to but often we get posters here who aren’t that great at expressing their needs in English. So we try to make our best interpretation; sometimes things are missed/misinterpreted.

                      without inputting a statement

                      Well, you have to be a bit more precise about what you want. Without “inputting a statement”, do you want Notepad++ to read your mind about what you want to search for? Or is it supposed to search for every possible combination of characters or words or string of words that appear elsewhere in your document? We are definitely willing to assist with the things that Notepad++ can do for you, if you make that need clear.

                      1 Reply Last reply Reply Quote 0
                      • Muhammad KhanM
                        Muhammad Khan
                        last edited by

                        You’re an absolute fucking retard

                        Scott SumnerS 1 Reply Last reply Reply Quote -7
                        • Scott SumnerS
                          Scott Sumner @Muhammad Khan
                          last edited by

                          @Muhammad-Khan

                          You’re an absolute f***ing retard

                          Maybe…but if so it doesn’t make sense that I have far more “reputation points” on this site than anyone else. Hmmmm…

                          1 Reply Last reply Reply Quote 2
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors