Community
    • Login

    Sentences and words

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    4 Posts 3 Posters 317 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Amazon BooksA
      Amazon Books
      last edited by

      Hello, I have some paragraphs (one paragraph in each separate text file). The files are numbered sequentially as p1,p2,p3, … I wish to count the number of sentences in each paragraph in these files and the number of words in each of these sentences. For example, paragraph p1 might have 3 sentences (identified by full stop at the end of each sentence). An the wordCount in these 3 sentences might be 10, 23, 16. How to go about counting and listing the number of these sentences and words in each of these sentences? If it can be done individually for each text file (paragraph) then also its ok. Thanks.

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @Amazon Books
        last edited by

        @Amazon-Books

        I have some paragraphs (one paragraph in each separate text file)

        Isn’t the number of “paragraphs” simply the number of files then?

        I wish to count the number of sentences in each paragraph in these files

        You could do it in ONE file like this (via the Count button), seeing the result on the status bar:

        b2249d42-2e22-48c6-9e39-6b9fbe905126-image.png

        To do multiple files you would switch to the Find in Files tab and run the same search against multiple files and obtain your count in Search results:

        568a6661-9a37-4428-9863-41039d0218d2-image.png

        How to go about counting…words in each of these sentences?

        This would require writing a program.

        Neil SchipperN 1 Reply Last reply Reply Quote 2
        • Neil SchipperN
          Neil Schipper @Alan Kilborn
          last edited by

          @Amazon-Books

          You can get word counts right with npp’s View -> Summary command, but that might be too much busy-work if there are lots of files and/or you want fresh updates for files that are frequently changed.

          Getting sentences right is not a trivial problem. You could examine your own data, and determine a reasonable average number words per sentence, and then calculate an estimated sentence count by dividing word count by that average.

          But sentence counting is surely a problem that has been somewhat solved (ie uses sophisticated models to give pretty good statistical results) many times. Best you do a search and try to find purpose built tools.

          @Alan-Kilborn

          <run of non-periods><period>

          is not a great definition of a sentence due to false positives with decimal numbers.… among other things.

          A plausible alternative might be

          <period><space or newline or end-of-file>

          but that also misfires, as with “Mr. Kilborn and Mr. Jones frequently comment in this community.”

          Isn’t the number of “paragraphs” simply the number of files then?

          OP did not ask for paragraph count.

          Alan KilbornA 1 Reply Last reply Reply Quote 2
          • Alan KilbornA
            Alan Kilborn @Neil Schipper
            last edited by Alan Kilborn

            @Neil-Schipper

            Isn’t the number of “paragraphs” simply the number of files then?

            OP did not ask for paragraph count.

            Correct. I was merely commenting on that, not supplying any kind of “solution” for it.


            <run of non-periods><period> is not a great definition of a sentence due to false positives with decimal numbers.… among other things.

            We know this… I was wanting the OP to try the solution and point this out, at which time we would explain that they are pretty much asking for an impossible task…at least with what Notepad++ can do.

            However, if the need is not for a 100% exact count, but possibly one that is somewhat close, simple algorithms might work.

            1 Reply Last reply Reply Quote 2
            • First post
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors