Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    Sentences and words

    Help wanted · · · – – – · · ·
    3
    4
    86
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Amazon Books
      Amazon Books last edited by

      Hello, I have some paragraphs (one paragraph in each separate text file). The files are numbered sequentially as p1,p2,p3, … I wish to count the number of sentences in each paragraph in these files and the number of words in each of these sentences. For example, paragraph p1 might have 3 sentences (identified by full stop at the end of each sentence). An the wordCount in these 3 sentences might be 10, 23, 16. How to go about counting and listing the number of these sentences and words in each of these sentences? If it can be done individually for each text file (paragraph) then also its ok. Thanks.

      Alan Kilborn 1 Reply Last reply Reply Quote 0
      • Alan Kilborn
        Alan Kilborn @Amazon Books last edited by

        @Amazon-Books

        I have some paragraphs (one paragraph in each separate text file)

        Isn’t the number of “paragraphs” simply the number of files then?

        I wish to count the number of sentences in each paragraph in these files

        You could do it in ONE file like this (via the Count button), seeing the result on the status bar:

        b2249d42-2e22-48c6-9e39-6b9fbe905126-image.png

        To do multiple files you would switch to the Find in Files tab and run the same search against multiple files and obtain your count in Search results:

        568a6661-9a37-4428-9863-41039d0218d2-image.png

        How to go about counting…words in each of these sentences?

        This would require writing a program.

        Neil Schipper 1 Reply Last reply Reply Quote 2
        • Neil Schipper
          Neil Schipper @Alan Kilborn last edited by

          @Amazon-Books

          You can get word counts right with npp’s View -> Summary command, but that might be too much busy-work if there are lots of files and/or you want fresh updates for files that are frequently changed.

          Getting sentences right is not a trivial problem. You could examine your own data, and determine a reasonable average number words per sentence, and then calculate an estimated sentence count by dividing word count by that average.

          But sentence counting is surely a problem that has been somewhat solved (ie uses sophisticated models to give pretty good statistical results) many times. Best you do a search and try to find purpose built tools.

          @Alan-Kilborn

          <run of non-periods><period>

          is not a great definition of a sentence due to false positives with decimal numbers.… among other things.

          A plausible alternative might be

          <period><space or newline or end-of-file>

          but that also misfires, as with “Mr. Kilborn and Mr. Jones frequently comment in this community.”

          Isn’t the number of “paragraphs” simply the number of files then?

          OP did not ask for paragraph count.

          Alan Kilborn 1 Reply Last reply Reply Quote 2
          • Alan Kilborn
            Alan Kilborn @Neil Schipper last edited by Alan Kilborn

            @Neil-Schipper

            Isn’t the number of “paragraphs” simply the number of files then?

            OP did not ask for paragraph count.

            Correct. I was merely commenting on that, not supplying any kind of “solution” for it.


            <run of non-periods><period> is not a great definition of a sentence due to false positives with decimal numbers.… among other things.

            We know this… I was wanting the OP to try the solution and point this out, at which time we would explain that they are pretty much asking for an impossible task…at least with what Notepad++ can do.

            However, if the need is not for a 100% exact count, but possibly one that is somewhat close, simple algorithms might work.

            1 Reply Last reply Reply Quote 2
            • First post
              Last post
            Copyright © 2014 NodeBB Forums | Contributors