Community
    • Login

    Find identical paragraphs

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    4 Posts 3 Posters 872 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Caro ChennoufC
      Caro Chennouf
      last edited by Caro Chennouf

      Hi everyone,

      I’m new to this community and I’m no developer at all.
      I use Notepad++ to edit song texts, it saves me time with stuff like delete all punctuation, add a character at the end of all lines, etc.

      But there is one feature I can’t seem to find :

      I need to highlight all the repeated paragraphs in a text, with a different style for every paragraph. For example :

      I am thinking of you
      In my sleepless solitude tonight
      If it’s wrong to love you
      Then my heart just won’t let me be right
      I’d give my all to have
      Just one more night with you
      I’d risk my life to feel
      Your body next to mine

      Baby can you feel me
      Imagining I’m looking in your eyes
      I can see you clearly
      Vividly emblazoned in my mind
      'Cause I can’t go on
      Living in the memory of our song
      I’d give my all for your love tonight

      And yet you’re so far
      Like a distant star
      I’m wishing on tonight
      I’d give my all to have
      Just one more night with you
      I’d risk my life to feel
      Your body next to mine

      'Cause I can’t go on
      Living in the memory of our song
      I’d give my all for your love tonight

      Here you can see I found all the paragraphs that are repeated and applied a different style for each one.
      BUT additionally, and this is the point that’s not easy to explain :
      I need it to be done automatically : not have to select paragraph 1, search for its repetitions, highlight it, then do the same thing with paragraph 2, etc…
      I hope I’m explaining clearly…

      It would help me find all the parts of a song that are identical (choruses, bridges…) at once.

      PeterJonesP 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn
        last edited by Alan Kilborn

        Your explanation is very clear (thus far).
        Thank you for that.

        One problem is, there doesn’t seem to be any real delimiter between “paragraphs”. Everything runs together. This is going to cause trouble for what you want to do.

        Second problem: You are limited to 5 or maybe six different styles. Also, styles aren’t permanent, so what do you intend to do with this data that is styled?

        So maybe a bit more about your intentions are, to get full advice on how to proceed… (sorry, but know this may invoke MORE problems…)

        1 Reply Last reply Reply Quote 1
        • PeterJonesP
          PeterJones @Caro Chennouf
          last edited by

          @Caro-Chennouf ,

          Notepad++ is a text editor, not a word processor. It cannot store information like italics or bold in the file itself. So if you saved the text file and someone else opened it on another machine or in another editor, it would just be plain text again. In fact, if you did the highlighting in Notepad++ then exited out, when you reloaded the highlighting would be gone again.

          With some regex searching, you might be able to handle a semi-automated process… ahh, @Alan-Kilborn just posted, and started asking the questions that would need to be answered in order to proceed.

          (I tried coming up with a regex that would match one or more lines in a row that have an exact match later on, but with the “or more”, I couldn’t get it to work right on my first attempt; as I have time, I might think about it more… Or a better regex guru than I might be able to beat me to it)

          PeterJonesP 1 Reply Last reply Reply Quote 0
          • PeterJonesP
            PeterJones @PeterJones
            last edited by PeterJones

            @Caro-Chennouf ,

            I figured out why my regex wasn’t working, so was able to get it to properly match an N-line paragraph that has a repeat later.

            1. if you’re in v8.x, go to Settings > Preferences > Highlighting
              • Go to the Mark All section and uncheck Match whole word only
                c88196e3-89b5-4a06-8971-07bdbc24e946-image.png
            2. Go to the first line (Ctrl+Home)
            3. FIND the first instance of each paragraph
              • Search > Find
              • FIND WHAT = (?-s)((^.+?(\R|\Z))+)(?=(?s:.*)\1)
              • Search Mode = ☑ Regular Expression
              • FIND NEXT
                => this highlights the first paragraph that is repeated somewhere else
                f7ea6590-2d52-42d7-866c-0c8d8456b4d1-image.png
            4. Use Search > Mark All > Using #th Style (or right click context menu > Style All Occurrences of Token)
              => all of the instances of that first paragraph should be marked with that style number
              04df6ffc-e045-4443-a690-d0d2b60bc17e-image.png
            5. Use Search > Next or F3 from the editor window, or the FIND NEXT button in the FIND dialog, to select the next “first instance of a paragraph”, and style using a different #th style. Repeat as necessary.
              b162546d-52a7-405c-b113-3faa0da8ffa4-image.png

            So it’s not 100% automated, but it’s better than manually having to find each paragraph (chorus or bridge) And remember, this will not be saved in the file; the next time you open it, you’ll have to do it again.

            1 Reply Last reply Reply Quote 3
            • First post
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors