Community
    • Login

    Seperating words in between dash with a line break

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    3 Posts 2 Posters 555 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Matthew HabashM Offline
      Matthew Habash
      last edited by

      Fellow Notepad++ Users,

      Could you please help me the the following regular expression problem I am having?

      I am trying to separate words between dashes with a line break but I’ve scoured forums and message threads and haven’t turned up much.

      Here is a sample text line.

      This is  a sample text---these are the words in between dashes--this is the end of the sample text
      

      Here is how I would like it to look.

      This is  a sample text
      ---these are the words in between dashes---
      this is the end of the sample text
      

      Notice how the words in between dashes are in seperate lines.

      To accomplish this, I have tried using the following Find/Replace expressions and settings

      • Find What = (?=---)(.*)(?=---)
      • Replace With = \n
      • Search Mode = REGULAR EXPRESSION
      • Dot Matches Newline = NOT CHECKED

      I can only get it to look like this:
      This is a sample text
      —these are the words in between dashes–this is the end of the sample text

      Unfortunately, this did not produce the output I desired, and I’m not sure why. Could you please help me understand what went wrong and help me find the solution?

      PeterJonesP 1 Reply Last reply Reply Quote 1
      • PeterJonesP Online
        PeterJones @Matthew Habash
        last edited by

        @Matthew-Habash said in Seperating words in between dash with a line break:

        Unfortunately, this did not produce the output I desired, and I’m not sure why. Could you please help me understand what went wrong and help me find the solution?

        I do not think you understand lookahead, nor fully understand your data.

        In your Find What, the (?=---) says “the next three characters must be hypens, but DON’T SELECT THEM YET OR MOVE THE CURSOR” – so that puts the search cursor at the beginning of the ---these are the words in between dashes---. Then (.*) says “grab 0 or more of any type of character…”. Then the second (?=---) says "… until you reach a point where the next three characters are hyphens, but don’t select those final three hyphens

        In your example text, there is only one triple-hyphen; the second set of dashes are just a double hyphen. This means that the (.*)(?=---) matches 0 characters – because if the .* doesn’t “eat up” any characters, then the lookahead will find the original triple-hyphen as the matching point. So you’ve got a 0-width selection just before the first and only ---.

        Here’s what it matches, using Find Next instead of Replace, so you can see the “selection” (which it indicates with ^ zero length match):
        5a40ae4c-afd8-4694-b843-ca187d682549-image.png

        Then your replacement says “replace everything that was matched” (ie, the emptiness just before the ---) “with a LF sequence”, which gives you:

        5af4bc30-b0f1-4342-9205-8a228f56e66d-image.png

        Please note: if your sample text had the text you thought it had three hyphens at the end of the phrase:

        This is  a sample text---these are the words in between dashes---this is the end of the sample text
        

        then your regex would have selected the first three hyphens, and the words in between:

        be5e21dd-7999-453a-85a4-5cad8ea9e2de-image.png

        And then the replace would have replaced all that selected text with a newline,
        a4518e76-d315-4b4b-a5da-9b7207d4f65e-image.png

        … which still wouldn’t be what you want.

        Based on your “how I would like it to look”, I think what you really intended to do was to put a newline before the first --- and another newline after the second --- (if there really were a second ---. To do that, I would do the following:

        • Find What: ---.*---
        • Replace With: \r\n$0\r\n
        • Search Mode: Regular Expression
        • . matches newline not checked

        The Find What is simpler: you want to match everything, including the hyphens, so no reason for the lookaheads and groups.
        In the Replace With, I am using $0 to say “include the contents of everything that was matched”, so that your original text isn’t lost. I am also using \r\n instead of just \n, because I am assuming you really have Windows-style newline (CRLF), not Unix/Linux newlines (LF only). If you actually do have Linux LF only, then Replace With: \n$0\n

        Results:
        FIND NEXT =>
        8402e8fe-cfbf-48bc-a252-61795d151df0-image.png

        REPLACE ALL =>
        2c1aca48-1380-46fd-9cc4-238deb5268f1-image.png

        This is  a sample text
        ---these are the words in between dashes---
        this is the end of the sample text
        

        But my regex will not work for your original single-line, because that one only has two hyphens, not three, so it will find no matches and do no replacements.

        Matthew HabashM 1 Reply Last reply Reply Quote 1
        • Matthew HabashM Offline
          Matthew Habash @PeterJones
          last edited by

          @PeterJones Thank you for the explanation and my apologies for the confusion. I meant for both sides of lines to have the same number of dashes. I guess I did not catch that.

          Your regex code worked and that was what I needed. Thank you very much!

          1 Reply Last reply Reply Quote 1

          Hello! It looks like you're interested in this conversation, but you don't have an account yet.

          Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

          With your input, this post could be even better 💗

          Register Login
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors