Community
    • 登入

    Regex: Select everything (the whole line) to the dot but not more dots

    已排程 已置頂 已鎖定 已移動 Help wanted · · · – – – · · ·
    7 貼文 4 Posters 4.3k 瀏覽
    正在載入更多貼文
    • 從舊到新
    • 從新到舊
    • 最多點贊
    回覆
    • 在新貼文中回覆
    登入後回覆
    此主題已被刪除。只有擁有主題管理權限的使用者可以查看。
    • Vasile CarausV
      Vasile Caraus
      最後由 編輯

      I have this 2 cases:

      1. I love you.
      
      2. I love you...
      

      I want to select with regex only the first line to the dot, but not the second that has more dots.

      I made this regex, but is not to good: ^(.+?\.)\K{0,3}$ or ^(.+?\.)(?!{1,1})$ or ^.*?!\.{1,3}$

      Maybe someone can help me

      PeterJonesP 1 條回覆 最後回覆 回覆 引用 0
      • guy038G
        guy038
        最後由 guy038 編輯

        Hello, @vasile-caraus and All,

        Simply, use this regex :

        SEARCH [^.\r\n]+\.(?!\.)

        Notes :

        • First, the part [^.\r\n]+ looks for a non-empty range of standard characters, different from a dot

        • Then, the part \. tries to match a literal dot char

        • But ONLY IF it is not followed with a second dot char, due to the negative look-ahead (?!\.)

        Best Regards,

        guy038

        1 條回覆 最後回覆 回覆 引用 1
        • Vasile CarausV
          Vasile Caraus
          最後由 Vasile Caraus 編輯

          thanks @guy038

          And another case, suppose I have this 2 lines, also with numbers 1. and 2. at beginning :

          1. I love you.

          2. I love you...

          in this case your regex [^.\r\n]+\.(?!\.) will selects also the second line, until the first dot 2. But If I don’t want to select at all this second line that contains also one dot after 2 ?

          1 條回覆 最後回覆 回覆 引用 0
          • PeterJonesP
            PeterJones @Vasile Caraus
            最後由 PeterJones 編輯

            To future readers:

            It was nice to see that the original poster was willing to give what he tried. That is to be lauded.

            I am replying only so that other people will learn. I will explain what that individual did wrong in each case to help you future readers learn (and with the very weak hope that the original questioner will also try to learn from this example), and will explain my thought process when coming up with a working solution.

            ^(.+?\.)\K{0,3}$ will not work because {0,3} has nothing to modify – the find dialog will tell you that it was an invalid regex. See the red text at the bottom:

            56a30abe-5c03-4005-982b-e23c604a190e-image.png

            I believe what was intended was ^(.+?\.)\K\.{0,3}$, which says find one or more of any character (non-greedy), followed by a literal dot; then reset the match, then find 0 to three literal dots. But that still finds the last two dots of the .... This is because the regex is saying you want 0 to 3 dots after the first-found-dot, which is not what the original poster described in text.

            ^(.+?\.)(?!{1,1})$ is once again invalid regular expression, again because the {1,1} says from 1 to 1 of the preceding token, and the preceding token isn’t specified, so it’s invalid. Even with a character there to quantify: why would you want to specify a quantity from 1 to 1? Just use the character. Assuming again that a literal dot was supposed be there with quantity 1, that should have been written ^(.+?\.)(?!\.{1,1})$, or more simply ^(.+?\.)(?!\.)$ because you don’t need a quantifier if it’s always exactly one. But that still won’t work because the regex is “one or more of any character, followed by a literal dot, not followed by another dot, then end-of-line”. But the third dot obviously matches that.

            ^.*?!\.{1,3}$ was the only valid regex (ie, didn’t complain about invalid regex. It looks for 0 or more characters followed by 1 to 3 dots. But you said you didn’t want to match if there was more than one dot, so that regex contradicts your description of your problem.

            I am going to assume that you want to select from the beginning of the line through the end of the line for any line that ends in exactly one dot. In text, I might say, “match from beginning of line, then zero or more characters, followed by a dot that isn’t preceded by another dot, followed by the end of a line.”

            • “match from beginning of line” = ^
            • “then zero or more characters” = .*
            • “followed by a dot that isn’t preceded by another dot” = (?<!\.)\. – I used the negative lookbehind to guarantee the character before isn’t a dot
            • “followed by the end of a line” = $

            Put it all together, and you have ^.*(?<!\.)\.$. This finds only one match in your example data:

            5faf2165-5b00-49c0-9156-f5489232395a-image.png

            This selects the whole line, which I interpreted your description to mean.

            Guy’s solution is a bit different, and took a different interpretation of the description: his finds the 1. as the first match, the I love you. (including the space before the I) as a second match, and the 2. as a third match. So his had 3 matches compared to my 1. This obviously means the original problem statement was unclear, because two reasonable people came up with two very different interpretations of the requirements.

            If you want to do regex search and replace, you have to think about the problem in little tiny steps, and be able to describe those steps to yourself, or to others, in little tiny detail. If you are unwilling to do this, you will never be good at regex, and we will get tired of helping you. Please learn from these thought processes.

            1 條回覆 最後回覆 回覆 引用 3
            • guy038G
              guy038
              最後由 guy038 編輯

              Hi, @vasile-caraus, @peterjones and All,

              I did notice that, with my previous search regex, it also matched the 2., beginning the second line ! but I thought that you wanted to match any sentence, ending with a period !

              But, of course, the @peterjones’s regex is the right one ! Just a slight modification : I would use ^.+(?<!\.)\.$ as probably, you don’t want to match a line with one dot only !

              But what about cases like below ?

              3. I love you. I love you...
              
              4. I love you... I love you.
              

              So, just test the my regex [^.\r\n]+\.(?!\.), the Peter’s one ^.*(?<!\.)\.$ and also this new one (?!\d)[^.\r\n]+\.(?!\.), against the text, below, and see the differences !

              1. I love you.
              
              2. I love you...
              
              3. I love you. I love you...
              
              4. I love you... I love you.
              

              BR

              guy038

              @vasile-caraus,

              Like you, I may sometimes create a regex which returns the fatal message Find: Invalid regular expression ! But unlike you, I learned, little by little, the basic features of regexes, and, then, some advanced functionalities, all with the help of regex tutorials ! So, I’m quickly able to get the part with wrong syntax or missing characters and can rebuild my regex correctly !

              Indeed ! For example, using a quantifier range as {0,3}, without some material to quantify, located before, should rather be considered as a noob error !

              After a fair practice of simple regular expressions, you will not reproduce this kind of wrong syntax ! Regex knowledge is as everything and as a Lego game : You first join two simple pieces together, then build a wall, made of some pieces, then a house, made of walls and so on !!

              Alan KilbornA 1 條回覆 最後回覆 回覆 引用 3
              • Alan KilbornA
                Alan Kilborn @guy038
                最後由 編輯

                @guy038 said in Regex: Select everything (the whole line) to the dot but not more dots:

                should rather be considered as a noob error

                After a fair practice of simple regular expressions, you will not reproduce this kind of wrong syntax !

                Some number of individuals are stuck in this “noob” state, forever.
                Sad but true.

                1 條回覆 最後回覆 回覆 引用 1
                • Vasile CarausV
                  Vasile Caraus
                  最後由 編輯

                  thank you all !

                  1 條回覆 最後回覆 回覆 引用 0
                  • 第一個貼文
                    最後的貼文
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors