Restricting search to the end of line



  • Hi all,
    I want to select the following word(s and) combinations, each sitting in one line, but I can’t restricting the search to the end of line. The following regex finds the word(s and) combinations in two consecutive lines, where it applies, which I want to avoid. I want it to operate only within a line - putting neither $ nor \r\n seems to work.

    Find what (sorry, poor-man’s-way): ```
    code_text

    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
    

    code_text

    contrast
    hands off
    computer aided software
    run out of examples
    put-down
    put-down slowly
    putting-remarks-here
    putting-remarks-here slowly
    set-out setting-computer
    

    What I want to achive is to put one # en the beginning of eac line, namely just before the word if there is only one word in the line or the very first word if there are more than one word, separated either by one white space or an “-”.

    Besides, I can’t figure out how to keep/put the found ones exactly in place. The combinations are beyond my comprehence :). For example, it finds “putting-remarks-here slowly”, is it “$1$2”? But then, if it finds “run out of examples”, then how? $1$2$3$4? Can these be achieved with this poor-man’s-regex at all?

    Many thanks in advance!



  • @glossar

    Putting a certain white space after a “probable” word makes no sense ```

    (\w+){0,1}(\s)
    
    

    code_text

    But then,
    

    code_text

    code_text
    ```(\w+){0,1}(\s){0,1}
    doesn't work, either.


  • @glossar ,

    Your question as a whole is pretty confusing. But I’ll still answer a couple of pieces

    I want it to operate only within a line

    Your regex doesn’t have any . characters, so you haven’t hit a . matches newline issue. However, your regex has a lot of \s in it. In case you didn’t know, \s matches any space-like characters, including spaces, tabs, vertical tabs, and newlines. If you really want that to match on just a single line, you need to be more restrictive. \h will match horizontal spaces (space, tab) only, so maybe try using that. Or if it’s just supposed to be a space character, then just type the space (or \x20 if you are posting in a forum where the number of spaces might be obscured)

    So the \s you use could explain why it doesn’t seem to be restricted to a single line (and thus answering the main question in your tile).

    Further, I am not sure you realize that {0,1} means the match will allow 0 or 1 instance of whatever token comes before; that means that a lot of your capture groups can turn up empty and still have the regex match that line… which means you might not be matching as much of the line as you think you are. That could mean that it’s really only matching part of a line to get a small number of “words” when you think it’s matching multiple lines and eating up lots of words over many lines.

    is it “$1$2”? … $1$2$3$4

    I cannot tell exactly what you’re trying to ask there. But I am not sure you realize just how many numbered capture groups you have in your regex, and how many of them might turn up empty.

    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
     ^^^^^ = group $1 will contain 1 or more word characters
      
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
          ^^^^ = group $2 will contain one any-space (or no spaces thanks to allowing quantity 0)
           
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
              ^^^^^ = this says a quantity of 0 or 1 instance of the space.  
                    => Normally, that would be accomplished with ? instead of {0,1}
              
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
                   ^^^^^^^^ = group $3 contains a hyphen or is empty (thanks to {0,1})
    
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
                           ^^^^^^^^^^ = group $4 either contains 1 or more word character, or no characters, thanks to {0,1}.  
                                      => That would be much easier phrased as (\w*)
                                     
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
                                     ^^^^ = group $5 will be a single space or newline
                                     
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
                                         ^^^^^^^^ = group $6 will contain a hyphen, or nothing
                                         
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
                                                 ^^^^^^^^^^ = group $7 will contain 1 or more word characters, or nothing
    
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
                                                           ^^^^^^^^^ = group $8 will contain 0 or 1 any-space characters
    
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
                                                                    ^^^^^^^^ = group $9 will contain 0 or 1 hyphens
                                                                    
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
                                                                            ^^^^^^^^^^ = group ${10} will contain 1 or more word characters, or be empty
                                                                            
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
                                                                                      ^^^^^^^^^ = group ${11} will contain 0 or one any-space characters
                                                                                      
    ^(\w+)(\s){0,1}(-){0,1}(\w+){0,1}(\s)(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}(\w+){0,1}(\s){0,1}(-){0,1}$
                                                                                               ^^^^^^^^ = group ${12} will contain 0 or one hyphens
    

    So, I’d like to point out the 12 capture groups, many of which can be empty.

    • By allowing all those empty groups, you are making it easy to capture text that you didn’t intend.
    • You are putting a bunch of stuff (spaces, hyphens, and words) into numbered capture groups.
      Do you really need to capture all those entities? And if so, do they really all need to be in separate groups?
    • there are better ways of phrasing most of those than by using {0,1} quantifier.

    What I want to achive is to put one # en the beginning of eac line, namely just before the word if there is only one word in the line or the very first word if there are more than one word, separated either by one white space or an “-”.

    I cannot tell what you mean by that. Every line of text you showed either has a single word, or multiple words with hyphens or spaces separating them. So by my interpretation of your description, you want a # put before every line, which is a simple task; but I am sure that’s not what you intended to say. Your example text doesn’t tell us which lines you want to put a # before and which ones you want to remain unchanged. That’s why my advice, which I’ve posted often, and I think even in response to you, is to always show “before” and “after” data in two separate blocks, so we know how you want your data to be transformed. Further, make sure you data has examples of lines that will be changed and lines that will stay the same. Unless you follow the advice, you are not going to get answers that make you happy.

    Please remember that this forum is about Notepad++ in all its aspects, and is not your personal regex-writing service, nor even a “regex help/tutorial forum”. 11/14 of your topics to date have been asking us to write or fix your regex for you, which is a pretty high percentage for something that’s a small fraction of what Notepad++ can do. Please remember that the npp-user-manual.org website, and the sites mentioned in the FAQ linked below, contain plenty of information and external resources for how to use regular expressions. We understand that regular expressions are new to a lot of users , and are generally willing to answer one or two regex questions for Notepad++ newbies, directing them to use the documentation we link for future reference. But the more regex questions a single user asks, the more strongly we point them to the documentation and encourage them that it’s time for them to start taking the initiative to learn regex on their own – the best way to learn is by doing: playing with what you’ve already been given, and the examples in the docs, and figuring out for yourself what the pieces do. If we keep handing you answers, you’ll never learn on your own. And I think 4 years and 11 questions is enough time for you to have become pretty good with regex. There are times that regex questions can be good questions in this forum – for example, “I have data X and want Y but I actually got Z; I read the docs on regex syntax S, and thought it meant that it would match blah from X and become yada from Y, but that doesn’t seem to be happening; what have I misunderstood?” is a reasonably good question, and might even be interesting to answer. But “write my regex for me; I’ll vaguely explain what I want my data to transform, without concrete examples” is a pretty bad question. Your questions lie somewhere between the two, but also don’t show a steady improvement over time, which makes us feel that we are wasting our time answering your questions.

    ----

    Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as literal text using the </> toolbar button or manual Markdown syntax. To make regex in red (and so they keep their special characters like *), use backticks, like `^.*?blah.*?\z`. Screenshots can be pasted from the clipboard to your post using Ctrl+V to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries.



  • Hello, @glossar, @peterjones and all,

    @peterjones said :

    I cannot tell what you mean by that. Every line of text you showed either has a single word, or multiple words with hyphens or spaces separating them. So by my interpretation of your description, you want a # put before every line, which is a simple task; but I am sure that’s not what you intended to say.

    I also made the same deductions than Peter !

    Please, try to describe your needs, in a better way. As I always says :

    Once the hypotheses and purpose are well presented, more than half of the work is already done! And the creation of the appropriate regexes is then greatly facilitated !

    BR
    guy038



  • @PeterJones said in Restricting search to the end of line:

    …this forum is about Notepad++ in all its aspects, and is not your personal regex-writing service, nor even a “regex help/tutorial forum”. 11/14 of your topics to date have been asking us to write or fix your regex for you, which is a pretty high percentage…

    @glossar

    Do yourself and us a favor and find a dedicated regex discussion site, where the participants will never tire of writing your regex replacements for you.



  • Hi all,
    Thank you for your understanding and offers to help me further. I’ve solved the problem in the meantime - as usual in a poor-man’s way! :D

    Love & Peace,
    glossar


Log in to reply