• Login
Community
  • Login

regular expression - search until 1st occurence of

Scheduled Pinned Locked Moved General Discussion
4 Posts 3 Posters 4.5k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D
    Diego Di Filippo
    last edited by Sep 19, 2018, 9:48 AM

    Hi all,

    I have a file with a content like this:

    BEGIN
    something here!!!
    END
    ELSE
    something here!!!
    something here!!!
    something here!!!
    something here!!!
    GO

    BEGIN
    something here!!!
    END
    ELSE
    something here!!!
    something here!!!
    GO

    I need to match only all ELSE GO block.

    my regex is this
    https://regex101.com/r/hjCIiv/1

    but all code from the first else is selected.

    Please help!

    D.

    S 1 Reply Last reply Sep 19, 2018, 12:56 PM Reply Quote 1
    • G
      guy038
      last edited by Sep 19, 2018, 11:40 AM

      Hello, @diego-di-filippo and All,

      Thanks for doing a try, before posting ! You weren’t very far from the correct regex, anyway !

      So, instead of ^ELSE\b(\r\n.*){1,}\bGO, just try with the regex ^ELSE\b(\r\n.*?){1,}\bGO

      Personally, I would prefer the shorter syntax (?s-i)^ELSE\R.+?\RGO\R

      Explanations :

      • The first part (?s-i) modifiers means that :

        • The dot will match any single character ( standard one or EOL one => MULTI- lines match ) because of the s modifier

        • The search will be performed, by the regex engine, in an non-insensitive way, because of the -i modifier

      • Then, the part ^ELSE\R tries to match the upper-case word ELSE, at beginning of a line, and followed with its line-break character(s) \R ( = \r\n in Windows files, \n in Unix files or \r in Mac files )

      • At the end of the regex, the part GO\R matches for the upper-case word GO, followed with its line-break, too

      • In the middle, the part .+?\R matches the shortest non-null range of any character, followed with a line-break, till the GO word. Note that, if you would omit the ? symbol, this time, it would search for the longest non-null range of chars, till the word GO !

      Best Regards,

      guy038

      1 Reply Last reply Reply Quote 1
      • S
        Scott Sumner @Diego Di Filippo
        last edited by Sep 19, 2018, 12:56 PM

        @Diego-Di-Filippo

        So it seems your original try at a correct regex was simply missing a ? after the .*.

        Here’s the core difference:

        .* <— Match any single character that is NOT a line break character between zero and unlimited times, as MANY times as possible, GIVING BACK as needed (GREEDY)

        .*? <— Match any single character that is NOT a line break character between zero and unlimited times, as FEW times as possible, EXPANDING as needed (LAZY)

        1 Reply Last reply Reply Quote 1
        • D
          Diego Di Filippo
          last edited by Sep 19, 2018, 3:46 PM

          thank you guys!!!
          D.

          1 Reply Last reply Reply Quote 1
          1 out of 4
          • First post
            1/4
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors