regular expression - search until 1st occurence of



  • Hi all,

    I have a file with a content like this:

    BEGIN
    something here!!!
    END
    ELSE
    something here!!!
    something here!!!
    something here!!!
    something here!!!
    GO

    BEGIN
    something here!!!
    END
    ELSE
    something here!!!
    something here!!!
    GO

    I need to match only all ELSE GO block.

    my regex is this
    https://regex101.com/r/hjCIiv/1

    but all code from the first else is selected.

    Please help!

    D.



  • Hello, @diego-di-filippo and All,

    Thanks for doing a try, before posting ! You weren’t very far from the correct regex, anyway !

    So, instead of ^ELSE\b(\r\n.*){1,}\bGO, just try with the regex ^ELSE\b(\r\n.*?){1,}\bGO

    Personally, I would prefer the shorter syntax (?s-i)^ELSE\R.+?\RGO\R

    Explanations :

    • The first part (?s-i) modifiers means that :

      • The dot will match any single character ( standard one or EOL one => MULTI- lines match ) because of the s modifier

      • The search will be performed, by the regex engine, in an non-insensitive way, because of the -i modifier

    • Then, the part ^ELSE\R tries to match the upper-case word ELSE, at beginning of a line, and followed with its line-break character(s) \R ( = \r\n in Windows files, \n in Unix files or \r in Mac files )

    • At the end of the regex, the part GO\R matches for the upper-case word GO, followed with its line-break, too

    • In the middle, the part .+?\R matches the shortest non-null range of any character, followed with a line-break, till the GO word. Note that, if you would omit the ? symbol, this time, it would search for the longest non-null range of chars, till the word GO !

    Best Regards,

    guy038



  • @Diego-Di-Filippo

    So it seems your original try at a correct regex was simply missing a ? after the .*.

    Here’s the core difference:

    .* <— Match any single character that is NOT a line break character between zero and unlimited times, as MANY times as possible, GIVING BACK as needed (GREEDY)

    .*? <— Match any single character that is NOT a line break character between zero and unlimited times, as FEW times as possible, EXPANDING as needed (LAZY)



  • thank you guys!!!
    D.


Log in to reply