The problem with regex. "Проблема с регулярным выражением"



  • The problem with regex.
    Greetings! The problem with regex. Replace “.” on the “_”. Throws an error: “Invalid regular expression.” The selection method revealed that swears at “+” and “". Replacing {0,} or {1,} also swears. Everything works in C #. I read on what the regular season is implemented here.
    https://community.notepad-plus-plus.org/topic/15765/faq-desk-where-to-find-regex-documentation
    https://npp-user-manual.org/docs/searching/#regular-expressions
    It generally goes like this “+?” and "
    ?”, so it doesn’t work either.
    https://www.boost.org/doc/libs/1_70_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html
    I don’t understand what’s wrong?

    Regular:
    (?<=[A-Za-z]+[.\w]*)(?<!www).(?!mq4|org|com|mq5|mqh|\W)

    Do Not Match:
    Indicator.mq4 … www.metatrader.org fx0t@yahoo.com
    3.3 .3 . . //. \. j5j8. .) ." j5j.

    Match:
    qe.F
    qa.1
    L1.L
    qs3.3
    qd.d.4
    qd.d.f
    qd.4.f
    qd.4.3
    fib_SR_shadow_4.c
    fib.SR.shadow.4.c

    Проблема с регулярным выражением.
    Приветствую! Проблема с регулярным выражением. Заменяю “.” на “_”. Выдает ошибку: “Недопустимое регулярное выражение”. Метод подбора выявил что ругается на “+” и “". На замену {0,} или {1,} тоже ругается. На C# все работает. Почитал на чем здесь реализована регулярка.
    https://community.notepad-plus-plus.org/topic/15765/faq-desk-where-to-find-regex-documentation
    https://npp-user-manual.org/docs/searching/#regular-expressions
    Здесь вообще идет так “+?” и "
    ?”, так это тоже не работает.
    https://www.boost.org/doc/libs/1_70_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html
    Мне не понятно что не так?

    Регулярка:
    (?<=[A-Za-z]+[.\w]*)(?<!www).(?!mq4|org|com|mq5|mqh|\W)

    Не Сопоставлять:
    Indicator.mq4 … www.metatrader.org fx0t@yahoo.com
    3.3 .3 . . //. \. j5j8. .) ." j5j.

    Сопоставлять:
    qe.F
    qa.1
    L1.L
    qs3.3
    qd.d.4
    qd.d.f
    qd.4.f
    qd.4.3
    fib_SR_shadow_4.c
    fib.SR.shadow.4.c



  • @Владимир-Анисимов

    One thing that instantly comes to mind is that lookbehinds at the start of a regular expression aren’t going to work as you would think…



  • Hello, @владимир-анисимов ,

    I do understand why you get the message Invalid regular expression with your regex (?<=[A-Za-z]+[.\w]*)(?<!www).(?!mq4|org|com|mq5|mqh|\W) and I 'll explain you next time !

    However, your regex seems excessively complicated !?


    So, first, in order to get a coherent and more simple search regex, could you give some examples of :

    • Lines that you do want to match

    • Lines that you do not want to match

    When replying, please use, preferably, the </> option which enables you to easily insert code text !

    See you later,

    Best Regards,

    guy038



  • @guy038 Anti spam does not respond. I’ll try with cleaners.



  • @guy038 hello. Here is a sample code.

    //+------------------------------------------------------------------+
    //|                                                  4 Period MA.mq4 |
    //|                 Copyright © 2006, tageiger aka fxid10t@yahoo.com |
    //+------------------------------------------------------------------+
    #property copyright "Copyright © 2006, tageiger aka fxid10t@yahoo.com"
    #property link      "mailto:fxid10t@yahoo.com"
    #property indicator_chart_window
     
    extern int p1.ma=5;//Period() in minutes
    extern int p2.ma=15;//Period() in minutes
    extern int p3.ma=30;//Period() in minutes
    extern int p4.ma=60;//Period() in minutes
    
    extern int STD.Rgres.length=56;
    extern double STD.width=0.809;
    
    extern int ma.applied.price=1;/*
    Applied price constants. It can be any of the following values:
    
    Constant       Value Description 
    PRICE_CLOSE    0     Close price. 
    PRICE_OPEN     1     Open price. 
    PRICE_HIGH     2     High price. 
    PRICE_LOW      3     Low price. 
    PRICE_MEDIAN   4     Median price, (high+low)/2. 
    PRICE_TYPICAL  5     Typical price, (high+low+close)/3. 
    PRICE_WEIGHTED 6     Weighted close price, (high+low+close+close)/4.*/ 
    extern int ma.Method=0;/*
    Moving Average Method
    Constant    Value Description 
    MODE_SMA    0     Simple moving average, 
    MODE_EMA    1     Exponential moving average, 
    MODE_SMMA   2     Smoothed moving average, 
    MODE_LWMA   3     Linear weighted moving average.   */
    
    extern int ma1.Length=13;
    extern int ma2.Length=21;
    extern int ma3.Length=34;
    extern int ma4.Length=55;
    extern int ma5.Length=89;
    extern int ma6.Length=144;
    extern int ma7.Length=233;
    
    extern int fib.SR.shadow.1=8;
    extern int fib.SR.shadow.2=13;
    extern int fib.SR.shadow.3=21;
    extern int fib.SR.shadow.4=34;
    extern int fib.SR.shadow.5=55;
    extern int fib.SR.shadow.6=89;
    extern int fib.SR.shadow.7=144;
    
    extern color fib.SR.shadow.1.c=AliceBlue;
    extern color fib.SR.shadow.2.c=LightBlue;
    extern color fib.SR.shadow.3.c=DodgerBlue;
    extern color fib.SR.shadow.4.c=RoyalBlue;
    extern color fib.SR.shadow.5.c=Blue;
    extern color fib.SR.shadow.6.c=MediumBlue;
    extern color fib.SR.shadow.7.c=DarkBlue;
    
    double ma1.p1, ma2.p1, ma3.p1, ma4.p1, ma5.p1, ma6.p1, ma7.p1;
    double ma1.p2, ma2.p2, ma3.p2, ma4.p2, ma5.p2, ma6.p2, ma7.p2;
    double ma1.p3, ma2.p3, ma3.p3, ma4.p3, ma5.p3, ma6.p3, ma7.p3;
    double ma1.p4, ma2.p4, ma3.p4, ma4.p4, ma5.p4, ma6.p4, ma7.p4;
    double bb1, bb2, bb3, bb4;
    double tmb1,tmb2,tmb3,tmb4,tmr1,tmr2,tmr3,tmr4;
    datetime t1.p1, t2.p1, t1.p2, t2.p2, t1.p3, t2.p3, t1.p4, t2.p4;
    

    Everything, as I described above. I need to replace “.” on the “_”. in variables like “ma1.p1” or “fib.SR.shadow.7.c” to “ma1_p1” or “fib_SR_shadow_7_c”. At the same time, do not change the fractional numbers “2.0” or the web address, they should remain as they are. So I created the published regular season above. It is worth recalling that it works in C #, and notepad swears at her. I can’t understand why. It will be more valuable for me to understand what the problem is, so as not to make mistakes in the future. Better explain the mistake. So far he has dealt with his task in this way.

    (?<=[A-Za-z])(?<!www)\.(?!mq4|org|com|mq5|mqh|\W|info)|(?<=[A-Za-z]\d)(?<!www)\.(?!mq4|org|com|mq5|mqh|\W|info)|(?<=\d)(?<!www)\.(?!mq4|org|com|mq5|mqh|\W|info)(?=[A-Za-z])
    

    Each point has its own regular.
    I need to understand why the quantifier does not work.



  • Hi, @владимир-анисимов and All,

    I absolutely don’t use the same logic as yours !! But, from what you said in your second post, my regex S/R seems to work nice !


    • Firstly, I decided to match, globally, any comment line or block ( so the forms #...., //.... and /*....*/ ) in order to avoid changing possible dot character(s) inside !. All this pattern, embedded in parentheses, defines a group 1

    • Secondly the regex engine looks for a literal . character, ONLY IF not followed with a number and a semi-colon ;

    This leads to this regex S/R :

    SEARCH (?s)(^\h*(#|//).+?$|/\*.+?\*/)|\.(?!\d+;)

    REPLACE ?1$0:_

    • In replacement, the ?1$0:_ syntax is a conditional replacement

      • If group 1 exists, the overall match $0 is simply rewritten as is

      • If the group 1 does not exist ( case of the second alternative \.(?!\d+;)) the dot is simply replaced with the part after the :, that is to say the _ underscore character


    After running this regex S/R, against your text, you should get :

    //+------------------------------------------------------------------+
    //|                                                  4 Period MA.mq4 |
    //|                 Copyright © 2006, tageiger aka fxid10t@yahoo.com |
    //+------------------------------------------------------------------+
    #property copyright "Copyright © 2006, tageiger aka fxid10t@yahoo.com"
    #property link      "mailto:fxid10t@yahoo.com"
    #property indicator_chart_window
     
    extern int p1_ma=5;//Period() in minutes
    extern int p2_ma=15;//Period() in minutes
    extern int p3_ma=30;//Period() in minutes
    extern int p4_ma=60;//Period() in minutes
    
    extern int STD_Rgres_length=56;
    extern double STD_width=0.809;
    
    extern int ma_applied_price=1;/*
    Applied price constants. It can be any of the following values:
    
    Constant       Value Description 
    PRICE_CLOSE    0     Close price. 
    PRICE_OPEN     1     Open price. 
    PRICE_HIGH     2     High price. 
    PRICE_LOW      3     Low price. 
    PRICE_MEDIAN   4     Median price, (high+low)/2. 
    PRICE_TYPICAL  5     Typical price, (high+low+close)/3. 
    PRICE_WEIGHTED 6     Weighted close price, (high+low+close+close)/4.*/ 
    extern int ma_Method=0;/*
    Moving Average Method
    Constant    Value Description 
    MODE_SMA    0     Simple moving average, 
    MODE_EMA    1     Exponential moving average, 
    MODE_SMMA   2     Smoothed moving average, 
    MODE_LWMA   3     Linear weighted moving average.   */
    
    extern int ma1_Length=13;
    extern int ma2_Length=21;
    extern int ma3_Length=34;
    extern int ma4_Length=55;
    extern int ma5_Length=89;
    extern int ma6_Length=144;
    extern int ma7_Length=233;
    
    extern int fib_SR_shadow_1=8;
    extern int fib_SR_shadow_2=13;
    extern int fib_SR_shadow_3=21;
    extern int fib_SR_shadow_4=34;
    extern int fib_SR_shadow_5=55;
    extern int fib_SR_shadow_6=89;
    extern int fib_SR_shadow_7=144;
    
    extern color fib_SR_shadow_1_c=AliceBlue;
    extern color fib_SR_shadow_2_c=LightBlue;
    extern color fib_SR_shadow_3_c=DodgerBlue;
    extern color fib_SR_shadow_4_c=RoyalBlue;
    extern color fib_SR_shadow_5_c=Blue;
    extern color fib_SR_shadow_6_c=MediumBlue;
    extern color fib_SR_shadow_7_c=DarkBlue;
    
    double ma1_p1, ma2_p1, ma3_p1, ma4_p1, ma5_p1, ma6_p1, ma7_p1;
    double ma1_p2, ma2_p2, ma3_p2, ma4_p2, ma5_p2, ma6_p2, ma7_p2;
    double ma1_p3, ma2_p3, ma3_p3, ma4_p3, ma5_p3, ma6_p3, ma7_p3;
    double ma1_p4, ma2_p4, ma3_p4, ma4_p4, ma5_p4, ma6_p4, ma7_p4;
    double bb1, bb2, bb3, bb4;
    double tmb1,tmb2,tmb3,tmb4,tmr1,tmr2,tmr3,tmr4;
    datetime t1_p1, t2_p1, t1_p2, t2_p2, t1_p3, t2_p3, t1_p4, t2_p4;
    

    Did you except this output ?


    Now, I do not know how the C/C++ regex librairies are implemented, but I do know the N++ regex engine which use a Boost regex library which supports look-behind containing fixed-length patterns, only ! Let me explain with a simple example :

    • The regex (?<=[A-Z]{3})\d+, against the string ABC7890    TUVWXYZ12345    AZ456    LMNOP12 finds the numbers 7890, 12345 and 12, but not the number 456 because it is preceded with two upper-case letters only. Logical !

    • But the regex (?<=[A-Z]]+)\d+ against the same string, simply returns the message Invalid regular expression ! why ?

    Well :

    • In the former regex, the string, matched by pattern \w{3}, represents exactly 3 word characters, so a fixed length string !

    • In the later regex, the string, matched by pattern \w+, may represent 1, 2, 3 or any positive quantity of word characters, so, obvviously, not a fixed-length string. It would have been the same result using the regex (?<=[A-Z]{3,})\d+ !

    If you really need the second regex, you must use the \K syntax and change the regex as :

    [A-Z]+\K\d+

    With that regex, once it matched any non-null range of upper-case letters [A-Z]+ , because of the \K feature, the regex engine forget anything that has been matched so far. It also resets the starting position to the \K location in pattern. So, the overall pattern is finally the \d+ regex, only !

    So, this time, the regex [A-Z]+\K\d+ does match all the numbers in string ABC7890    TUVWXYZ12345    AZ456    LMNOP12, because the regex pattern means “find a number if it is preceded by upper-case letter(s)” ;-))

    Best Regards,

    guy038



  • @guy038 hi. Your regular routine partially fulfills.

    if(lots == 0.0)
    Comment_(" Нет дебаланса ордеров." );           //"All Postions Are Locked. Calculations cancelled."
    Comment_(" работаю...");
    ObjectSet("DEAD ZONE", OBJPROP_YDISTANCE, MP_Y+RazmerShrifta*2.3);
    

    Unfortunately in these lines, it makes a replacement. Although it is not needed.
    Thanks for trying to clarify. I don’t know English and Google translator does not correctly translate your thought. I will study your regular, replacing “? 1 $ 0: _” and “\ K”.



  • Hello, @владимир-анисимов and All,

    Ah… OK ! So, change the regex S/R as below :

    SEARCH (?s)((#|//).+?$|/\*.+?\*/|".+?")|\.(?!\d+[;)])

    REPLACE ?1$0:_

    • I added the ".+?" alternative to avoid changing possible dot ., in the zones "......"

    • I changed the negative look-ahead as (?!\d+[;)]) to avoid changing numbers ending with the ; or ) character

    • I suppressed the part ^\h* to allow the search of comment lines #.... or //...., occurring after some code

    Best Regards,

    guy038



  • @guy038 Thanks for the help. You have expanded my boundaries how else you can work with the regular season. Thanks you.


Log in to reply