Regex: Search and Replace in Chess PGN



  • Hello all,
    I have a chess pgn file, and I want to search/Replace like the following conditon:


    if in the [FEN] string, there is a w (near the end of string, case sensitive), then replace the uper [White “whatever inside quotation marks”] to [White “wtm”]

    if in the [FEN] string, there is a b (near the end of string, case-sensitive), then replace the uper [White “whatever inside quotation marks”] to [White “btm”]


    Below is EXAMPLE TEXT:
    

    [Event “I’M Not A GM Speed Chess”]
    [Site “chess.com INT”]
    [Date “2021.01.13”]
    [Round “1.8”]
    [White “Canty, James”]
    [Black “Rozman, Levy”]
    [Result “1-0”]
    [ECO “B01”]
    [WhiteElo “2221”]
    [BlackElo “2343”]
    [Annotator “Admin”]
    [SetUp “1”]
    [FEN “3k4/8/5rKp/5N2/5PP1/4R2r/8/8 w - - 0 56”]
    [PlyCount “1”]
    [EventDate “2021.01.03”]

    1. Kxf6 1-0

    [Event “Titled Tuesday 12th Jan”]
    [Site “chess.com INT”]
    [Date “2021.01.12”]
    [Round “1”]
    [White “Kulkarni, R.”]
    [Black “Tharushi, T H D Niklesha”]
    [Result “1-0”]
    [ECO “B22”]
    [WhiteElo “2321”]
    [BlackElo “1712”]
    [Annotator “Admin”]
    [SetUp “1”]
    [FEN “6k1/3R4/5Kp1/2p2pN1/5P1P/1Pb3P1/4r3/8 w - - 0 48”]
    [PlyCount “1”]
    [EventDate “2021.01.12”]

    1. Kxg6 1-0

    [Event “Schools Friendly U11 2021”]
    [Site “lichess.org INT”]
    [Date “2021.01.11”]
    [Round “4”]
    [White “Kryshtafor, Maksym”]
    [Black “Qalabegashvili, Data”]
    [Result “1-0”]
    [ECO “B72”]
    [BlackElo “1308”]
    [Annotator “Admin”]
    [SetUp “1”]
    [FEN “8/p3p3/2ppR3/8/7P/1K6/2r2k2/8 w - - 0 39”]
    [PlyCount “1”]
    [EventDate “2021.01.11”]
    [WhiteTeam “Ukraine”]
    [BlackTeam “Georgia”]
    [WhiteTeamCountry “UKR”]
    [BlackTeamCountry “GEO”]

    1. Kxc2 1-0

    Thanks for reading, and have a nice day to all the community!



  • @Nguyen-Wanderoz said in Regex: Search and Replace in Chess PGN:

    I have a chess pgn file, and I want to search/Replace like the following conditon:

    I think it’s achievable however since your data wasn’t in a “black window” there is the possibility of it having been altered by the posting engine (it likes changing the double quotes "). To prevent that please post your data again, show 1 set each of the “b” and “w” settings. And repeat the same examples in a separate window with the results you want (after the edits).

    So insert the “before” examples, then use the </> button you see immediately above the posting window after selecting the examples. Repeat this for the “after” set.

    Terry



  • @Nguyen-Wanderoz said in Regex: Search and Replace in Chess PGN:

    if in the [FEN] string, there is a w (near the end of string, case sensitive), then replace the uper [White “whatever inside quotation marks”] to [White “wtm”]

    @Terry-R , thank you! Here I modify as you instructed. Long time ago, @guy038 used to instruct me on similar scheme: https://community.notepad-plus-plus.org/topic/16283/please-help-me-to-replace-text-like-the-following-condition?_=1619661518437
    However, it’s too hard for me to write new one on my own.


    This is the set with b in [FEN]
    Before:

    [Event "11th Sharjah Women Prelim"]
    [Site "lichess.org INT"]
    [Date "2021.01.07"]
    [Round "10"]
    [White "Arabidze, M."]
    [Black "Janiashvili, Mariami"]
    [Result "1/2-1/2"]
    [ECO "A06"]
    [WhiteElo "2437"]
    [BlackElo "1963"]
    [Annotator "Admin"]
    [SetUp "1"]
    [FEN "r5k1/pp3pB1/8/3p4/4p3/1P2P1Pr/P4P2/R2R2K1 b - - 0 30"]
    [PlyCount "1"]
    [EventDate "2021.01.07"]
    
    30... Kxg7 1/2-1/2
    
    
    

    This is the set with w in [FEN]
    Before:

    [Event "1st Cappelle Online Blitz"]
    [Site "Europe-Echecs INT"]
    [Date "2021.01.10"]
    [Round "5"]
    [White "Bauer, Ch"]
    [Black "Mohammad, Nubairshah Shaikh"]
    [Result "1-0"]
    [ECO "B10"]
    [WhiteElo "2639"]
    [BlackElo "2445"]
    [Annotator "Admin"]
    [SetUp "1"]
    [FEN "2k5/P4R2/1rK2p2/4bp2/3N4/8/5P2/8 w - - 0 59"]
    [PlyCount "1"]
    [EventDate "2021.01.10"]
    [EventType "blitz"]
    
    59. Kxb6 1-0
    

    This is the result I want to get: set with b in [FEN]. Please have a look on [White"text-changed"]
    After:

    [Event "11th Sharjah Women Prelim"]
    [Site "lichess.org INT"]
    [Date "2021.01.07"]
    [Round "10"]
    [White "btm"]
    [Black "Janiashvili, Mariami"]
    [Result "1/2-1/2"]
    [ECO "A06"]
    [WhiteElo "2437"]
    [BlackElo "1963"]
    [Annotator "Admin"]
    [SetUp "1"]
    [FEN "r5k1/pp3pB1/8/3p4/4p3/1P2P1Pr/P4P2/R2R2K1 b - - 0 30"]
    [PlyCount "1"]
    [EventDate "2021.01.07"]
    
    30... Kxg7 1/2-1/2
    
    

    This is the result I want to get: set with w in [FEN]. Please have a look on [White"text-changed"]
    After:

    [Event "1st Cappelle Online Blitz"]
    [Site "Europe-Echecs INT"]
    [Date "2021.01.10"]
    [Round "5"]
    [White "wtm"]
    [Black "Mohammad, Nubairshah Shaikh"]
    [Result "1-0"]
    [ECO "B10"]
    [WhiteElo "2639"]
    [BlackElo "2445"]
    [Annotator "Admin"]
    [SetUp "1"]
    [FEN "2k5/P4R2/1rK2p2/4bp2/3N4/8/5P2/8 w - - 0 59"]
    [PlyCount "1"]
    [EventDate "2021.01.10"]
    [EventType "blitz"]
    
    59. Kxb6 1-0
    

    Thank you :)



  • @Nguyen-Wanderoz said in Regex: Search and Replace in Chess PGN:

    Long time ago, @guy038 used to instruct me on similar scheme: https://community.notepad-plus-plus.org/topic/16283/please-help-me-to-replace-text-like-the-following-condition?_=1619661518437
    However, it’s too hard for me to write new one on my own.

    Thanks for showing the data in black boxes, it does mean we can trust it has not been altered during posting.

    I wasn’t aware that @guy038 had supplied a solution for you that long ago on the same data with slightly different needs. I feel now that you mentioned him he will likely adjust that solution to fit your new request. If he hasn’t when I get back to my PC in about 10 hours I will undertake to do so for you.

    Terry



  • Hi, @nguyen-wanderoz, @terry-r and All,

    @terry-r :

    You could have provided the solution by yourself ! Easy with the excellent @nguyen-wanderoz’s post, which gives us raw text with, both, the BEFORE and AFTER samples for each of the two possibilities, in reverse video !


    So, in short, @nguyen-wanderoz has a list with a lot of sections, regarding chess tournaments. In each section, two lines are of interest for this topic :

    • The line describing the person playing as "White"

    • The line [FEN ••••••••], which describes, I presume, the interesting moves ( I may be wrong ! )

    Two configurations of a single section are possible :

    ....
    [White "Arabidze, M."]
    ....
    [FEN "r5k1/pp3pB1/8/3p4/4p3/1P2P1Pr/P4P2/R2R2K1 b - - 0 30"]
    ....
    

    As the FEN line contains the letter b, near the end of line, he wants to change the name of the White line by the btm string

    OR

    ....
    [White "Bauer, Ch"]
    ....
    [FEN "2k5/P4R2/1rK2p2/4bp2/3N4/8/5P2/8 w - - 0 59"]
    ....
    

    As the FEN line contains the letter w, near the end of line, he wants to change the name of the White line by the wtm string


    Thus, a possible regex S/R is :

    SEARCH : (?-si)(?<=^\x5bWhite\x20)".*?"(?=\x5d(?s:.+?)^\x5bFEN\x20.+?\x20(?:(w)|b))

    REPLACE "(?1w:b)tm"

    As usual :

    • Tick the Wrap around option

    • Select the Regular expression search mode

    • Click on the Replace All button ( or the Replace button if your N++ release is v7.9.1 or later )


    IMPORTANT : Due to a specificity of the Markdown syntax , on our NodeBB forum, I preferred to refer of the :

    • [ character as the \x5b character

    • ] character as the \x5d character


    Notes :

    • First, the part (?-si) means that :

      • The meta-character regex char . matches a single standard character ( not an EOL char )

      • All the search is processed in a non-insenstive to case way ( so sensitive to case ! )

    • The main search is the part ".*?" which searches for any zone surrounded with double-quotes. However this search is effective ONLY IF :

      • It is preceded with the string [White, beginning a line and followed with a space char, thanks to the look-behind structure (?<=^\x5bWhite\x20)

      • It is followed with the string ]•••••••••••••••[FEN followed with a space char and followed with anything else till a second space char and finally a lowercase letter w or b, thanks to the look-ahead structure (?=\x5d(?s:.+?)^\x5bFEN\x20.+?\x20((w)|b))

    • Note that the ••••••••• part refers to the shortest non-null multi-lines range of chars, including EOL chars. Hence, the necessity to use the (?s) modifier with the simple syntax (?s:.+?)

    • Note also the ending part of the regex (?:(w)|b) is a non-capturing group, containing, itself, a group 1 if the letter w is matched

    • In the replacement regex, we rewrite :

      • The opening "

      • The letter w if group1 exists and the letter b if group1 is not defined, due to the conditional syntax (?1w:b)

      • The tm string, with that exact case

      • The ending "

    Best Regards

    guy038



  • Hi @guy038 @Terry-R,

    Thank you!!! With your instructions, I can smoothly S/R.


    Ah, in chess database, [FEN ••••••••] will describe position, from uper left to lower right. All the content inside brackets will keep information about the game. The main chess moves would be presented afterward, outside the brackets.


    Once again, I’m very grateful to you for your help and explainations :)!
    Best Regards,
    wanderoz


Log in to reply