Adding } to the end of a text line



  • Hello,
    I do Need a Little help with regular expressions.
    I have a utf-8 textr file with thousands of entries

    synonyms are looking like this: {Basel-City}.

    Due to a error the look kike this: {Basel-Country or {Northwestern Switzerland

    Is there a way to find this two Kind of occurrences and replace it with the first part + }

    Thanks a lot for your help.
    Uwe



  • Hello @uwe-helmer,

    Seemingly, the part, inside your {....} entries, is a range of characters, made up of :

    • Normal word character, ( upper or lower-case letter or accentuated one, digit or the low-line character _) => \w

    • The dash-minus character -, as, for instance, in the string Basel-Country

    • The space character, as, for instance, in the string Northwestern Switzerland

    So, the question is : in your present file, which character(s) is/are assumed to be after an unbalanced entry {.... ?

    Once, this/these character(s) known ( it may be usual End of Line characters, BTW ), it should be easy to find out the right regex, to get well-balanced entries {....}, again !

    You may also insert a small part of your text, in your next post !

    Cheers,

    guy038



  • Hi guy038,
    it is always CR and LF. Sometimes the closing } is missing. Whereever I have a { I do Need a closing } before the CR and LF.

    {Ausgang
    {door
    {Eingang}
    {Eintritt}
    {entry
    {exit}
    {gate}
    {gateway}
    {ingress
    {Pforte}
    {Portal}
    {slammer}
    {Tor
    {Türe}
    {Zugang}
    Gatter
    Tor
    Torbogen
    Treppen
    Stufen
    Wände
    Gelaender
    {banister
    {handrail}
    {railing}
    Innenraum
    {interior}

    Thanks a lot.
    Uwe



  • Hi, @uwe-helmer,

    Ah, OK ! Sorry for my late reply, but I was studying the haunting problem of the suppression of duplicate lines, without corrupting the order of the file contents :-)) See, below :

    https://notepad-plus-plus.org/community/topic/14729/deleting-lines-that-repeat-the-first-15-characters/13


    So, let’s consider your example text, below, in a new tab :

    {Ausgang
    {door
    {Eingang}
    {Eintritt}
    {entry
    {exit}
    {gate}
    {gateway}
    {ingress
    {Pforte}
    {Portal}
    {slammer}
    {Tor
    {Türe}
    {Zugang}
    Gatter
    Tor
    Torbogen
    Treppen
    Stufen
    Wände
    Gelaender
    {banister
    {handrail}
    {railing}
    Innenraum
    {interior}
    

    Open the Replace dialog ( Ctrl + H )

    Check the Regular expression search mode

    SEARCH (?-s)^\{.*[^}\r\n](?=\R)

    REPLACE $0}

    Click, once, on the Replace All button ( or, successively, on the Replace button )

    => You should get the text :

    {Ausgang}
    {door}
    {Eingang}
    {Eintritt}
    {entry}
    {exit}
    {gate}
    {gateway}
    {ingress}
    {Pforte}
    {Portal}
    {slammer}
    {Tor}
    {Türe}
    {Zugang}
    Gatter
    Tor
    Torbogen
    Treppen
    Stufen
    Wände
    Gelaender
    {banister}
    {handrail}
    {railing}
    Innenraum
    {interior}
    

    Et voilà !

    Notes :

    • First, the (?-s) modifier forces the regex engine to interpret the special . character as matching a single standard character, only

    • Then, the part ^\{ looks the \{ character, at beginning of line ^. Note that the special { regex character have to be escaped !

    • Now, the part .* searches any amount, even empty, of standard characters, till …

    • A character different from the } character ( part [^}\r\n] ) which is followed by EOL characters ( look-ahead feature (?=\R) )

      • The part [^}\r\n] is a negative character class, looking for any character, different from, either, the } character, the EOL character \r and the EOL character \n

      • The \R stands for any kind of line break ( \r\n, \n or \r )

    • In replacement, it rewrites the overall matched string $0, simply followed by the } character

    Best Regards,

    guy038


Log in to reply