Removing numbers and emoji



  • i want to remove the number, dot(.) and emoji.
    Can someone please provide some suggestions

    BEFORE:

    1.🐯 to export
    2.🐲 chain
    3.🐲 process fees
    4.🐲 μˆ˜κ³ ν•˜λ‹€
    5.🐯 μˆ˜μž…ν’ˆ
    6.🐲 μˆ˜μž…ν•˜λ‹€

    AFTER:

    to export
    chain
    process fees
    μˆ˜κ³ ν•˜λ‹€
    μˆ˜μž…ν’ˆ
    μˆ˜μž…ν•˜λ‹€



  • Hi @Joss-Medina

    Try this:

    Place the caret at the beginning of the first line to change. Then open the Find panel and copy the following line:

    Find what: (?-s)^\d+[^ ]+?\x20
    Replace with: [leave empty]

    Select the Regular expression search mode, and click the Replace All button.

    Hope it helps.



  • This post is deleted!


  • Hello, @joss-medina, @astrosofista and All,

    @astrosofista, as all the emoji symbols, in OP text, have an Unicode code-point over the Basic Mutilingual Plane ( BMP ), so above \x{FFFF}, our Boost regex engine is, generally, not able to find them properly :-(

    So instead of using a negative character class [^....] , which, obviously, does not work, I would suggest the classical regex syntax, below, which matches, from beginning of line, the shortest range of standard characters till a Space char :

    SEARCH (?-s)^.+?\x20

    REPLACE Leave EMPTY

    Best Regards,

    guy038



  • Given that @Joss-Medina specifically indicated lines starting with a number, I’d recommend a hybrid of @astrosofista’s and @guy038’s solutions:

    • FIND WHAT = (?-s)^\d+.+?\x20

    Like @astrosofista’s solution, this requires that the line start with one or more digits. Like @guy038’s solution, it allows any characters between the digits and the space.



  • @guy038 said in Removing numbers and emoji:

    our Boost regex engine is, generally, not able to find them properly

    @guy038, yes sorry, you’re right, I was misguided by my PCRE education :(


Log in to reply