[Regex] Rounding numbers



  • Hello,

    I would like to round up decimals in a document, so that the numbers only have three decimal points. I know that this can be done with Regular Expressions… But i have no idea how these work.

    Example:
    16.0129754 --> 16.013

    Sincerely,
    David



  • I know @guy038 can do some magical stuff with regex, and with a simple example, like the one you showed, there would only have to be about 20 checks ((first digit 0-9) x (second digit 0-4 or second digit 5-9)). But to be able to handle all of these properly would require a monster.

    16.01217    -> 16.012
    16.01297    -> 16.013
    16.01949    -> 16.019
    16.01999    -> 16.020
    16.14999    -> 16.150
    16.19999    -> 16.200
    16.99999    -> 17.000
    19.99999    -> 20.000
    199.9999    -> 200.000
    

    Because for every digit that might possibly change, you have to have another layer. (And any digit from the thousandths digit and higher could potentially change)

    What about rounding negative toward zero?

    -0.0009     -> -0.001
    -0.0004     -> -0.000
                or  0.000
    

    Also, what do you want done with numbers that have fewer than 3 digits after the decimal? Extend or keep as-is?

    16.01       -> 16.010       # extend
    16.01       -> 16.01        # keep as-is
    

    What do you want to do if there’s a not-really-numeric, but looks like it, such as an IP address:

    10.1.117.9
    

    Most regex that you could come up with would probably stumble over that one.

    Honestly, this is where a full-fledged programming language would shine. And since Python can have easy access to the contents of a text file opened in Notepad++ , using the Python Script plugin, that’s what I would recommend.

    Here’s a simple example, though it doesn’t handle the IP-address exception.

    # https://notepad-plus-plus.org/community/topic/15100/regex-rounding-numbers
    #   version 1
    
    #console.clear()
    #console.show()
    
    editor.documentEnd()
    end = editor.getCurrentPos()
    start = 0
    #console.write("start:{}\n".format(start))
    #console.write("end:  {}\n".format(end))
    #editor.documentStart()
    
    while start < end:
        position = editor.findText( FINDOPTION.REGEXP , start , end , "-*\d+\.\d{4,}")    # find any that are at more than 3 digits after the decimal point  
        # change to "-*\d+\.\d+" to handle any with at least one digit after the decimal (so 1.1 will go to 1.100, but 1. will be assumed to be the end of a sentence, and stay as 1.)
    
        # error handling
        if position is None:
            #console.writeError("editor.position is NONE => not found.\n")
            break
    
        # grab the matched text
        #console.write("editor: findText @ " + str(position[0]) + ":" + str(position[1]) + "\n")
        text = editor.getTextRange(position[0], position[1])
        #console.write(text + "\n")
    
        # round it=
        rounded = round( float(text) , 3 )
        text = "{:0.3f}".format(rounded)
        #console.write(text + "\n" )
    
        # replace it with rounded
        editor.setSel(position[0], position[1])
        editor.replaceSel(text)
    
        # next
        start = position[1]
    
    #console.writeError("DONE\n")
    

    (the commented-out console.write commands, and similar, were from my debugging; useful if you want to see what’s going on as it’s searching)

    If you place this in the notepad++ directory\plugins\PythonScript\scripts\roundNumbers.py, or in %AppData%\Config\PythonScript\scripts\roundNumbers.py, you can run it using Plugins > PythonScript > Scripts > roundNumbers.



  • @PeterJones
    Wow that is awesome! I tried it out and it really works perfectly. That is exactly what i needed. Thank you very much!

    And the good thing is that i don’t have to round up IP addresses ;D.

    Sincerely,
    David



  • Glad it works for you. (And I know you don’t have to round up IP addresses… but if you had a document that mixed your numbers and IP addresses, it might get confused… though since I’m looking for 4 digits after the decimal, it shouldn’t match valid IP. :-)



  • I haven’t really investigated it in the context of the current “rounding” problem, but off the top of my head it seems like the Pythonscript presented here was meant for doing this type of thing. Maybe. Maybe not. YMMV. :-D



  • Hello, @david-wanke, @perterjones and All,

    David, I saw your post yesterday but I hadn’t time to reply, because I’ve just thought about some rounding cases, shown by Peter !

    Of course, with a regex, no problem to change, for instance :

    • The floating number 12.45678 into the number 12.457

    • And the floating number 12.45623 into the number 12.456

    using the rules :

    • If the fourth decimal digit, after the decimal point, is between 0 and 4, just truncate all decimal digits after the third one

    • If the fourth decimal digit, after the decimal point, is between 5 and 9, increase by one the third digit and truncate all decimal digits after the third one

    But, as regular expressions, mainly, do string operations, I would have, given the second rule, to take in account all values of the third decimal digit : if 3 then replace by 4, if 7 then replace by 8 and so on… ! And what about rounding 12.99962 which should give 13.000 !


    So, it’s easy to understand that the rounding process is, fundamentally, a mathematical operation, which is usually handled by programming or script languages. For instance, in the Peter’s Python script, the line, below, does the mathematical operation ( I’m not a python expert but I don’t think I’m wrong about it !! )

    while start < end:
        ....
        rounded = round( float(text) , 3 )
        ....
    

    As Peter said, this an example of the limitations of regular languages ;-))

    Best Regards,

    guy038

    P.S. :

    Refer to some Wiki documentation, below :

    https://en.wikipedia.org/wiki/Regular_expression

    https://en.wikipedia.org/wiki/Regular_language

    Just note that I did not understand most of the second article, anyway ! But, it’s good, from time to time, to feel oneself, very intelligent… during some seconds ;-))


Log in to reply