How to see hex value of character next to cursor?



  • @PeterJones said:

    Using the same script, but skipping the editor callback, would allow you to do it on-demand rather than “live”.

    Technically, you would have to replace the line

    editor.callback(callback_sci_UPDATEUI, [SCINTILLANOTIFICATION.UPDATEUI])
    

    with

    callback_sci_UPDATEUI(None)


  • Okay, I’m addicted. I got it somewhat debugged. This will work up to U+FFFF as a one-shot. (Or comment out the “on-demand” line and uncomment the “editor.callback” line to get it “live”)

    # encoding=utf-8
    
    def callback_sci_UPDATEUI(args):
        c = editor.getCharAt(editor.getCurrentPos())
        if c < 1 or c > 255:
            p = editor.getCurrentPos()
            q = editor.positionAfter(p)
            s = editor.getTextRange(p,q).decode('utf-8')
            try:
                c = ord(s)
            except:
                txt = "'{}' = {} char: ".format(s.encode('utf-8'), len(s))
                for ch in s:
                    c = ord(ch)
                    txt = txt + " HEX:0x{0:04X} DEC:{0} '{1}'".format(c, unichr(c).encode('utf-8') if c not in [13, 10, 0] else 'LINE-ENDING' if c != 0 else 'END-OF-FILE')
                notepad.setStatusBar(STATUSBARSECTION.DOCTYPE, txt)
                return
        try:
            info = "HEX:0x{0:04X} DEC:{0} '{1}'".format(c, unichr(c).encode('utf-8') if c not in [13, 10, 0] else 'LINE-ENDING' if c != 0 else 'END-OF-FILE')
        except ValueError:
            info = "HEX:?? DEC:?"
        notepad.setStatusBar(STATUSBARSECTION.DOCTYPE, info)
    
    callback_sci_UPDATEUI(None)     # per https://notepad-plus-plus.org/community/topic/17799/, want on-demand
    # editor.callback(callback_sci_UPDATEUI, [SCINTILLANOTIFICATION.UPDATEUI]) # per https://notepad-plus-plus.org/community/topic/14767/, want live
    


  • Okay, I found the way to make it compatible with U+10000 and above:

    # encoding=utf-8
    
    def get_wide_ordinal(char):
        '''https://stackoverflow.com/a/7291240/5508606'''
        if len(char) != 2:
            return ord(char)
        return 0x10000 + (ord(char[0]) - 0xD800) * 0x400 + (ord(char[1]) - 0xDC00)
    
    def callback_sci_UPDATEUI(args):
        c = editor.getCharAt(editor.getCurrentPos())
        if c < 1 or c > 255:
            p = editor.getCurrentPos()
            q = editor.positionAfter(p)
            s = editor.getTextRange(p,q).decode('utf-8')
            c = get_wide_ordinal(s)
        else:
            s = unichr(c)
    
        try:
            info = "'{1}' = HEX:0x{0:04X} = DEC:{0} ".format(c, s.encode('utf-8') if c not in [13, 10, 0] else 'LINE-ENDING' if c != 0 else 'END-OF-FILE')
        except ValueError:
            info = "HEX:?? DEC:?"
        notepad.setStatusBar(STATUSBARSECTION.DOCTYPE, info)
    
    callback_sci_UPDATEUI(None)     # per https://notepad-plus-plus.org/community/topic/17799/, want on-demand
    # editor.callback(callback_sci_UPDATEUI, [SCINTILLANOTIFICATION.UPDATEUI]) # per https://notepad-plus-plus.org/community/topic/14767/, want live
    


  • Addicted. Definitely addicted. :)



  • Hello, @PeterJones, and All,

    I’ve tried your script, on a text, containing Unicode characters over the BMP => Just perfect !

    Just a question : In which case the END-OF-FILE string is displayed in the status bar ? I initially thought it could be when opening a new-1 empty file but not !

    Cheers,

    guy038



  • @guy038

    there is no end-of-file (EOF) string/char.
    EOF is more like a status like in reading a file.
    If the filehandle comes to the end of a file it sets it status EOF to true
    to inform that the end is reached.

    Or did I misunderstand your question?



  • @guy038 said:

    In which case the END-OF-FILE string is displayed in the status bar ?

    In @Scott-Sumner’s original, END-OF-FILE would show if you were at the last character of the file, or if the file were empty. However, my changes to try to trap the unicode characters caused the EOF-result to give errors.

    To fix that bug, change

    if c < 1 or c > 255:
    

    to

    if c < 0 or c > 255:


  • @Ekopalypse said:

    there is no end-of-file (EOF) string/char.

    @guy038 was talking about the else 'END-OF-FILE' that should go to the status bar – ie, output of the program, not input from the file.

    The editor.getCharAt() apparently returns 0 if the getCurrentPos() is at the end of the file. But because I introduced the bug of c<1 instead of c<0 after that line, the 0 condition in the info = ... was never hit, so my code didn’t recognized the end of the file. (It was actually printing the untrapped error to the PythonScript console, but I hadn’t seen it, because I never tried testing the end-of-file condition while making my unicode-capable version of the script.)



  • @PeterJones

    ahh, I was looking to your code searching if there is this statement but missed that I have to scroll horizontally to see it, then I thought @guy038 was really talking about EOF status - I should have known better :-) - SORRY @guy038 and thank you Peter for clarifying.



  • Hello, @PeterJones, @ekopalypse, and All,

    So, finally, using your modification :

        if c < 0 or c > 255:
    

    and your line, slightly modified :

            info = "  {1}    0x{0:04X}  -  {0} ".format(c, s.encode('utf-8') if c not in [13, 10, 0] else 'LINE-END' if c != 0 else 'FILE-END')
    

    Everything is OK ! And I’ve never met the error exception :-))

        except ValueError:
            info = "HEX:?? DEC:?"
    

    For instance :

    • Open a new file

    • Running the script shows |  FILE-END    0x0000  -  0', in the status bar

    • Add a Euro character

    • With cursor right before the currency symbol, I get |  €    0x20AC  -  8364'

    • With cursor right after the currency symbol, I get, again, |  FILE-END    0x0000  -  0'

    • Now, hit the Enter key

    • Move the caret right after the => This time, it answers |  LINE-END    0x000D  -  13'

    • Hit the Down arrow key. Again, we get |  FILE-END    0x0000  -  0'

    An other example :

    • In a new file, I added two characters, which give the same resulting glyph é !
    é   //  LATIN SMALL LETTER E WITH ACUTE
    é  //  LATIN SMALL LETTER E + DIACRITICAL COMBINING ACUTE ACCENT
    
    • With cursor right before the first single é character, it displays |  é    0x00E9  -  233'

    • With cursor right before the group of two characters, it displays |  e    0x0065  -  101'

    • Moving the cursor rightwards 1 position, “inside” the group of two characters, it displays |  ́    0x0301  -  769', as expected !

    Refer to the complete list of the Combining Diacritical marks, below, for information :

    http://www.unicode.org/charts/PDF/U0300.pdf

    BR

    guy038



  • @Alan-Kilborn said:

    Addicted. Definitely addicted. :)

    I am admittedly addicted. But did you see that it helped me use Notepad++ to find a hidden character in someone else’s code? That right there justifies the time I put into it, in my opinion.


Log in to reply