REQUEST: Value of character in status line



  • I have Notepad 7.5.6 on Windows Server 2012.

    Can we have a new feature? I need the see the value of a character to the right of the cursor in the status line.

    Why? Because we get a lot of data from Word, Excel, Quark, PDF and other documents, that we export to a text file, and different programs export some characters in different ways. So I need to see if a character is a real dash (x46) or M-dash (x97, which appears longer).

    Then I process that text file in Perl and change the M-dash to another special escaped sequence for yet another program which will import the processed data.

    Showing the value in hex and decimal would be helpful. I need both in different circumstances.

    Thanks!



  • @Blafulous-Crassley

    See here for some ideas on how to accomplish this.



  • @Blafulous-Crassley said:

    I have Notepad 7.5.6 on Windows Server 2012.
    Can we have a new feature? I

    It’s always best to check the newest version of software (v7.7.1 is currently the newest) before asking for a new feature. You can download the portable “zip” version, and unzip to a temporary directory, if you’re not ready to commit to a new version of Notepad++.

    That feature has not yet been added, but it’s easy enough to script (which is why it’s likely to never be added, even if you follow the FAQ’s instructions for requesting a new feature).

    If you have the PythonScript plugin ( look here for instructions on installing it on v7.6.3 and newer, or just use the old Plugin Manager to install it on your old v7.5.6), you can grab the script I posted here. Any PythonScript can be aliased to a keystroke (use the Python Script Configuration dialog, and Add the script to the Menu Items list; then, after restarting Notepad++, use Settings > Shortcut Mapper to assign a keyboard shortcut.

    If you leave the script as-is, it will show, on-demand, what character is under the cursor. If you uncomment the last line, save, and run it once, it will do a live update as you move the cursor around… but it’s not super efficient. On-demand is probably all you’ll really find you need…

    (@Alan-Kilborn gave the short-form of the answer while I was typing. Sometimes, I think he’s right that I’m too verbose in my help.)



  • I did manage to install the zip file into my \user\ME\bin\notepad++\ dir. It seems I can’t install PythonScript, even in my \users\ME directory, because IT has the system locked down, and they don’t like me installing stuff.



  • Ok, I got NP++ 7.7.1 running, went to install PythonScript and it’s not even listed in the plugins.



  • @Blafulous-Crassley said:

    PythonScript and it’s not even listed in the plugins.

    Which is why I said “look here [= https://notepad-plus-plus.org/community/topic/17256/guide-how-to-install-the-pythonscript-plugin-on-notepad-7-6-3-7-6-4-and-above ] for instructions on installing it on v7.6.3 and newer”, which includes on v7.7.1, which is newer than v7.6.3.



  • Ah, got it. I will have to install PythonScript later. :)



  • Hello, @blafulous-crassley, @PeterJones, @alan-kilborn and All,

    Here is the complete Unicode list of the dash characters :

    http://www.unicode.org/versions/Unicode12.0.0/ch06.pdf#G9697

    From that list, after some tests, it happens that the 9 characters do not display correctly, their glyph, in Notepad++, with most of the well-known fonts !

    So, below, I listed the remaining Unicode dash characters, sorting them by increasing width of the >< distance :

    
    # 3 CONSECUTIVE DASH characters, between ><
    
            05BE  Hebrew punctuation maqaf   >־־־<
    
    >⁻⁻⁻<   207B  Superscript minus
    >₋₋₋<   208B  Subscript minus
    >‑‑‑<   2011  Non-breaking hyphen
    >---<   002D  Hyphen-minus
    >‐‐‐<   2010  Hyphen
    >‒‒‒<   2012  Figure dash
    >–––<   2013  En dash
    >−−−<   2212  Minus sign
    >―――<   2015  Horizontal bar (= quotation dash)
    >———<   2014  Em dash
    
    >﹣﹣﹣<   FE63  Small hyphen-minus
    >---<   FF0D  Fullwidth hyphen-minus
    
    # ONE char, only, between >< 
    
    >~<   007E  Tilde (when used as swung dash)
    >〜<   301C  Wave dash
    >〰<   3030  Wavy dash
    

    So, @blafulous-crassley, you could, easily, replace all these esoteric dash characters with the usual dash character, \x2d, ( or a specific dash char ), using the following regex S/R :

    SEARCH [\x{2010}-\x{2015}\x{207B}\x{208B}\x{2212}\x{FE63}\x{FF0D}]

    REPLACE \x20 ( or another dash char with the syntax \x{####} )


    In the same way, here is the complete Unicode list of true space characters :

    http://www.unicode.org/versions/Unicode12.0.0/ch06.pdf#G17548

    And again, from that list, after tests, 3 characters ( OGHAM SPACE MARK, NARROW NO-BREAK SPACE, MEDIUM MATHEMATICAL SPACE), do not display their glyph right, in Notepad++, with most of the well-known fonts !

    So, below, I listed the remaining Unicode space characters, sorting them by increasing width of the space character :

    
    # 3 CONSECUTIVE SPACE characters, between ><
    
    >​​​<      200B  ZERO-WIDTH SPACE     ( ZWSP  Format character )
    
    >   <   200A  HAIR SPACE           ( HSP   character )
    
    >   <   2009  THIN SPACE           ( THSP  character )
    
    >   <   2006  SIX-PER-EM SPACE     ( 6/MSP character )
    
    >   <   2005  FOUR-PER-EM SPACE    ( 4/MSP character )
    
    >   <   2004  THREE-PER-EM SPACE   ( 3/MSP character )
    >   <   0020  SPACE                ( SP    character )
    >   <   00A0  NO-BREAK SPACE       ( NBSP  character )
    >   <   2008  PUNCTUATION SPACE
    
    >   <   2000  EN QUAD              ( NQSP  character )
    >   <   2002  EN SPACE             ( ENSP  character )
    
    >   <   2007  FIGURE SPACE         ( FSP   character )
    
    >   <   3000  IDEOGRAPHIC SPACE    ( IDSP  character )
    
    >   <   2001  EM QUAD              ( MQSP  character )
    >   <   2003  EM SPACE             ( EMSP  character )
    

    Of course, this time, you’ll be able to see the space width difference, ONLY IF you use a true proportional Unicode font, such as :

    • Lucida Sans Unicode, Microsoft Sans Serif or Tahoma ( Microsoft fonts )

    • MS PGothic or MS PMincho or MS UI Gothic ( Asiatic Microsoft proportional font )

    • MS Gothic or MS Mincho ( Asiatic Microsoft fixed font )

    I suppose that the most common of these ones is the proportional Tahoma font !

    Note that, with the last two fixed fonts, the EM QUAD, EM SPACE and IDEOGRAPHIC space characters are, exactly, twice wider than usual space !

    So, in order to replace all these esoteric space characters, with the usual space character, \x20, ( or a specific space char ), use the following regex S/R :

    SEARCH [\x{00A0}\x{2000}-\x{200B}\x{3000}]

    REPLACE \x20 ( or another space char with the syntax \x{####} )

    Best Regards,

    guy038



  • Thank you @guy038 that is very helpful! Since I got into the print industry I learned a lot about all the various dashes and spaces, and even non-breaking spaces.

    I will very likely be unable to install PythonScript to get that character value since I’m on a multi-user machine I don’t have permissions to install it in c:\program files (x86)\ or any variant thereof. Plus we are super busy right now.


Log in to reply