Community
    • Login

    "Summary" feature improvement

    Scheduled Pinned Locked Moved General Discussion
    31 Posts 3 Posters 3.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • guy038G
      guy038
      last edited by guy038

      Hi, Alan and All,

      ( Continuation of the previous post )

      Now, I’ve come accross a problem with the encodings !

      Have you ever noticed that, when you decide to re-interpret the present encoding of a file with the View > Character Set > ... feature, that there are two possible scenarios ?

      • A) - The present econding is an Unicode encoding with a BOM ( Byte Order Mark ). So, either the UTF-8-BOM, UCS-2 BE BOM or UCS-2 LE BOM encoding

      • B) - The present encoding is an ANSI or UTF-8 file, so without a BOM

      In the first case, whatever the new encoding chosen ( one-byte or two-bytes encoding ), the file contents do not change and my script just respects the real encoding of the current file

      For example, with an UCS-2 LE BOM encoded file, if I change its encoding to View > Character Set > Western European > OEM-US, my new summary just consider that it’s still a true UCS-2 LE BOM encoded file, leading to a correct summary report !

      In the second case, the new encoding chosen does modify the current file contents in the editor window. In addition, it automatically supposes that the current file is an UTF-8 encoded file, leading to erroneous results in the summary rapport :-( However, the current file contents, saved on the disk, seem still unchanged !!

      For instance :

      • Open a new tab

      • Use the Encoding > Convert to UTF-8 feature, if necessary

      • Enter the four chars Aé☀𝜜, without any line-break, at the end

      • Save this file as Test-UTF8.txt

      • Using my script, you get, in a new tab :

      ---------------------------------------------------------------------------------------------
                                  SUMMARY on 2024-02-05 16:50:23.656000
      ---------------------------------------------------------------------------------------------
      
       FULL File Path    :  D:\@@\792\Test-UTF8.txt
      
       CREATION     Date :  Mon Feb  5 16:45:24 2024
       MODIFICATION Date :  Mon Feb  5 15:17:02 2024
      
       READ-ONLY flag    :  NO
       READ-ONLY editor  :  NO
      
      
       Current VIEW      :  MAIN View
      
       Current ENCODING  :  UTF-8
      
       Current LANGUAGE  :  TXT  (Normal text file)
      
       Current Line END  :  Windows (CR LF)
      
       Current WRAPPING  :  YES
      
      
       1-BYTE  Chars     :  1
       2-BYTES Chars     :  1
       3-BYTES Chars     :  1
      
       Sum BMP Chars     :  3
       4-BYTES Chars     :  1
      
       CHARS w/o CR & LF :  4
       EOL ( CR or LF )  :  0
      
       TOTAL characters  :  4
      
      
       BYTES Length      :  10 (0 * 1 + 1 * 1b + 1 * 2b + 1 * 3b + 1 * 4b)
       Byte Order Mark   :  0
      
       BUFFER Length     :  10
       Length on DISK    :  10
      
      
       NON-Blank Chars   :  4
      
       WORDS     Count   :  1 (Caution !)
      
       NON-SPACE Count   :  1
      
      
       True EMPTY lines  :  0
       True BLANK lines  :  0
      
       EMPTY/BLANK lines :  0
      
       NON-BLANK lines   :  1
       TOTAL Lines       :  1
      
      
       SELECTION(S)      :  0 selected char (0 selected byte) in 1 EMPTY range
      

      Everything is OK ( buffer length and length on disk are identical and the bytes length description shows one char for each number of bytes, without any EOL )

      • Now, switch back to the Test-UTF8.txt file

      • Run the View > Character Set > Western European > OEM-US feature

      • Re-run my script. This time, in a other new tab, you get :

      ---------------------------------------------------------------------------------------------
                                  SUMMARY on 2024-02-05 16:51:16.937000
      ---------------------------------------------------------------------------------------------
      
       FULL File Path    :  D:\@@\792\Test-UTF8.txt
      
       CREATION     Date :  Mon Feb  5 16:45:24 2024
       MODIFICATION Date :  Mon Feb  5 15:17:02 2024
      
       READ-ONLY flag    :  NO
       READ-ONLY editor  :  NO
      
      
       Current VIEW      :  MAIN View
      
       Current ENCODING  :  UTF-8
      
       Current LANGUAGE  :  TXT  (Normal text file)
      
       Current Line END  :  Windows (CR LF)
      
       Current WRAPPING  :  YES
      
      
       1-BYTE  Chars     :  1
       2-BYTES Chars     :  6
       3-BYTES Chars     :  3
      
       Sum BMP Chars     :  10
       4-BYTES Chars     :  0
      
       CHARS w/o CR & LF :  10
       EOL ( CR or LF )  :  0
      
       TOTAL characters  :  10
      
      
       BYTES Length      :  22 (0 * 1 + 1 * 1b + 6 * 2b + 3 * 3b + 0 * 4b)
       Byte Order Mark   :  0
      
       BUFFER Length     :  22
       Length on DISK    :  10
      
      
       NON-Blank Chars   :  10
      
       WORDS     Count   :  2 (Caution !)
      
       NON-SPACE Count   :  1
      
      
       True EMPTY lines  :  0
       True BLANK lines  :  0
      
       EMPTY/BLANK lines :  0
      
       NON-BLANK lines   :  1
       TOTAL Lines       :  1
      
      
       SELECTION(S)      :  0 selected char (0 selected byte) in 1 EMPTY range
      

      And, at the same time, a prompt displays this warning :

      CURRENT file re-interpreted as OEM-US => Possible ERRONEOUS results
      So, CLOSE the file WITHOUT saving, RESTORE it (CTRL + SHIFT + T) and RESTART script

      Indeed, this time, as the file contents are unchanged, the length on DISK is still correct but the BUFFER length is wrong, due to the re-interpretation of the characters by the OEM-US encoding. That’s why I preferred to add this warning at the end of the script !

      Now, do as it is said :

      • Close the Test-UTF8.txt file ( Ctrl + W )

      • Restore it ( Ctrl + Shift + T )

      • Again, you get the UTF-8 indication, for the Test-UTF8.txt file, at right of the status bar

      • Re-run my script

      => This time, we get again a correct summary, without any prompt !


      Alan or other python gurus, feel free to improve this last version and/or test on various files if all the numbers shown are coherent !

      Best Regards,

      guy038

      1 Reply Last reply Reply Quote 1
      • guy038G
        guy038
        last edited by guy038

        Hi All,

        I"ve just realized that, up to now, I simply improved my script with an old version of N++ ( v7.9.2 ). I apologize…

        So, I’m first going to update my last portable version, on my W10 laptop, from v8.5.4 to the v8.6.2 version and I will update my script and redo all the tests

        See you later !

        BR

        guy038

        Alan KilbornA 1 Reply Last reply Reply Quote 0
        • Alan KilbornA
          Alan Kilborn @guy038
          last edited by

          @guy038 said in Improved version of the "Summary" feature, ...:

          I"ve just realized that, up to now, I simply improved my script with an old version of N++ ( v7.9.2 ). I apologize…

          :-(

          You ought to close out these ancient versions…permanently.

          1 Reply Last reply Reply Quote 0
          • guy038G
            guy038
            last edited by guy038

            Hello, @alan-kilborn and All,

            I’e just discovered that, since the v8.0 N++ version, the UCS-2 BE BOM and UCS-2 LE BOM encodings are able to handle all the characters over the BMP. Thus, these encoding were renamed, respectively, as UTF-16 BE BOM and UTF-16 LE BOM !

            Note that, with these two encodings, each character with code > \x{FFFF} is built with the surrogate pair mechanism, so with two 16-bytes chars. Consequently, the total number of characters in the buffer = 2 (BOM) + number of chars <= x{FFFF} x 2 + number of chars > x{FFFF} x 4

            For example, the simple string Aé☀𝜜, without any EOL, in an UTF-16 BE encoding file, is coded with 12 bytes as :

            
            FE FF 00 41 00 E9 26 00 D8 35 DF 1C
            ----- ----- ----- ----- -----------
             BOM    A     é     ☀       𝜜
            
            

            So, here is my final and updated version of the script, which works in all versions since the v8.0 one !

            # encoding=utf-8
            
            #-------------------------------------------------------------------------
            #                    STATISTICS about the CURRENT file ( v0.5 )
            #-------------------------------------------------------------------------
            
            from __future__ import print_function    # for Python2 compatibility
            
            from Npp import *
            
            import re
            
            import os, time, datetime
            
            import ctypes
            
            from ctypes.wintypes import BOOL, HWND, WPARAM, LPARAM, UINT
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            #  From @alan-kilborn, in post https://community.notepad-plus-plus.org/topic/21733/pythonscript-different-behavior-in-script-vs-in-immediate-mode/4
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            def npp_get_statusbar(statusbar_item_number):
            
                WNDENUMPROC = ctypes.WINFUNCTYPE(BOOL, HWND, LPARAM)
                FindWindowW = ctypes.windll.user32.FindWindowW
                FindWindowExW = ctypes.windll.user32.FindWindowExW
                SendMessageW = ctypes.windll.user32.SendMessageW
                LRESULT = LPARAM
                SendMessageW.restype = LRESULT
                SendMessageW.argtypes = [ HWND, UINT, WPARAM, LPARAM ]
                EnumChildWindows = ctypes.windll.user32.EnumChildWindows
                GetClassNameW = ctypes.windll.user32.GetClassNameW
                create_unicode_buffer = ctypes.create_unicode_buffer
            
                SBT_OWNERDRAW = 0x1000
                WM_USER = 0x400; SB_GETTEXTLENGTHW = WM_USER + 12; SB_GETTEXTW = WM_USER + 13
            
                npp_get_statusbar.STATUSBAR_HANDLE = None
            
                def get_result_from_statusbar(statusbar_item_number):
                    assert statusbar_item_number <= 5
                    retcode = SendMessageW(npp_get_statusbar.STATUSBAR_HANDLE, SB_GETTEXTLENGTHW, statusbar_item_number, 0)
                    length = retcode & 0xFFFF
                    type = (retcode >> 16) & 0xFFFF
                    assert (type != SBT_OWNERDRAW)
                    text_buffer = create_unicode_buffer(length)
                    retcode = SendMessageW(npp_get_statusbar.STATUSBAR_HANDLE, SB_GETTEXTW, statusbar_item_number, ctypes.addressof(text_buffer))
                    retval = '{}'.format(text_buffer[:length])
                    return retval
            
                def EnumCallback(hwnd, lparam):
                    curr_class = create_unicode_buffer(256)
                    GetClassNameW(hwnd, curr_class, 256)
                    if curr_class.value.lower() == "msctls_statusbar32":
                        npp_get_statusbar.STATUSBAR_HANDLE = hwnd
                        return False  # stop the enumeration
                    return True  # continue the enumeration
            
                npp_hwnd = FindWindowW(u"Notepad++", None)
                EnumChildWindows(npp_hwnd, WNDENUMPROC(EnumCallback), 0)
                if npp_get_statusbar.STATUSBAR_HANDLE: return get_result_from_statusbar(statusbar_item_number)
                assert False
            
            St_bar = npp_get_statusbar(4)  # Zone 4 ( STATUSBARSECTION.UNICODETYPE )
            

            See next post for continuation !

            1 Reply Last reply Reply Quote 1
            • guy038G
              guy038
              last edited by

              Hi, @alan-kilborn and All,

              Continuation of the script :

              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              def number(occ):
                  global num
                  num += 1
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              Curr_encoding = str(notepad.getEncoding())
              
              if Curr_encoding == 'ENC8BIT':
                  Curr_encoding = 'ANSI'
              
              if Curr_encoding == 'COOKIE':
                  Curr_encoding = 'UTF-8'
              
              if Curr_encoding == 'UTF8':
                  Curr_encoding = 'UTF-8-BOM'
              
              if Curr_encoding == 'UCS2BE':
                  Curr_encoding = 'UTF-16 BE BOM'
              
              if Curr_encoding == 'UCS2LE':
                  Curr_encoding = 'UTF-16 LE BOM'
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                  Line_title = 95
              else:
                  Line_title = 75
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              File_name = notepad.getCurrentFilename()
              
              if os.path.isfile(File_name) == True:
              
                  Creation_date = time.ctime(os.path.getctime(File_name))
              
                  Modif_date = time.ctime(os.path.getmtime(File_name))
              
                  Size_length = os.path.getsize(File_name)
              
                  RO_flag = 'YES'
              
                  if os.access(File_name, os.W_OK):
                      RO_flag = 'NO'
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              RO_editor = 'NO'
              
              if editor.getReadOnly() == True:
                  RO_editor = 'YES'
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              if notepad.getCurrentView() == 0:
                  Curr_view = 'MAIN View'
              else:
                  Curr_view = 'SECONDARY view'
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              Curr_lang = notepad.getCurrentLang()
              
              Lang_desc = notepad.getLanguageDesc(Curr_lang)
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              if editor.getEOLMode() == 0:
                  Curr_eol = 'Windows (CR LF)'
              
              if editor.getEOLMode() == 1:
                  Curr_eol = 'Macintosh (CR)'
              
              if editor.getEOLMode() == 2:
                  Curr_eol = 'Unix (LF)'
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              Curr_wrap = 'NO'
              
              if editor.getWrapMode() == 1:
                  Curr_wrap = 'YES'
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              num = 0
              if Curr_encoding == 'ANSI':
                  editor.research(r'[^\r\n]', number)
              
              if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                  editor.research(r'(?![\r\n])[\x{0000}-\x{007F}]', number)
              
              Total_1_byte = num
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              num = 0
              if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                  editor.research(r'[\x{0080}-\x{07FF}]', number)
              
              if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                  editor.research(r'(?![\r\n\x{D800}-\x{DFFF}])[\x{0000}-\x{FFFF}]', number)  #  ALL BMP vchars ( With PYTHON, the [^\r\n\x{D800}-\x{DFFF}] syntax does NOT work properly !)
              
              Total_2_bytes = num
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              num = 0
              if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                  editor.research(r'(?![\x{D800}-\x{DFFF}])[\x{0800}-\x{FFFF}]', number)
              
              Total_3_bytes = num
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              Total_BMP = Total_1_byte + Total_2_bytes + Total_3_bytes
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              num = 0
              editor.research(r'[^\r\n]', number)
              
              Total_standard = num
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              Total_4_bytes = 0  #  By default
              
              if Curr_encoding != 'ANSI':
                  Total_4_bytes = Total_standard - Total_BMP
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              num = 0
              editor.research(r'\r|\n', number)
              
              Total_EOL = num
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              Total_chars = Total_EOL + Total_standard
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              if Curr_encoding == 'ANSI':
                  Bytes_length = Total_EOL + Total_1_byte
              
              if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                  Bytes_length = Total_EOL + Total_1_byte + 2 * Total_2_bytes + 3 * Total_3_bytes + 4 * Total_4_bytes
              
              if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                  Bytes_length = 2 * Total_EOL + 2 * Total_BMP + 4 * Total_4_bytes
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              BOM = 0  #  Default ANSI and UTF-8
              
              if Curr_encoding == 'UTF-8-BOM':
                  BOM = 3
              
              if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                  BOM = 2
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              Buffer_length = Bytes_length + BOM
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              num = 0
              editor.research(r'[^\r\n\t\x20]', number)
              
              Non_blank_chars = num
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              num = 0
              editor.research(r'\w+', number)
              
              Words_count = num
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              num = 0
              
              if Curr_encoding == 'ANSI':
                  editor.research(r'((?!\s).)+', number)
              else:
                  editor.research(r'((?!\s).[\x{D800}-\x{DFFF}]?)+', number)
              
              Non_space_count = num
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              num = 0
              if Curr_encoding == 'ANSI':
                  editor.research(r'(?<!\f)^(?:\r\n|\r|\n)', number)
              else:
                  editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^(?:\r\n|\r|\n)', number)
              
              Empty_lines = num
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              num = 0
              if Curr_encoding == 'ANSI':
                  editor.research(r'(?<!\f)^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
              else:
                  editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
              
              Blank_lines = num
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              Emp_blk_lines = Empty_lines + Blank_lines
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              num = 0
              if Curr_encoding == 'ANSI':
                  editor.research(r'(?-s)\r\n|\r|\n|(?:.|\f)\z', number)
              else:
                  editor.research(r'(?-s)\r\n|\r|\n|(?:.|[\f\x{0085}\x{2028}\x{2029}])\z', number)
              
              Total_lines = num
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              Non_blk_lines = Total_lines - Emp_blk_lines
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              Num_sel = editor.getSelections()  # Get ALL selections ( EMPTY or NOT )
              
              # print ('Res = ', Num_sel)
              
              if Num_sel != 0:
              
                  Bytes_count = 0
                  Chars_count = 0
              
                  for n in range(Num_sel):
              
                      Bytes_count += editor.getSelectionNEnd(n) - editor.getSelectionNStart(n)
              
                      Chars_count += editor.countCharacters(editor.getSelectionNStart(n), editor.getSelectionNEnd(n))
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
                  if Chars_count < 2:
                      Txt_chars = ' selected char ('
              
                  else:
                      Txt_chars = ' selected chars ('
              
              
                  if Bytes_count < 2:
                      Txt_bytes = ' selected byte) in '
              
                  else:
                      Txt_bytes = ' selected bytes) in '
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
                  if Num_sel < 2 and Bytes_count == 0:
                      Txt_ranges = ' EMPTY range\n'
              
                  if Num_sel < 2 and Bytes_count > 0:
                      Txt_ranges = ' range\n'
              
                  if Num_sel > 1 and Bytes_count == 0:
                      Txt_ranges = ' EMPTY ranges\n'
              
                  if Num_sel > 1 and Bytes_count > 0:
                      Txt_ranges = ' ranges (EMPTY or NOT)\n'
              
              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
              
              line_list = []  # empty list
              
              line_list.append ('-' * Line_title)
              
              line_list.append (' ' * ((Line_title - 37) / 2) + 'SUMMARY on ' + str(datetime.datetime.now()))
              
              line_list.append ('-' * Line_title +'\n')
              
              line_list.append (' FULL File Path    :  ' + File_name + '\n')
              
              if os.path.isfile(File_name) == True:
              
                  line_list.append(' CREATION     Date :  ' + Creation_date)
              
                  line_list.append(' MODIFICATION Date :  ' + Modif_date + '\n')
              
                  line_list.append(' READ-ONLY flag    :  ' + RO_flag )
              
              line_list.append (' READ-ONLY editor  :  ' + RO_editor + '\n\n')
              
              line_list.append (' Current VIEW      :  ' + Curr_view + '\n')
              
              line_list.append (' Current ENCODING  :  ' + Curr_encoding + '\n')
              
              line_list.append (' Current LANGUAGE  :  ' + str(Curr_lang) + '  (' + Lang_desc + ')\n')
              
              line_list.append (' Current Line END  :  ' + Curr_eol + '\n')
              
              line_list.append (' Current WRAPPING  :  ' + Curr_wrap + '\n\n')
              
              line_list.append (' 1-BYTE  Chars     :  ' + str(Total_1_byte))
              
              line_list.append (' 2-BYTES Chars     :  ' + str(Total_2_bytes))
              
              line_list.append (' 3-BYTES Chars     :  ' + str(Total_3_bytes) + '\n')
              
              line_list.append (' Sum BMP Chars     :  ' + str(Total_BMP))
              
              line_list.append (' 4-BYTES Chars     :  ' + str(Total_4_bytes) + '\n')
              
              line_list.append (' CHARS w/o CR & LF :  ' + str(Total_standard))
              
              line_list.append (' EOL ( CR or LF )  :  ' + str(Total_EOL) + '\n')
              
              line_list.append (' TOTAL characters  :  ' + str(Total_chars) + '\n\n')
              
              if Curr_encoding == 'ANSI':
                  line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_EOL) + ' x 1 + ' + str(Total_1_byte) + ' x 1b)')
              
              if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                  line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_EOL) + ' x 1 + ' + str(Total_1_byte) + ' x 1b + '\
                  + str(Total_2_bytes) + ' x 2b + ' + str(Total_3_bytes) + ' x 3b + ' + str(Total_4_bytes) + ' x 4b)')
              
              if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                  line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_EOL) + ' x 2 + ' + str(Total_BMP) + ' x 2b + ' + str(Total_4_bytes) + ' x 4b)')
              
              line_list.append (' Byte Order Mark   :  ' + str(BOM) + '\n')
              
              line_list.append (' BUFFER Length     :  ' + str(Buffer_length))
              
              if os.path.isfile(File_name) == True:
                  line_list.append (' Length on DISK    :  ' + str(Size_length) + '\n\n')
              else:
                  line_list.append ('\n')
              
              line_list.append (' NON-Blank Chars   :  ' + str(Non_blank_chars) + '\n')
              
              line_list.append (' WORDS     Count   :  ' + str(Words_count) + ' (Caution !)\n')
              
              line_list.append (' NON-SPACE Count   :  ' + str(Non_space_count) + '\n\n')
              
              line_list.append (' True EMPTY lines  :  ' + str(Empty_lines))
              
              line_list.append (' True BLANK lines  :  ' + str(Blank_lines) + '\n')
              
              line_list.append (' EMPTY/BLANK lines :  ' + str(Emp_blk_lines) + '\n')
              
              line_list.append (' NON-BLANK lines   :  ' + str(Non_blk_lines))
              
              line_list.append (' TOTAL Lines       :  ' + str(Total_lines) + '\n\n')
              
              line_list.append (' SELECTION(S)      :  ' + str(Chars_count) + Txt_chars + str(Bytes_count) + Txt_bytes + str(Num_sel) + Txt_ranges)
              
              editor.copyText ('\r\n'.join(line_list))
              
              notepad.new()
              
              editor.paste()
              
              editor.copyText('')
              
              if St_bar != 'ANSI' and St_bar != 'UTF-8' and St_bar != 'UTF-8-BOM' and St_bar != 'UTF-16 BE BOM' and St_bar != 'UTF-16 LE BOM':
              
                  if Curr_encoding == 'UTF-8':  #  SAME value for both an 'UTF-8' or 'ANSI' file, when RE-INTERPRETED with the 'Encoding > Character Set > ...' feature
              
                      notepad.prompt ('CURRENT file re-interpreted as ' + St_bar + '  =>  Possible ERRONEOUS results' + \
                                      '\nSo, CLOSE the file WITHOUT saving, RESTORE it (CTRL + SHIFT + T) and RESTART script', '!!! WARNING !!!', '')
              
              # ----Aé☀𝜜-----------------------------------------------------------------------------------------------------------------------------------------------------
              

              If you’re still working or doing tests wih a N++ version prior to v8.0 :

              • First, change any sub-string UTF-16 with UCS-2, in the python script

              • And, of course, do not forget to get rid of any character over \x{FFFF} in your UCS-2 BE/LE BOM encoded files, before using this script


              Note, that the encoding problem, described two posts ago, when trying to encode any file, without a BOM, with a Encoding > Character Set > ... encoding, stll remains. Thus, the warning prompt is still present at the end of this final version !


              Now, I’m going to update an old post where I explained the poor performance of the present summary feature. I’ll take the opportunity to include the instructions for understanding this improved script !

              Best Regards,

              guy038

              Alan KilbornA 1 Reply Last reply Reply Quote 1
              • Alan KilbornA
                Alan Kilborn @guy038
                last edited by

                @guy038

                You have this line in your script:

                line_list.append (' ' * ((Line_title - 37) / 2) + 'SUMMARY on ' + str(datetime.datetime.now()))
                

                I would suggest changing it to:

                line_list.append (' ' * int((Line_title - 37) / 2) + 'SUMMARY on ' + str(datetime.datetime.now()))
                

                This is because, without the int, under Python3 we see the following error:

                TypeError: can't multiply sequence by non-int of type 'float'
                
                1 Reply Last reply Reply Quote 3
                • guy038G
                  guy038
                  last edited by

                  Hi, @alan-kilborn and All,

                  Just follow this link to find out why I decided to improve the View > Summary feature and to get the last version of the Python script, wich gives us a decent and exact Summary feature !

                  https://community.notepad-plus-plus.org/post/92794 ( 4 posts )

                  BR

                  guy038

                  Alan KilbornA 1 Reply Last reply Reply Quote 0
                  • Alan KilbornA
                    Alan Kilborn @guy038
                    last edited by

                    @guy038 said:

                    Just follow this link

                    I’m MIGHTY confused as to why you felt the need to reanimate a several-years-old topic/thread to continue discussing what you dedicated this current thread to…
                    Why not just keep talking here?

                    1 Reply Last reply Reply Quote 1
                    • guy038G
                      guy038
                      last edited by guy038

                      Hello, @alan-kilborn,

                      Sorry to get you confused. I’ll try to explain why I wanted to continue on the other thread !

                      • Firsly, I wanted to show from where and why my script came : the whole logic of the View > Summary needed to be completely rebuilt :-((

                      • Secondly, I wanted to update these old posts. Indeed, at that time, the v7.9.1 N++ version was just released. So, I recently did some tests to verify if, consecutively to the encoding improvements of the v8.0 version, the global logic of the summary has been improved. Unfortunately, the View > Summary feature still gives wrong results, especially when the present file is a UTF-16 BE BOM or UTF-16 LE BOM encoded file :-((

                      • Thus, it seemed obvious to me to continue on this thread and add the consecutive versions of my script !


                      Now, I realized that I could have stayed with this new thread, and put a link to my initial post to help people to understand the reasons of this Python script !

                      So, unless you’re terribly upset of my decision ( which would need a lot of modifications ) , I suppose that I’m going on to post the possible new versions of my script on the other thread !

                      In order to get it more clear, I could simply rename this present thread as Summary feature improvement and rename the other thread as Emulation of the "Summary" feature with Python script

                      Alan, what do you think of ?

                      Best Regards,

                      guy038

                      Alan KilbornA 1 Reply Last reply Reply Quote 0
                      • Alan KilbornA
                        Alan Kilborn @guy038
                        last edited by

                        @guy038 said in Improved version of the "Summary" feature, ...:

                        what do you think of ?

                        I wouldn’t bother trying to rename things at this point.
                        It’s no problem simply because I was confused (that’s MY problem). :-)
                        Carry on… :-)

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors