• Login
Community
  • Login

"Summary" feature improvement

Scheduled Pinned Locked Moved General Discussion
31 Posts 3 Posters 5.8k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G
    guy038
    last edited by guy038 Feb 6, 2024, 9:01 AM Feb 5, 2024, 3:40 PM

    Hi, Alan and All,

    ( Continuation of the previous post )

    Now, I’ve come accross a problem with the encodings !

    Have you ever noticed that, when you decide to re-interpret the present encoding of a file with the View > Character Set > ... feature, that there are two possible scenarios ?

    • A) - The present econding is an Unicode encoding with a BOM ( Byte Order Mark ). So, either the UTF-8-BOM, UCS-2 BE BOM or UCS-2 LE BOM encoding

    • B) - The present encoding is an ANSI or UTF-8 file, so without a BOM

    In the first case, whatever the new encoding chosen ( one-byte or two-bytes encoding ), the file contents do not change and my script just respects the real encoding of the current file

    For example, with an UCS-2 LE BOM encoded file, if I change its encoding to View > Character Set > Western European > OEM-US, my new summary just consider that it’s still a true UCS-2 LE BOM encoded file, leading to a correct summary report !

    In the second case, the new encoding chosen does modify the current file contents in the editor window. In addition, it automatically supposes that the current file is an UTF-8 encoded file, leading to erroneous results in the summary rapport :-( However, the current file contents, saved on the disk, seem still unchanged !!

    For instance :

    • Open a new tab

    • Use the Encoding > Convert to UTF-8 feature, if necessary

    • Enter the four chars Aé☀𝜜, without any line-break, at the end

    • Save this file as Test-UTF8.txt

    • Using my script, you get, in a new tab :

    ---------------------------------------------------------------------------------------------
                                SUMMARY on 2024-02-05 16:50:23.656000
    ---------------------------------------------------------------------------------------------
    
     FULL File Path    :  D:\@@\792\Test-UTF8.txt
    
     CREATION     Date :  Mon Feb  5 16:45:24 2024
     MODIFICATION Date :  Mon Feb  5 15:17:02 2024
    
     READ-ONLY flag    :  NO
     READ-ONLY editor  :  NO
    
    
     Current VIEW      :  MAIN View
    
     Current ENCODING  :  UTF-8
    
     Current LANGUAGE  :  TXT  (Normal text file)
    
     Current Line END  :  Windows (CR LF)
    
     Current WRAPPING  :  YES
    
    
     1-BYTE  Chars     :  1
     2-BYTES Chars     :  1
     3-BYTES Chars     :  1
    
     Sum BMP Chars     :  3
     4-BYTES Chars     :  1
    
     CHARS w/o CR & LF :  4
     EOL ( CR or LF )  :  0
    
     TOTAL characters  :  4
    
    
     BYTES Length      :  10 (0 * 1 + 1 * 1b + 1 * 2b + 1 * 3b + 1 * 4b)
     Byte Order Mark   :  0
    
     BUFFER Length     :  10
     Length on DISK    :  10
    
    
     NON-Blank Chars   :  4
    
     WORDS     Count   :  1 (Caution !)
    
     NON-SPACE Count   :  1
    
    
     True EMPTY lines  :  0
     True BLANK lines  :  0
    
     EMPTY/BLANK lines :  0
    
     NON-BLANK lines   :  1
     TOTAL Lines       :  1
    
    
     SELECTION(S)      :  0 selected char (0 selected byte) in 1 EMPTY range
    

    Everything is OK ( buffer length and length on disk are identical and the bytes length description shows one char for each number of bytes, without any EOL )

    • Now, switch back to the Test-UTF8.txt file

    • Run the View > Character Set > Western European > OEM-US feature

    • Re-run my script. This time, in a other new tab, you get :

    ---------------------------------------------------------------------------------------------
                                SUMMARY on 2024-02-05 16:51:16.937000
    ---------------------------------------------------------------------------------------------
    
     FULL File Path    :  D:\@@\792\Test-UTF8.txt
    
     CREATION     Date :  Mon Feb  5 16:45:24 2024
     MODIFICATION Date :  Mon Feb  5 15:17:02 2024
    
     READ-ONLY flag    :  NO
     READ-ONLY editor  :  NO
    
    
     Current VIEW      :  MAIN View
    
     Current ENCODING  :  UTF-8
    
     Current LANGUAGE  :  TXT  (Normal text file)
    
     Current Line END  :  Windows (CR LF)
    
     Current WRAPPING  :  YES
    
    
     1-BYTE  Chars     :  1
     2-BYTES Chars     :  6
     3-BYTES Chars     :  3
    
     Sum BMP Chars     :  10
     4-BYTES Chars     :  0
    
     CHARS w/o CR & LF :  10
     EOL ( CR or LF )  :  0
    
     TOTAL characters  :  10
    
    
     BYTES Length      :  22 (0 * 1 + 1 * 1b + 6 * 2b + 3 * 3b + 0 * 4b)
     Byte Order Mark   :  0
    
     BUFFER Length     :  22
     Length on DISK    :  10
    
    
     NON-Blank Chars   :  10
    
     WORDS     Count   :  2 (Caution !)
    
     NON-SPACE Count   :  1
    
    
     True EMPTY lines  :  0
     True BLANK lines  :  0
    
     EMPTY/BLANK lines :  0
    
     NON-BLANK lines   :  1
     TOTAL Lines       :  1
    
    
     SELECTION(S)      :  0 selected char (0 selected byte) in 1 EMPTY range
    

    And, at the same time, a prompt displays this warning :

    CURRENT file re-interpreted as OEM-US => Possible ERRONEOUS results
    So, CLOSE the file WITHOUT saving, RESTORE it (CTRL + SHIFT + T) and RESTART script

    Indeed, this time, as the file contents are unchanged, the length on DISK is still correct but the BUFFER length is wrong, due to the re-interpretation of the characters by the OEM-US encoding. That’s why I preferred to add this warning at the end of the script !

    Now, do as it is said :

    • Close the Test-UTF8.txt file ( Ctrl + W )

    • Restore it ( Ctrl + Shift + T )

    • Again, you get the UTF-8 indication, for the Test-UTF8.txt file, at right of the status bar

    • Re-run my script

    => This time, we get again a correct summary, without any prompt !


    Alan or other python gurus, feel free to improve this last version and/or test on various files if all the numbers shown are coherent !

    Best Regards,

    guy038

    1 Reply Last reply Reply Quote 1
    • G
      guy038
      last edited by guy038 Feb 6, 2024, 2:36 PM Feb 6, 2024, 10:07 AM

      Hi All,

      I"ve just realized that, up to now, I simply improved my script with an old version of N++ ( v7.9.2 ). I apologize…

      So, I’m first going to update my last portable version, on my W10 laptop, from v8.5.4 to the v8.6.2 version and I will update my script and redo all the tests

      See you later !

      BR

      guy038

      A 1 Reply Last reply Feb 6, 2024, 12:31 PM Reply Quote 0
      • A
        Alan Kilborn @guy038
        last edited by Feb 6, 2024, 12:31 PM

        @guy038 said in Improved version of the "Summary" feature, ...:

        I"ve just realized that, up to now, I simply improved my script with an old version of N++ ( v7.9.2 ). I apologize…

        :-(

        You ought to close out these ancient versions…permanently.

        1 Reply Last reply Reply Quote 0
        • G
          guy038
          last edited by guy038 Feb 10, 2024, 10:22 AM Feb 9, 2024, 2:53 PM

          Hello, @alan-kilborn and All,

          I’e just discovered that, since the v8.0 N++ version, the UCS-2 BE BOM and UCS-2 LE BOM encodings are able to handle all the characters over the BMP. Thus, these encoding were renamed, respectively, as UTF-16 BE BOM and UTF-16 LE BOM !

          Note that, with these two encodings, each character with code > \x{FFFF} is built with the surrogate pair mechanism, so with two 16-bytes chars. Consequently, the total number of characters in the buffer = 2 (BOM) + number of chars <= x{FFFF} x 2 + number of chars > x{FFFF} x 4

          For example, the simple string Aé☀𝜜, without any EOL, in an UTF-16 BE encoding file, is coded with 12 bytes as :

          
          FE FF 00 41 00 E9 26 00 D8 35 DF 1C
          ----- ----- ----- ----- -----------
           BOM    A     é     ☀       𝜜
          
          

          So, here is my final and updated version of the script, which works in all versions since the v8.0 one !

          # encoding=utf-8
          
          #-------------------------------------------------------------------------
          #                    STATISTICS about the CURRENT file ( v0.5 )
          #-------------------------------------------------------------------------
          
          from __future__ import print_function    # for Python2 compatibility
          
          from Npp import *
          
          import re
          
          import os, time, datetime
          
          import ctypes
          
          from ctypes.wintypes import BOOL, HWND, WPARAM, LPARAM, UINT
          
          # --------------------------------------------------------------------------------------------------------------------------------------------------------------
          #  From @alan-kilborn, in post https://community.notepad-plus-plus.org/topic/21733/pythonscript-different-behavior-in-script-vs-in-immediate-mode/4
          # --------------------------------------------------------------------------------------------------------------------------------------------------------------
          
          def npp_get_statusbar(statusbar_item_number):
          
              WNDENUMPROC = ctypes.WINFUNCTYPE(BOOL, HWND, LPARAM)
              FindWindowW = ctypes.windll.user32.FindWindowW
              FindWindowExW = ctypes.windll.user32.FindWindowExW
              SendMessageW = ctypes.windll.user32.SendMessageW
              LRESULT = LPARAM
              SendMessageW.restype = LRESULT
              SendMessageW.argtypes = [ HWND, UINT, WPARAM, LPARAM ]
              EnumChildWindows = ctypes.windll.user32.EnumChildWindows
              GetClassNameW = ctypes.windll.user32.GetClassNameW
              create_unicode_buffer = ctypes.create_unicode_buffer
          
              SBT_OWNERDRAW = 0x1000
              WM_USER = 0x400; SB_GETTEXTLENGTHW = WM_USER + 12; SB_GETTEXTW = WM_USER + 13
          
              npp_get_statusbar.STATUSBAR_HANDLE = None
          
              def get_result_from_statusbar(statusbar_item_number):
                  assert statusbar_item_number <= 5
                  retcode = SendMessageW(npp_get_statusbar.STATUSBAR_HANDLE, SB_GETTEXTLENGTHW, statusbar_item_number, 0)
                  length = retcode & 0xFFFF
                  type = (retcode >> 16) & 0xFFFF
                  assert (type != SBT_OWNERDRAW)
                  text_buffer = create_unicode_buffer(length)
                  retcode = SendMessageW(npp_get_statusbar.STATUSBAR_HANDLE, SB_GETTEXTW, statusbar_item_number, ctypes.addressof(text_buffer))
                  retval = '{}'.format(text_buffer[:length])
                  return retval
          
              def EnumCallback(hwnd, lparam):
                  curr_class = create_unicode_buffer(256)
                  GetClassNameW(hwnd, curr_class, 256)
                  if curr_class.value.lower() == "msctls_statusbar32":
                      npp_get_statusbar.STATUSBAR_HANDLE = hwnd
                      return False  # stop the enumeration
                  return True  # continue the enumeration
          
              npp_hwnd = FindWindowW(u"Notepad++", None)
              EnumChildWindows(npp_hwnd, WNDENUMPROC(EnumCallback), 0)
              if npp_get_statusbar.STATUSBAR_HANDLE: return get_result_from_statusbar(statusbar_item_number)
              assert False
          
          St_bar = npp_get_statusbar(4)  # Zone 4 ( STATUSBARSECTION.UNICODETYPE )
          

          See next post for continuation !

          1 Reply Last reply Reply Quote 1
          • G
            guy038
            last edited by Feb 9, 2024, 2:57 PM

            Hi, @alan-kilborn and All,

            Continuation of the script :

            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            def number(occ):
                global num
                num += 1
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            Curr_encoding = str(notepad.getEncoding())
            
            if Curr_encoding == 'ENC8BIT':
                Curr_encoding = 'ANSI'
            
            if Curr_encoding == 'COOKIE':
                Curr_encoding = 'UTF-8'
            
            if Curr_encoding == 'UTF8':
                Curr_encoding = 'UTF-8-BOM'
            
            if Curr_encoding == 'UCS2BE':
                Curr_encoding = 'UTF-16 BE BOM'
            
            if Curr_encoding == 'UCS2LE':
                Curr_encoding = 'UTF-16 LE BOM'
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                Line_title = 95
            else:
                Line_title = 75
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            File_name = notepad.getCurrentFilename()
            
            if os.path.isfile(File_name) == True:
            
                Creation_date = time.ctime(os.path.getctime(File_name))
            
                Modif_date = time.ctime(os.path.getmtime(File_name))
            
                Size_length = os.path.getsize(File_name)
            
                RO_flag = 'YES'
            
                if os.access(File_name, os.W_OK):
                    RO_flag = 'NO'
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            RO_editor = 'NO'
            
            if editor.getReadOnly() == True:
                RO_editor = 'YES'
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            if notepad.getCurrentView() == 0:
                Curr_view = 'MAIN View'
            else:
                Curr_view = 'SECONDARY view'
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            Curr_lang = notepad.getCurrentLang()
            
            Lang_desc = notepad.getLanguageDesc(Curr_lang)
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            if editor.getEOLMode() == 0:
                Curr_eol = 'Windows (CR LF)'
            
            if editor.getEOLMode() == 1:
                Curr_eol = 'Macintosh (CR)'
            
            if editor.getEOLMode() == 2:
                Curr_eol = 'Unix (LF)'
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            Curr_wrap = 'NO'
            
            if editor.getWrapMode() == 1:
                Curr_wrap = 'YES'
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            num = 0
            if Curr_encoding == 'ANSI':
                editor.research(r'[^\r\n]', number)
            
            if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                editor.research(r'(?![\r\n])[\x{0000}-\x{007F}]', number)
            
            Total_1_byte = num
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            num = 0
            if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                editor.research(r'[\x{0080}-\x{07FF}]', number)
            
            if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                editor.research(r'(?![\r\n\x{D800}-\x{DFFF}])[\x{0000}-\x{FFFF}]', number)  #  ALL BMP vchars ( With PYTHON, the [^\r\n\x{D800}-\x{DFFF}] syntax does NOT work properly !)
            
            Total_2_bytes = num
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            num = 0
            if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                editor.research(r'(?![\x{D800}-\x{DFFF}])[\x{0800}-\x{FFFF}]', number)
            
            Total_3_bytes = num
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            Total_BMP = Total_1_byte + Total_2_bytes + Total_3_bytes
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            num = 0
            editor.research(r'[^\r\n]', number)
            
            Total_standard = num
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            Total_4_bytes = 0  #  By default
            
            if Curr_encoding != 'ANSI':
                Total_4_bytes = Total_standard - Total_BMP
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            num = 0
            editor.research(r'\r|\n', number)
            
            Total_EOL = num
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            Total_chars = Total_EOL + Total_standard
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            if Curr_encoding == 'ANSI':
                Bytes_length = Total_EOL + Total_1_byte
            
            if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                Bytes_length = Total_EOL + Total_1_byte + 2 * Total_2_bytes + 3 * Total_3_bytes + 4 * Total_4_bytes
            
            if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                Bytes_length = 2 * Total_EOL + 2 * Total_BMP + 4 * Total_4_bytes
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            BOM = 0  #  Default ANSI and UTF-8
            
            if Curr_encoding == 'UTF-8-BOM':
                BOM = 3
            
            if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                BOM = 2
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            Buffer_length = Bytes_length + BOM
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            num = 0
            editor.research(r'[^\r\n\t\x20]', number)
            
            Non_blank_chars = num
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            num = 0
            editor.research(r'\w+', number)
            
            Words_count = num
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            num = 0
            
            if Curr_encoding == 'ANSI':
                editor.research(r'((?!\s).)+', number)
            else:
                editor.research(r'((?!\s).[\x{D800}-\x{DFFF}]?)+', number)
            
            Non_space_count = num
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            num = 0
            if Curr_encoding == 'ANSI':
                editor.research(r'(?<!\f)^(?:\r\n|\r|\n)', number)
            else:
                editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^(?:\r\n|\r|\n)', number)
            
            Empty_lines = num
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            num = 0
            if Curr_encoding == 'ANSI':
                editor.research(r'(?<!\f)^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
            else:
                editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
            
            Blank_lines = num
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            Emp_blk_lines = Empty_lines + Blank_lines
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            num = 0
            if Curr_encoding == 'ANSI':
                editor.research(r'(?-s)\r\n|\r|\n|(?:.|\f)\z', number)
            else:
                editor.research(r'(?-s)\r\n|\r|\n|(?:.|[\f\x{0085}\x{2028}\x{2029}])\z', number)
            
            Total_lines = num
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            Non_blk_lines = Total_lines - Emp_blk_lines
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            Num_sel = editor.getSelections()  # Get ALL selections ( EMPTY or NOT )
            
            # print ('Res = ', Num_sel)
            
            if Num_sel != 0:
            
                Bytes_count = 0
                Chars_count = 0
            
                for n in range(Num_sel):
            
                    Bytes_count += editor.getSelectionNEnd(n) - editor.getSelectionNStart(n)
            
                    Chars_count += editor.countCharacters(editor.getSelectionNStart(n), editor.getSelectionNEnd(n))
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
                if Chars_count < 2:
                    Txt_chars = ' selected char ('
            
                else:
                    Txt_chars = ' selected chars ('
            
            
                if Bytes_count < 2:
                    Txt_bytes = ' selected byte) in '
            
                else:
                    Txt_bytes = ' selected bytes) in '
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
                if Num_sel < 2 and Bytes_count == 0:
                    Txt_ranges = ' EMPTY range\n'
            
                if Num_sel < 2 and Bytes_count > 0:
                    Txt_ranges = ' range\n'
            
                if Num_sel > 1 and Bytes_count == 0:
                    Txt_ranges = ' EMPTY ranges\n'
            
                if Num_sel > 1 and Bytes_count > 0:
                    Txt_ranges = ' ranges (EMPTY or NOT)\n'
            
            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
            
            line_list = []  # empty list
            
            line_list.append ('-' * Line_title)
            
            line_list.append (' ' * ((Line_title - 37) / 2) + 'SUMMARY on ' + str(datetime.datetime.now()))
            
            line_list.append ('-' * Line_title +'\n')
            
            line_list.append (' FULL File Path    :  ' + File_name + '\n')
            
            if os.path.isfile(File_name) == True:
            
                line_list.append(' CREATION     Date :  ' + Creation_date)
            
                line_list.append(' MODIFICATION Date :  ' + Modif_date + '\n')
            
                line_list.append(' READ-ONLY flag    :  ' + RO_flag )
            
            line_list.append (' READ-ONLY editor  :  ' + RO_editor + '\n\n')
            
            line_list.append (' Current VIEW      :  ' + Curr_view + '\n')
            
            line_list.append (' Current ENCODING  :  ' + Curr_encoding + '\n')
            
            line_list.append (' Current LANGUAGE  :  ' + str(Curr_lang) + '  (' + Lang_desc + ')\n')
            
            line_list.append (' Current Line END  :  ' + Curr_eol + '\n')
            
            line_list.append (' Current WRAPPING  :  ' + Curr_wrap + '\n\n')
            
            line_list.append (' 1-BYTE  Chars     :  ' + str(Total_1_byte))
            
            line_list.append (' 2-BYTES Chars     :  ' + str(Total_2_bytes))
            
            line_list.append (' 3-BYTES Chars     :  ' + str(Total_3_bytes) + '\n')
            
            line_list.append (' Sum BMP Chars     :  ' + str(Total_BMP))
            
            line_list.append (' 4-BYTES Chars     :  ' + str(Total_4_bytes) + '\n')
            
            line_list.append (' CHARS w/o CR & LF :  ' + str(Total_standard))
            
            line_list.append (' EOL ( CR or LF )  :  ' + str(Total_EOL) + '\n')
            
            line_list.append (' TOTAL characters  :  ' + str(Total_chars) + '\n\n')
            
            if Curr_encoding == 'ANSI':
                line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_EOL) + ' x 1 + ' + str(Total_1_byte) + ' x 1b)')
            
            if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_EOL) + ' x 1 + ' + str(Total_1_byte) + ' x 1b + '\
                + str(Total_2_bytes) + ' x 2b + ' + str(Total_3_bytes) + ' x 3b + ' + str(Total_4_bytes) + ' x 4b)')
            
            if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_EOL) + ' x 2 + ' + str(Total_BMP) + ' x 2b + ' + str(Total_4_bytes) + ' x 4b)')
            
            line_list.append (' Byte Order Mark   :  ' + str(BOM) + '\n')
            
            line_list.append (' BUFFER Length     :  ' + str(Buffer_length))
            
            if os.path.isfile(File_name) == True:
                line_list.append (' Length on DISK    :  ' + str(Size_length) + '\n\n')
            else:
                line_list.append ('\n')
            
            line_list.append (' NON-Blank Chars   :  ' + str(Non_blank_chars) + '\n')
            
            line_list.append (' WORDS     Count   :  ' + str(Words_count) + ' (Caution !)\n')
            
            line_list.append (' NON-SPACE Count   :  ' + str(Non_space_count) + '\n\n')
            
            line_list.append (' True EMPTY lines  :  ' + str(Empty_lines))
            
            line_list.append (' True BLANK lines  :  ' + str(Blank_lines) + '\n')
            
            line_list.append (' EMPTY/BLANK lines :  ' + str(Emp_blk_lines) + '\n')
            
            line_list.append (' NON-BLANK lines   :  ' + str(Non_blk_lines))
            
            line_list.append (' TOTAL Lines       :  ' + str(Total_lines) + '\n\n')
            
            line_list.append (' SELECTION(S)      :  ' + str(Chars_count) + Txt_chars + str(Bytes_count) + Txt_bytes + str(Num_sel) + Txt_ranges)
            
            editor.copyText ('\r\n'.join(line_list))
            
            notepad.new()
            
            editor.paste()
            
            editor.copyText('')
            
            if St_bar != 'ANSI' and St_bar != 'UTF-8' and St_bar != 'UTF-8-BOM' and St_bar != 'UTF-16 BE BOM' and St_bar != 'UTF-16 LE BOM':
            
                if Curr_encoding == 'UTF-8':  #  SAME value for both an 'UTF-8' or 'ANSI' file, when RE-INTERPRETED with the 'Encoding > Character Set > ...' feature
            
                    notepad.prompt ('CURRENT file re-interpreted as ' + St_bar + '  =>  Possible ERRONEOUS results' + \
                                    '\nSo, CLOSE the file WITHOUT saving, RESTORE it (CTRL + SHIFT + T) and RESTART script', '!!! WARNING !!!', '')
            
            # ----Aé☀𝜜-----------------------------------------------------------------------------------------------------------------------------------------------------
            

            If you’re still working or doing tests wih a N++ version prior to v8.0 :

            • First, change any sub-string UTF-16 with UCS-2, in the python script

            • And, of course, do not forget to get rid of any character over \x{FFFF} in your UCS-2 BE/LE BOM encoded files, before using this script


            Note, that the encoding problem, described two posts ago, when trying to encode any file, without a BOM, with a Encoding > Character Set > ... encoding, stll remains. Thus, the warning prompt is still present at the end of this final version !


            Now, I’m going to update an old post where I explained the poor performance of the present summary feature. I’ll take the opportunity to include the instructions for understanding this improved script !

            Best Regards,

            guy038

            A 1 Reply Last reply Feb 9, 2024, 4:17 PM Reply Quote 1
            • A
              Alan Kilborn @guy038
              last edited by Feb 9, 2024, 4:17 PM

              @guy038

              You have this line in your script:

              line_list.append (' ' * ((Line_title - 37) / 2) + 'SUMMARY on ' + str(datetime.datetime.now()))
              

              I would suggest changing it to:

              line_list.append (' ' * int((Line_title - 37) / 2) + 'SUMMARY on ' + str(datetime.datetime.now()))
              

              This is because, without the int, under Python3 we see the following error:

              TypeError: can't multiply sequence by non-int of type 'float'
              
              1 Reply Last reply Reply Quote 3
              • G
                guy038
                last edited by Feb 10, 2024, 3:17 AM

                Hi, @alan-kilborn and All,

                Just follow this link to find out why I decided to improve the View > Summary feature and to get the last version of the Python script, wich gives us a decent and exact Summary feature !

                https://community.notepad-plus-plus.org/post/92794 ( 4 posts )

                BR

                guy038

                A 1 Reply Last reply Feb 11, 2024, 12:04 PM Reply Quote 0
                • A
                  Alan Kilborn @guy038
                  last edited by Feb 11, 2024, 12:04 PM

                  @guy038 said:

                  Just follow this link

                  I’m MIGHTY confused as to why you felt the need to reanimate a several-years-old topic/thread to continue discussing what you dedicated this current thread to…
                  Why not just keep talking here?

                  1 Reply Last reply Reply Quote 1
                  • G
                    guy038
                    last edited by guy038 Feb 11, 2024, 7:28 PM Feb 11, 2024, 7:26 PM

                    Hello, @alan-kilborn,

                    Sorry to get you confused. I’ll try to explain why I wanted to continue on the other thread !

                    • Firsly, I wanted to show from where and why my script came : the whole logic of the View > Summary needed to be completely rebuilt :-((

                    • Secondly, I wanted to update these old posts. Indeed, at that time, the v7.9.1 N++ version was just released. So, I recently did some tests to verify if, consecutively to the encoding improvements of the v8.0 version, the global logic of the summary has been improved. Unfortunately, the View > Summary feature still gives wrong results, especially when the present file is a UTF-16 BE BOM or UTF-16 LE BOM encoded file :-((

                    • Thus, it seemed obvious to me to continue on this thread and add the consecutive versions of my script !


                    Now, I realized that I could have stayed with this new thread, and put a link to my initial post to help people to understand the reasons of this Python script !

                    So, unless you’re terribly upset of my decision ( which would need a lot of modifications ) , I suppose that I’m going on to post the possible new versions of my script on the other thread !

                    In order to get it more clear, I could simply rename this present thread as Summary feature improvement and rename the other thread as Emulation of the "Summary" feature with Python script

                    Alan, what do you think of ?

                    Best Regards,

                    guy038

                    A 1 Reply Last reply Feb 11, 2024, 8:06 PM Reply Quote 0
                    • A
                      Alan Kilborn @guy038
                      last edited by Feb 11, 2024, 8:06 PM

                      @guy038 said in Improved version of the "Summary" feature, ...:

                      what do you think of ?

                      I wouldn’t bother trying to rename things at this point.
                      It’s no problem simply because I was confused (that’s MY problem). :-)
                      Carry on… :-)

                      1 Reply Last reply Reply Quote 0
                      31 out of 31
                      • First post
                        31/31
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors