• Login
Community
  • Login

"Summary" feature improvement

Scheduled Pinned Locked Moved General Discussion
31 Posts 3 Posters 3.0k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P
    PeterJones @guy038
    last edited by Jan 26, 2024, 3:27 PM

    @guy038 said in Improved version of the "Summary" feature, ...:

    I, of course, would prefer to paste all these results directly on the clipboard !

    If you mean you want to put the text you’ve created into the clipboard, so you can later paste it, editor.copyText(stringVar) will put the contents of stringVar into the Windows clipboard.

    1 Reply Last reply Reply Quote 2
    • G
      guy038
      last edited by guy038 Jan 26, 2024, 4:05 PM Jan 26, 2024, 4:04 PM

      Hi, @alan-kilborn, @peterjones and All,

      Peter, thanks for the tip ! I do understand that I may replace all the lines, with this way. For example, the line :

      print (' TOTAL Lines       : ', Total_lines,'\n\n')
      

      by these two ones :

      l = ' TOTAL Lines       : ' + str(Total_lines) + '\n\n'
      
      editor.copyText(l)
      

      But I was thinking about the way to store all the results, first, let’s say, in the text variable and then, use a unique editor.copyText(text) command ! It could be quicker to execute ?

      BR

      guy038

      1 Reply Last reply Reply Quote 0
      • G
        guy038
        last edited by guy038 Jan 26, 2024, 4:26 PM Jan 26, 2024, 4:25 PM

        Hi, @peterjones, and All,

        I think I understood the way to do it :

        Just replace, for instance, these 3 lines :

        print (' True EMPTY lines  : ', Empty_lines)
        
        print (' True BLANK lines  : ', Blank_lines, '\n')
        
        print (' EMPTY/BLANK lines : ', Emp_blk_lines,'\n')
        

        by the 4 lines :

        t = ' True EMPTY lines  : ' + str(Empty_lines) + '\n'
        
        t = t + ' True BLANK lines  : ' + str(Blank_lines) + '\n\n'
        
        t = t + ' EMPTY/BLANK lines : ' + str(Emp_blk_lines) + '\n\n'
        
        editor.copyText(t)
        

        Isn’t it ?

        BR

        guy038

        P 1 Reply Last reply Jan 26, 2024, 5:13 PM Reply Quote 1
        • P
          PeterJones @guy038
          last edited by Jan 26, 2024, 5:13 PM

          @guy038 ,

          Yes, I was thinking something along those lines.

          A 1 Reply Last reply Jan 26, 2024, 7:45 PM Reply Quote 0
          • A
            Alan Kilborn @PeterJones
            last edited by Alan Kilborn Jan 26, 2024, 8:47 PM Jan 26, 2024, 7:45 PM

            @guy038

            Or… you could do a list of lines, adding a line to the list each time it is calculated, then join the lines together into a string and copy that to the clipboard:

            line_list = []  # empty list
            line_list.append(' True EMPTY lines  : ' + str(Empty_lines))
            line_list.append(' True BLANK lines  : ' + str(Blank_lines))
            line_list.append(' EMPTY/BLANK lines : ' + str(Emp_blk_lines))
            editor.copyText('\r\n'.join(line_list))
            

            The line endings don’t have to match the current document, since they are going to the clipboard.

            1 Reply Last reply Reply Quote 2
            • G
              guy038
              last edited by guy038 Jan 27, 2024, 1:11 PM Jan 27, 2024, 1:01 PM

              Hi, All,

              I’m off, tomorrow, with a dozen friends from the ski club, for three days, full board, at the ‘Villages Vacances Familles’ in Monetier-les-bains, in one of France’s great ski areas, near Briançon!

              The weather forecast for Sunday, Monday and Tuesday is fine with sunshine most of the time, but cloudy on Tuesday, and the snow, which is a bit hard, is on the cards: from 116 cm on the summits, at 2800 m down to 2100 m and with 40 cm in the resorts.

              To whet your appetite, here’s an interactive piste map. Once you’ve switched to full screen, you can even ignore certain markers (pistes, ski lifts or other details) by clicking on the icon at the top left !

              https://www.serre-chevalier.com/en/ski-area/interactive-trail-map

              So, see you on next Tuesday evening or Wednesday !

              BR

              guy038

              You may also have a look to this site :

              https://en.wikipedia.org/wiki/Serre_Chevalier

              A 1 Reply Last reply Jan 28, 2024, 1:00 PM Reply Quote 2
              • A
                Alan Kilborn @guy038
                last edited by Jan 28, 2024, 1:00 PM

                @guy038 said in Improved version of the "Summary" feature, ...:

                I’m off … with a dozen friends from the ski club, for three days

                Oh, great… starts writing an interesting script, publishes it here partially-finished, then goes on vacation… :-(

                Enjoy, but I hope the script doesn’t get forgotten about!

                1 Reply Last reply Reply Quote 0
                • G
                  guy038
                  last edited by guy038 Feb 10, 2024, 10:20 AM Feb 2, 2024, 12:20 PM

                  Hello, @alan-kilborn and All,

                  Alan, you’ve been reassured. I’m back. Our stay went well, although there was some very hard snow, following rain in the previous days. And, on the first day, on the Luc Alphand black run, at Chante-Merle, I was not very proud ! Basically, it was OK between 10.00 am and 2.30 pm max !


                  Now, let’s get back to our beloved editor !

                  So, here is my third version ( and still incomplete ) of my Python script which improves the View > Summary feature :

                  # encoding=utf-8
                  
                  #----------------------------------------------------------------------------
                  #                    STATISTIQUES about the CURRENT file ( v0.2 )
                  #----------------------------------------------------------------------------
                  
                  from __future__ import print_function    # for Python2 compatibility
                  
                  import re
                  
                  import os, time
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  def number(occ):
                      global num
                      num += 1
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                      Line_title = 93
                  else:
                      Line_title = 71
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  File_name = notepad.getCurrentFilename()
                  
                  if os.path.isfile(File_name) == True:
                  
                      Creation_date = time.ctime(os.path.getctime(File_name))
                  
                      Modif_date = time.ctime(os.path.getmtime(File_name))
                  
                      Size_length = os.path.getsize(File_name)
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  Curr_encoding = str(notepad.getEncoding())
                  
                  if Curr_encoding == 'ENC8BIT':
                      Curr_encoding = 'ANSI'
                  
                  if Curr_encoding == 'COOKIE':
                      Curr_encoding = 'UTF-8'
                  
                  if Curr_encoding == 'UTF8':
                      Curr_encoding = 'UTF8-BOM'
                  
                  if Curr_encoding == 'UCS2BE':
                      Curr_encoding = 'UCS-2 BE BOM'
                  
                  if Curr_encoding == 'UCS2LE':
                      Curr_encoding = 'UCS-2 LE BOM'
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  Curr_lang = notepad.getCurrentLang()
                  
                  Lang_desc = notepad.getLanguageDesc(Curr_lang)
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  num = 0
                  if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                      editor.research(r'[^\r\n]', number)
                  
                  if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                      editor.research(r'(?![\r\n])[\x{0000}-\x{007F}]', number)
                  
                  Total_1_byte = num
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  num = 0
                  if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                      editor.research(r'[\x{0080}-\x{07FF}]', number)
                  
                  if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                      editor.research(r'[^\r\n]', number)
                  
                  Total_2_bytes = num
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  num = 0
                  if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                      editor.research(r'(?![\x{D800}-\x{DFFF}])[\x{0800}-\x{FFFF}]', number)
                  
                  Total_3_bytes = num
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  Total_BMP = Total_1_byte + Total_2_bytes + Total_3_bytes
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  num = 0
                  editor.research(r'[^\r\n]', number)
                  
                  Total_Standard = num
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  Total_4_bytes = 0  #  By default
                  
                  if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                      Total_4_bytes = Total_Standard - Total_BMP
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  num = 0
                  editor.research(r'\r|\n', number)
                  
                  Total_EOL = num
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  Total_chars = Total_Standard + Total_EOL
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  Bytes_length = Total_EOL + Total_1_byte  #  Default ANSI
                  
                  if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                      Bytes_length = 2 * Total_chars 
                  
                  if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                      Bytes_length = Total_EOL + Total_1_byte + 2 * Total_2_bytes + 3 * Total_3_bytes + 4 * Total_4_bytes
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  BOM = 0  #  Default ANSI and UTF-8
                  
                  if notepad.getEncoding() == BUFFERENCODING.UTF8:
                      BOM = 3
                  
                  if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                      BOM = 2
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  Buffer_length = Bytes_length + BOM
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  num = 0
                  editor.research(r'[^\r\n\t\x20]', number)
                  
                  Non_blank_chars = num
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  num = 0
                  editor.research(r'\w+', number)
                  
                  Words_count = num
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  num = 0
                  
                  if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                      editor.research(r'((?!\s).[\x{D800}-\x{DFFF}]?)+', number)
                  else:
                      editor.research(r'((?!\s).)+', number)
                  
                  Non_space_count = num
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  num = 0
                  if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                      editor.research(r'(?<!\f)^(?:\r\n|\r|\n)', number)
                  else:
                      editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^(?:\r\n|\r|\n)', number)
                  
                  Empty_lines = num
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  num = 0
                  if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                      editor.research(r'(?<!\f)^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
                  else:
                      editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
                  
                  Blank_lines = num
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  Emp_blk_lines = Empty_lines + Blank_lines
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  num = 0
                  if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                      editor.research(r'(?-s)\r\n|\r|\n|(?:.|\f)\z', number)
                  else:
                      editor.research(r'(?-s)\r\n|\r|\n|(?:.|[\f\x{0085}\x{2028}\x{2029}])\z', number)
                  
                  Total_lines = num
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  Non_blk_lines = Total_lines - Emp_blk_lines
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  
                  Num_sel = editor.getSelections()  # Get ALL selections ( EMPTY or NOT )
                  
                  if Num_sel != 0:
                  
                      Bytes_count = 0
                      Chars_count = 0
                  
                      for n in range(Num_sel):
                  
                          Bytes_count += editor.getSelectionNEnd(n) - editor.getSelectionNStart(n)
                  
                          Chars_count += editor.countCharacters(editor.getSelectionNStart(n), editor.getSelectionNEnd(n))
                  
                  if Chars_count < 2:
                      Txt_chars = ' selected char ('
                  
                  else:
                      Txt_chars = ' selected chars ('
                  
                  
                  if Bytes_count < 2:
                      Txt_bytes = ' selected byte) in '
                  
                  else:
                      Txt_bytes = ' selected bytes) in '
                  
                  
                  if Num_sel < 2 and Bytes_count == 0:
                      Txt_ranges = ' EMPTY range\n'
                  
                  if Num_sel < 2 and Bytes_count > 0:
                          Txt_ranges = ' range\n'
                  
                  if Num_sel > 1 and Bytes_count == 0:
                      Txt_ranges = ' EMPTY ranges\n'
                  
                  if Num_sel > 1 and Bytes_count > 0:
                      Txt_ranges = ' ranges (EMPTY or NOT)\n'
                  
                  # ----Aé☀𝜜----------------------------------------------------------------------------------------------------------
                  
                  line_list = []  # empty list
                  
                  line_list.append ('-' * Line_title)
                  
                  line_list.append (' ' * ((Line_title - 7) / 2) + 'Summary')
                  
                  line_list.append ('-' * Line_title +'\n')
                  
                  line_list.append (' Full File Path    : ' + File_name + '\n')
                  
                  if os.path.isfile(File_name) == True:
                      line_list.append(' Creation Date     : ' + Creation_date)
                  
                      line_list.append(' Modification Date : ' + Modif_date + '\n\n')
                  else:
                      line_list.append ('\n')
                  
                  line_list.append (' Current ENCODING  : ' + Curr_encoding + '\n')
                  
                  line_list.append (' Current LANGUAGE  : ' + str(Curr_lang) + '  (' + Lang_desc + ')\n\n')
                  
                  line_list.append (' 1-BYTE  Chars     : ' + str(Total_1_byte))
                  
                  line_list.append (' 2-BYTES Chars     : ' + str(Total_2_bytes))
                  
                  line_list.append (' 3-BYTES Chars     : ' + str(Total_3_bytes) + '\n')
                  
                  line_list.append (' Sum BMP Chars     : ' + str(Total_BMP))
                  
                  line_list.append (' 4-BYTES Chars     : ' + str(Total_4_bytes) + '\n')
                  
                  line_list.append (' Chars w/o CR & LF : ' + str(Total_Standard))
                  
                  line_list.append (' EOL ( CR or LF )  : ' + str(Total_EOL) + '\n')
                  
                  line_list.append (' TOTAL characters  : ' + str(Total_chars) + '\n\n')
                  
                  if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                      line_list.append (' BYTES Length      : ' + str(Bytes_length) + ' (1 * ' + str(Total_1_byte) + ' + 1 * ' + str(Total_EOL) + ' + 2 * ' + str(Total_2_bytes) + ' + 3 * ' + str(Total_3_bytes) + ' + 4 * ' + str(Total_4_bytes) + ')')
                  
                  if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                      line_list.append (' BYTES Length      : ' + str(Bytes_length) + ' (2 * ' + str(Total_chars) + ')')
                  
                  if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                      line_list.append (' BYTES Length      : ' + str(Bytes_length) + ' (1 * ' + str(Total_chars) + ')')
                  
                  line_list.append (' Byte Order Mark   : '+ str(BOM) + '\n')
                  
                  line_list.append (' BUFFER Length     : '+ str(Buffer_length))
                  
                  if os.path.isfile(File_name) == True:
                      line_list.append (' Length on DISK    : '+ str(Size_length) + '\n\n')
                  else:
                      line_list.append ('\n\n')
                  
                  line_list.append (' NON-Blank Chars   : ' + str(Non_blank_chars) + '\n')
                  
                  line_list.append (' Words     Count   : ' + str(Words_count) + '\n')
                  
                  line_list.append (' NON-Space Count   : ' + str(Non_space_count) + '\n\n')
                  
                  line_list.append (' True EMPTY lines  : ' + str(Empty_lines))
                  
                  line_list.append (' True BLANK lines  : ' + str(Blank_lines) + '\n')
                  
                  line_list.append (' EMPTY/BLANK lines : ' + str(Emp_blk_lines) + '\n')
                  
                  line_list.append (' NON-BLANK lines   : ' + str(Non_blk_lines))
                  
                  line_list.append (' TOTAL Lines       : ' + str(Total_lines) + '\n\n')
                  
                  line_list.append (' SELECTION(S)      : ' + str(Chars_count) + Txt_chars + str(Bytes_count) + Txt_bytes + str(Num_sel) + Txt_ranges)
                  
                  editor.copyText ('\r\n'.join(line_list))
                  
                  # ---------------------------------------------------------------------------------------------------------------
                  

                  Now, two points are still not clear :


                  - A To get the current language of the current file, I use the notepad.getCurrentLang() fonction. But I also saw the notepad.getLangType() fonction which seems to return the same string !? Which fonction would be best for this specific script ?


                  - B For the current encoding, I would like to get the zone at right of the status bar. I did see the notepad.setStatusBar(statusBarSection, function) but I don’t see the counterpart notepad.getStatusBar(statusBarSection !

                  For instance, if you have a UTF-8 default file, typing notepad.getEncoding(), on the python console, returns Npp.BUFFERENCODING.COOKIE

                  Now, if you decide to change the way this file’s bytes are interpreted with, for example, the Encoding > Character Set > Western European > OEM-US encoding, typing again notepad.getEncoding() on the python console, still returns Npp.BUFFERENCODING.COOKIE, although I would have expected Npp.BUFFERENCODING.OEM-US or perhaps just OEM-US !

                  TIA for any hint !

                  Best Regards,

                  guy038

                  A 2 Replies Last reply Feb 2, 2024, 12:34 PM Reply Quote 0
                  • A
                    Alan Kilborn @guy038
                    last edited by Alan Kilborn Feb 2, 2024, 1:16 PM Feb 2, 2024, 12:34 PM

                    @guy038 said:

                    notepad.getCurrentLang() versus notepad.getLangType()

                    If you’re interested in the currently active tab, you can use either one.
                    If you wanted a tab that is not the active one, notepad.getLangType() allows you to specify a buffer id for that other tab.


                    but I don’t see the counterpart notepad.getStatusBar(statusBarSection)

                    My script for that is HERE.


                    …still returns Npp.BUFFERENCODING.COOKIE, although I would have expected …

                    Some good “cookie” discussion is HERE, and it also includes the get-status-bar technique.

                    Some more good discussions about encoding are found in these threads:

                    • https://community.notepad-plus-plus.org/topic/25175/how-can-i-get-the-encoding-of-current-document
                    • https://community.notepad-plus-plus.org/topic/24560/new-plugin-multireplace

                    Quoting @Coises from one of those threads:

                    Notepad++ doesn’t handle character sets internally the way it appears to a user. For editing, everything is either in the user default code page (“ANSI”) or in UTF-8. Whenever you’re not using the default code page, Notepad++ uses UTF-8 (so it is possible to enter and see characters that aren’t in the code page). Translation to other code pages is done when reading and writing the file.

                    This could be the reason you aren’t obtaining the results you expect. Now, Notepad++ could, if it wanted to, provide the info that it itself knows about (from the status bar). For whatever reason, it chooses not to (probably lack of developer attention to this detail).

                    1 Reply Last reply Reply Quote 0
                    • G
                      guy038
                      last edited by guy038 Feb 10, 2024, 10:20 AM Feb 3, 2024, 3:02 PM

                      Hello, @alan-kilborn and All,

                      I’m still improving my Python script which can be used instead of the View > Summary feature. I added :

                      • The current date and time in the title

                      • The Read-only file flag status

                      • The Notepad++ Read only status

                      • The current view, the current Line End and the current wrap mode

                      • The script, now, opens automatically a new tab and pastes the contents of the summary in this new tab


                      Here is the fourth ( and still incomplete ) version :

                      # encoding=utf-8
                      
                      #-------------------------------------------------------------------------
                      #                    STATISTICS about the CURRENT file ( v0.3 )
                      #-------------------------------------------------------------------------
                      
                      from __future__ import print_function    # for Python2 compatibility
                      
                      import re
                      
                      import os, time
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      def number(occ):
                          global num
                          num += 1
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                          Line_title = 93
                      else:
                          Line_title = 71
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      File_name = notepad.getCurrentFilename()
                      
                      if os.path.isfile(File_name) == True:
                      
                          Creation_date = time.ctime(os.path.getctime(File_name))
                      
                          Modif_date = time.ctime(os.path.getmtime(File_name))
                      
                          Size_length = os.path.getsize(File_name)
                      
                          RO_flag = 'YES'
                      
                          if os.access(File_name, os.W_OK):
                              RO_flag = 'NO'
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      RO_editor = 'NO'
                      
                      if editor.getReadOnly() == True:
                          RO_editor = 'YES'
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      if notepad.getCurrentView() == 0:
                          Curr_view = 'MAIN View'
                      else:
                          Curr_view = 'SECONDARY view'
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      Curr_encoding = str(notepad.getEncoding())
                      
                      if Curr_encoding == 'ENC8BIT':
                          Curr_encoding = 'ANSI'
                      
                      if Curr_encoding == 'COOKIE':
                          Curr_encoding = 'UTF-8'
                      
                      if Curr_encoding == 'UTF8':
                          Curr_encoding = 'UTF8-BOM'
                      
                      if Curr_encoding == 'UCS2BE':
                          Curr_encoding = 'UCS-2 BE BOM'
                      
                      if Curr_encoding == 'UCS2LE':
                          Curr_encoding = 'UCS-2 LE BOM'
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      Curr_lang = notepad.getCurrentLang()
                      
                      Lang_desc = notepad.getLanguageDesc(Curr_lang)
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      if editor.getEOLMode() == 0:
                          Curr_eol = 'Windows (CR LF)'
                      
                      if editor.getEOLMode() == 1:
                          Curr_eol = 'Macintosh (CR)'
                      
                      if editor.getEOLMode() == 2:
                          Curr_eol = 'Unix (LF)'
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      Curr_wrap = 'NO'
                      
                      if editor.getWrapMode() == 1:
                          Curr_wrap = 'YES'
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      num = 0
                      if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                          editor.research(r'[^\r\n]', number)
                      
                      if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                          editor.research(r'(?![\r\n])[\x{0000}-\x{007F}]', number)
                      
                      Total_1_byte = num
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      num = 0
                      if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                          editor.research(r'[\x{0080}-\x{07FF}]', number)
                      
                      if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                          editor.research(r'[^\r\n]', number)
                      
                      Total_2_bytes = num
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      num = 0
                      if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                          editor.research(r'(?![\x{D800}-\x{DFFF}])[\x{0800}-\x{FFFF}]', number)
                      
                      Total_3_bytes = num
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      Total_BMP = Total_1_byte + Total_2_bytes + Total_3_bytes
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      num = 0
                      editor.research(r'[^\r\n]', number)
                      
                      Total_Standard = num
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      Total_4_bytes = 0  #  By default
                      
                      if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                          Total_4_bytes = Total_Standard - Total_BMP
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      num = 0
                      editor.research(r'\r|\n', number)
                      
                      Total_EOL = num
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      Total_chars = Total_Standard + Total_EOL
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      Bytes_length = Total_EOL + Total_1_byte  #  Default ANSI
                      
                      if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                          Bytes_length = 2 * Total_chars
                      
                      if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                          Bytes_length = Total_EOL + Total_1_byte + 2 * Total_2_bytes + 3 * Total_3_bytes + 4 * Total_4_bytes
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      BOM = 0  #  Default ANSI and UTF-8
                      
                      if notepad.getEncoding() == BUFFERENCODING.UTF8:
                          BOM = 3
                      
                      if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                          BOM = 2
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      Buffer_length = Bytes_length + BOM
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      num = 0
                      editor.research(r'[^\r\n\t\x20]', number)
                      
                      Non_blank_chars = num
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      num = 0
                      editor.research(r'\w+', number)
                      
                      Words_count = num
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      num = 0
                      
                      if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                          editor.research(r'((?!\s).[\x{D800}-\x{DFFF}]?)+', number)
                      else:
                          editor.research(r'((?!\s).)+', number)
                      
                      Non_space_count = num
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      num = 0
                      if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                          editor.research(r'(?<!\f)^(?:\r\n|\r|\n)', number)
                      else:
                          editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^(?:\r\n|\r|\n)', number)
                      
                      Empty_lines = num
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      num = 0
                      if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                          editor.research(r'(?<!\f)^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
                      else:
                          editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
                      
                      Blank_lines = num
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      Emp_blk_lines = Empty_lines + Blank_lines
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      num = 0
                      if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                          editor.research(r'(?-s)\r\n|\r|\n|(?:.|\f)\z', number)
                      else:
                          editor.research(r'(?-s)\r\n|\r|\n|(?:.|[\f\x{0085}\x{2028}\x{2029}])\z', number)
                      
                      Total_lines = num
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      Non_blk_lines = Total_lines - Emp_blk_lines
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      Num_sel = editor.getSelections()  # Get ALL selections ( EMPTY or NOT )
                      
                      # print ('Res = ', Num_sel)
                      
                      if Num_sel != 0:
                      
                          Bytes_count = 0
                          Chars_count = 0
                      
                          for n in range(Num_sel):
                      
                              Bytes_count += editor.getSelectionNEnd(n) - editor.getSelectionNStart(n)
                      
                              Chars_count += editor.countCharacters(editor.getSelectionNStart(n), editor.getSelectionNEnd(n))
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                          if Chars_count < 2:
                              Txt_chars = ' selected char ('
                      
                          else:
                              Txt_chars = ' selected chars ('
                      
                      
                          if Bytes_count < 2:
                              Txt_bytes = ' selected byte) in '
                      
                          else:
                              Txt_bytes = ' selected bytes) in '
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                          if Num_sel < 2 and Bytes_count == 0:
                              Txt_ranges = ' EMPTY range\n'
                      
                          if Num_sel < 2 and Bytes_count > 0:
                              Txt_ranges = ' range\n'
                      
                          if Num_sel > 1 and Bytes_count == 0:
                              Txt_ranges = ' EMPTY ranges\n'
                      
                          if Num_sel > 1 and Bytes_count > 0:
                              Txt_ranges = ' ranges (EMPTY or NOT)\n'
                      
                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                      
                      line_list = []  # empty list
                      
                      line_list.append ('-' * Line_title)
                      
                      line_list.append (' ' * ((Line_title - 37) / 2) + 'SUMMARY on ' + str(datetime.datetime.now()))
                      
                      line_list.append ('-' * Line_title +'\n')
                      
                      line_list.append (' FULL File Path    :  ' + File_name + '\n')
                      
                      if os.path.isfile(File_name) == True:
                      
                          line_list.append(' CREATION     Date :  ' + Creation_date)
                      
                          line_list.append(' MODIFICATION Date :  ' + Modif_date + '\n')
                      
                          line_list.append(' READ-ONLY flag    :  ' + RO_flag )
                      
                      line_list.append (' READ-ONLY editor  :  ' + RO_editor + '\n\n')
                      
                      line_list.append (' Current VIEW      :  ' + Curr_view + '\n')
                      
                      line_list.append (' Current ENCODING  :  ' + Curr_encoding + '\n')
                      
                      line_list.append (' Current LANGUAGE  :  ' + str(Curr_lang) + '  (' + Lang_desc + ')\n')
                      
                      line_list.append (' Current Line END  :  ' + Curr_eol + '\n')
                      
                      line_list.append (' Current WRAPPING  :  ' + Curr_wrap + '\n\n')
                      
                      line_list.append (' 1-BYTE  Chars     :  ' + str(Total_1_byte))
                      
                      line_list.append (' 2-BYTES Chars     :  ' + str(Total_2_bytes))
                      
                      line_list.append (' 3-BYTES Chars     :  ' + str(Total_3_bytes) + '\n')
                      
                      line_list.append (' Sum BMP Chars     :  ' + str(Total_BMP))
                      
                      line_list.append (' 4-BYTES Chars     :  ' + str(Total_4_bytes) + '\n')
                      
                      line_list.append (' CHARS w/o CR & LF :  ' + str(Total_Standard))
                      
                      line_list.append (' EOL ( CR or LF )  :  ' + str(Total_EOL) + '\n')
                      
                      line_list.append (' TOTAL characters  :  ' + str(Total_chars) + '\n\n')
                      
                      if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                          line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (1 * ' + str(Total_1_byte) + ' + 1 * ' + str(Total_EOL) + ' + 2 * ' + str(Total_2_bytes) + ' + 3 * ' + str(Total_3_bytes) + ' + 4 * ' + str(Total_4_bytes) + ')')
                      
                      if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                          line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (2 * ' + str(Total_chars) + ')')
                      
                      if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                          line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (1 * ' + str(Total_chars) + ')')
                      
                      line_list.append (' Byte Order Mark   :  ' + str(BOM) + '\n')
                      
                      line_list.append (' BUFFER Length     :  ' + str(Buffer_length))
                      
                      if os.path.isfile(File_name) == True:
                          line_list.append (' Length on DISK    :  ' + str(Size_length) + '\n\n')
                      else:
                          line_list.append ('\n')
                      
                      line_list.append (' NON-Blank Chars   :  ' + str(Non_blank_chars) + '\n')
                      
                      line_list.append (' WORDS     Count   :  ' + str(Words_count) + '\n')
                      
                      line_list.append (' NON-SPACE Count   :  ' + str(Non_space_count) + '\n\n')
                      
                      line_list.append (' True EMPTY lines  :  ' + str(Empty_lines))
                      
                      line_list.append (' True BLANK lines  :  ' + str(Blank_lines) + '\n')
                      
                      line_list.append (' EMPTY/BLANK lines :  ' + str(Emp_blk_lines) + '\n')
                      
                      line_list.append (' NON-BLANK lines   :  ' + str(Non_blk_lines))
                      
                      line_list.append (' TOTAL Lines       :  ' + str(Total_lines) + '\n\n')
                      
                      line_list.append (' SELECTION(S)      :  ' + str(Chars_count) + Txt_chars + str(Bytes_count) + Txt_bytes + str(Num_sel) + Txt_ranges)
                      
                      editor.copyText ('\r\n'.join(line_list))
                      
                      notepad.new()
                      
                      editor.paste()
                      
                      # ----Aé☀𝜜-----------------------------------------------------------------------------------------------------------------------------------------------------
                      

                      Now, Alan, I may incorporate, of course, your script, partially displayed below, within my script, in order to get the exact encoding used by the current file !

                      # -*- coding: utf-8 -*-
                      from __future__ import print_function
                      
                      from Npp import *
                      import ctypes
                      from ctypes.wintypes import BOOL, HWND, WPARAM, LPARAM, UINT
                      
                      def npp_get_statusbar(statusbar_item_number):
                      
                          WNDENUMPROC = ctypes.WINFUNCTYPE(BOOL, HWND, LPARAM)
                          FindWindowW = ctypes.windll.user32.FindWindowW
                          FindWindowExW = ctypes.windll.user32.FindWindowExW
                          SendMessageW = ctypes.windll.user32.SendMessageW
                          LRESULT = LPARAM
                          SendMessageW.restype = LRESULT
                          SendMessageW.argtypes = [ HWND, UINT, WPARAM, LPARAM ]
                          EnumChildWindows = ctypes.windll.user32.EnumChildWindows
                          GetClassNameW = ctypes.windll.user32.GetClassNameW
                          create_unicode_buffer = ctypes.create_unicode_buffer
                      .....
                      .....
                          npp_hwnd = FindWindowW(u"Notepad++", None)
                          EnumChildWindows(npp_hwnd, WNDENUMPROC(EnumCallback), 0)
                          if npp_get_statusbar.STATUSBAR_HANDLE: return get_result_from_statusbar(statusbar_item_number)
                          assert False
                      
                      print(npp_get_statusbar(4))  # Zone 4 ( STATUSBARSECTION.UNICODETYPE )
                      

                      But, given that I’m only interressed in the fourth zone of the Status Bar, can’t we used a simplified version of your script to do so ?

                      TIA again !

                      Best Regards,

                      guy038

                      A 1 Reply Last reply Feb 3, 2024, 4:21 PM Reply Quote 0
                      • A
                        Alan Kilborn @guy038
                        last edited by Feb 3, 2024, 4:21 PM

                        @guy038 said in Improved version of the "Summary" feature, ...:

                        given that I’m only interressed in the fourth zone of the Status Bar, can’t we used a simplified version of your script to do so ?

                        I’m not sure how it could be simplified, but if you have ideas on that, please do it and publish it as part of your script.

                        It could be put in its own module, and import ed if you’d like…

                        1 Reply Last reply Reply Quote 0
                        • A
                          Alan Kilborn @guy038
                          last edited by Alan Kilborn Feb 5, 2024, 11:55 AM Feb 5, 2024, 11:54 AM

                          @guy038 said :

                          …returns Npp.BUFFERENCODING.COOKIE, although I would have expected Npp.BUFFERENCODING.OEM-US or perhaps just OEM-US…

                          Rather that relying on a kludged read of the Notepad++ status bar, perhaps you should make a github issue against Notepad++, stating that NPPM_GETBUFFERENCODING doesn’t provide the data you need/expect, and maybe this plugin command will be enhanced for you?

                          1 Reply Last reply Reply Quote 1
                          • G
                            guy038
                            last edited by guy038 Feb 10, 2024, 10:19 AM Feb 5, 2024, 3:34 PM

                            Hello, @alan-kilborn and All,

                            Alan, first of all, I could have told you that I didn’t want to bother modifying your script and that I’d integrated it, as is. But, the truth is that I’m still a long way from understanding your script and seing any possible simplifications :-((


                            So, here is my final version of this Python script which can be used instead of the View > Summary feature. It contains the @alan-kilborn section which reads the right part of the status-bar, relative to the current encoding

                            I have to split this script into two consecutive posts !

                            # encoding=utf-8
                            
                            #-------------------------------------------------------------------------
                            #                    STATISTICS about the CURRENT file ( v0.4 )
                            #-------------------------------------------------------------------------
                            
                            from __future__ import print_function    # for Python2 compatibility
                            
                            from Npp import *
                            
                            import re
                            
                            import os, time
                            
                            import ctypes
                            
                            from ctypes.wintypes import BOOL, HWND, WPARAM, LPARAM, UINT
                            
                            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                            #  From @alan-kilborn, in post https://community.notepad-plus-plus.org/topic/21733/pythonscript-different-behavior-in-script-vs-in-immediate-mode/4
                            # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                            
                            def npp_get_statusbar(statusbar_item_number):
                            
                                WNDENUMPROC = ctypes.WINFUNCTYPE(BOOL, HWND, LPARAM)
                                FindWindowW = ctypes.windll.user32.FindWindowW
                                FindWindowExW = ctypes.windll.user32.FindWindowExW
                                SendMessageW = ctypes.windll.user32.SendMessageW
                                LRESULT = LPARAM
                                SendMessageW.restype = LRESULT
                                SendMessageW.argtypes = [ HWND, UINT, WPARAM, LPARAM ]
                                EnumChildWindows = ctypes.windll.user32.EnumChildWindows
                                GetClassNameW = ctypes.windll.user32.GetClassNameW
                                create_unicode_buffer = ctypes.create_unicode_buffer
                            
                                SBT_OWNERDRAW = 0x1000
                                WM_USER = 0x400; SB_GETTEXTLENGTHW = WM_USER + 12; SB_GETTEXTW = WM_USER + 13
                            
                                npp_get_statusbar.STATUSBAR_HANDLE = None
                            
                                def get_result_from_statusbar(statusbar_item_number):
                                    assert statusbar_item_number <= 5
                                    retcode = SendMessageW(npp_get_statusbar.STATUSBAR_HANDLE, SB_GETTEXTLENGTHW, statusbar_item_number, 0)
                                    length = retcode & 0xFFFF
                                    type = (retcode >> 16) & 0xFFFF
                                    assert (type != SBT_OWNERDRAW)
                                    text_buffer = create_unicode_buffer(length)
                                    retcode = SendMessageW(npp_get_statusbar.STATUSBAR_HANDLE, SB_GETTEXTW, statusbar_item_number, ctypes.addressof(text_buffer))
                                    retval = '{}'.format(text_buffer[:length])
                                    return retval
                            
                                def EnumCallback(hwnd, lparam):
                                    curr_class = create_unicode_buffer(256)
                                    GetClassNameW(hwnd, curr_class, 256)
                                    if curr_class.value.lower() == "msctls_statusbar32":
                                        npp_get_statusbar.STATUSBAR_HANDLE = hwnd
                                        return False  # stop the enumeration
                                    return True  # continue the enumeration
                            
                                npp_hwnd = FindWindowW(u"Notepad++", None)
                                EnumChildWindows(npp_hwnd, WNDENUMPROC(EnumCallback), 0)
                                if npp_get_statusbar.STATUSBAR_HANDLE: return get_result_from_statusbar(statusbar_item_number)
                                assert False
                            
                            St_bar = npp_get_statusbar(4)  # Zone 4 ( STATUSBARSECTION.UNICODETYPE )
                            

                            See next post for continuation !

                            1 Reply Last reply Reply Quote 1
                            • G
                              guy038
                              last edited by Feb 5, 2024, 3:35 PM

                              Hi @alan-kilborn and All,

                              Continuation of my script :

                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              def number(occ):
                                  global num
                                  num += 1
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                                  Line_title = 93
                              else:
                                  Line_title = 71
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              File_name = notepad.getCurrentFilename()
                              
                              if os.path.isfile(File_name) == True:
                              
                                  Creation_date = time.ctime(os.path.getctime(File_name))
                              
                                  Modif_date = time.ctime(os.path.getmtime(File_name))
                              
                                  Size_length = os.path.getsize(File_name)
                              
                                  RO_flag = 'YES'
                              
                                  if os.access(File_name, os.W_OK):
                                      RO_flag = 'NO'
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              RO_editor = 'NO'
                              
                              if editor.getReadOnly() == True:
                                  RO_editor = 'YES'
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              if notepad.getCurrentView() == 0:
                                  Curr_view = 'MAIN View'
                              else:
                                  Curr_view = 'SECONDARY view'
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              Curr_encoding = str(notepad.getEncoding())
                              
                              if Curr_encoding == 'ENC8BIT':
                                  Curr_encoding = 'ANSI'
                              
                              if Curr_encoding == 'COOKIE':
                                  Curr_encoding = 'UTF-8'
                              
                              if Curr_encoding == 'UTF8':
                                  Curr_encoding = 'UTF8-BOM'
                              
                              if Curr_encoding == 'UCS2BE':
                                  Curr_encoding = 'UCS-2 BE BOM'
                              
                              if Curr_encoding == 'UCS2LE':
                                  Curr_encoding = 'UCS-2 LE BOM'
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              Curr_lang = notepad.getCurrentLang()
                              
                              Lang_desc = notepad.getLanguageDesc(Curr_lang)
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              if editor.getEOLMode() == 0:
                                  Curr_eol = 'Windows (CR LF)'
                              
                              if editor.getEOLMode() == 1:
                                  Curr_eol = 'Macintosh (CR)'
                              
                              if editor.getEOLMode() == 2:
                                  Curr_eol = 'Unix (LF)'
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              Curr_wrap = 'NO'
                              
                              if editor.getWrapMode() == 1:
                                  Curr_wrap = 'YES'
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              num = 0
                              if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                                  editor.research(r'[^\r\n]', number)
                              
                              if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                                  editor.research(r'(?![\r\n])[\x{0000}-\x{007F}]', number)
                              
                              Total_1_byte = num
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              num = 0
                              if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                                  editor.research(r'[\x{0080}-\x{07FF}]', number)
                              
                              if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                                  editor.research(r'[^\r\n]', number)
                              
                              Total_2_bytes = num
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              num = 0
                              if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                                  editor.research(r'(?![\x{D800}-\x{DFFF}])[\x{0800}-\x{FFFF}]', number)
                              
                              Total_3_bytes = num
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              Total_BMP = Total_1_byte + Total_2_bytes + Total_3_bytes
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              num = 0
                              editor.research(r'[^\r\n]', number)
                              
                              Total_Standard = num
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              Total_4_bytes = 0  #  By default
                              
                              if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                                  Total_4_bytes = Total_Standard - Total_BMP
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              num = 0
                              editor.research(r'\r|\n', number)
                              
                              Total_EOL = num
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              Total_chars = Total_Standard + Total_EOL
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              Bytes_length = Total_EOL + Total_1_byte  #  Default ANSI
                              
                              if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                                  Bytes_length = 2 * Total_chars
                              
                              if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                                  Bytes_length = Total_EOL + Total_1_byte + 2 * Total_2_bytes + 3 * Total_3_bytes + 4 * Total_4_bytes
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              BOM = 0  #  Default ANSI and UTF-8
                              
                              if notepad.getEncoding() == BUFFERENCODING.UTF8:
                                  BOM = 3
                              
                              if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                                  BOM = 2
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              Buffer_length = Bytes_length + BOM
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              num = 0
                              editor.research(r'[^\r\n\t\x20]', number)
                              
                              Non_blank_chars = num
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              num = 0
                              editor.research(r'\w+', number)
                              
                              Words_count = num
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              num = 0
                              
                              if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                                  editor.research(r'((?!\s).[\x{D800}-\x{DFFF}]?)+', number)
                              else:
                                  editor.research(r'((?!\s).)+', number)
                              
                              Non_space_count = num
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              num = 0
                              if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                                  editor.research(r'(?<!\f)^(?:\r\n|\r|\n)', number)
                              else:
                                  editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^(?:\r\n|\r|\n)', number)
                              
                              Empty_lines = num
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              num = 0
                              if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                                  editor.research(r'(?<!\f)^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
                              else:
                                  editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
                              
                              Blank_lines = num
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              Emp_blk_lines = Empty_lines + Blank_lines
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              num = 0
                              if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                                  editor.research(r'(?-s)\r\n|\r|\n|(?:.|\f)\z', number)
                              else:
                                  editor.research(r'(?-s)\r\n|\r|\n|(?:.|[\f\x{0085}\x{2028}\x{2029}])\z', number)
                              
                              Total_lines = num
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              Non_blk_lines = Total_lines - Emp_blk_lines
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              Num_sel = editor.getSelections()  # Get ALL selections ( EMPTY or NOT )
                              
                              # print ('Res = ', Num_sel)
                              
                              if Num_sel != 0:
                              
                                  Bytes_count = 0
                                  Chars_count = 0
                              
                                  for n in range(Num_sel):
                              
                                      Bytes_count += editor.getSelectionNEnd(n) - editor.getSelectionNStart(n)
                              
                                      Chars_count += editor.countCharacters(editor.getSelectionNStart(n), editor.getSelectionNEnd(n))
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                                  if Chars_count < 2:
                                      Txt_chars = ' selected char ('
                              
                                  else:
                                      Txt_chars = ' selected chars ('
                              
                              
                                  if Bytes_count < 2:
                                      Txt_bytes = ' selected byte) in '
                              
                                  else:
                                      Txt_bytes = ' selected bytes) in '
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                                  if Num_sel < 2 and Bytes_count == 0:
                                      Txt_ranges = ' EMPTY range\n'
                              
                                  if Num_sel < 2 and Bytes_count > 0:
                                      Txt_ranges = ' range\n'
                              
                                  if Num_sel > 1 and Bytes_count == 0:
                                      Txt_ranges = ' EMPTY ranges\n'
                              
                                  if Num_sel > 1 and Bytes_count > 0:
                                      Txt_ranges = ' ranges (EMPTY or NOT)\n'
                              
                              # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              line_list = []  # empty list
                              
                              line_list.append ('-' * Line_title)
                              
                              line_list.append (' ' * ((Line_title - 37) / 2) + 'SUMMARY on ' + str(datetime.datetime.now()))
                              
                              line_list.append ('-' * Line_title +'\n')
                              
                              line_list.append (' FULL File Path    :  ' + File_name + '\n')
                              
                              if os.path.isfile(File_name) == True:
                              
                                  line_list.append(' CREATION     Date :  ' + Creation_date)
                              
                                  line_list.append(' MODIFICATION Date :  ' + Modif_date + '\n')
                              
                                  line_list.append(' READ-ONLY flag    :  ' + RO_flag )
                              
                              line_list.append (' READ-ONLY editor  :  ' + RO_editor + '\n\n')
                              
                              line_list.append (' Current VIEW      :  ' + Curr_view + '\n')
                              
                              line_list.append (' Current ENCODING  :  ' + Curr_encoding + '\n')
                              
                              line_list.append (' Current LANGUAGE  :  ' + str(Curr_lang) + '  (' + Lang_desc + ')\n')
                              
                              line_list.append (' Current Line END  :  ' + Curr_eol + '\n')
                              
                              line_list.append (' Current WRAPPING  :  ' + Curr_wrap + '\n\n')
                              
                              line_list.append (' 1-BYTE  Chars     :  ' + str(Total_1_byte))
                              
                              line_list.append (' 2-BYTES Chars     :  ' + str(Total_2_bytes))
                              
                              line_list.append (' 3-BYTES Chars     :  ' + str(Total_3_bytes) + '\n')
                              
                              line_list.append (' Sum BMP Chars     :  ' + str(Total_BMP))
                              
                              line_list.append (' 4-BYTES Chars     :  ' + str(Total_4_bytes) + '\n')
                              
                              line_list.append (' CHARS w/o CR & LF :  ' + str(Total_Standard))
                              
                              line_list.append (' EOL ( CR or LF )  :  ' + str(Total_EOL) + '\n')
                              
                              line_list.append (' TOTAL characters  :  ' + str(Total_chars) + '\n\n')
                              
                              if notepad.getEncoding() == BUFFERENCODING.UTF8 or notepad.getEncoding() == BUFFERENCODING.COOKIE:
                                  line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_EOL) + ' * 1 + ' + str(Total_1_byte) + ' * 1b + '\
                                  + str(Total_2_bytes) + ' * 2b + ' + str(Total_3_bytes) + ' * 3b + ' + str(Total_4_bytes) + ' * 4b)')
                              
                              if notepad.getEncoding() == BUFFERENCODING.UCS2BE or notepad.getEncoding() == BUFFERENCODING.UCS2LE:
                                  line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_chars) + ' * 2b)')
                              
                              if notepad.getEncoding() == BUFFERENCODING.ENC8BIT:
                                  line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_chars) + ' * 1b)')
                              
                              line_list.append (' Byte Order Mark   :  ' + str(BOM) + '\n')
                              
                              line_list.append (' BUFFER Length     :  ' + str(Buffer_length))
                              
                              if os.path.isfile(File_name) == True:
                                  line_list.append (' Length on DISK    :  ' + str(Size_length) + '\n\n')
                              else:
                                  line_list.append ('\n')
                              
                              line_list.append (' NON-Blank Chars   :  ' + str(Non_blank_chars) + '\n')
                              
                              line_list.append (' WORDS     Count   :  ' + str(Words_count) + ' (Caution !)\n')
                              
                              line_list.append (' NON-SPACE Count   :  ' + str(Non_space_count) + '\n\n')
                              
                              line_list.append (' True EMPTY lines  :  ' + str(Empty_lines))
                              
                              line_list.append (' True BLANK lines  :  ' + str(Blank_lines) + '\n')
                              
                              line_list.append (' EMPTY/BLANK lines :  ' + str(Emp_blk_lines) + '\n')
                              
                              line_list.append (' NON-BLANK lines   :  ' + str(Non_blk_lines))
                              
                              line_list.append (' TOTAL Lines       :  ' + str(Total_lines) + '\n\n')
                              
                              line_list.append (' SELECTION(S)      :  ' + str(Chars_count) + Txt_chars + str(Bytes_count) + Txt_bytes + str(Num_sel) + Txt_ranges)
                              
                              editor.copyText ('\r\n'.join(line_list))
                              
                              notepad.new()
                              
                              editor.paste()
                              
                              if St_bar != 'ANSI' and St_bar != 'UTF-8' and St_bar != 'UTF-8-BOM' and St_bar != 'UCS-2 BE BOM' and St_bar != 'UCS-2 LE BOM':
                              
                                  if Curr_encoding == 'UTF-8':  #  SAME value for both an 'UTF-8' or 'ANSI' file, when RE-INTERPRETED with the 'Encoding > Character Set > ...' feature
                              
                                      notepad.prompt ('CURRENT file re-interpreted as ' + St_bar + '  =>  Possible ERRONEOUS results' + \
                                                      '\nSo, CLOSE the file WITHOUT saving, RESTORE it (CTRL + SHIFT + T) and RESTART script', '!!! WARNING !!!', '')
                              
                              # ----Aé☀𝜜-----------------------------------------------------------------------------------------------------------------------------------------------------
                              
                              1 Reply Last reply Reply Quote 0
                              • G
                                guy038
                                last edited by guy038 Feb 6, 2024, 9:01 AM Feb 5, 2024, 3:40 PM

                                Hi, Alan and All,

                                ( Continuation of the previous post )

                                Now, I’ve come accross a problem with the encodings !

                                Have you ever noticed that, when you decide to re-interpret the present encoding of a file with the View > Character Set > ... feature, that there are two possible scenarios ?

                                • A) - The present econding is an Unicode encoding with a BOM ( Byte Order Mark ). So, either the UTF-8-BOM, UCS-2 BE BOM or UCS-2 LE BOM encoding

                                • B) - The present encoding is an ANSI or UTF-8 file, so without a BOM

                                In the first case, whatever the new encoding chosen ( one-byte or two-bytes encoding ), the file contents do not change and my script just respects the real encoding of the current file

                                For example, with an UCS-2 LE BOM encoded file, if I change its encoding to View > Character Set > Western European > OEM-US, my new summary just consider that it’s still a true UCS-2 LE BOM encoded file, leading to a correct summary report !

                                In the second case, the new encoding chosen does modify the current file contents in the editor window. In addition, it automatically supposes that the current file is an UTF-8 encoded file, leading to erroneous results in the summary rapport :-( However, the current file contents, saved on the disk, seem still unchanged !!

                                For instance :

                                • Open a new tab

                                • Use the Encoding > Convert to UTF-8 feature, if necessary

                                • Enter the four chars Aé☀𝜜, without any line-break, at the end

                                • Save this file as Test-UTF8.txt

                                • Using my script, you get, in a new tab :

                                ---------------------------------------------------------------------------------------------
                                                            SUMMARY on 2024-02-05 16:50:23.656000
                                ---------------------------------------------------------------------------------------------
                                
                                 FULL File Path    :  D:\@@\792\Test-UTF8.txt
                                
                                 CREATION     Date :  Mon Feb  5 16:45:24 2024
                                 MODIFICATION Date :  Mon Feb  5 15:17:02 2024
                                
                                 READ-ONLY flag    :  NO
                                 READ-ONLY editor  :  NO
                                
                                
                                 Current VIEW      :  MAIN View
                                
                                 Current ENCODING  :  UTF-8
                                
                                 Current LANGUAGE  :  TXT  (Normal text file)
                                
                                 Current Line END  :  Windows (CR LF)
                                
                                 Current WRAPPING  :  YES
                                
                                
                                 1-BYTE  Chars     :  1
                                 2-BYTES Chars     :  1
                                 3-BYTES Chars     :  1
                                
                                 Sum BMP Chars     :  3
                                 4-BYTES Chars     :  1
                                
                                 CHARS w/o CR & LF :  4
                                 EOL ( CR or LF )  :  0
                                
                                 TOTAL characters  :  4
                                
                                
                                 BYTES Length      :  10 (0 * 1 + 1 * 1b + 1 * 2b + 1 * 3b + 1 * 4b)
                                 Byte Order Mark   :  0
                                
                                 BUFFER Length     :  10
                                 Length on DISK    :  10
                                
                                
                                 NON-Blank Chars   :  4
                                
                                 WORDS     Count   :  1 (Caution !)
                                
                                 NON-SPACE Count   :  1
                                
                                
                                 True EMPTY lines  :  0
                                 True BLANK lines  :  0
                                
                                 EMPTY/BLANK lines :  0
                                
                                 NON-BLANK lines   :  1
                                 TOTAL Lines       :  1
                                
                                
                                 SELECTION(S)      :  0 selected char (0 selected byte) in 1 EMPTY range
                                

                                Everything is OK ( buffer length and length on disk are identical and the bytes length description shows one char for each number of bytes, without any EOL )

                                • Now, switch back to the Test-UTF8.txt file

                                • Run the View > Character Set > Western European > OEM-US feature

                                • Re-run my script. This time, in a other new tab, you get :

                                ---------------------------------------------------------------------------------------------
                                                            SUMMARY on 2024-02-05 16:51:16.937000
                                ---------------------------------------------------------------------------------------------
                                
                                 FULL File Path    :  D:\@@\792\Test-UTF8.txt
                                
                                 CREATION     Date :  Mon Feb  5 16:45:24 2024
                                 MODIFICATION Date :  Mon Feb  5 15:17:02 2024
                                
                                 READ-ONLY flag    :  NO
                                 READ-ONLY editor  :  NO
                                
                                
                                 Current VIEW      :  MAIN View
                                
                                 Current ENCODING  :  UTF-8
                                
                                 Current LANGUAGE  :  TXT  (Normal text file)
                                
                                 Current Line END  :  Windows (CR LF)
                                
                                 Current WRAPPING  :  YES
                                
                                
                                 1-BYTE  Chars     :  1
                                 2-BYTES Chars     :  6
                                 3-BYTES Chars     :  3
                                
                                 Sum BMP Chars     :  10
                                 4-BYTES Chars     :  0
                                
                                 CHARS w/o CR & LF :  10
                                 EOL ( CR or LF )  :  0
                                
                                 TOTAL characters  :  10
                                
                                
                                 BYTES Length      :  22 (0 * 1 + 1 * 1b + 6 * 2b + 3 * 3b + 0 * 4b)
                                 Byte Order Mark   :  0
                                
                                 BUFFER Length     :  22
                                 Length on DISK    :  10
                                
                                
                                 NON-Blank Chars   :  10
                                
                                 WORDS     Count   :  2 (Caution !)
                                
                                 NON-SPACE Count   :  1
                                
                                
                                 True EMPTY lines  :  0
                                 True BLANK lines  :  0
                                
                                 EMPTY/BLANK lines :  0
                                
                                 NON-BLANK lines   :  1
                                 TOTAL Lines       :  1
                                
                                
                                 SELECTION(S)      :  0 selected char (0 selected byte) in 1 EMPTY range
                                

                                And, at the same time, a prompt displays this warning :

                                CURRENT file re-interpreted as OEM-US => Possible ERRONEOUS results
                                So, CLOSE the file WITHOUT saving, RESTORE it (CTRL + SHIFT + T) and RESTART script

                                Indeed, this time, as the file contents are unchanged, the length on DISK is still correct but the BUFFER length is wrong, due to the re-interpretation of the characters by the OEM-US encoding. That’s why I preferred to add this warning at the end of the script !

                                Now, do as it is said :

                                • Close the Test-UTF8.txt file ( Ctrl + W )

                                • Restore it ( Ctrl + Shift + T )

                                • Again, you get the UTF-8 indication, for the Test-UTF8.txt file, at right of the status bar

                                • Re-run my script

                                => This time, we get again a correct summary, without any prompt !


                                Alan or other python gurus, feel free to improve this last version and/or test on various files if all the numbers shown are coherent !

                                Best Regards,

                                guy038

                                1 Reply Last reply Reply Quote 1
                                • G
                                  guy038
                                  last edited by guy038 Feb 6, 2024, 2:36 PM Feb 6, 2024, 10:07 AM

                                  Hi All,

                                  I"ve just realized that, up to now, I simply improved my script with an old version of N++ ( v7.9.2 ). I apologize…

                                  So, I’m first going to update my last portable version, on my W10 laptop, from v8.5.4 to the v8.6.2 version and I will update my script and redo all the tests

                                  See you later !

                                  BR

                                  guy038

                                  A 1 Reply Last reply Feb 6, 2024, 12:31 PM Reply Quote 0
                                  • A
                                    Alan Kilborn @guy038
                                    last edited by Feb 6, 2024, 12:31 PM

                                    @guy038 said in Improved version of the "Summary" feature, ...:

                                    I"ve just realized that, up to now, I simply improved my script with an old version of N++ ( v7.9.2 ). I apologize…

                                    :-(

                                    You ought to close out these ancient versions…permanently.

                                    1 Reply Last reply Reply Quote 0
                                    • G
                                      guy038
                                      last edited by guy038 Feb 10, 2024, 10:22 AM Feb 9, 2024, 2:53 PM

                                      Hello, @alan-kilborn and All,

                                      I’e just discovered that, since the v8.0 N++ version, the UCS-2 BE BOM and UCS-2 LE BOM encodings are able to handle all the characters over the BMP. Thus, these encoding were renamed, respectively, as UTF-16 BE BOM and UTF-16 LE BOM !

                                      Note that, with these two encodings, each character with code > \x{FFFF} is built with the surrogate pair mechanism, so with two 16-bytes chars. Consequently, the total number of characters in the buffer = 2 (BOM) + number of chars <= x{FFFF} x 2 + number of chars > x{FFFF} x 4

                                      For example, the simple string Aé☀𝜜, without any EOL, in an UTF-16 BE encoding file, is coded with 12 bytes as :

                                      
                                      FE FF 00 41 00 E9 26 00 D8 35 DF 1C
                                      ----- ----- ----- ----- -----------
                                       BOM    A     é     ☀       𝜜
                                      
                                      

                                      So, here is my final and updated version of the script, which works in all versions since the v8.0 one !

                                      # encoding=utf-8
                                      
                                      #-------------------------------------------------------------------------
                                      #                    STATISTICS about the CURRENT file ( v0.5 )
                                      #-------------------------------------------------------------------------
                                      
                                      from __future__ import print_function    # for Python2 compatibility
                                      
                                      from Npp import *
                                      
                                      import re
                                      
                                      import os, time, datetime
                                      
                                      import ctypes
                                      
                                      from ctypes.wintypes import BOOL, HWND, WPARAM, LPARAM, UINT
                                      
                                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                      #  From @alan-kilborn, in post https://community.notepad-plus-plus.org/topic/21733/pythonscript-different-behavior-in-script-vs-in-immediate-mode/4
                                      # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                      
                                      def npp_get_statusbar(statusbar_item_number):
                                      
                                          WNDENUMPROC = ctypes.WINFUNCTYPE(BOOL, HWND, LPARAM)
                                          FindWindowW = ctypes.windll.user32.FindWindowW
                                          FindWindowExW = ctypes.windll.user32.FindWindowExW
                                          SendMessageW = ctypes.windll.user32.SendMessageW
                                          LRESULT = LPARAM
                                          SendMessageW.restype = LRESULT
                                          SendMessageW.argtypes = [ HWND, UINT, WPARAM, LPARAM ]
                                          EnumChildWindows = ctypes.windll.user32.EnumChildWindows
                                          GetClassNameW = ctypes.windll.user32.GetClassNameW
                                          create_unicode_buffer = ctypes.create_unicode_buffer
                                      
                                          SBT_OWNERDRAW = 0x1000
                                          WM_USER = 0x400; SB_GETTEXTLENGTHW = WM_USER + 12; SB_GETTEXTW = WM_USER + 13
                                      
                                          npp_get_statusbar.STATUSBAR_HANDLE = None
                                      
                                          def get_result_from_statusbar(statusbar_item_number):
                                              assert statusbar_item_number <= 5
                                              retcode = SendMessageW(npp_get_statusbar.STATUSBAR_HANDLE, SB_GETTEXTLENGTHW, statusbar_item_number, 0)
                                              length = retcode & 0xFFFF
                                              type = (retcode >> 16) & 0xFFFF
                                              assert (type != SBT_OWNERDRAW)
                                              text_buffer = create_unicode_buffer(length)
                                              retcode = SendMessageW(npp_get_statusbar.STATUSBAR_HANDLE, SB_GETTEXTW, statusbar_item_number, ctypes.addressof(text_buffer))
                                              retval = '{}'.format(text_buffer[:length])
                                              return retval
                                      
                                          def EnumCallback(hwnd, lparam):
                                              curr_class = create_unicode_buffer(256)
                                              GetClassNameW(hwnd, curr_class, 256)
                                              if curr_class.value.lower() == "msctls_statusbar32":
                                                  npp_get_statusbar.STATUSBAR_HANDLE = hwnd
                                                  return False  # stop the enumeration
                                              return True  # continue the enumeration
                                      
                                          npp_hwnd = FindWindowW(u"Notepad++", None)
                                          EnumChildWindows(npp_hwnd, WNDENUMPROC(EnumCallback), 0)
                                          if npp_get_statusbar.STATUSBAR_HANDLE: return get_result_from_statusbar(statusbar_item_number)
                                          assert False
                                      
                                      St_bar = npp_get_statusbar(4)  # Zone 4 ( STATUSBARSECTION.UNICODETYPE )
                                      

                                      See next post for continuation !

                                      1 Reply Last reply Reply Quote 1
                                      • G
                                        guy038
                                        last edited by Feb 9, 2024, 2:57 PM

                                        Hi, @alan-kilborn and All,

                                        Continuation of the script :

                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        def number(occ):
                                            global num
                                            num += 1
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        Curr_encoding = str(notepad.getEncoding())
                                        
                                        if Curr_encoding == 'ENC8BIT':
                                            Curr_encoding = 'ANSI'
                                        
                                        if Curr_encoding == 'COOKIE':
                                            Curr_encoding = 'UTF-8'
                                        
                                        if Curr_encoding == 'UTF8':
                                            Curr_encoding = 'UTF-8-BOM'
                                        
                                        if Curr_encoding == 'UCS2BE':
                                            Curr_encoding = 'UTF-16 BE BOM'
                                        
                                        if Curr_encoding == 'UCS2LE':
                                            Curr_encoding = 'UTF-16 LE BOM'
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                                            Line_title = 95
                                        else:
                                            Line_title = 75
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        File_name = notepad.getCurrentFilename()
                                        
                                        if os.path.isfile(File_name) == True:
                                        
                                            Creation_date = time.ctime(os.path.getctime(File_name))
                                        
                                            Modif_date = time.ctime(os.path.getmtime(File_name))
                                        
                                            Size_length = os.path.getsize(File_name)
                                        
                                            RO_flag = 'YES'
                                        
                                            if os.access(File_name, os.W_OK):
                                                RO_flag = 'NO'
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        RO_editor = 'NO'
                                        
                                        if editor.getReadOnly() == True:
                                            RO_editor = 'YES'
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        if notepad.getCurrentView() == 0:
                                            Curr_view = 'MAIN View'
                                        else:
                                            Curr_view = 'SECONDARY view'
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        Curr_lang = notepad.getCurrentLang()
                                        
                                        Lang_desc = notepad.getLanguageDesc(Curr_lang)
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        if editor.getEOLMode() == 0:
                                            Curr_eol = 'Windows (CR LF)'
                                        
                                        if editor.getEOLMode() == 1:
                                            Curr_eol = 'Macintosh (CR)'
                                        
                                        if editor.getEOLMode() == 2:
                                            Curr_eol = 'Unix (LF)'
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        Curr_wrap = 'NO'
                                        
                                        if editor.getWrapMode() == 1:
                                            Curr_wrap = 'YES'
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        num = 0
                                        if Curr_encoding == 'ANSI':
                                            editor.research(r'[^\r\n]', number)
                                        
                                        if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                                            editor.research(r'(?![\r\n])[\x{0000}-\x{007F}]', number)
                                        
                                        Total_1_byte = num
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        num = 0
                                        if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                                            editor.research(r'[\x{0080}-\x{07FF}]', number)
                                        
                                        if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                                            editor.research(r'(?![\r\n\x{D800}-\x{DFFF}])[\x{0000}-\x{FFFF}]', number)  #  ALL BMP vchars ( With PYTHON, the [^\r\n\x{D800}-\x{DFFF}] syntax does NOT work properly !)
                                        
                                        Total_2_bytes = num
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        num = 0
                                        if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                                            editor.research(r'(?![\x{D800}-\x{DFFF}])[\x{0800}-\x{FFFF}]', number)
                                        
                                        Total_3_bytes = num
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        Total_BMP = Total_1_byte + Total_2_bytes + Total_3_bytes
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        num = 0
                                        editor.research(r'[^\r\n]', number)
                                        
                                        Total_standard = num
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        Total_4_bytes = 0  #  By default
                                        
                                        if Curr_encoding != 'ANSI':
                                            Total_4_bytes = Total_standard - Total_BMP
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        num = 0
                                        editor.research(r'\r|\n', number)
                                        
                                        Total_EOL = num
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        Total_chars = Total_EOL + Total_standard
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        if Curr_encoding == 'ANSI':
                                            Bytes_length = Total_EOL + Total_1_byte
                                        
                                        if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                                            Bytes_length = Total_EOL + Total_1_byte + 2 * Total_2_bytes + 3 * Total_3_bytes + 4 * Total_4_bytes
                                        
                                        if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                                            Bytes_length = 2 * Total_EOL + 2 * Total_BMP + 4 * Total_4_bytes
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        BOM = 0  #  Default ANSI and UTF-8
                                        
                                        if Curr_encoding == 'UTF-8-BOM':
                                            BOM = 3
                                        
                                        if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                                            BOM = 2
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        Buffer_length = Bytes_length + BOM
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        num = 0
                                        editor.research(r'[^\r\n\t\x20]', number)
                                        
                                        Non_blank_chars = num
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        num = 0
                                        editor.research(r'\w+', number)
                                        
                                        Words_count = num
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        num = 0
                                        
                                        if Curr_encoding == 'ANSI':
                                            editor.research(r'((?!\s).)+', number)
                                        else:
                                            editor.research(r'((?!\s).[\x{D800}-\x{DFFF}]?)+', number)
                                        
                                        Non_space_count = num
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        num = 0
                                        if Curr_encoding == 'ANSI':
                                            editor.research(r'(?<!\f)^(?:\r\n|\r|\n)', number)
                                        else:
                                            editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^(?:\r\n|\r|\n)', number)
                                        
                                        Empty_lines = num
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        num = 0
                                        if Curr_encoding == 'ANSI':
                                            editor.research(r'(?<!\f)^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
                                        else:
                                            editor.research(r'(?<![\f\x{0085}\x{2028}\x{2029}])^[\t\x20]+(?:\r\n|\r|\n|\z)', number)
                                        
                                        Blank_lines = num
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        Emp_blk_lines = Empty_lines + Blank_lines
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        num = 0
                                        if Curr_encoding == 'ANSI':
                                            editor.research(r'(?-s)\r\n|\r|\n|(?:.|\f)\z', number)
                                        else:
                                            editor.research(r'(?-s)\r\n|\r|\n|(?:.|[\f\x{0085}\x{2028}\x{2029}])\z', number)
                                        
                                        Total_lines = num
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        Non_blk_lines = Total_lines - Emp_blk_lines
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        Num_sel = editor.getSelections()  # Get ALL selections ( EMPTY or NOT )
                                        
                                        # print ('Res = ', Num_sel)
                                        
                                        if Num_sel != 0:
                                        
                                            Bytes_count = 0
                                            Chars_count = 0
                                        
                                            for n in range(Num_sel):
                                        
                                                Bytes_count += editor.getSelectionNEnd(n) - editor.getSelectionNStart(n)
                                        
                                                Chars_count += editor.countCharacters(editor.getSelectionNStart(n), editor.getSelectionNEnd(n))
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                            if Chars_count < 2:
                                                Txt_chars = ' selected char ('
                                        
                                            else:
                                                Txt_chars = ' selected chars ('
                                        
                                        
                                            if Bytes_count < 2:
                                                Txt_bytes = ' selected byte) in '
                                        
                                            else:
                                                Txt_bytes = ' selected bytes) in '
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                            if Num_sel < 2 and Bytes_count == 0:
                                                Txt_ranges = ' EMPTY range\n'
                                        
                                            if Num_sel < 2 and Bytes_count > 0:
                                                Txt_ranges = ' range\n'
                                        
                                            if Num_sel > 1 and Bytes_count == 0:
                                                Txt_ranges = ' EMPTY ranges\n'
                                        
                                            if Num_sel > 1 and Bytes_count > 0:
                                                Txt_ranges = ' ranges (EMPTY or NOT)\n'
                                        
                                        # --------------------------------------------------------------------------------------------------------------------------------------------------------------
                                        
                                        line_list = []  # empty list
                                        
                                        line_list.append ('-' * Line_title)
                                        
                                        line_list.append (' ' * ((Line_title - 37) / 2) + 'SUMMARY on ' + str(datetime.datetime.now()))
                                        
                                        line_list.append ('-' * Line_title +'\n')
                                        
                                        line_list.append (' FULL File Path    :  ' + File_name + '\n')
                                        
                                        if os.path.isfile(File_name) == True:
                                        
                                            line_list.append(' CREATION     Date :  ' + Creation_date)
                                        
                                            line_list.append(' MODIFICATION Date :  ' + Modif_date + '\n')
                                        
                                            line_list.append(' READ-ONLY flag    :  ' + RO_flag )
                                        
                                        line_list.append (' READ-ONLY editor  :  ' + RO_editor + '\n\n')
                                        
                                        line_list.append (' Current VIEW      :  ' + Curr_view + '\n')
                                        
                                        line_list.append (' Current ENCODING  :  ' + Curr_encoding + '\n')
                                        
                                        line_list.append (' Current LANGUAGE  :  ' + str(Curr_lang) + '  (' + Lang_desc + ')\n')
                                        
                                        line_list.append (' Current Line END  :  ' + Curr_eol + '\n')
                                        
                                        line_list.append (' Current WRAPPING  :  ' + Curr_wrap + '\n\n')
                                        
                                        line_list.append (' 1-BYTE  Chars     :  ' + str(Total_1_byte))
                                        
                                        line_list.append (' 2-BYTES Chars     :  ' + str(Total_2_bytes))
                                        
                                        line_list.append (' 3-BYTES Chars     :  ' + str(Total_3_bytes) + '\n')
                                        
                                        line_list.append (' Sum BMP Chars     :  ' + str(Total_BMP))
                                        
                                        line_list.append (' 4-BYTES Chars     :  ' + str(Total_4_bytes) + '\n')
                                        
                                        line_list.append (' CHARS w/o CR & LF :  ' + str(Total_standard))
                                        
                                        line_list.append (' EOL ( CR or LF )  :  ' + str(Total_EOL) + '\n')
                                        
                                        line_list.append (' TOTAL characters  :  ' + str(Total_chars) + '\n\n')
                                        
                                        if Curr_encoding == 'ANSI':
                                            line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_EOL) + ' x 1 + ' + str(Total_1_byte) + ' x 1b)')
                                        
                                        if Curr_encoding == 'UTF-8' or Curr_encoding == 'UTF-8-BOM':
                                            line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_EOL) + ' x 1 + ' + str(Total_1_byte) + ' x 1b + '\
                                            + str(Total_2_bytes) + ' x 2b + ' + str(Total_3_bytes) + ' x 3b + ' + str(Total_4_bytes) + ' x 4b)')
                                        
                                        if Curr_encoding == 'UTF-16 BE BOM' or Curr_encoding == 'UTF-16 LE BOM':
                                            line_list.append (' BYTES Length      :  ' + str(Bytes_length) + ' (' + str(Total_EOL) + ' x 2 + ' + str(Total_BMP) + ' x 2b + ' + str(Total_4_bytes) + ' x 4b)')
                                        
                                        line_list.append (' Byte Order Mark   :  ' + str(BOM) + '\n')
                                        
                                        line_list.append (' BUFFER Length     :  ' + str(Buffer_length))
                                        
                                        if os.path.isfile(File_name) == True:
                                            line_list.append (' Length on DISK    :  ' + str(Size_length) + '\n\n')
                                        else:
                                            line_list.append ('\n')
                                        
                                        line_list.append (' NON-Blank Chars   :  ' + str(Non_blank_chars) + '\n')
                                        
                                        line_list.append (' WORDS     Count   :  ' + str(Words_count) + ' (Caution !)\n')
                                        
                                        line_list.append (' NON-SPACE Count   :  ' + str(Non_space_count) + '\n\n')
                                        
                                        line_list.append (' True EMPTY lines  :  ' + str(Empty_lines))
                                        
                                        line_list.append (' True BLANK lines  :  ' + str(Blank_lines) + '\n')
                                        
                                        line_list.append (' EMPTY/BLANK lines :  ' + str(Emp_blk_lines) + '\n')
                                        
                                        line_list.append (' NON-BLANK lines   :  ' + str(Non_blk_lines))
                                        
                                        line_list.append (' TOTAL Lines       :  ' + str(Total_lines) + '\n\n')
                                        
                                        line_list.append (' SELECTION(S)      :  ' + str(Chars_count) + Txt_chars + str(Bytes_count) + Txt_bytes + str(Num_sel) + Txt_ranges)
                                        
                                        editor.copyText ('\r\n'.join(line_list))
                                        
                                        notepad.new()
                                        
                                        editor.paste()
                                        
                                        editor.copyText('')
                                        
                                        if St_bar != 'ANSI' and St_bar != 'UTF-8' and St_bar != 'UTF-8-BOM' and St_bar != 'UTF-16 BE BOM' and St_bar != 'UTF-16 LE BOM':
                                        
                                            if Curr_encoding == 'UTF-8':  #  SAME value for both an 'UTF-8' or 'ANSI' file, when RE-INTERPRETED with the 'Encoding > Character Set > ...' feature
                                        
                                                notepad.prompt ('CURRENT file re-interpreted as ' + St_bar + '  =>  Possible ERRONEOUS results' + \
                                                                '\nSo, CLOSE the file WITHOUT saving, RESTORE it (CTRL + SHIFT + T) and RESTART script', '!!! WARNING !!!', '')
                                        
                                        # ----Aé☀𝜜-----------------------------------------------------------------------------------------------------------------------------------------------------
                                        

                                        If you’re still working or doing tests wih a N++ version prior to v8.0 :

                                        • First, change any sub-string UTF-16 with UCS-2, in the python script

                                        • And, of course, do not forget to get rid of any character over \x{FFFF} in your UCS-2 BE/LE BOM encoded files, before using this script


                                        Note, that the encoding problem, described two posts ago, when trying to encode any file, without a BOM, with a Encoding > Character Set > ... encoding, stll remains. Thus, the warning prompt is still present at the end of this final version !


                                        Now, I’m going to update an old post where I explained the poor performance of the present summary feature. I’ll take the opportunity to include the instructions for understanding this improved script !

                                        Best Regards,

                                        guy038

                                        A 1 Reply Last reply Feb 9, 2024, 4:17 PM Reply Quote 1
                                        • A
                                          Alan Kilborn @guy038
                                          last edited by Feb 9, 2024, 4:17 PM

                                          @guy038

                                          You have this line in your script:

                                          line_list.append (' ' * ((Line_title - 37) / 2) + 'SUMMARY on ' + str(datetime.datetime.now()))
                                          

                                          I would suggest changing it to:

                                          line_list.append (' ' * int((Line_title - 37) / 2) + 'SUMMARY on ' + str(datetime.datetime.now()))
                                          

                                          This is because, without the int, under Python3 we see the following error:

                                          TypeError: can't multiply sequence by non-int of type 'float'
                                          
                                          1 Reply Last reply Reply Quote 3
                                          17 out of 31
                                          • First post
                                            17/31
                                            Last post
                                          The Community of users of the Notepad++ text editor.
                                          Powered by NodeBB | Contributors