@B-M ,
I was actually cogitating over this for the past day, because I thought a simplified version of your request might be doable in a scripting plugin:
If you are willing to use a script for the PythonScript plugin and run it manually, then this might be enough for you.
23883-highlight-frequent-words.py PythonScript InstructionsWhen you run the script, it will analyze the active file’s wordlist, and assign a foreground color to the N most-frequent words from the active file. (It will ignore 1 and 2-letter words, and anything in the list of stopwords. The default stopwords include some linking verbs, conjunctions, articles, and prepositions; there are probably other words that should be in the default list, but it was enough for proof-of-concept.)
You can edit the script to change the “stopwords” by adding words into the string. You can change the palette by editing the list in def createPalette(): , and by doing that, you can change how many words will be colored.
The 2022-Dec-13 source for the script is included in the post. The link above will always take you to my most recent version (in case I make future bugfixes or feature additions).
Script
# encoding=utf-8 """in response to https://community.notepad-plus-plus.org/topic/23883/ When run, this script will count all the words (except for stopwords), and assign a color to each word in the list of Top N Colors Edit the contents of the stopwords variable to change which words are ignored - the default stopwords are some linking verbs, conjunctions, articles, and prepositions - words of 1 or two characters are automatically annoyed Edit the list of appended colors in createPalette() to change the color palette - you can change the colors to your liking - you can add or remove colors to change the number of words that will be colored - the default color list is based on mspaint.exe's palette on my laptop Based heavily on these: - @Alan-Kilborn https://community.notepad-plus-plus.org/post/61801 - @Ekopalypse https://github.com/Ekopalypse/NppPythonScripts/blob/master/npp/EnhanceAnyLexer.py """ from Npp import editor,notepad,console, INDICATORSTYLE, INDICFLAG, INDICVALUE from random import randint class FrequentHighlighter(object): INDICATOR_ID = 0 stopwords = """ be am are is was were been has have and but or nor for yet so the an aboard about above across after against along amid among around as at before behind below beneath beside between beyond but by concerning considering despite down during except following for from in inside into like minus near next of off on onto opposite out outside over past per plus regarding round save since than through till to toward under underneath unlike until up upon versus via with within without """.split() @staticmethod def rgb(r, g, b): ''' Helper function Retrieves rgb color triple and converts it into its integer representation Args: r = integer, red color value in range of 0-255 g = integer, green color value in range of 0-255 b = integer, blue color value in range of 0-255 Returns: integer ''' return (b << 16) + (g << 8) + r def match_found(self, m): self.word_matches.append(editor.getTextRange(m.span(0)[0], m.span(0)[1]).lower()) def createPalette(self): self.palette = [] self.palette.append(self.rgb(237,28,36)) self.palette.append(self.rgb(255,174,201)) self.palette.append(self.rgb(255,127,39)) self.palette.append(self.rgb(255,201,14)) self.palette.append(self.rgb(255,242,0)) self.palette.append(self.rgb(239,228,176)) self.palette.append(self.rgb(34,177,76)) self.palette.append(self.rgb(181,230,29)) self.palette.append(self.rgb(0,162,232)) self.palette.append(self.rgb(153,217,234)) self.palette.append(self.rgb(63,72,204)) self.palette.append(self.rgb(112,146,190)) self.palette.append(self.rgb(163,73,164)) self.palette.append(self.rgb(200,191,231)) self.word_palette = {} # word palette will be populated later def go(self): self.createPalette() editor.indicSetStyle(self.INDICATOR_ID, INDICATORSTYLE.TEXTFORE) editor.indicSetFlags(self.INDICATOR_ID, INDICFLAG.VALUEFORE) self.word_matches = [] self.histogram_dict = {} editor.research('[[:alpha:]]{3,}', self.match_found) for word in self.word_matches: if word not in self.histogram_dict: self.histogram_dict[word] = 1 elif word not in self.stopwords: self.histogram_dict[word] += 1 elif word in self.stopwords: if word in self.histogram_dict: del self.histogram_dict[word] output_list = [] for word in sorted(self.histogram_dict, key=self.histogram_dict.get, reverse=True): if len(self.palette) > 0: self.word_palette[word] = self.palette.pop(0) # equivalent to perl's shift, where pop() without argument is perl's pop output_list.append('{}={} [0x{:06x}]'.format(word, self.histogram_dict[word], self.word_palette[word])) elif self.histogram_dict[word] > 1: output_list.append('{}={}'.format(word, self.histogram_dict[word])) #console.write('\r\n'.join(output_list) + "\r\n\r\n") # do the coloring self.colored = 0 editor.research('[[:alpha:]]{3,}', self.match_colorize) def match_colorize(self, m): a,b = m.span(0) word = editor.getTextRange(m.span(0)[0], m.span(0)[1]).lower() if word in self.histogram_dict: if not word in self.word_palette: return # console.write("{:<3d} => 0x{:06x}\tword='{}'[{}] from {}..{} = len:{}\n".format(self.colored, self.word_palette[word], word, self.histogram_dict[word],a,b,b-a)) editor.setIndicatorCurrent(self.INDICATOR_ID) editor.setIndicatorValue(self.word_palette[word]) editor.indicatorFillRange(a, b-a) self.colored = self.colored + 1 FrequentHighlighter().go()