Determining if the active document is mono-spaced
-
I thought it would be worth asking if anyone has already solved this problem.
From a plugin, how can I determine if the active document is mono-spaced: meaning that for each style used by the current lexer, the font for that style is mono-spaced, and all the fonts have the same character width?
Motivation: It turns out the “full analysis” step elastic tabstops processing in my Columns++ plugin can be at least 65 times faster if it can count characters rather than measure text widths… so it would make elastic tabstops useful for considerably larger files if I could verify that every character will occupy the same width on the display. The alternatives would be either forcing the user to set an “assume fixed pitch” toggle to enable fast processing of larger files, or trying to guess at the size of spans of text and (somehow) detect and correct false guesses that affect the results without taking as much time as it would to measure in the first place.
Anybody know some clever, fast way to do it?
-
@Coises ,
That allows you to check the font name, the font size, and whether the font for a given styleID in the active lexer is monospaced. So you would just have to loop over all the styleID values for the active lexer and check those message results for each one (shouldn’t take too long; probably don’t have to run that check very often unless you’re afraid the user is frequently changing their Style Configurator)
(Don’t be too impressed with my arcane knowledge: I didn’t actually know about that third message until you asked the question. I was just looking up the naming for the first two, which I was confident should likely exist, and noticed the third in the same group, which makes the whole thing simpler.)
-
@PeterJones said in Determining if the active document is mono-spaced:
That allows you to check the font name, the font size, and whether the font for a given styleID in the active lexer is monospaced. So you would just have to loop over all the styleID values for the active lexer and check those message results for each one (shouldn’t take too long; probably don’t have to run that check very often unless you’re afraid the user is frequently changing their Style Configurator)
That makes sense. Thanks. It does lead to another question: How can one enumerate “all the styleID values for the active lexer”? I don’t see any way to ask either Notepad++ or Scintilla directly for that information; so I suspect Notepad++ must ask the lexers themselves when it figures out how to populate the Style Configurator dialog. Hopefully I can use that as a model — or does one just loop through all 256 possibilities and assume that any that aren’t used will be set to the default anyway, so they won’t change the result? If that assumption is valid, getting the STYLE_DEFAULT font name first and then skipping anything that’s the same would be reasonably efficient and simpler than figuring out how to communicate with a lexer.
(Don’t be too impressed with my arcane knowledge: I didn’t actually know about that third message until you asked the question. I was just looking up the naming for the first two, which I was confident should likely exist, and noticed the third in the same group, which makes the whole thing simpler.)
OK. ;-) BTW, it doesn’t work that way… closer reading (and testing, to be sure I wasn’t misreading) indicates that SCI_STYLE[GET|SET]CHECKMONOSPACED is only about telling Scintilla whether to bother to check if the font might be monospaced; it doesn’t reflect what Scintilla determined.
-
@Coises said:
That makes sense. Thanks. It does lead to another question: How can one enumerate “all the styleID values for the active lexer”? I don’t see any way to ask either Notepad++ or Scintilla directly for that information; so I suspect Notepad++ must ask the lexers themselves when it figures out how to populate the Style Configurator dialog.
AFAIK, Notepad looks through stylers.xml for the current theme, and any styleIDs that are defined will be present in the Style Configurator; you could do the same.
and assume that any that aren’t used will be set to the default anyway, so they won’t change the result?
Any that aren’t in stylers.xml will not have their font/size/color changed by Notepad++.
If that assumption is valid, getting the STYLE_DEFAULT font name first and then skipping anything that’s the same would be reasonably efficient and simpler than figuring out how to communicate with a lexer.
STYLE_DEFAULT is style#0 (for most, if not all, lexers)
OK. ;-) BTW, it doesn’t work that way… closer reading (and testing, to be sure I wasn’t misreading) indicates that SCI_STYLE[GET|SET]CHECKMONOSPACED is only about telling Scintilla whether to bother to check if the font might be monospaced; it doesn’t reflect what Scintilla determined.
Ah, okay; I didn’t study it deeply; on the first skim, it sounded like it might be what you wanted. Sorry.
-
It looks to me from reading the Scintilla doc like you could use a combination of
SCI_STYLEGETCHECKMONOSPACED
and a quick scan to see if all your characters are ASCII would be enough for most purposes.The potential pitfall I see here is weird stuff like
BEL
andENQ
that are technically ASCII but are represented as big boxes in Notepad++. So you’d probably have to exclude those in your calculation. -
@PeterJones said in Determining if the active document is mono-spaced:
Any that aren’t in stylers.xml will not have their font/size/color changed by Notepad++.
I see. It looks like Notepad++ does a SCI_STYLECLEARALL after setting up STYLE_DEFAULT and before setting up other styles, so there won’t be “noise” in unused styles. A fairly naïve approach — just getting the width of a blank in STYLE_DEFAULT and then looping through all 256 possible styles to check whether the widths of a blank and a W are equal to that — is looking like it might work.
Thank you for pointing me in a good direction.
-
A demonstrated use case for
SCI_STYLEGETCHECKMONOSPACED
is more efficient line wrapping: https://github.com/notepad-plus-plus/notepad-plus-plus/issues/10193#issuecomment-1114819246In practice, it’s only effective when integrated with a suite of line-measuring optimizations. If line wrapping is slowing things down, the developer of Notepad2 would be the best person to ask.
Otherwise, I don’t think it has much to do with the OP’s problem statement. Line wrapping is the editor’s job, not a plugin’s.
-
@Mark-Olson said in Determining if the active document is mono-spaced:
The potential pitfall I see here is weird stuff like BEL and ENQ that are technically ASCII but are represented as big boxes in Notepad++. So you’d probably have to exclude those in your calculation.
Good point — even if all the fonts in use are fixed pitch, this case (and probably others) could yield an exception. At this point I’m thinking the practical answer will be to get a good guess as to whether the file is “mostly” mono-spaced, and then find a way to identify and adjust for the exceptions as they scroll into view.
Thanks.