Odd (?) caret placement with auto-indent
-
@Alan-Kilborn seems strange to me
I imagine most people are not keeping the whitespace characters visible.
They’d likely think the auto-indent failed or doesn’t apply, add their own “missing” whitespace and keep going. -
@Alan-Kilborn said in Odd (?) caret placement with auto-indent:
Does anyone else find this odd?
Most definitely.
Or, if that’s not judged odd, isn’t it strange that the caret is now followed by two tab characters?
And in this situation I don’t really care about it…but…what happened to my space character?The space that you pushed forward was expanded into the start-of-line indentation tabs.
It’s more obvious if you have
three x
:
=>
If the three-space version had turned into an indent with the caret after the indent, it would have made sense to me. If the three-space version had turned into an indent with the caret after the indent and the space still after the caret, I would have considered it “correct”. With the current implementation, I’d call it a bug.
For the three-space-x, I am not sure whether I consider it a bug or not, because I could easily imagine that some users would expect the next “word” to be at the indented location, rather than having an extra space… but the caret ending up at column 0 instead of after the indent definitely seems buggish to me. (For the caret to end up in a different location depending on whether ENTER is hit before or after the space in three-space-x is highly inconsistent behavior.)
-
I would say auto-indentation is odd, period. It’s taken pretty directly from Scintilla’s notion of indentation. It produces annoying anomalies if you sometimes edit files that use tabs for indentation and sometimes edit files that use spaces, and you don’t keep changing the tabs/spaces for indent setting (even when switching from one document tab to another, if you have documents of both kinds open at the same time).
I think what’s happening is consistent — it just that the consistency of it depends on thinking like Scintilla, rather than like a human being.
When you press the enter key, first Notepad++ just inserts a line ending at the caret, regardless of auto-type. At this point, the caret will be at the beginning of the new line; there might or might not be tabs and/or spaces following the caret.
Next, if auto-indent is enabled Notepad++ checks to see if the indentation of the new line (with an appropriate increment or decrement for languages where curly braces are recognized) is already correct; if it isn’t, it asks Scintilla to adjust the indentation. Scintilla sees any tabs and blanks at the beginning of the line (regardless of where the caret is) as existing indentation; it replaces the indentation according to its prevailing rules (tabs or spaces).
If the caret is at the beginning of the line and indentation follows it, Scintilla leaves the caret at the beginning of the line when it adjusts the indentation. If the caret is at the beginning of the line and there is no indentation, Scintilla inserts the indentation before the caret.
Documentation note:
The user manual says that Auto-indent applies when you “hit Enter at the end of a line that’s indented.” I had always assumed that was how it worked and hadn’t noticed or remembered otherwise; but clearly, from this demonstration, it applies when the caret is within a line as well. (And when the curly braces / Advanced Auto-indent is in effect, the original line doesn’t have to be indented, either.)
Very much my personal feeling: Scintilla’s concept of indentation is not so good. (The same use of tabs or spaces without regard to context that occurs with auto-indent can occur in some other situations: such as when typing in virtual space, as in a rectangular or thin selection, when some of the lines crossed have no characters other than blanks and tabs. Existing tabs or blanks get converted to whatever is set as the indentation character.) I don’t yet have a simple proposal to make this behave more rationally, though.
-
@Coises said in Odd (?) caret placement with auto-indent:
The user manual says that Auto-indent applies when you “hit Enter at the end of a line that’s indented.”
@PeterJones, the user manual should get updated as the auto-indent stuff also happens when you hit
enter
in the middle of a line that is indented. I suspect the manual can just say “when you hitenter
” and not have the part about being at the end of a line.If at the time you press
enter
there were spaces and/or tabs following the caret position then the editor (Notepad++ and/or Scintilla, I’m not sure which) replaces those spaces/tabs with whatever it’s currently configured for indenting and also leaves the caret at column 1 instead of indenting it. The part about the caret at column 1 seems like a glitch. The resulting indentation tabs/spaces looks good and any visible text that had followed those spaces/tabs is indented.The space/tab replacement behavior can astonish people who had expected the original spaces/tabs to be kept in the text and that the resulting line would be indented by the normal indent plus some leading spaces/tabs that you had there.
The space replacement and then positioning at column 1 behavior also happens if the caret is in the initial or leading spaces/tabs of a line. In that case though you end up with a line with only spaces/tabs and that also defines the indenting for the new line.
-
@mkupper said in Odd (?) caret placement with auto-indent:
the user manual should get updated as the auto-indent stuff also happens when you hit enter in the middle of a line that is indented. I suspect the manual can just say “when you hit enter” and not have the part about being at the end of a line.
I could’ve sworn when testing/documenting the new indentation feature in v8.7 that I verified that it didn’t auto-indent when the
ENTER
was in the middle of the page, but I just confirmed that in v8.7.2 back through v8.6.9, they all do their auto-indent even in the middle; I guess I somehow messed up my test. I’ll go work on a fix.update: my updated text:
I will leave the PR open for a while, if you want to comment on it.
-
@PeterJones, I feel the Note part should make it clearer that the behavior only happens when there were spaces/tabs immediately following the caret at the time enter was pressed.
FWIW - I did some testing with the other characters matched by a regexp
\s
and found that only spaces (\x20
) and tabs (\x09
) get replaced when they are immediately following the caret’s position at the time someone pressesenter
. Whitespace characters other than space and tab are retained in the text.\s
seems to match following which I had used for testing:ch Unicode Title ~\t~ \x{0009} ^I - HT - CHARACTER TABULATION ~\n~ \x{000A} ^J - LF - LINE FEED (LF) ~~ \x{000B} ^K - VT - LINE TABULATION ~~ \x{000C} ^L - FF - FORM FEED (FF) ~\r~ \x{000D} ^M - CR - CARRIAGE RETURN (CR) ~ ~ \x{0020} SPACE ~ ~ \x{00A0} NBSP - NO-BREAK SPACE ~ ~ \x{1680} OSPM 0 OGHAM SPACE MARK ~~ \x{180E} MVS - MONGOLIAN VOWEL SEPARATOR ~ ~ \x{2000} NOSP - EN QUAD ~ ~ \x{2001} MOSP - EM QUAD (mutton quad; nominally, the height of the font) ~ ~ \x{2002} ENSP - EN SPACE (nut; half an em) ~ ~ \x{2003} EMSP - EM SPACE (mutton; nominally, a space equal to the type size in points; may scale by the condensation factor of a font) ~ ~ \x{2004} 3/MSP - THREE-PER-EM SPACE (thick space) ~ ~ \x{2005} 4/MSP - FOUR-PER-EM SPACE (mid space) ~ ~ \x{2006} 6/MSP - SIX-PER-EM SPACE (in computer typography sometimes equated to thin space) ~ ~ \x{2007} FSP - FIGURE SPACE (space equal to tabular width of a font; this is equivalent to the digit width of fonts with fixed-width digits) ~ ~ \x{2008} PSP - PUNCTUATION SPACE (space equal to narrow punctuation of a font) ~ ~ \x{2009} THSP - THIN SPACE (a fifth of an em (or sometimes a sixth)) ~ ~ \x{200A} HSP - HAIR SPACE (thinner than a thin space; in traditional typography, the thinnest space available) ~ ~ \x{2028} LS - LINE SEPARATOR ~ ~ \x{2029} PS - PARAGRAPH SEPARATOR ~ ~ \x{202F} NNBSP - NARROW NO-BREAK SPACE (commonly abbreviated NNBSP; a narrow form of a no-break space, typically the width of a thin space or a mid space) ~ ~ \x{205F} MMSP - MEDIUM MATHEMATICAL SPACE (abbreviated MMSP; four-eighteenths of an em) ~ ~ \x{3000} ISSP - IDEOGRAPHIC SPACE
-
@mkupper said in Odd (?) caret placement with auto-indent:
I feel the Note part should make it clearer that the behavior only happens when there were spaces/tabs immediately following the caret at the time enter was pressed.
How can I make it more clear? I already say, “For either Basic or Advanced mode, if you hit <kbd>Enter</kbd> in the middle of an indented line, before one or more space or tab characters (like before the space in
two words
)…”. I suppose I could explicitly call out the caret, saying “For either Basic or Advanced mode, if you hit <kbd>Enter</kbd> when the caret is in the middle of an indented line, before one or more space or tab characters (like before the space intwo words
)” … but I thought it was already pretty explicit even without the word “caret” being included.Do you have an alternate suggestion?
-
Sorry, I’ve been AFK since originally posting, but nice to see thoughtful replies on my topic.
It’s enough to make one want to script their own auto-indent. :-)
(But not post that here, because my script wouldn’t take tab characters into account, as I don’t use them) -
@PeterJones said in Odd (?) caret placement with auto-indent:
before one or more space or tab characters
The anal part of me :-) keeps getting stuck on “before one or more space or tab characters.” I had used “immediately before” rather than the more vague “before” which leaves open that there may be one or more characters between the caret and the spaces/tabs.
Overall, are we trying to document “defined” or “expected” behavior. Should the space/tab replacement thing be considered a “feature”, a “bug”, “glitch”, or an “anomaly?” If the manual is intend is to document defined/expected behavior then it may be better to remain silent on the space/tab replacement detail.
Likewise, the part about leaving the caret at column 1 seems to me to be a npp bug. Should that be included in the manual?
-
“immediately before”
I wouldn’t’ve thought it would be needed, but I also don’t object to that phrasing.
Overall, are we trying to document “defined” or “expected” behavior.
Yes. ;-)
If the manual is intend is to document defined/expected behavior then it may be better to remain silent on the space/tab replacement detail.
It’s hard to say: if it’s a bug that will be immediately fixed, then I can understand not documenting it. But if it’s going to be a bug that gets ignored by the developer for the next 5+ years, I think it should be documented.
Recently, a lot of the improvements to the manual have been about things that were assumed to be common knowledge, or clarifying behavior that was not obviously intuitive to the new user.
But if you think that the space clarification and the cursor-to-start-of-line would be more of a distraction than a help in the UM, I could just take out the new Note, and just leave the corrected phrasing of the individual modes.
-
@PeterJones said in Odd (?) caret placement with auto-indent:
But if it’s going to be a bug that gets ignored by the developer for the next 5+ years, I think it should be documented.
I second this idea.
-
@Alan-Kilborn, @mkupper, @PeterJones:
Would it make sense to anyone besides me to request changing the implementation of Auto-indent: Basic so that instead of relying on Scintilla’s indentation rules, it simply checked the line in which the Enter key was pressed for leading tabs and spaces, and, if it found any, inserted a copy of those before the caret in the new line, without regard to whether tabs or blanks are set as the indentation mode and without changing anything following the caret?
Since Basic never needs to create, expand or reduce indentation (like Advanced can) — it’s always duplicating the existing indentation, except for the case where spaces or tabs follow the caret — it shouldn’t be a problem to make it just do that: duplicate the existing indentation and not be reliant on having the tabs/spaces setting right for the specific file. And if you had tabs or spaces after the caret, they’d still be there following the caret in the new line, as a normal human being would expect.
I can request this and prepare an appropriate PR to implement it, but first I wanted to ask if others would find it logical. The major argument against it that I can see would be that it should really work this way for Advanced, too — but that’s more complex to implement, with more “OK, so what do we do if…” cases, and might thus be less likely to be deemed acceptable.
-
I said in Odd (?) caret placement with auto-indent:
Would it make sense to anyone besides me to request changing the implementation of Auto-indent: Basic so that instead of relying on Scintilla’s indentation rules, it simply checked the line in which the Enter key was pressed for leading tabs and spaces, and, if it found any, inserted a copy of those before the caret in the new line, without regard to whether tabs or blanks are set as the indentation mode and without changing anything following the caret?
I have submitted this as Issue #15843. Please comment if you think it does, or does not, make sense.
-
I also added Issue #15845, which directly notes the problem described in the original post. (Hopefully @Alan-Kilborn doesn’t mind that I did that.)
I suggested a solution, but not a specific commit, as the Auto-indent: Advanced code addresses many special cases, and without further examination I am not sure exactly where the fix is correctly applied.
-
@Coises said in Odd (?) caret placement with auto-indent:
Hopefully @Alan-Kilborn doesn’t mind that I did that.
Of course not.
-
You’ve proposed changing “basic” and “advanced” auto-indent in two separate github issues. I understand why you did this (in case one is accepted and the other is not accepted).
An offshoot from this is, if the basic one is accepted and the advanced one is not, it does me no good. Why? Well, because auto-indent is a global feature, not a file-type specific feature.
In theory I would turn basic auto-indent on for “normal” text files, to get the benefits of a “fix” for such files. However, for Python and C++ files, I want/need advanced auto-indent. But, I can’t do this (again, it’s global: basic OR advanced).
I lobbied the N++ developer that basic/advanced should be a per-filetype setting recently, but I lost (I was told “let’s see if other users need this”).
-
@Alan-Kilborn said in Odd (?) caret placement with auto-indent:
In theory I would turn basic auto-indent on for “normal” text files, to get the benefits of a “fix” for such files. However, for Python and C++ files, I want/need advanced auto-indent. But, I can’t do this (again, it’s global: basic OR advanced).
The code for Auto-indent: Advanced splits into three cases: C-like languages, Python and everything else; everything else uses the same code as Basic. The same change I proposed for Basic could be made there.
The reason I separated the two cases is that what I propose doing to Basic is simple and straightforward. I want to ignore the configured indent settings and just duplicate the indentation of the previous line. I was able, with reasonable confidence, to devise a commit which I believe implements that without causing side-effects.
The code for Advanced (except when it falls back to Basic) is far more complex, and of course it can’t ignore the indentation settings. The reason I didn’t suggest applying my change for Basic to Advanced when it falls back to Basic is that Advanced doesn’t just copy the existing indentation, it recomputes it, whether it is changing or not, according to the indentation settings (tabs or spaces). I thought it might be bizarre to have Auto-indent: Advanced sometimes change the indentation style to match the configuration and sometimes not.
I suggested something as to how Auto-indent: Advanced could be changed to fix the specific problem you mentioned, but I did not give an example commit, because I have not yet been successful at creating one. So far, for each attempt I’ve made I’ve quickly found some unwanted side-effect.
Part of the difficulty, also, is that after some experimentation with the unmodified version, there are some corner cases for which I do not follow the logic of the current behavior (or if some of that might be unintended). Example:
void func() { return; }
Start with that in language C++. The results of placing the caret after{
and pressing Enter, then placing the caret after;
and pressing Enter are not the same as the results of placing the caret after;
and pressing Enter, then placing the caret after{
and pressing Enter. In the latter order, the blank in “; }
” is not absorbed; however, if you start with the line indented one level and do the same thing, the blank is absorbed. Either way, the caret winds up at the beginning of the line, as per your initial complaint.