Colored "Find what:" zone
-
@guy038 said in Colored "Find what:" zone:
just pay attention to the small box, colored or not, between the mention Find what: and the field where you type in your search regex. This box would not be a button at all
I think it actually has more value to truly BE a button, or, more accurately, a horizontally-narrow dropdown, where one could select, for example:
N
for normal mode searchE
for extended mode searchR
for regular expression mode search (with . does not match newline)R.
for regular expression mode search (with . matches newline)
Of course, when dropped, it could appear like this for maximum user help:
N : Normal Search Mode
E : Extended Search Mode
R : Regular Expression Search Mode (. does not match newline)
R. : Regular Expression Search Mode (. matches newline)
With such a change, the
Search Mode
box near the bottom of the window could be eliminated, making room for future searching goodie options! :-)Most users could just ignore it, and if it is small that is easy to do (just leave it set at
N
). But for users that switch, something small and to the left of the text to find (Guy’s idea!) seems definitely worth it.I can’t believe I would get excited about a Find UI change that I know devs won’t like – because they don’t like any of these types of things.
I did not noticed that you spoke about the five Mark Style # ( with style 2 in your picture ). Indeed, I do like this solution, too !
I suppose I had forgotten that “styling” is controlled by match-case and match-whole-word settings. I think maybe that makes my suggested solution somewhat less great than I first thought. :-(
Somewhat like @astrosofista , I use automation to help keep these boxes clear when invoking a new Find, so I think that’s why I forgot.
-
Hi, @alan-kilborn and All,
Regarding the
Mark style
feature, unfortunately, it cannot take in account a range of more of2,047
characters too, as theFind what:
zone :-(Just duplicate a zone of, let say,
5,000
characters or so ! I had never done such a test before ;-))BR
guy038
-
@guy038 said in Colored "Find what:" zone:
Regarding the Mark style feature, unfortunately, it cannot take in account a range of more of 2,047 characters as the Find what: zone
Well, I had not encountered this before (having not attempted such a large “styling”) but it does not surprise me because clearly “styling” is going to involve Notepad++'s internal find routines to do its job. And those, as we know, have this 2047 limit.
But really, it isn’t much of a limitation to what we’re discussing (your usage when doing before/after regex replacement “compares”), right? Perhaps you were thinking that the “styling” method may be better because it might not have this limitation?
As an alternative for such a mechanism for such compares, what I do is to use an independent compare utility. The utility can do quick-to-invoke compares on the last two things copied to the clipboard. Thus it is perfect for your described application. Everyone touts the N++ Compare plugin, but I find a separate utility outside of N++, with possibly some hooks “into” N++ (via PythonScript) to be even more useful. Nothing against N++'s Compare plugin, however.
But, back to the 2047 (or is it 2046? I can’t remember) limit…
Is it truly 2047 characters, or is it 2047 bytes? Also a “can’t remember” for me. If it is “bytes” then, worst case for UTF-8 data, it might be as little as 2047-divided-by-4, or roughly 512 characters.
But even if it is characters, is such a limit “too small” for today’s conditions?
Maybe lobbying the N++ devs for an increase in this number is a reasonable thing to do? -
@Alan-Kilborn said in Colored "Find what:" zone:
I think it actually has more value to truly BE a button, or, more accurately, a horizontally-narrow dropdown,
One thing that my earlier proposal does NOT consider, is changing modes via keyboard.
I don’t currently have a great idea for this, without increasing the size on the UI.
But probably it is all pointless anyway, as Find UI changes are rarely considered by the devs. -
Hi, @alan-kilborn and All,
I did a series of tests and I’ve found out an interesting point about the
Find What:
filling zone !!A
) First case :-
Make a normal selection of some text or use the current selection
-
Hit the
Ctrl + F
,Ctrl+ H
,Ctrl + Shift + F
orCtrl + M
shortcut. So, this selection usually fills in theFind What:
zone, automatically
=> In this case, the maximum size of this zone is
2,046
bytes, whatever the characters stored and the number of chars to encode each character. For instance, the stringAé▣🎷
contains1 + 2 + 3 + 4
bytes, in anUTF-8
file. So,10
bytes are inserted in theFind what:
zoneB
) Second case :-
Copy the current selection in the clipboard, with
Ctrl + C
-
Cancel the current selection
-
Hit the
Ctrl + F
,Ctrl+ H
,Ctrl + Shift + F
orCtrl + M
shortcut -
Delete the contents of the
Find What:
zone, whatever it is -
Paste the contents of the clipboard with
Ctrl + V
, in theFind what
zone
=> In that case, the maximum size of this zone is
2046
chars and :-
Each character, with Unicode code-point
<= U+FFFF
, stands for one character -
Each character, with Unicode code-point
> U+FFFF
, stands for two characters !
So the same string
Aé▣🎷
contains1 + 1 + 1 + 2
chars. Thus,5
pseudo characters are inserted in theFind what:
zoneRemark that this case
B
) occurs, also, for theReplace with:
zone, as we need, necessarily, to fill in this zone with clipboard contents, anyway ! Therefore, the maximum size of theReplace with:
zone is2,046
characters, too, with the above distinction between characters within or outside theBMP
!BTW, we get the same results whatever the current search mode used !
Best Regards,
guy038
P.S. :
For a quick test, note the differences between cases
A
) andB
) with the one-line text of the▣
character (U+25A3
), below :▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣▣
First case :
-
Select all the above text (
2,109
chars ) -
Open the
Find
dialog ( TheFins what:
zone is filled in, automatically ) -
Tick the
Wrap around
option -
Click once only on the
Find Next
button
=>
682
characters▣
are selected ( each char is coded with three bytesE2 96 A3
=>682 * 3 = 2,046
bytes )Second case :
-
Select again all this text (
2,109
chars ) -
Hit
Ctrl + C
-
Cancel the selection
-
Open the
Find
dialog -
Delete anything in the
Find what:
zone -
Paste the clipboard contents with
Ctrl + V
, in theFind what:
zone -
Tick the
Wrap around
option -
Click on the
Find Next
button
=>
2,046
characters▣
are selected ( each char counts for itself, as its code-point is<= U+FFFF
) -
-
@guy038 said in Colored "Find what:" zone:
I did a series of tests and I’ve found out an interesting point about the Find What: filling zone !!
This sounds like more than “interesting” behavior.
It sounds like “buggy” behavior.
And it sounds like possibly several bugs. :-( -
Hi, @alan-kilborn and All,
Yeah, I admit that it’s really border line ! Now, which case seems more logical and which case seems more interesting ?
I would say that :
-
The first case seems more logical as it considers the total amount of bytes inserted in the
Search what:
zone, which is strictly equal to the total amount of bytes of the current selection, before calling theFind
dialog -
Now, the second case, pasting contents in the
Search what:
zone, is more interesting, of course, because we can search for a greater range of characters ( at least,2
times more ) !
Best Regards,
guy038
-
-
I would say, that if the developers are going to set some kind of limit (and often in software a limit must be set), then for user convenience and understanding, it should be a character limit. Users don’t understand characters versus bytes (unless those numbers are strictly the same, and with UTF-8 and other encodings they are NOT).
And, different methods of entry should of course not alter the amount of data that can be accepted.
-
Hello, @alan-kilborn and All,
Yes, Alan, the
character
point of vue should be preferred to thebyte
one, like, for instance, thesel :
number, in status bar, which refers to characters ( not bytes ) !So, the second case should, therfore, be preferred. However, note that, presently, there is still a difference between chars in the
BMP
, counting for one char and characters outside theBMP
, counting for two. A bit weird, BTW ?BR
guy038
-
@guy038 said in Colored "Find what:" zone:
the character point of vue should be preferred to the byte one, like, for instance, the sel : number, in status bar, which refers to characters ( not bytes ) !
And the
Pos :
number, in the status bar, bothers me somewhat, as it seems intuitively like it should be one character = one “position” change as you cursor over it. But for multibyte characters it is NOT a change of one.The Pythonscript programmer in me sort of understands this, however. Meaning how Scintilla deals with “position”.
chars in the BMP, counting for one char and characters outside the BMP, counting for two.
You might understand that way better than me.
The “two” makes me think of “surrogate pairs” but here is where I back off because I don’t know what I’m talking about. :-) -
Hello, @Alan-kilborn and All,
Thank, alan, for your feedback !
Yes, I know that the
Pos
number, in the status bar, refers to exact position (starting from0
), in current file, of the first byte of the sequence needed to write a character, in a specific encoding ! For instance, theUTF-8
sequence of the🎷
character, representing a saxophone, is the four bytes sequence (F0 9F 8E B7
). So, if you insert in a new tab, the stringA🎷Z0
you can jump, with theSearch > Go to...
feature, when theOffset
radio button is set, to :-
Pos 0
, right before theA
letter -
Pos 1
, right before the🎷
letter -
Pos 5
, right before theZ
letter -
Pos 6
, right before the0
digit
And, if you try the offset
2
,3
or4
, which are all within theUTF-8
encoding of the🎷
character, you would just jump to the nextZ
char !This behavior is now correct, because I created an issue about this problem. Refer to this issue !
Now, I think that you’re right regarding your assumption about the two bytes used by a char, over the
BMP
: this has really something to do with the two bytes of thesurrogate
mechanism !For instance :
-
The regex to get the
🎷
character, use its surrogate pair\x{D83C}\x{DFB7}
( as we cannot use its complete hexadecimal code\x{1F3B7}
) -
The general regex
(?-s).[\x{D800}-\x{DFFF}]
finds any character over theBMP
( Basic Multilingual Plane ), so with code-point over\x{FFFF}
BR
guy038
-