Notepad++ doesn't find the accented characters with the "Find in files" function
-
Hello,
I’m French and there is a really nasty issue for many versions now. It’s impossible to properly use the “Find in Files” function for searching terms with accented characters anymore. The last working version was the 7.6.6 one. No accented character is recognized, so the search doesn’t find anything except if the files are already opened, which kills the very purpose of this function.There is not really a category for the tool bugs (just plugin ones), so please tell me if I don’t write my request in the good place.
-
Hello, @atelier-traduction and All,
I’m French too, but I’ve never seen that behavior, even with the last
7.8.7
N++ version !?Of course, my default ANSI code page is
Windows-1252
where most of the accentuated characters are in range[\xC0-\xFF]
And the Find in Files search of, for instance, the regex
[\xC0-\xFF]+
does get any range of consecutive accentuated char(s) !Best Regards,
guy038
Bonjour, @atelier-procduction et à Tous :
Je suis Français également et je n’ai jamais remarqué ce comportement, même avec la dernière version
7.8.7
!?Bien entendu, ma page de code
ANSI
, par défaut, estWindows-1252
, où la plupart des caractères accentués se trouvent dans la zone[\xC0-\xFF]
Et, par exemple, une recherche, avec Find in Files, de l’expression régulière
[\xC0-\xFF]+
trouvent bien tout zone de caractère(s) accentué(s) consécutif(s) !Bien cordialement,
guy038
-
-
@Atelier-Traduction said in Notepad++ doesn't find the accented characters with the "Find in files" function:
We can see the exact problem here
How does that show the problem?
I think there are going to be people that will really put in an extreme amount of effort to help you, but you’re going to have to give them more specifics to go on.
Oftentimes people experience an issue and feel that “everyone must see the same thing, it is so obvious” and that means (because it is so obvious) no details are needed. In truth, perhaps no one is seeing it besides the person (or 3) with the problem.
-
Ok, I’m on the 32-bit version of Notepad++ (64-bit version has the same problem) on Windows 10 64 bit (Windows 7 64-bit version has the same problem) and I’m using the default settings. And yeah, 3/3 users in the same case is pretty curious in my opinion, but maybe not for you, I suppose. I mentionned the last working version too (7.6.6). Everything is in its default state in the folders too with the default plugins.
Here’s how it shows the issue: If I search for “rétablissement” in “Find in Files” with any filter and the default settings once again, no “rétablissement” is recognized in any file not opened by Notepad++ because the ‘é’ character itself is not recognized. The image above shows a □ character instead of ‘é’ if I just search for the term “em”, it’s the proof the versions above 7.6.6 don’t recognize the accented characters, or the ASCII extended ones if you prefer.
And well, I found another very interesting fact. The bug occurs only if Notepad++ searches in not opened files which are encoded into ANSI. If it searches into UTF-8 (without BOM) not opened files with the proper UTF-8 encoding for the accented characters (so real UTF-8 and not ANSI), all the accented characters are recognized without any problem.
So, for me, the not opened files are recognized with the UTF-8 (without BOM) format and never in ANSI if there is not a BOM, that’s why Notepad++ can’t understand the encoding of these characters.
Of course, if you need more details, I will be glad to give you.
Else, I’m here to try to get in touch with the developper himself, because he doesn’t want to be bothered by mail for bug reports and there is not a appropriate category in here to send bug reports of the tool itself.
-
I don’t know about other readers, but the additional exposition definitely helped me understand more about it. Thank you.
I think @guy038 might be in the best position to help you explore this. Hopefully he continues to contribute to this thread.
Meanwhile, of interest might be THIS . Of course, you could be the author of that; I can’t tell from the usernames, but they are certainly dissimilar (means nothing, though).
I’m here to try to get in touch with the developer himself, because he doesn’t want to be bothered by mail for bug reports
Good luck with that. Many have tried. Many have failed. :-)
there is not a appropriate category in here to send bug reports of the tool itself.
Correct. It wouldn’t be “here”. THIS describes the actual bug report process.
-
The last working version was the 7.6.6 one.
I see a “big” version bump in Scintilla between N++ 7.6.6 (uses Scintilla 3.5.6) and N++ 7.7 (uses Scintilla 4.1.4). Maybe there is something in that which drives this behavior change? Of course, maybe not as well…
-
Ok, thank you ^^
At least, I can try to write a bug report on Github with all the details even if others tried before, you never know… -
@Atelier-Traduction said in Notepad++ doesn't find the accented characters with the "Find in files" function:
I can try to write a bug report on Github
I would certainly say you should ADD to the existing one I linked earlier…did you see that link?
-
Yeah. If you think this is better, ok, I just will do that.
-
https://github.com/notepad-plus-plus/notepad-plus-plus/issues/7668
Done, let’s pray now. -
THIS seems to give some weight to my “Scintilla update” theory (or guess, actually).
-
Hi, @atelier-traduction, @alan-kilborn and All,
Aaaaah, yes :-(( @atelier-traduction, you’re quite right about it. I’ve never noticed that bug, yet ! Indeed, as I replied, in that forum, in English, all my text files are, mainly, written in English, so without any accentuated characters !
Compared to you, with my local N++ configuration, the accentuated characters are even not displayed at all, in the
Search result
panel !
This only happens under three conditions :
-
A search has been successful and the file is currently seen in the
Search result
panel -
The file is presently not opened in a Notepad++ session, either in the main or in the secondary session
-
The file is
ANSI
encoded ( All the otherUnicode
encodings are OK ! )
Note that, in addition, if the search, itself, contained accentuated characters, the Find result panel does not show the
ANSI
file !
I also confirm, as you said, that this bug occurs since Notepad++
V7.7
I get no time, presently, but I’ll open an issue on
GitHub
very soon !BR
guy038
-
-
I just would like to add that the “Search result” doesn’t find anything if accentuated characters are searched for.
https://community.notepad-plus-plus.org/assets/uploads/files/1592392282525-notepad_isssue.png
You can see these squares because I put the term “em” to find. If I put “rétablissement”, the “Search result” panel doesn’t find anything, so I don’t agree with the “the file is currently seen in the Search result panel” part. Or I misinterpreted your phrasing.
-
Hi, All,
Done ! See my comment on
GitHub
:https://github.com/notepad-plus-plus/notepad-plus-plus/issues/7668#issuecomment-645602955
After further tests, problem occurs with any character between
\x{0080}
and\x{00FF}
BR
guy038
-
Fixed in https://github.com/notepad-plus-plus/notepad-plus-plus/commit/c5a0ed7c1aaac56dc96deabba8dd5e7cba261b2d
It will come with next release. -
Thank you @donho, @guy038 and @Alan-Kilborn ;)