Notepad++ doesn't find the accented characters with the "Find in files" function



  • Hello,
    I’m French and there is a really nasty issue for many versions now. It’s impossible to properly use the “Find in Files” function for searching terms with accented characters anymore. The last working version was the 7.6.6 one. No accented character is recognized, so the search doesn’t find anything except if the files are already opened, which kills the very purpose of this function.

    There is not really a category for the tool bugs (just plugin ones), so please tell me if I don’t write my request in the good place.



  • Hello, @atelier-traduction and All,

    I’m French too, but I’ve never seen that behavior, even with the last 7.8.7 N++ version !?

    Of course, my default ANSI code page is Windows-1252 where most of the accentuated characters are in range [\xC0-\xFF]

    And the Find in Files search of, for instance, the regex [\xC0-\xFF]+ does get any range of consecutive accentuated char(s) !

    Best Regards,

    guy038

    Bonjour, @atelier-procduction et à Tous :

    Je suis Français également et je n’ai jamais remarqué ce comportement, même avec la dernière version 7.8.7 !?

    Bien entendu, ma page de code ANSI, par défaut, est Windows-1252, où la plupart des caractères accentués se trouvent dans la zone [\xC0-\xFF]

    Et, par exemple, une recherche, avec Find in Files, de l’expression régulière [\xC0-\xFF]+ trouvent bien tout zone de caractère(s) accentué(s) consécutif(s) !

    Bien cordialement,

    guy038



  • Well, I forgot to say I’m using Windows 10 (the same issue is on Windows 7), and we are three users to get the same issue with the default settings. So I think this is a general issue. We can see the exact problem here:

    notepad_isssue.png



  • @Atelier-Traduction said in Notepad++ doesn't find the accented characters with the "Find in files" function:

    We can see the exact problem here

    How does that show the problem?

    I think there are going to be people that will really put in an extreme amount of effort to help you, but you’re going to have to give them more specifics to go on.

    Oftentimes people experience an issue and feel that “everyone must see the same thing, it is so obvious” and that means (because it is so obvious) no details are needed. In truth, perhaps no one is seeing it besides the person (or 3) with the problem.



  • Ok, I’m on the 32-bit version of Notepad++ (64-bit version has the same problem) on Windows 10 64 bit (Windows 7 64-bit version has the same problem) and I’m using the default settings. And yeah, 3/3 users in the same case is pretty curious in my opinion, but maybe not for you, I suppose. I mentionned the last working version too (7.6.6). Everything is in its default state in the folders too with the default plugins.

    Here’s how it shows the issue: If I search for “rétablissement” in “Find in Files” with any filter and the default settings once again, no “rétablissement” is recognized in any file not opened by Notepad++ because the ‘é’ character itself is not recognized. The image above shows a □ character instead of ‘é’ if I just search for the term “em”, it’s the proof the versions above 7.6.6 don’t recognize the accented characters, or the ASCII extended ones if you prefer.

    And well, I found another very interesting fact. The bug occurs only if Notepad++ searches in not opened files which are encoded into ANSI. If it searches into UTF-8 (without BOM) not opened files with the proper UTF-8 encoding for the accented characters (so real UTF-8 and not ANSI), all the accented characters are recognized without any problem.

    So, for me, the not opened files are recognized with the UTF-8 (without BOM) format and never in ANSI if there is not a BOM, that’s why Notepad++ can’t understand the encoding of these characters.

    Of course, if you need more details, I will be glad to give you.

    Else, I’m here to try to get in touch with the developper himself, because he doesn’t want to be bothered by mail for bug reports and there is not a appropriate category in here to send bug reports of the tool itself.



  • @Atelier-Traduction

    I don’t know about other readers, but the additional exposition definitely helped me understand more about it. Thank you.

    I think @guy038 might be in the best position to help you explore this. Hopefully he continues to contribute to this thread.

    Meanwhile, of interest might be THIS. Of course, you could be the author of that; I can’t tell from the usernames, but they are certainly dissimilar (means nothing, though).

    I’m here to try to get in touch with the developer himself, because he doesn’t want to be bothered by mail for bug reports

    Good luck with that. Many have tried. Many have failed. :-)

    there is not a appropriate category in here to send bug reports of the tool itself.

    Correct. It wouldn’t be “here”. THIS describes the actual bug report process.



  • The last working version was the 7.6.6 one.

    I see a “big” version bump in Scintilla between N++ 7.6.6 (uses Scintilla 3.5.6) and N++ 7.7 (uses Scintilla 4.1.4). Maybe there is something in that which drives this behavior change? Of course, maybe not as well…



  • Ok, thank you ^^
    At least, I can try to write a bug report on Github with all the details even if others tried before, you never know…



  • @Atelier-Traduction said in Notepad++ doesn't find the accented characters with the "Find in files" function:

    I can try to write a bug report on Github

    I would certainly say you should ADD to the existing one I linked earlier…did you see that link?



  • Yeah. If you think this is better, ok, I just will do that.





  • @Atelier-Traduction

    THIS seems to give some weight to my “Scintilla update” theory (or guess, actually).



  • Hi, @atelier-traduction, @alan-kilborn and All,

    Aaaaah, yes :-(( @atelier-traduction, you’re quite right about it. I’ve never noticed that bug, yet ! Indeed, as I replied, in that forum, in English, all my text files are, mainly, written in English, so without any accentuated characters !

    Compared to you, with my local N++ configuration, the accentuated characters are even not displayed at all, in the Search result panel !


    This only happens under three conditions :

    • A search has been successful and the file is currently seen in the Search result panel

    • The file is presently not opened in a Notepad++ session, either in the main or in the secondary session

    • The file is ANSI encoded ( All the other Unicode encodings are OK ! )

    Note that, in addition, if the search, itself, contained accentuated characters, the Find result panel does not show the ANSI file !


    I also confirm, as you said, that this bug occurs since Notepad++ V7.7

    I get no time, presently, but I’ll open an issue on GitHub very soon !

    BR

    guy038



  • I just would like to add that the “Search result” doesn’t find anything if accentuated characters are searched for.

    https://community.notepad-plus-plus.org/assets/uploads/files/1592392282525-notepad_isssue.png

    You can see these squares because I put the term “em” to find. If I put “rétablissement”, the “Search result” panel doesn’t find anything, so I don’t agree with the “the file is currently seen in the Search result panel” part. Or I misinterpreted your phrasing.



  • Hi, All,

    Done ! See my comment on GitHub :

    https://github.com/notepad-plus-plus/notepad-plus-plus/issues/7668#issuecomment-645602955

    After further tests, problem occurs with any character between \x{0080} and \x{00FF}

    BR

    guy038





  • Thank you @donho, @guy038 and @Alan-Kilborn ;)


Log in to reply