Find and Display *All* Duplicate Lines
-
@mkupper said in Find and Display *All* Duplicate Lines:
I’m still getting and puzzling over the random selections issue.
Is it only a “random selection” issue? That is, if you pretend the selection isn’t there, does your caret (at one end of the selection) end up on the correct line? If that’s the case, maybe we can simply cancel the selection.
It seems maybe there is something of a race going on here. The double-click makes Scintilla want to do something in the current (results) file – it wants to select a double-clicked word – but maybe that processing is somehow delayed until the source file is activated, and then bogus positions are used as Scintilla finishes making its selection? This goes against my understanding of how it should work but…
-
@Alan-Kilborn said in Find and Display *All* Duplicate Lines:
output file has a .sr extension. The sr stands for “search results”; I tried to emulate Notepad++'s Search results format in the output I created.
Another reason is that I have a UDL that I made for .sr files which colorizes the output
A bit off topic for this current discussion, but Notepad++ has a “lexer” for “Internal Search” called “searchResult” that has entries in both ‘langs.model.xml’ and ‘stylers.model.xml’. You can see it in:
The “lexer” is set here:
notepad-plus-plus/PowerEditor/src/ScintillaComponent/FindReplaceDlg.cpp: 640: if (_scintView.execute(SCI_GETLEXER) == SCLEX_NULL) 641: { 642: _scintView.setLexer(L_SEARCHRESULT, LIST_NONE); // Restore searchResult lexer in case the lexer was changed to SCLEX_NULL in GotoFoundLine() 643: }
which through
setLexer()
eventually calls:notepad-plus-plus/PowerEditor/src/ScintillaComponent/ScintillaEditView.cpp: 2165: bool ScintillaEditView::setLexerFromLangID(int langID) // Internal lexer only 2166: { 2167: if (langID >= L_EXTERNAL) 2168: return false; 2169: 2170: const char* lexerNameID = _langNameInfoArray[langID]._lexerID; 2171: execute(SCI_SETILEXER, 0, reinterpret_cast<LPARAM>(CreateLexer(lexerNameID))); 2172: return true; 2173: }
That functionality can be “duplicated” in PythonScript and in fact I have a “hidder Lexer” script based on work from @PeterJones and others that can enable some of Lexilla’s lexers that N++ does not expose. I’m wondering why we can’t just:
from ctypes import windll, addressof, create_unicode_buffer from ctypes.wintypes import HWND, UINT, WPARAM, LPARAM from Npp import editor, notepad SendMessage = windll.user32.SendMessageW; SendMessage.argtypes = [HWND, UINT, WPARAM, LPARAM] SendMessage.restype = LPARAM NPPM_CREATELEXER = (1024 + 1000 + 110) _lexer = create_unicode_buffer('searchResult') ilexer_ptr = SendMessage(notepad.hwnd, NPPM_CREATELEXER, 0, addressof(_lexer)) editor.setILexer(ilexer_ptr) editor.colourise(0, -1)
which (I hope) is the smallest self-contained example of what my hidder Lexer script does. I tried this but it does not lex the document as “searchResult” or “Internal Search”. I’m sure it’s a bit different in that I’m not trying to activate a Lexilla lexer, but this lexer name seems to be defined and used within N++, I wonder why PythonScript cannot activate it?
Cheers.
-
@Alan-Kilborn said in Find and Display *All* Duplicate Lines:
Is it only a “random selection” issue? That is, if you pretend the selection isn’t there, does your caret (at one end of the selection) end up on the correct line? If that’s the case, maybe we can simply cancel the selection.
When a selection happens I am taken to the middle of a page with the selection ending at the line I’m on. The intended target line is on the page, sometimes it’s top line that is visible, and sometimes the target line is at the at the bottom of the visible page.
This item is unrelated to the random selection issue. I don’t know if it’s by design but sometimes double clicking takes me to a page with the target line at the very top and other times the target is at the very bottom of the selected page. This slows me down because after double clicking I need to then visually find the newly selected line. The choice of top or bottom seems random, even when retesting a line such as 6208.
I now have two workarounds as I understand better what is happening with the random top/bottom combined with the intermittent appearance of a selection. When I get a selection I can go back to
DupeLineResults.sr
and double click again. The odds are it will work.Another workaround is to be mindful of the line number I intend to go to when double clicking in
DupeLineResults.sr
. If I get a selection then I know the line I want is either the first or last line on the page and so can look for it and move there without needing to re-run the double-click.The glitch is intermittent and can’t be solved by just cancelling the selection. However, a possible workaround in the code is to fetch the current line number. If you are not on the expected line then cancel the selection and re-run going to the target line.
When I double click in
DupeLineResults.sr
I’m seeing that one of three things will happen:- I am taken to the selected line which is positioned at the top of the page.
- I am taken to the selected line which is positioned at the bottom of the page. This seems to happen more often than being taken to the top of the page.
- I am taken to the middle of a page though trending towards the upper half and will have a selection running from the top of the page to my spot in the middle. The intended target line has always been either the first or last visible line on the page. The selection has always started far up the file and tends to be near the top.
I don’t know if it matters but I run Notepad++ (and all apps) in full screen mode and have a single monitor. It lets me focus on what I’m working on. My fingers are well versed in the keystrokes needed to navigate among tabs of those apps that have tabs and are well versed in Alt-Tabbing to other apps.
-
@mkupper said in Find and Display *All* Duplicate Lines:
sometimes it’s top line that is visible, and sometimes the target line is at the at the bottom of the visible page
The problem you are experiencing seems to go deeper than this, but I will say that there isn’t a guarantee as to where in the viewport a line that you are moving to will appear.
Scintilla documentation for
.gotoLine
for example, says, “…scrolls the view (if needed) to make it (the line) visible”.Programmers can pull extra duty to ensure that a line appears in the viewport where they want it to; example HERE.
-
@Alan-Kilborn said in Find and Display *All* Duplicate Lines:
The problem you are experiencing seems to go deeper than this, but I will say that there isn’t a guarantee as to where in the viewport a line that you are moving to will appear.
Scintilla documentation for .gotoLine for example, says, “…scrolls the view (if needed) to make it (the line) visible”.
I agree that the problem I’m experiencing is odd. I was thinking about what the
FindAndDisplayAllDuplicateLines.py
script could be racing with and so disabled TextFX which I found I had finally weeded myself from using. That did not help the selection issue.I tried to detect a pattern of the top or bottom of the viewport and did not see one while also getting intermittent selections. My current installation of Notepad++ seems to be nearly bare-bones. I think the only non-default plugin is PythonScript.
The next step for me is to set up a fresh portable installation.
-
Moderator note: When typing names of files with two-letter extensions, some extensions map to known TopLevelDomains, which makes NodeBB linkify those filenames as URLs.
So the “search results”
.sr
suffix is trying to linkify to domains assigned to Suriname, and.py
suffix is trying to linkify domains assigned to Paraguay. Whether or not those domains actually exist, it’s not a good idea to link to them: spam bots that are crawling the web see links to non-existent domains, and they may try to buy those domains and put nefarious websites behind those links just to get a few more victims.I have
red-text
ed the links I noticed in this discussion… but when you are previewing your post, if you see something in link color that you don’t expect, please go back andred-text
ify it, so that I don’t have to. -
@PeterJones said in Find and Display *All* Duplicate Lines:
When typing names of files with two-letter extensions, some extensions map to known TopLevelDomains, which makes NodeBB linkify those filenames as URLs.
I always wondered why it did that – go to know.
In the past I’ve sometimes red-texted the name of scripts, but not always; didn’t realize there was possibly some harm to come from not doing it.
I’ll attempt to remember to do filenames in red-text from now on. -
@Michael-Vincent said in Find and Display *All* Duplicate Lines:
this lexer name seems to be defined and used within N++, I wonder why PythonScript cannot activate it?
I agree with the sentiment, and a while ago I actually tried go get this going – but not to the level it seems you did. I was also unsuccessful. I had some offline chats with @Ekopalypse about it; I don’t think we got to a working solution. :-(
-
@Alan-Kilborn @Michael-Vincent
Afaik the problem is that the “search result” lexer uses an internal structure MarkingsStruct that contains the results to which it refers.
-
I used my lunch break productively :-)
Please note that this only works if the corresponding search has taken place, i.e.
You cannot save the search result in a file and reapply the styling when loading.from ctypes import (cdll, windll, create_string_buffer, create_unicode_buffer, addressof, pointer, WINFUNCTYPE) from ctypes.wintypes import BOOL, HWND, LPARAM, WPARAM, UINT from Npp import editor2, notepad SendMessage = windll.user32.SendMessageW SendMessage.argtypes = [HWND, UINT, WPARAM, LPARAM] SendMessage.restype = LPARAM NPPM_CREATELEXER = (1024 + 1000 + 110) WNDENUMPROC = WINFUNCTYPE(BOOL, HWND, LPARAM) FindWindowEx = windll.user32.FindWindowExW GetWindowText = windll.user32.GetWindowTextW GetWindowTextLength = windll.user32.GetWindowTextLengthW EnumChildWindows = windll.user32.EnumChildWindows GetClassName = windll.user32.GetClassNameW nppHandle = notepad.hwnd curr_class = create_unicode_buffer(256) WM_CLOSE = 0x010 window_hwnds = {} SEARCH_WINDOW = 'Search results' def foreach_window(hwnd, lParam): if curr_class[:GetClassName(hwnd, curr_class, 256)] == '#32770': length = GetWindowTextLength(hwnd) if length > 0: buff = create_unicode_buffer(length + 1) GetWindowText(hwnd, buff, length + 1) if buff.value == SEARCH_WINDOW: window_hwnds[buff.value] = hwnd return False return True EnumChildWindows(nppHandle, WNDENUMPROC(foreach_window), 0) if SEARCH_WINDOW in window_hwnds: SCI_GETPROPERTY = 4008 sci_hwnd = FindWindowEx(window_hwnds[SEARCH_WINDOW], None, 'Scintilla', None) mark_struct = create_string_buffer(b'@MarkingsStruct') mark_struct_ptr = addressof(mark_struct) length = SendMessage(sci_hwnd, SCI_GETPROPERTY, mark_struct_ptr, 0) buffer = create_string_buffer(length+1) SendMessage(sci_hwnd, SCI_GETPROPERTY, mark_struct_ptr, addressof(buffer)) _lexer = create_unicode_buffer('searchResult') ilexer_ptr = SendMessage(notepad.hwnd, NPPM_CREATELEXER, 0, addressof(_lexer)) editor2.setILexer(ilexer_ptr) editor2.setProperty('@MarkingsStruct', buffer.value) editor2.styleSetFore(1, (224, 108, 117)) editor2.styleSetFore(2, (229, 192, 123)) editor2.styleSetFore(3, (209, 154, 102)) editor2.styleSetFore(4, (97, 175, 239)) editor2.colourise(0, -1)
Note the use of editor2!
-
@Alan-Kilborn said earlier:
Another reason is that I have a UDL that I made for .sr files which colorizes the output somewhat like N++'s Search results,
Actually, I misspoke. When I first set it up, I was trying to do it with a UDL, but I later switched to using the EnhanceAnyLexer plugin. (I was confused because I didn’t delete my UDL when I went a different way)
EnhanceAnyLexer seems easier than trying to force N++ to artificially use the internal Search-result lexer, but it is definitely interesting to play around with something like that, so I enjoyed considering the code from @Michael-Vincent and @Ekopalypse earlier in this thread.
-
-
Hello, @yaron, @coises, @mkupper, @alan-kilborn and All,
From this post :
In order to always get the target line on top of the visible screen, @mkupper, simply add these
3
Python lines :curr_pos = editor.getCurrentPos() curr_line = editor.lineFromPosition(curr_pos) editor.setFirstVisibleLine(curr_line)
right after the line
editor.gotoLine(line_in_source_file)
in the
FindAndDisplayAllDuplicateLines (FADADL)
Alan scriptJust be sure that the parameter
Enable scrolling beyond last line
is checked, in thePreferences > Editing
panel
Now, @mkupper and @alan-kilborn, to get rid of the random selection issue, I personally solve the problem by moving the
DupeLineResults.sr
file in the secondary view ! And , in that case, it does not bother anymore about a possible previous selection, in theDupeLineResults.sr
file, right before double-clicking to get an other line ;-)WOW…, Alan, everything is perfect with your script, by now !!
Best Regards,
guy038
-
@guy038 said in Find and Display *All* Duplicate Lines:
In order to always get the target line on top of the visible screen, @mkupper, simply add these 3 Python lines :
curr_pos = editor.getCurrentPos() curr_line = editor.lineFromPosition(curr_pos) editor.setFirstVisibleLine(curr_line)
right after the line
editor.gotoLine(line_in_source_file)
You shouldn’t have to calculate new values (your
curr_pos
andcurr_line
).editor.setFirstVisibleLine(line_in_source_file)
should suffice.
to get rid of the random selection issue, I personally solve the problem by moving the
DupeLineResults.sr
file in the secondary view ! And , in that case, it does not bother anymore about a possible previous selection, in theDupeLineResults.sr
file, right before double-clicking to get an other lineI can’t comment, as I can’t reproduce random selections happening.
Alan, everything is perfect with your script, by now
Well, I doubt this, given mkupper’s continuing strange issues with it.
EDIT: Ah…wait… I may have just had an inspiration on what could be happening for mkupper, even though I can’t repro it. I’ll do some more thinking on it, and if its logic is sound, I’ll post about it…
-
Hi, @yaron, @coises, @mkupper, @alan-kilborn and All,
So, as @alan-kilborn mentioned, to always get the target line on top of the visible screen, @mkupper, simply add this line :
editor.setFirstVisibleLine(line_in_source_file)
Right after the line
editor.gotoLine(line_in_source_file)
But I forgot to specify that your must cancel, as well, the
Word wrap
feature. That is IMPORTANT !!BR
guy038
-
Hi, @alan-kilborn and All,
An other minor bug :
If you do get a
DupeLineresults.sr
file in a tab and that the corresponding source file is presently closed, any double-click on a line of theDupeLineResults.sr
file will not open the source file, contrary to a double-click in theSearch results
panel !BR
guy038
-
@guy038 said in Find and Display *All* Duplicate Lines:
If you do get a
DupeLineresults.sr
file in a tab and that the corresponding source file is presently closed, any double-click on a line of theDupeLineResults.sr
file will not open the source file, contrary to a double-click in the Search results panel !Yes, well a compromise here, because this is only a single source file situation – not multi-file like a potential find-in-files – is that you already have the source file open in another tab. :-)
We can fix it with…more code… The original intent, like most of my scripts, is just a demo of possible functionality, not all-encompassing behavior. To try to do that…scripts get too long and the main point is lost, with all the error-checking needed, and the full-featuredness ratcheting up the line count…
-
@guy038 said in Find and Display *All* Duplicate Lines:
But I forgot to specify that your must cancel, as well, the Word wrap feature. That is IMPORTANT !!
Do we call this YOUR bug, since you introduced the “setFirstVisibleLine” code? :-)
I didn’t try it, but probably changing:
editor.setFirstVisibleLine(line_in_source_file)
to
editor.setFirstVisibleLine(editor.visibleFromDocLine(line_in_source_file))
will cure that.
-
Thank you @guy038 on the
editor.setFirstVisibleLine(line_in_source_file)
thing. That works perfectly and now I’m consistently taken a view with the desired line at the top.@Alan-Kilborn, as the results are now more consistent I spotted a clue related to the random selection. The end of the random selection is at or very near the mouse which is there because I was double clicking on the line in the
DupeLineResults.sr
file. The typing cursor is also at that spot.When we double click on a word in npp that word becomes selected. When working with
DupeLineResults.sr
I double click on a line and am usually double clicking on the number part ofLine 1234
though I could double click on the wordLine
. I’m now wondering if npp or Scintilla is still in the middle of painting that double-clicked word in theDupeLineResults.sr
tab while theFindAndDisplayAllDuplicateLines.py
script is runningnotepad.activateIndex(view, index)
I tried an experiment with starting a bunch of CPU bound processes to tie up the machine but was unable to hit the sweet spot of getting random selections to happen every time. I did discover that if I use
start /high
when starting a CPU bound thread that having all of my CPU cores running high priority threads results badly performing windows. I killed one of those threads to free up a CPU core and both Windows and Notepad++ work very well. -
Hi @guy038 a bit off topic but…
Congratulations on being the top 3 poster now!!!
Thanks for your great contributions to the Notepad++ community. -
-
If you’ve used a script in this thread, you might want to double check your copy of it for a bug I’ve discovered.
Look to previous postings in this topic thread where the script has been changed – find the textmoderator edit (2024-Jan-14)
.
There’s a link there that describes the bug in more detail, and shows what needs to be changed in an old copy (or you can simply grab a copy of the current version).