Marked text manipulation

Andriy Poznakhovskyy

Hey all,

I know this topic is quite old, anyway decided to share solution I’ve discovered (I’m not so technical, so Python script isn’t the option for me). So, long story short, I’ve extracted a long JSON response and needed to copy 95 URLs from it only and ignore everything else. Like in @Suncatcher’s case, everything was stored in a single line.

So, I did the following:

Search for https://site.com/project/(.*?) regexp and replace all matches with \r\nhttps://site.com/project/$1\r\n so URLs were moved to separate lines;
Afterward, switch to “Mark” tool, check “Bookmark line” option and mark all https://site.com/project/(.*?)
Finally, click “Search” menu => Bookmark => Remove unmarked lines

That’s it, list of necessary items only (URLs in my case) was created 🎉 My case is easier comparing to topic’s author, anyway hope this will be helpful for someone in the future, cheers!

P.S. @guy038 thanks for mentioning Bookmark feature, I’ve never used it before and it’s super helpful

Alan Kilborn

More old thread revival…

So I recently had a need for what is discussed in this thread, but I needed it embedded in a Pythonscript, and all I really needed was the logic conveyed by @guy038 with this solution:

So I figured out the Your regex to match part for my data; I’ll use Bob|Ted here for that for purposes of illustration, and of course some sample data:

Alice Carol Alan Bob Ted
Ted Bob
Bob Carol Ted
Ted Carol
Alice Carol
Alice Bob
Bob
Alan Carol
Alan Alice
Bob Carol Alan
Alice
Bob Ted Alan
Alice Ted
Alan Ted
Ted Alan Alice Carol
Bob Ted Alice Carol
Bob Alan Alice Carol
Alan
Bob Ted Alan Carol
Ted
Alan Bob
Alice Carol Ted
Alice Bob Alan
Alice Bob Carol
Bob Carol
Bob Ted Alan Alice
Alice Ted Alan
Carol
Alice Carol Alan
Alice Bob Ted
Carol Ted Alan

and I coded up the Pythonscript one-liner for it based on @guy038 's regex:

editor.rereplace(r'(?s)^.*?(Bob|Ted)|(?s).*\z', r'(?1\1\r\n)')

and I thought I would end up with a number of lines with either Bob or Ted on them. What actually happened was that I ended up with a single-line result of Alice! Clearly, INCORRECT! Or at least not what I needed.

Digging in and working on it a bit, I found a correct way to achieve it in a Pythonscript replacement, and that is:

editor.rereplace(r'(?s)(Bob|Ted)|(?:.+?(?=(?1)))|(?:.+\z)', r'?1\1\r\n')

which, for the sample data above, yields the expected:

Bob
Ted
Ted
Bob
Bob
Ted
Ted
Bob
Bob
Bob
Bob
Ted
Ted
Ted
Ted
Bob
Ted
Bob
Bob
Ted
Ted
Bob
Ted
Bob
Bob
Bob
Bob
Ted
Ted
Bob
Ted
Ted

So, long story LONG, but I wanted to share that if anyone tries this technique using a script, the search regex to use might need to be altered to:

SEARCH (?s)( Your regex to match)|(?:.+?(?=(?1)))|(?:.+\z)

The REPLACE part is unchanged from what @guy038 provided.

Note that I also tested it interactively in Notepad++'s Replace window and it works fine there as well, at least for my sample data.

guy038

Hello, @alan-kilborn and All,

I’m really sorry, because it’s just my fault and you wouldn’t have had to look for an alternative solution :-( Indeed, the regex S/R, that I gave in my post, below, does contains an error which is not important when using the Notepad++ Replace dialog, but which seems critical when you run a Python script, involving regexes !

https://community.notepad-plus-plus.org/topic/12710/marked-text-manipulation/8

I suppose that this fact should be related to this “small” point, located at the end of the description of the editor.rereplace helper method :

http://npppythonscript.sourceforge.net/docs/latest/scintilla.html?highlight=editor.rereplace#Editor.rereplace

An small point to note, is that the replacements are first searched, and then all replacements are made. This is done for performance and reliability reasons. Generally this will have no side effects, however there may be cases where it makes a difference. (Author’s note: If you have such a case, please post a note on the forums such that it can be added to the documentation, or corrected).

To understand the problem , let’s just use the beginning of your example text, pasted in a new N++ tab

Alice Carol Alan Bob Ted
Ted Bob
Bob Carol Ted
Ted Carol
Alice Carol
Alice Bob

If my generic regex S/R, below, with your regex choice (Bob|Ted) is used, against this text :

SEARCH (?s)^.*?(Bob|Ted)|(?s).*\z

REPLACE ?1\1\r\n

We get, after a click on the Replace All button or several clicks on the Replace button, and with the Wrap around option ticked, the following correct result :

Bob
Ted
Ted
Bob
Bob
Ted
Ted
Bob

Note that I use the ^ assertion which forces the regex engine to search a range of chars beginning a line. Of course, in case of replacement, no trouble at all ! Indeed, due to the \r\n syntax, any match \1 is rewritten with a line-break. So, the next search, with the (?s) mode, automatically matches right after that line-break, added by the replacement !

Now, let’s get back the initial text ( with Ctrl + Z ) and let’s suppose that we just want to trace the different matches of that regex S/R, using the Find Next button only. In that case, we get only 2 matches !!??

Obviously, the first match is :

Alice Carol Alan Bob

But the second and final match is :

 Ted
Ted Bob
Bob Carol Ted
Ted Carol
Alice Carol
Alice Bob

Why ? Well, after the first match, the caret location is right after the word Bob of the first line. So, it cannot match the string space + Ted because this string should begin the current line, due to, both, the ^ symbol and the grouping parentheses

As the first alternative (?s)^.*?(Bob|Ted) cannot match, at this location, the regex engine tries the other alternative (?s).*\z, which, of course, matches all the remaining characters of current file, beginning with space + Ted of the 1st line !!

BTW, I don’t understand, Alan why you got a match Alice. Indeed, when running :

editor.rereplace(r'(?s)^.*?(Bob|Ted)|(?s).*\z', r'?1\1\r\n')

I personally only get the forename Bob, which is the first word matched of the text !

Now, it’s easy to imagine the correct regex S/R to use : it should not contain any ^ assertion and be as below :

SEARCH (?s).*?(Bob|Ted)|(?s).*\z

REPLACE ?1\1\r\n ( or ?1\1\n for an Unix file )

This time, if you click, successively, on the Find Next button, you’ll be able to see the different matches of the search regex !

And, I did verify that the one-line script, below, without the ^ symbol, gives the expected text ;-))

editor.rereplace(r'(?s).*?(Bob|Ted)|(?s).*\z', r'?1\1\r\n')

Best Regards

guy038

Two more points :

Your new regex S/R :

editor.rereplace(r'(?s)(Bob|Ted)|(?:.+?(?=(?1)))|(?:.+\z)', r'?1\1\r\n')

works correctly because it does not contain any ^ assertion !

but, would you had added the ^ symbol, like below :

editor.rereplace(r'(?s)(Bob|Ted)|(?:^.+?(?=(?1)))|(?:.+\z)', r'?1\1\r\n')

it would had changed all your multi-lines example text as :

Bob

The lesson of that story is :

If you can properly visualize the different matches of a regex expression, as you expect to, when using the Find Next button, it’s likely that any replacement process, run from within a S/R script command, should work nicely, too ;-))

Alan Kilborn

@guy038

Thank you for the further analysis.

I’m really sorry, because it’s just my fault and you wouldn’t have had to look for an alternative solution

No worries at all! :-)

I don’t understand, Alan why you got a match (of only) “Alice”. I personally only get the forename “Bob”

Indeed! I guess “something happened” because if I re-run it now the same way I for sure get “Bob” as well! Sorry for that confusion.

Other comments:

I did not realize (obviously) that it was merely a case of a problem with the ^ in the original expression. :-(
I totally jumped in to an almost wholly different solution, based upon something related I was working on.

The lesson of that story is…

Nice to know!

guy038

Hello, @alan-kilborn and All,

So, as the result of our discussion :

This old post https://community.notepad-plus-plus.org/topic/12710/marked-text-manipulation/8

have been updated https://community.notepad-plus-plus.org/topic/19189/need-a-copy-marked-lines-feature/7

Cheers,

guy038

Alan Kilborn

Semi-related info:

Part of this current thread is about how to copy text that one has marked via Marking that requires use of a PythonScript.

In a POST in another thread, I provide a script that will select text given conditions specified by user input. That selected text can then be copied.

Similar results thru some different methods; enough to warrant a cross-link, I guess.

Laura Lucas Alday

Can anyone explain quickly how to use the python script? As in which steps I need to do in Notepad++ to run the script.

PeterJones

@Laura-Lucas-Alday

Install PythonScript using Plugins > Plugins Admin
Plugins > Python Script > New Script
Paste the contents and save under someName (whatever name you want)
Plugins > Python Script > Scripts > someName
To add a keyboard shortcut for the script:
Plugins > Python Script > Configuration,
select the someName script and Add to the menu
Restart Notepad++
someName will now be in the main Plugins > Python Script menu, not just in the Scripts submenu
Settings > Shortcut Mapper > Plugin Commands will see the script, and you can assign a keyboard shortcut

Alan Kilborn

@Laura-Lucas-Alday

Peter maybe only made it 99.9999% clear. Perhaps the newbie 0.0001% might still be left wondering…

Peter’s step 4 would be how you actually run the script.

Be advised that in this particular case, you need to have some red-marked text for the script to operate on (hopefully this part is clear).

I think there is an open issue that would make this script’s functionality a native part of Notepad++. I will try to find it…

Alan Kilborn

This may be the issue to which I was referring: https://github.com/notepad-plus-plus/notepad-plus-plus/issues/6095

Alan Kilborn

At the time of the earlier discussions, there was no way to copy marked text natively in Notepad++, meaning without resorting to scripting. Now (7.9.1-ish) there is:

Just press the indicated button after you already have marked some text.

Note: I have added similar information in this somewhat-related other THREAD.

dinkumoil

@Alan-Kilborn

My preferred solution would have been to select all marked text because this provides more flexibility. This way the user could

copy
cut
delete
change

the marked search results.

To achieve that I wrote the following Lua script (intended to be used with, well, the LuaScript plugin).

-- =============================================================================
-- Add menu entry to select all items marked by find marks
-- =============================================================================

npp.AddShortcut("Select Marked", "", function()
  local indicatorIdx
  local pos, markStart, markEnd
  local hasMarkedItems

  -- Search find marks have indicator number 31
  SCE_UNIVERSAL_FOUND_STYLE = 31

  -- Remove selection but do not change cursor position
  editor.Anchor  = editor.CurrentPos
  hasMarkedItems = false

  -- Search find mark indicators and select marked text
  pos = 0
  
  while pos < editor.TextLength do
    if editor:IndicatorValueAt(SCE_UNIVERSAL_FOUND_STYLE, pos) == 1 then
      markStart = editor:IndicatorStart(SCE_UNIVERSAL_FOUND_STYLE, pos)
      markEnd   = editor:IndicatorEnd(SCE_UNIVERSAL_FOUND_STYLE, pos)

      if editor.SelectionEmpty or not editor.MultipleSelection then
        editor:SetSelection(markEnd, markStart)
      else
        editor:AddSelection(markEnd, markStart)
      end

      pos            = markEnd
      hasMarkedItems = true
    end
    
    pos = pos + 1
  end

  -- Scroll last selected element into view. This is only because
  -- Npp/Scintilla will do that anyway when moving the cursor next time
  if hasMarkedItems then
    editor:ScrollCaret()
  end
end)

After adding this code to the startup.lua file and restarting Notepad++, the plugin’s submenu provides a new entry Select Marked which can be assigned to a keyboard shortcut.

Alan Kilborn

@dinkumoil said in Marked text manipulation:

select all marked text because this provides more flexibility. This way the user could

copy
cut
delete
change

That’s a good idea as well, especially for delete and change (although those can be achieved, on a one-off basis, by replace-with-nothing and replace-with something ops).

Copy and cut of selected text would jam all of the text together if it were on a selection basis (probably not really useful that way).

I noticed this new Copy Marked Text command inserts a line-ending between the individual globs of marked text, which seems to make it useful, although to tell the truth I haven’t had a real application for it myself, yet.

dinkumoil

@Alan-Kilborn said in Marked text manipulation:

Copy and cut of selected text would jam all of the text together if it were on a selection basis (probably not really useful that way).

Yes, you are right. Up to now, I’ve used my script only for deleting and editing the selections. But especially being able to multi-edit all occurences of a search term or overwrite them by multi-paste vastly increases productivity.

chk1

@dinkumoil said in Marked text manipulation:

@Alan-Kilborn

My preferred solution would have been to select all marked text because this provides more flexibility. This way the user could

copy

cut

delete

change

the marked search results.

To achieve that I wrote the following Lua script (intended to be used with, well, the LuaScript plugin).
// snip
After adding this code to the startup.lua file and restarting Notepad++, the plugin’s submenu provides a new entry Select Marked which can be assigned to a keyboard shortcut.

Thank you so much for that code, I converted it to Python for fans of the PythonScript plugin:

# -*- coding: utf-8 -*-
from Npp import editor

# https://community.notepad-plus-plus.org/topic/12710/marked-text-manipulation/48?_=1756239096378
SCE_UNIVERSAL_FOUND_STYLE = 31

def mark_to_multi():
    has_marked_items = False
    pos = 0
    while pos < editor.getTextLength():
        if editor.indicatorValueAt(SCE_UNIVERSAL_FOUND_STYLE, pos) == 1:
            mark_start = editor.indicatorStart(SCE_UNIVERSAL_FOUND_STYLE, pos)
            mark_end   = editor.indicatorEnd(SCE_UNIVERSAL_FOUND_STYLE, pos)
            
            if editor.getSelectionEmpty() or not editor.getMultipleSelection():
                editor.setSelection(mark_end, mark_start)
            else:
                editor.addSelection(mark_end, mark_start)

            pos = mark_end
            has_marked_items = True
            editor.setIndicatorCurrent(SCE_UNIVERSAL_FOUND_STYLE)
            editor.indicatorClearRange(mark_start, mark_end-mark_start)

        pos += 1
    if has_marked_items:
        editor.scrollCaret()

mark_to_multi()

The editor I used previously had a handy shortcut for this, I could just hit Alt+Enter in the search bar and it would immediately make all search results into cursors. Is there a reasonably easy way to replicate this in Notepad++? I use this function a lot.

Terry R

@chk1 said in Marked text manipulation:

I could just hit Alt+Enter in the search bar and it would immediately make all search results into cursors.

Do you mean make the search results into “links” to the actual text in the file? Because if so, then you should understand that already happens.

The mouse cursor doesn’t indicate that they are links, but in the “Search results” window that appears, double clicking on any search result line will make the main window move lines until that search result shows.

Terry

chk1

@Terry-R Thanks for the quick response, but the Search results window is not what I mean. When I press Alt+Enter (or an alternative key combo) I want every search result to be selected via multi-editing in the editor window, and not in a separate window or sidebar.

Right now, with help of the script, I open the Marks search (Ctrl+M), enter my search term and press “Mark all”. Then I run the script to convert all marks to multi-edit selections. I want to reduce these extra steps if possible, i.e. in the best case: Ctrl+F, enter search word, hit Alt+Enter.

Coises

@chk1 said in Marked text manipulation:

Right now, with help of the script, I open the Marks search (Ctrl+M), enter my search term and press “Mark all”. Then I run the script to convert all marks to multi-edit selections. I want to reduce these extra steps if possible, i.e. in the best case: Ctrl+F, enter search word, hit Alt+Enter.

It might not be worth it for your case, but the Search dialog in the Columns++ plugin has Select All as an option on the dropdown menu on the Count button.

You can assign a shortcut to Columns++ | Search in Notepad++ shortcut mapper; but at present, there is no straightforward way to open the Count button menu using only the keyboard.