Select bookmarked lines

hu ma

Is there a way to do it? I want to de a series of steps, which is not possible using any plugin what so ever, so if anyone could help i will be grateful.

1st one is select the bookmarked lines and add a certain text at the start and end of each “selected line”

2nd one is reverseing the order of selected text, for example: 12345 will be 54321, but for the unicode character set, not only ASCII characters.
For instance, the sentence
(hello in Japanese is こんにちは), will be
(はちにんこ si esenapaJ ni olleh)

guy038

Hello, hu ma,

Concerning your second step, I suppose that a simple Lua or Python script would do it, fine ! I can’t help you ( yet ! ), in that matter, but I guess that Claudia or Scott ( Two Python’s gurus ! ) will be, certainly, able to get the appropriate script !
Concerning your first step, NO need to bookmark any text, for adding text in front of / at end of each bookmarked line ( unless you need the bookmarked lines for an other purpose ! )

Just do a search/replacement, in Regular expression search mode :

To add text, at beginning of all the lines, matching a specific regex :

SEARCH (?-s)^.*<Your regex|String|Character>

REPLACE Text to add at beginning$0

To add text, at end of all the lines, matching a specific regex :

SEARCH (?-s)<Your regex|String|Character>.*

REPLACE $0Text to add at end

NOTES :

The <Your regex|String|Character> expression is the regex, stringor character, that you would have used to bookmark lines, matching that regex, string or character !
The part (?-s) in a pattern modifier, which forces the regex engine to consider the dot meta-character (. ) as any standard character, except for any EOL character, as \r or \n
The part ^.* represents any range of standard characters, even empty, between the beginning of a line and your expression to match
The part .* are all the remaining standard characters, even zero, between your expression to match and the end of a line, before the EOL characters

In replacement, the $0 syntax stands for the complete current search match

Best regards,

guy038

Scott Sumner

Here’s a quick Pythonscript one-liner that reverses the currently selected text:

editor.replaceSel(editor.getSelText()[::-1])

I presume it works for Unicode; I only tried it out quickly with ASCII.

@guy038, Your solution for the other part works as long as the original poster is doing a find (more accurately a “mark”-find); when I read the posting originally, my impression was that the set of bookmarked lines was marked manually. An approach in that case could be to number all lines with the column editor feature, then cut all bookmarked lines to another file, do a regex-based find-and-replace (that of course ignores the artificial line numbers), then copy the lines from this new file back to the original file, then sort that file and finally delete the line numbers…whew…a lot of work! :-)

abuali huma

@guy038 said:

Hello, hu ma,

Concerning your second step, I suppose that a simple Lua or Python script would do it, fine ! I can’t help you ( yet ! ), in that matter, but I guess that Claudia or Scott ( Two Python’s gurus ! ) will be, certainly, able to get the appropriate script !

Concerning your first step, NO need to bookmark any text, for adding text in front of / at end of each bookmarked line ( unless you need the bookmarked lines for an other purpose ! )

Just do a search/replacement, in Regular expression search mode :

To add text, at beginning of all the lines, matching a specific regex :

SEARCH (?-s)^.*<Your regex|String|Character>

REPLACE Text to add at beginning$0

To add text, at end of all the lines, matching a specific regex :

SEARCH (?-s)<Your regex|String|Character>.*

REPLACE $0Text to add at end

NOTES :

The <Your regex|String|Character> expression is the regex, stringor character, that you would have used to bookmark lines, matching that regex, string or character !

The part (?-s) in a pattern modifier, which forces the regex engine to consider the dot meta-character (. ) as any standard character, except for any EOL character, as \r or \n

The part ^.* represents any range of standard characters, even empty, between the beginning of a line and your expression to match

The part .* are all the remaining standard characters, even zero, between your expression to match and the end of a line, before the EOL characters

In replacement, the $0 syntax stands for the complete current search match

Best regards,

guy038

it doesn’t work when I put SEARCH (?-s)^.*<[\x{3000}-\x{9faf}]>
I don’t have a special characters at each line that I use for searching, only a random characters between [\x{3000}-\x{9faf}] is how I bookmark them.

Anyway, I managed to do this by recording macro after bookmarking all lines by
wirting start line text> press END> wirting END line text> Toggle Bookmark> F2 to Move to next bookmark
playing that marco does the trick for me

abuali huma

@Scott-Sumner said:

Here’s a quick Pythonscript one-liner that reverses the currently selected text:
editor.replaceSel(editor.getSelText()[::-1])
I presume it works for Unicode; I only tried it out quickly with ASCII.

@guy038, Your solution for the other part works as long as the original poster is doing a find (more accurately a “mark”-find); when I read the posting originally, my impression was that the set of bookmarked lines was marked manually. An approach in that case could be to number all lines with the column editor feature, then cut all bookmarked lines to another file, do a regex-based find-and-replace (that of course ignores the artificial line numbers), then copy the lines from this new file back to the original file, then sort that file and finally delete the line numbers…whew…a lot of work! :-)

Unfortunately it is works only for ASCII, For Unicode it mess up the text

Scott Sumner

This could possibly work, but if it doesn’t I am handing off to an encoding expert (@guy038 ?) because I’m pretty much exclusively an A-Z person. :-)

editor.replaceSel(editor.getSelText().decode('utf8')[::-1].encode('utf8'))

abuali huma

@Scott-Sumner said:

This could possibly work, but if it doesn’t I am handing off to an encoding expert (@guy038 ?) because I’m pretty much exclusively an A-Z person. :-)
editor.replaceSel(editor.getSelText().decode('utf8')[::-1].encode('utf8'))

Thanks, It work really well now.
The only what remaining is to select each bookmarked lines then flip them by one click

Claudia Frank

@abuali-huma

or you could try to adapt the script with what I’ve posted here.

Cheers
Claudia

abuali huma

@Claudia-Frank said:

@abuali-huma

or you could try to adapt the script with what I’ve posted here.

Cheers
Claudia

Well, I didn’t know how to adapt a script
let say all the lines that I’m bookmarking is containing this text :

=txeT eniL<

Or if it is better this regex

[\x{3000}-\x{9faf}]

Claudia Frank

@abuali-huma

so you want reverse the whole line of every bookmarked line, correct?

Cheers
Claudia

Scott Sumner

@abuali-huma

This short script could do it:

npp_bookmark_marker_id_number = 24
npp_bookmark_marker_mask = 1 << npp_bookmark_marker_id_number
line_nbr = editor.markerNext(0, npp_bookmark_marker_mask)
while line_nbr != -1:
    editor.setSelectionStart(editor.positionFromLine(line_nbr))
    editor.setSelectionEnd(editor.getLineEndPosition(line_nbr))
    editor.replaceSel(editor.getSelText().decode('utf8')[::-1].encode('utf8'))
    line_nbr = editor.markerNext(line_nbr + 1, npp_bookmark_marker_mask)

abuali huma

@Scott-Sumner said:

@abuali-huma

This short script could do it:

npp_bookmark_marker_id_number = 24
npp_bookmark_marker_mask = 1 << npp_bookmark_marker_id_number
line_nbr = editor.markerNext(0, npp_bookmark_marker_mask)
while line_nbr != -1:
    editor.setSelectionStart(editor.positionFromLine(line_nbr))
    editor.setSelectionEnd(editor.getLineEndPosition(line_nbr))
    editor.replaceSel(editor.getSelText().decode('utf8')[::-1].encode('utf8'))
    line_nbr = editor.markerNext(line_nbr + 1, npp_bookmark_marker_mask)

Thanks, that saved my day!

hu ma

What if i want to do this?

Original
Line#1 Aaaaaa;Ax#&;Dx#&
Line#2 Bbbbb

Result
Line#1 Bbbbb;Ax#&;Dx#&Aaaaaa

And again with unicode character texts also

abuali huma

Sorry if the previous isn’t clear example
better example would be

Original
Line#1 FGHI;Ax#&;Dx#&
Line#2 BCDE

Result
Line#1 BCDE;Ax#&;Dx#&FGHI

guy038

Hi, hu ma,

Ah! I see ! You’re using CJK ideographic characters !

The different Unicode CJK scripts are :

- CJK Radicals Supplement - Phonetics and Symbols  (  2E80 -  2EFF )
- CJK Kangxi Radicals                              (  2F00 -  2FDF )
- CJK Ideographic Description Characters           (  2FF0 -  2FFF )
- CJK Symbols and Punctuation                      (  3000 -  303F )
- CJK Strokes                                      (  31C0 -  31EF )
- CJK Enclosed Letters and Months                  (  3200 -  32FF )
- CJK Compatibility                                (  3300 -  33FF )
- CJK Unified Ideographs Extension A               (  3400 -  4DB5 )
- CJK Unified Ideographs                           (  4E00 -  9FD5 )
- CJK Compatibility Ideographs                     (  F900 -  FAFF )
- CJK Compatibility Forms                          (  FE30 -  FE4F )
- CJK Half-width Punctuation                       (  FF61 -  FF64 )
- CJK Unified Ideographs Extension B               ( 20000 - 2A6D6 )
- CJK Unified Ideographs Extension C               ( 2A700 - 2B734 )
- CJK Unified Ideographs Extension D               ( 2B740 - 2B91D )
- CJK Compatibility Ideographs Supplement          ( 2F800 - 2FA1F )

As I have Asiatic languages, installed in my Win XP configuration, I chose the SimSun font to give you an example of my regex search/replacement, which does work, perfectly well !

In this example, the regex looks, successively, for the three following characters, enclosed by two angle brackets <> :

The second character, of your range [\x{3000}-\x{9faf}], known in my SimSun font, ( 、 ), of Unicode code-point \x{3001}
A character, in the middle of your range [\x{3000}-\x{9faf}] , ( 栀 ), of Unicode code-point \x{6800}
The last character, of your range [\x{3000}-\x{9faf}], known in my SimSun font, ( 龥 ), of Unicode code-point \x{9fa5}

Moreover, I added some random values :

The three characters 一倀怀, of Unicode code-points \x{4e00}\x{5000}\x{6000}, before the searched string
The three characters 瀀耀退, of Unicode code-points \x{7000}\x{8000}\x{9000}, after the searched string

And, in replacement, I inserted the string Inserted灭Text, containing the ideographic character 灭, of Unicode code-point \x{706d}

So, starting with the original text, with an UTF-8 encoding :

一倀怀<、>瀀耀退
一倀怀<栀>瀀耀退
一倀怀<龥>瀀耀退

The first S/R :

SEARCH ^.*<[\x{3000}-\x{9faf}]>

REPLACE Inserted灭Text$0

gives the resulting text :

Inserted灭Text一倀怀<、>瀀耀退
Inserted灭Text一倀怀<栀>瀀耀退
Inserted灭Text一倀怀<龥>瀀耀退

And the second S/R :

SEARCH <\x{3000}-\x{9faf}]>.*

REPLACE $0Inserted灭Text

gives the final text :

一倀怀<、>瀀耀退Inserted灭Text
一倀怀<栀>瀀耀退Inserted灭Text
一倀怀<龥>瀀耀退Inserted灭Text

Cheers,

guy038

P.S. :

I just saw your recent post. Let’s me a couple of minutes to think about it. I’m back , soon !

hu ma

@guy038 said:

Hi, hu ma,

Ah! I see ! You’re using CJK ideographic characters !

The different Unicode CJK scripts are :
- CJK Radicals Supplement - Phonetics and Symbols  (  2E80 -  2EFF )
- CJK Kangxi Radicals                              (  2F00 -  2FDF )
- CJK Ideographic Description Characters           (  2FF0 -  2FFF )
- CJK Symbols and Punctuation                      (  3000 -  303F )
- CJK Strokes                                      (  31C0 -  31EF )
- CJK Enclosed Letters and Months                  (  3200 -  32FF )
- CJK Compatibility                                (  3300 -  33FF )
- CJK Unified Ideographs Extension A               (  3400 -  4DB5 )
- CJK Unified Ideographs                           (  4E00 -  9FD5 )
- CJK Compatibility Ideographs                     (  F900 -  FAFF )
- CJK Compatibility Forms                          (  FE30 -  FE4F )
- CJK Half-width Punctuation                       (  FF61 -  FF64 )
- CJK Unified Ideographs Extension B               ( 20000 - 2A6D6 )
- CJK Unified Ideographs Extension C               ( 2A700 - 2B734 )
- CJK Unified Ideographs Extension D               ( 2B740 - 2B91D )
- CJK Compatibility Ideographs Supplement          ( 2F800 - 2FA1F )
As I have Asiatic languages, installed in my Win XP configuration, I chose the SimSun font to give you an example of my regex search/replacement, which does work, perfectly well !

In this example, the regex looks, successively, for the three following characters, enclosed by two angle brackets <> :

The second character, of your range [\x{3000}-\x{9faf}], known in my SimSun font, ( 、 ), of Unicode code-point \x{3001}

A character, in the middle of your range [\x{3000}-\x{9faf}] , ( 栀 ), of Unicode code-point \x{6800}

The last character, of your range [\x{3000}-\x{9faf}], known in my SimSun font, ( 龥 ), of Unicode code-point \x{9fa5}

Moreover, I added some random values :

The three characters 一倀怀, of Unicode code-points \x{4e00}\x{5000}\x{6000}, before the searched string

The three characters 瀀耀退, of Unicode code-points \x{7000}\x{8000}\x{9000}, after the searched string

And, in replacement, I inserted the string Inserted灭Text, containing the ideographic character 灭, of Unicode code-point \x{706d}

So, starting with the original text :
一倀怀<、>瀀耀退
一倀怀<栀>瀀耀退
一倀怀<龥>瀀耀退
The first S/R :

SEARCH ^.*<[\x{3000}-\x{9faf}]>

REPLACE Inserted灭Text$0

gives the resulting text :
Inserted灭Text一倀怀<、>瀀耀退
Inserted灭Text一倀怀<栀>瀀耀退
Inserted灭Text一倀怀<龥>瀀耀退
And the second S/R :

SEARCH [\x{3000}-\x{9faf}]>.*

REPLACE $0Inserted灭Text

gives the final text :
一倀怀<、>瀀耀退Inserted灭Text
一倀怀<栀>瀀耀退Inserted灭Text
一倀怀<龥>瀀耀退Inserted灭Text
Cheers,

guy038

P.S. :

I just saw your recent post. Let’s me a couple of minutes to think about it. I’m back , soon !

I see bunch of options here I could use, thanks for the information

guy038

Hello, abuali huma,

You didn’t say if the regex must keep the Line 2 unchanged or if you want to wipe out this line !

Anyway, here are the appropriate regexes for each case :

1) If line 2 is unchanged :

From the original text :

FGHI;Ax#&;Dx#&
BCDE

the S/R, below :

SEARCH (?-si)^(FGHI)(.*)(?=\R(BCDE))

REPLACE \3\2\1

would give the result :

BCDE;Ax#&;Dx#&FGHI
BCDE

NOTES :

The first part (?-si) forces the dot ( . ) to match standard characters, only, and the regex engine to work, in a NON-insensitive way
The \R form represents any kind of EOL characters (\r\n ), (\n ) or (\r )
The (?=\R(BCDE)) syntax is called a positive look-ahead, that is to say a condition which must be verified to valid the overall regex, but which is never part of the final match. So the condition is “Does it exist, at the end of line 1, some EOL character(s), followed by the string BCDE ?”. The string BCDE is stored in group 3

2) If line 2 must be deleted, too :

From the original text :

FGHI;Ax#&;Dx#&
BCDE

The second S/R :

SEARCH (?-si)^(FGHI)(.*)\R(BCDE)

REPLACE \3\2\1

would give the final text :

BCDE;Ax#&;Dx#&FGHI

Best Regards,

guy038

abuali huma

@guy038 said:

Hello, abuali huma,

You didn’t say if the regex must keep the Line 2 unchanged or if you want to wipe out this line !

Anyway, here are the appropriate regexes for each case :

1) If line 2 is unchanged :

From the original text :
FGHI;Ax#&;Dx#&
BCDE
the S/R, below :

SEARCH (?-si)^(FGHI)(.*)(?=\R(BCDE))

REPLACE \3\2\1

would give the result :
BCDE;Ax#&;Dx#&FGHI
BCDE
NOTES :

The first part (?-si) forces the dot ( . ) to match standard characters, only, and the regex engine to work, in a NON-insensitive way

The \R form represents any kind of EOL characters (\r\n ), (\n ) or (\r )

The (?=\R(BCDE)) syntax is called a positive look-ahead, that is to say a condition which must be verified to valid the overall regex, but which is never part of the final match. So the condition is “Does it exist, at the end of line 1, some EOL character(s), followed by the string BCDE ?”. The string BCDE is stored in group 3

2) If line 2 must be deleted, too :

From the original text :
FGHI;Ax#&;Dx#&
BCDE
The second S/R :

SEARCH (?-si)^(FGHI)(.*)\R(BCDE)

REPLACE \3\2\1

would give the final text :
BCDE;Ax#&;Dx#&FGHI
Best Regards,

guy038

What if it is different text?
Example one
Line#1 こんにちは;Ax#&;Dx#&
Line#2 行く
Result
Line#1 行く ;Ax#&;Dx#&こんにちは

Example two
Line#18 細かい ;Ax#&;Dx#&
Line#19 自分を愛する
Result
Line#18 自分を愛する ;Ax#&;Dx#& 細かい

and so on, and yes 2nd line is supposed to be deleted

guy038

Hi, abuali huma,

Sorry, I was absent for a while, because of an virus analysis on my laptop ! But, please be quiet : absolutely NO connection with our NodeBB site nor our discussion, too :-))

For a general regex, no problem at all ! I just interpreted that :

Group 1, beginning the first line, is supposed to contain ideographic characters, ONLY
Group 2 represents any range, after the last ideographic character, in the first line, till the end of the line
Group 3, standing for the second line, is supposed, also, to contain ideographic characters, ONLY

Remark :

Some additional space characters, NOT present in the original text, seem included in the replacement text, of your two examples :

Before the first semicolon, in your first example
After the ampersand character, in your second example

From my hypotheses, any space character would be stored in group 2, of course !. In addition, you may separate, in replacement, the three groups with a space character, as well !

So, a general S/R would be :

SEARCH (?-si)^([\x{3000}-\x{9faf}]+)(.*)\R([\x{3000}-\x{9faf}]+)

REPLACE \3\2\1

Notes :

The part [\x{3000}-\x{9faf}]+ tries to match the largest, non empty, range of ideographic characters, from beginning of a line ( ^ ), stored in group 1 OR after an EOL character (\R ), stored in group 3
During replacement, these three groups, on two consecutive lines, are, just, re-written, in a single line, into a different order !
As said above, the replacement regex may be changed into \3 \2 \1

Cheers,

guy038

hu ma

@guy038 said:

Hi, abuali huma,

Sorry, I was absent for a while, because of an virus analysis on my laptop ! But, please be quiet : absolutely NO connection with our NodeBB site nor our discussion, too :-))

For a general regex, no problem at all ! I just interpreted that :

Group 1, beginning the first line, is supposed to contain ideographic characters, ONLY

Group 2 represents any range, after the last ideographic character, in the first line, till the end of the line

Group 3, standing for the second line, is supposed, also, to contain ideographic characters, ONLY

Remark :

Some additional space characters, NOT present in the original text, seem included in the replacement text, of your two examples :

Before the first semicolon, in your first example

After the ampersand character, in your second example

From my hypotheses, any space character would be stored in group 2, of course !. In addition, you may separate, in replacement, the three groups with a space character, as well !

So, a general S/R would be :

SEARCH (?-si)^([\x{3000}-\x{9faf}]+)(.*)\R([\x{3000}-\x{9faf}]+)

REPLACE \3\2\1

Notes :

The part [\x{3000}-\x{9faf}]+ tries to match the largest, non empty, range of ideographic characters, from beginning of a line ( ^ ), stored in group 1 OR after an EOL character (\R ), stored in group 3

During replacement, these three groups, on two consecutive lines, are, just, re-written, in a single line, into a different order !

As said above, the replacement regex may be changed into \3 \2 \1

Cheers,

guy038

Thanks mate, I will give it a try once I get back on my pc after a while…

But as I understand from you, that is in group 1&3, if the line had mixed character set, the regex will not apply to that line ?