• 0 Votes
    9 Posts
    10k Views
    guy038G

    Hi sngbrdb and All,

    First of all, as MAPJe71 said and, as you found out yourself, the regex [^,], for instance, matches, absolutely any of the 128,172 characters of the last Unicode 9.0 version, except for the usual comma sign !. So, to prevent any match of the regex engine to extend on several lines, it’s a good practise to include, systematically, the two characters \r\n in a negative character class. Therefore, our example should be re-written [^,\r\n]

    Just note that the shorten syntax \R, to match any kind of EOL character(s), cannot be used, inside a character class !

    Now, I would like to clarify some points, relative to :

    The two modifiers (?s) and (?m) and their opposite form (?-s) and (?-m)

    The two assertions ^ and $

    The dot meta-character .

    As I presume that this post will be ( too ! ) long , just have a drink and… let’s go !

    In the first place, it’s VERY important to realize that the two modifiers (?m) and (?s) do NOT deal of the same things :

    The (?m) modifier, and its opposite form (?-m), change the meaning of the ^ and $ assertions

    The (?s) modifier, and its opposite form (?-s), change the meaning of the . dot meta-character

    By default, the regex engine of N++ considers any text as made of multiple lines. So :

    The ^ symbol is a zero length assertion, which represents the location between an EOL character OR the very beginning of the current file and the first standard character of a line

    The $ symbol is a zero length assertion, which represents the location between the last standard character of a line and an EOL character OR the very end of the current file

    Although not necessary, and, especially, if all parts of your regex follows that behaviour, you may include, at the beginning of the regex, the (?m) modifier ( for multi-lines )

    For instance, the regexes ^123 or (?m)^123 would match the 123 string of any line, which begins with the string 123 and the regexes 789$ or (?m)789$ would match the 789 string of any line, which ends with the string 789

    On the contrary, when your regex begins with the (?-m) modifier ( for no multi-lines ) the regex engine considers all the contents of your current file as an unique line. So, the meaning of the ^ and $ symbols are restricted :

    The ^ symbol becomes a zero length assertion, which represents the location before the very first character of the current file

    The $ symbol becomes a zero length assertion, which represents the location after the very last character of the current file

    For instance, the regex (?-m)^123 would match a 123 string, at the beginning of the very first line of the current file and the regex (?-m)789$ would match a 789 string, at the end of the very last line of the current file. Notice, this implies that no EOL character follows the string 789. Indeed, in that case, the string 789 would not really end the file !!

    You’ll probably agree, as I do, that the behaviour of the regex engine, when using a (?-m) modifier, seems rather uninteresting :-(( Indeed, the two regexes, above, could be, simply, re-written as \A123 and 789\z, with the zero-length assertions \A and \z

    VERY IMPORTANT :

    If your regex does NOT contain any ^ symbol, nor $ symbol, the modifiers (?m) and/or (?-m) are quite USELESS !!

    By default, if the “. matches new line” option is UNCHECKED, the regex engine of N++ considers that the dot meta-character matches a standard character, only, and skips any EOL character !

    Although not necessary, and, especially, if all parts of your regex follows that behaviour, you may include, at the beginning of the regex, the (?-s) modifier ( for NO single line )

    Then, if we consider the simple text, below, with the two EOL characters \r\n, after digit 5

    12345 67890

    The regexes .+ or (?-s).+ would match, successively, the strings 12345 and 67890

    On the contrary, when your regex begins with the (?s) modifier ( For single line ) AND/OR if the ". matches new line" option is CHECKED, the N++ regex engine considers that the dot meta-character can match, absolutely, any character ( standard and EOL ones ) !

    Therefore, on the sample text above, the regex (?s).+ would match the overall string 12345\r\n67890, in one go !

    Notes :

    The in-line modifiers (?s) and (?-s) have priority on the present state of the . matches new line option of the Find/Replace dialog. So :

    Even if that option is checked, the regex (?-s).+ would match any standard text, till an EOL character, excluded

    Even if that option is UNchecked, the regex (?s).+ would match all the subsequent text, till the end of the current file

    Keep in mind that the combined use of the (?s) and (?-s) in-line modifiers, in a same regex, may be very interesting. For instance, the search of the regex (?s)(.+\R)(?-s)(.*123.*\R) and the replacement regex \2\1 would move the last line, containing the string 123, before the present contents of the current file, by swapping the two groups 1 and 2 !

    VERY INPORTANT :

    As above, for the (?m) modifier, if your regex does NOT contain any . dot symbol, the modifiers (?s) and/or (?-s) are quite USELESS !!

    Finally, let’s see the action of the two modifiers, m and s, used together. Consequently to what I said, just above, any regex containing these two modifiers should contain, at least, one dot meta-character and, either, a ^ or a $ symbol ! It will be the case, as we’re going to use the two regexes .{100,350}$ and ^.{100,350}, each of them preceded by one of the four modifier’s form, below :

    (?s-m) ( in short : ^ and $ symbols match beginning and end of file / . symbol matches any character )

    (?-sm) ( in short : ^ and $ symbols match beginning and end of file / . symbol matches standard characters )

    (?m-s) ( in short : ^ and $ symbols match beginning and end of line / . symbol matches standard characters )

    (?sm) ( in short : ^ and $ symbols match beginning and end of line / . symbol matches any character )

    To clearly notice the differences, between all these cases, let’s use the test text, below, corresponding to some parts of the license.txt file, slightly changed :

    When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. ----- Test line, which contains 60 characters, ONLY ! ------ For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all

    Once this text, pasted in a new tab, and before applying the regexes, just verify that :

    The word When, beginning the first line of that text, is NOT preceded by any character

    The word all, ending the last line of that text, is NOT followed by any EOL character !

    And, preferably :

    Select the Word wrap behaviour, with the menu option View - Word wrap

    Select the Show all characters behaviour, with the menu option View - Show Symbols - Show All Characters

    Finally, in the Find dialog :

    UNCHECK the Wrap around option

    Select the Regular expression search mode

    A last advice : Before testing any of the regexes, below, just move back the cursor, at the very beginning of this sample text ( so, before the word When ), with the CTRL + Origin shortcut

    Thus :

    The regex (?s-m).{100,350}$ matches the last 350 characters, of the current file, spread out on one or several lines ( 1 occurrence )

    The regex (?-sm).{100,350}$ matches the maximum of the last characters, if between 100 and 350, of the very last line, of the current file ( 1 occurrence )

    The regex (?m-s).{100,350}$ matches the maximum of the last characters, if between 100 and 350, of any single line, of the current file ( 6 occurrences )

    The regex (?sm).{100,350}$ matches a maximum range of any character, if between 100 and 350, followed by an EOL character OR finishing the current file, in one or several lines, empty or not ( 5 occurrences )

    and :

    The regex (?s-m)^.{100,350} matches the first 350 characters, of the current file, spread out on one or several lines ( See, note below ! )

    The regex (?-sm)^.{100,350} matches the maximum of the first characters, if between 100 and 350, of the very first line, of the current file ( 1 occurrence )

    The regex (?m-s)^.{100,350} matches the maximum of the first characters, if between 100 and 350, of any single line, of the current file ( 6 occurrences )

    The regex (?sm)^.{100,350} matches a maximum range of any character, if between 100 and 350, preceded by an EOL character OR beginning the current file, in one or several lines, empty or not ( 4 occurrences )

    IMPORTANT :

    Due to an incorrect handling of backward assertions, the N++ regex engine may NOT produce, in some cases, the right matches ! It’s just the case of the regex (?s-m)^.{100,350}, with the backward assertion ^ This regex engine should find one match, ONLY. However it, wrongly, find 5 occurrences :-((

    In fact, the regex engine seems, in that specific case, to use, instead, the regex (?s).{100,350}, which, simply, matches the longest string, till 350 characters, of any character, in one or several lines !

    With the hope that this global oversight could help you, in some cases !!

    Best Regards,

    guy038

  • want to open multiple instances without double-opening the same file

    Locked
    1
    0 Votes
    1 Posts
    2k Views
    No one has replied
  • Plugin Failures "Installation of --- Failed". Error

    Locked
    1
    0 Votes
    1 Posts
    2k Views
    No one has replied
  • 0 Votes
    2 Posts
    2k Views
    Vasile CarausV

    my backup works almost fine, I believe notepad++ save all files, even the new files without name, after a particular time, like 30 minutes. I don’t know exactly. I lost recently a file, I had 4 new unsave files.

  • Skipping Blank Lines?

    Locked
    4
    0 Votes
    4 Posts
    3k Views
    Vasile CarausV

    Search:
    \s+(.*?)

    Replace with:
    (Leave one Space)

    Or another solution:

    Search:
    \R
    Replace by:
    (Leave one space)

  • New Search in Search Results Does Not Work

    Locked
    1
    0 Votes
    1 Posts
    1k Views
    No one has replied
  • Backus-Naur Form

    3
    0 Votes
    3 Posts
    3k Views
    Arjan WiskerkeA

    That is correct.
    I thought it ought to be on the standard syntaxis list. But that would be more like a argument of aesthetics. I don’t think it is going to be used very often.

    I appreciate your response , Sorry for my late reaction, I seldom come to this forum.

  • Margin Manipulation

    Locked
    10
    0 Votes
    10 Posts
    7k Views
    Claudia FrankC

    @Scott-Sumner

    Hi Scott,

    are you more interested in how to create your own marker symbol or how to show a marker
    in the margin when changes occur?

    A quick script for the later would look like

    editor.setMarginTypeN(3,4) editor.setMarginWidthN(3,20) def callback_MODIFIED(args): if args['modificationType'] & 0x1: editor.marginSetText(editor.lineFromPosition(args['position']), '>') elif args['modificationType'] & 0x2: editor.marginSetText(editor.lineFromPosition(args['position']), '<') editor.clearCallbacks([SCINTILLANOTIFICATION.MODIFIED]) editor.callback(callback_MODIFIED, [SCINTILLANOTIFICATION.MODIFIED])

    The script uses a text marker instead of a real symbol.
    ‘>’ should indicate that something was added, and … you know ;-)

    If it is about changing symbols, hmm …, I didn’t play with it yet. It would involve using

    SCI_MARKERDEFINEPIXMAP(int markerNumber, const char *xpm) SCI_MARKERDEFINERGBAIMAGE(int markerNumber, const char *pixels)

    and others to create the images first and then assign like @dail already showed.
    If there is something I can do, let me know - this would be much more interesting than
    writing boring documentation.

    Cheers
    Claudia

  • Line numbering starting at zero?

    Locked
    3
    0 Votes
    3 Posts
    5k Views
    dailD

    Since this got derailed I split the second half of the original discussion into a new topic here

  • function/code-block auto indentation

    Locked
    3
    0 Votes
    3 Posts
    2k Views
    Claudia FrankC

    @kczx3

    as MAPJe71 already said, not natively but with a scripting language plugin
    like python script or lua it should be possible to write a script which will do the job.

    Cheers
    Claudia

  • 0 Votes
    4 Posts
    3k Views
    Claudia FrankC

    @MAPJe71

    well, depends ;-) could be a goal or a gate or an anonymous web broswer ;-)

    Cheers
    Claudia

  • UDL Autocomplete:keywords with space is shown in multiple lines

    Locked
    1
    0 Votes
    1 Posts
    1k Views
    No one has replied
  • Problem adding and copy-pasting UTF characters.

    Locked
    1
    0 Votes
    1 Posts
    3k Views
    No one has replied
  • Notepad++ default charset for newly opened files

    Locked
    1
    0 Votes
    1 Posts
    2k Views
    No one has replied
  • Wild card again

    Locked
    3
    0 Votes
    3 Posts
    2k Views
    Mark RomerM

    If you just want the file names, you can use a Regular expression find and replace. I’m not a regex expert, but this worked for me:

    Under Find what: ^.*\\(.+) n:.*$ Under Replace with: $1

    What this does is it matches an entire line (^ means the start of the line and $ means the end of the line). Within that line, it creates a capture group (the bit in parentheses, .+, which is one or more of any character) that is preceded by the part from the beginning of the line to a slash, and followed by a space followed by “n:” followed by zero or more characters to the end of the line. The $1 in Replace with means to replace with the contents of the first capture group.

    This will take you from

    copy k:\apps\demex\system\ABBREV.DBF n:\apps\demex\system\ABBREV.DBF /Y copy k:\apps\demex\system\TEST.DBF n:\apps\demex\system\TEST.DBF /Y copy k:\apps\demex\system\"abc def ghi.DBF" n:\apps\demex\system\"abc def ghi.DBF" /Y copy k:\apps\demex\system\1234567890.DBF n:\apps\demex\system\1234567890.DBF /Y copy k:\apps\demex\system\Me.DBF n:\apps\demex\system\Me.DBF /Y

    to

    ABBREV.DBF TEST.DBF "abc def ghi.DBF" 1234567890.DBF Me.DBF
  • UDL: Colon(:) at end of line as a keyword to style the whole line

    Locked
    1
    0 Votes
    1 Posts
    1k Views
    No one has replied
  • Prefixes for numbers don't work with letters, when using own syntax

    4
    0 Votes
    4 Posts
    3k Views
    Сергей Чепцов68С

    @PeterCJ-AtWork Thanks a lot! It works - prefix 2 and extras 1. Really confusing…

  • Clean a file using Regex

    Locked
    3
    0 Votes
    3 Posts
    2k Views
    guy038G

    Hi all,

    In my previous post, I gave a general method to replace a specific character by an other, everywhere in lines of a delimited text, except for a range, between column c1 and c2. I now give you an extension of that method to SEVERAL fixed zones to exclude !

    I mean :

    ^---------- Zone 1 to exclude ------------ Zone 2 to exclude -------------------- Zone 3 to exclude ------------$

    So, let’s suppose the original text, below :

    abcd,04,,11111111, 22,ANYWORD ,ANYWORD,ANY,QZERTY,001,,5555,,AN,Y ANY,pqrst,00x,ANYWORD ,ANYWORD,9A9 ,Last Field, fghi,02,,22222222, 22,ANY ,ANY, WORDS, ANY,AZERTY,999,,6666,,ANY AN,Y,uvwxy,01y,ANY ,ANY, WORDS,,7Z3 ,Last Field, klmn,09,,33333333, 22,WORDS,ANY, WORDS,ANY,TEST-1,123,,7777,,ANY,,ANY,zabcd,02z,,ORDS,ANY, WORDS,3H5 ,Last Field,

    I defined 3 zones to exclude, where the comma character will NOT be changed, while the S/R process :

    The zone 1, which starts at column 26 and ends at column 52 => S1= 26 and E1 = 52 The zone 2, which starts at column 65 and ends at column 72 => S2= 65 and E2 = 72 The zone 3, which starts at column 84 and ends at column 99 => S3= 84 and E3 = 99

    As previously explained, we, temporarily, add the # or @ boundaries, in order to delimit these 3 zones, with the general S/R, below :

    ^(.{S1-1})(.{E1-S1+1})(.{S2-E1-1})(.{E2-S2+1})(.{S3-E2-1))(.{E3-S3+1})..............(.{Sn-En-1-1})(.{En-Sn+1})

    With the given values of S1 through E3 above, we get the following S/R :

    SEARCH : ^(.{25})(.{27})(.{12})(.{8})(.{11})(.{16})

    REPLACE : \1#\2@\3#\4@\5#\6@

    which gives us the delimited text, below, with the boundaries :

    abcd,04,,11111111, 22,#ANYWORD ,ANYWORD,ANY,QZERTY@,001,,5555,,#AN,Y ANY@,pqrst,00x,#ANYWORD ,ANYWORD@,9A9 ,Last Field, fghi,02,,22222222, 22,#ANY ,ANY, WORDS, ANY,AZERTY@,999,,6666,,#ANY AN,Y@,uvwxy,01y,#ANY ,ANY, WORDS,@,7Z3 ,Last Field, klmn,09,,33333333, 22,#WORDS,ANY, WORDS,ANY,TEST-1@,123,,7777,,#ANY,,ANY@,zabcd,02z,#,ORDS,ANY, WORDS@,3H5 ,Last Field,

    Then, running the second regex S/R, below :

    SEARCH : ,(?=[^@]*#)|,(?![^#]*@)|(#|@)

    REPLACE : (?1:_)

    we obtain the final text :

    abcd_04__11111111_ 22_ANYWORD ,ANYWORD,ANY,QZERTY_001__5555__AN,Y ANY_pqrst_00x_ANYWORD ,ANYWORD_9A9 _Last Field_ fghi_02__22222222_ 22_ANY ,ANY, WORDS, ANY,AZERTY_999__6666__ANY AN,Y_uvwxy_01y_ANY ,ANY, WORDS,_7Z3 _Last Field_ klmn_09__33333333_ 22_WORDS,ANY, WORDS,ANY,TEST-1_123__7777__ANY,,ANY_zabcd_02z_,ORDS,ANY, WORDS_3H5 _Last Field_

    As expected, all the commas, located from column 26 till column 52, from column 65 till column 72 and from column 84 till column 99, have NOT been changed into an underscore character !

    Notes :

    In comparison to the previous regexes, only the look-aheads of the second S/R, are slightly different :

    The positive look-ahead (?=[^@]*#) verifies that, from the cursor location, a # character, can be found further, on the current line scanned, without any @ character, between the cursor location and the # location

    The negative look-ahead (?![^#]*@) verifies that, from the cursor location, a @ character, cannot be found further, on the current line scanned, without any # character, between the cursor location and the @ location

    Cheers,

    guy038

  • New to notepad and html; work won't show up in browsers

    Locked
    3
    0 Votes
    3 Posts
    5k Views
    Ellen Brewer JohnE

    Thank you very much for your reply! I was able to sort out my issues.

  • 0 Votes
    1 Posts
    1k Views
    No one has replied