replace in selected area



  • Hi

    I have a file like this:

    *JSP - 060 N N
    1309,com.sapiens.mig.ri.v050m00.RI17483,70
    1309,com.sapiens.mig.ri.v050m00.RI17483,70

    *SQL - 812 N N #SQL01
    *DDC - 045 N N
    *JRN - 055 N N

    I am looking for the easiest way (macro?) to parse only this lines:

    1309,com.sapiens.mig.ri.v050m00.RI17483,70
    1309,com.sapiens.mig.ri.v050m00.RI17484,40

    and replace them with :

    1309 RI17483 70
    1309 RI17484 40

    In other words:

    • I want to select these lines

    • replace dots & commas in these lines to space

    • replace each selected line with the 1st word then space then 6th word then space then 7th word

    so that line :
    1309,com.sapiens.mig.ri.v050m00.RI17483,70
    will be replaced with
    1309 RI17483 70

    and all the other non-selected lines in the file will be left unchanged.

    Can this be done ???

    Many thanks



  • Hello , Gonen Shoham,

    Of course, it can -:))

    I noticed that, finally, you would like to change, ONLY, the lines which begin with a digit, wouldn’t you ?

    If so, you can achieve all your changes, in one go, with a regex Search-Replacement !

    • Move back to the very beginning of your file ( Ctrl + Origin )

    • Open the Replace dialog ( Ctrl + H )

    • In the Find what: zone, type (?-s)^(?=\d)(\w+).+\W+(\w+)\W+(\w+)$

    • In the Replace with: zone, type \1 \2 \3 , with a space character after digits 1 and 2

    • Click on the Replace All button

    Et voilà !!

    NOTES :

    • As usual, the in-line modifier (?-s) ensures you the any meta-character dot ( . ) will match standard characters, ONLY, and not EOL characters

    • Then, the part ^(?=\d) is called a positive look-ahead, that is to say a condition which verifies that, at beginning ( ^ ) of each line, a digit ( \d ) does occur ( Note that this condition must be true, in order to valid the overall search regex. However, these look-arounds do not consume any character and the regex engine position is, still, at the very beginning of each line )

    • So, the part (\w+) represents the first non-null range of word characters, of each line, which is stored as group 1, due to the parentheses

    • Then, the part .+ stands for any non-null range of standard characters, till the regex engine reaches :

      • A non-null list of NON-word characters ( \W+ ), followed by :

      • A non-null range of word characters ( (\w+) ), stored as group 2, followed by :

      • An other non-null list of NON-word characters \W+, followed by :

      • An other non-null range of word characters ( (\w+) ), stored as group 3, finally followed by the end of the line ( $ )

    • In replacement, we just re-write the contents of the three groups 1 ( first word ), 2 ( penultimate word ) and 3 ( last word ), separated by one space character.

    REMARKS* :

    • The class \w represents, in the world-common ASCII range of characters, the class [A-Za-z0-9_], ( upper-case letters, lower-case letters, digits and the low line character ) But, if your present file has an Unicode encoding ( UTF-8 or UCS-2 ), the range of word characters ( \w ), in N++, is extended to any character, considered as letter or digit, in either, Latin, Greek, Cyrillic, Hebrew and Arab languages, correcty displayed with the Courrier New default N++ font !

    • And the inverse class, \W, stands for any NON-word character, that is to say, any chracter, which does NOT belong to the word characters range

    So, from your example :

    *JSP - 060 N N
    1309,com.sapiens.mig.ri.v050m00.RI17483,70
    1309,com.sapiens.mig.ri.v050m00.RI17483,70
    *SQL - 812 N N #SQL01
    *DDC - 045 N N
    *JRN - 055 N N
    

    we would obtain the modified text, below :

    *JSP - 060 N N
    1309 RI17483 70
    1309 RI17483 70
    *SQL - 812 N N #SQL01
    *DDC - 045 N N
    *JRN - 055 N N
    

    Even if the non-selected lines begin with an othersymbol than the star, except for digits, AND/OR the different words are separated by several non-word characters, our S/R, always, behave as expected :-) For instance, if you consider the text, below, where I inserted numerous non-word characters, whose some space and tabulation characters, in the two lines, that have to be changed and where I changed the first character of the lines, which have to stay unchanged :

    +JSP - 060 N N
    1309;com---sapiens~mig&ri>>>>>v050m00    RI17483,@@@@@@@70
    1309:com"""sapiens===mig/ri!v050m00			RI17483%%%70
    *SQL - 812 N N #SQL01
    DDC - 045 N N
    JRN - 055 N N
    

    We, still, get the result, below !

    +JSP - 060 N N
    1309 RI17483 70
    1309 RI17483 70
    *SQL - 812 N N #SQL01
    DDC - 045 N N
    JRN - 055 N N
    

    Best Regards,

    guy038


Log in to reply