Community
    • Login

    help replacing

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    19 Posts 5 Posters 9.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Perry SticcaP
      Perry Sticca
      last edited by

      guy038: It took about 10 minutes to go through the whole file, but it worked! Merci!

      I am most appreciative of you spending the time to figure that out and help me - I hope it did not take you very long.

      The first time I tried it, it did not replace anything. That is because in my “huge” file, the 6-digit numbers are not immediately followed by a comma - they have a few blanks before the comma. But in my simplified example that I provided (and you used), I edited it so all of the lines had a comma before and after each 6-digit account number.

      So, that’s how you wrote the " ,(\d{6})(?s)(?=,.*\1(,\d{7})) " expression. Once I did a find and replace on my huge file, and removed any blanks between the account number and the comma, your expression worked perfectly.

      Is it possible to modify your regex to accommodate a file where the account numbers are immediately followed by one or more blanks, and then a comma?

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by

        Hi, @perry-sticca and All,

        No problem at all ! I can, even, give you two regexes :-))

        These two regexes, below, detects any six-digits account number, followed by ( consecutive ) space or tabulation character(s), even none, before a comma

        With the regex ,(\d{6})(?s)(?=\h*,.*\1(,\d{7})), the new seven-digits account number is written but the possible blank characters, after the old account number, are kept in the file

        And with the regex ,(\d{6})\h*(?s)(?=,.*\1(,\d{7})), the new seven-digits account number replaces the old six-digits account number, as well as all possible blank characters, located after it !

        Notes :

        • The syntax \h represents any of the 3 horizontal blank characters : the Space (\x20 ), the Tabulation ( \x09 ) or the No Breaking Space ( \xA0 )

        • The quantifier * stands for 0 or more occurrences of the previous blank character

        Cheers,

        guy038

        1 Reply Last reply Reply Quote 0
        • guy038G
          guy038
          last edited by guy038

          Hi @adam-creason, @scott-sumner and All,

          As I solved the Perry Sticca problem ( see above ), I realized, Adam and Scott, that the repetitive Replace All actions may be avoided, if we switch the location of the text1 and text2 Adam’s blocks :-))

          So let’s supposed the text, below :

          text2:
          
          name 0002-
          name 0001-
          name 1000-
          name 0003-
          
          text 1:
          
          name 0001-first value
          name 0002-second value
          name 0003-third value
          name 1000-one thousandth value
          

          Then, the regex :

          SEARCH (name \d{4}-)(?s)(?=\R.*\1((?-s).+))

          REPLACE $0\2

          would change it, after an UNIQUE click on the Replace All button, by :

          text2:
          
          name 0002-second value
          name 0001-first value
          name 1000-one thousandth value
          name 0003-third value
          
          text 1:
          
          name 0001-first value
          name 0002-second value
          name 0003-third value
          name 1000-one thousandth value
          

          Magic, isn’t it !


          Notes :

          • The first part, (name \d{4}-), is the regex to search, stored as group 1

          • But, ONLY IF the positive look-ahead, (?s)(?=\R.*\1((?-s).+)) is true. That is to say, if group1 is immediately followed by EOL character(s), then any range of any character, due to the (?s) modifier till an other group1, again, and the remainder of the current line, only, due to the (?-s) modifier, located inside the group 2

          In replacement, we rewrite the entire searched expression $0 , followed by the group 2. Note that we could have used the \1\2 replacement regex, instead, for identical results !

          Cheers,

          guy038

          1 Reply Last reply Reply Quote 1
          • Thomas Daryl Phillips IIT
            Thomas Daryl Phillips II
            last edited by

            sorry but im hugely out of my depth and its hard to intrepret the changes that are being made for the 2 peoples tasks… this is what i have…

                    <game name="Bomberman (USA)">
            			<description>Bomberman (USA)</description>
            			<rom crc="DB9DCF89" md5="0F9C8D3D3099C70368411F6DF9EF49C1" name="No_Intro_Sept_2016/No_Intro_N3.zip/Nintendo%20-%20Nintendo%20Entertainment%20System%2FBomberman%20%28USA%29.zip" sha1="D2BF7BD570430902114F1E3393F1FEB8B1C76E4D" size="16190" />
            			<title_clean>Bomberman</title_clean>
            			<plot>blah blah blah description</plot>
            			<releasedate>11/5/1987</releasedate>
            			<year>1985</year>
            			<genre>Action</genre>
            			<studio>Hudson Soft Company, Ltd.</studio>
            			<nplayers>1</nplayers>
            			<perspective>Top-Down</perspective>
            			<rating>3.4</rating>
            			<ESRB>E - Everyone</ESRB>
            			<videoid>CZ9Pu9Usk5o</videoid>
            			<thegamesdb_id>1040</thegamesdb_id>
            			<gamefaqs_url>http://www.gamefaqs.com/nes/563390-bomberman</gamefaqs_url>
            			<mobygames_url>http://www.mobygames.com/game/nes/bomberman-</mobygames_url>
            			<giantbomb_url>http://www.giantbomb.com/bomberman/3030-20589/</giantbomb_url>
            			<consolegrid_url>http://consolegrid.com/games/98</consolegrid_url>
            			<snapshot1>http://i.imgur.com/qBmuc17.jpg</snapshot1>
            			<fanart1>http://i.imgur.com/PL7xZJD.jpg</fanart1>
            		</game>
            		<game name="Bomber Man II (Japan)">
            			<description>Bomber Man II (Japan)</description>
            			<rom crc="0C401790" md5="E8DD578E17C4326D5E6E9C916B2328A1" name="No_Intro_Sept_2016/No_Intro_N3.zip/Nintendo%20-%20Nintendo%20Entertainment%20System%2FBomber%20Man%20II%20%28Japan%29.zip" sha1="CD665ACEA15A4542A9E4CF16A7CA2CE53C88726D" size="67022" />
            			<title_clean>Bomber Man II</title_clean>
            			<plot>blaaaaaaah</plot>
            			<releasedate>28/2/1993</releasedate>
            			<year>1991</year>
            			<genre>Action</genre>
            			<studio>Hudson Soft Company, Ltd., Hudson Soft USA, Inc.</studio>
            			<nplayers>1-3 VS</nplayers>
            			<perspective>Top-Down</perspective>
            			<rating>4.0</rating>
            			<videoid>7K6Ktv6G_j0</videoid>
            			<thegamesdb_id>1653</thegamesdb_id>
            			<gamefaqs_url>http://www.gamefaqs.com/nes/587150-bomberman-ii</gamefaqs_url>
            			<mobygames_url>http://www.mobygames.com/game/nes/bomberman-ii</mobygames_url>
            			<giantbomb_url>http://www.giantbomb.com/bomberman-ii/3030-5993/</giantbomb_url>
            			<consolegrid_url>http://consolegrid.com/games/7199</consolegrid_url>
            			<boxart1>http://i.imgur.com/iQH8lAk.jpg</boxart1>
            			<snapshot1>http://i.imgur.com/8lyzbhy.jpg</snapshot1>
            			<fanart1>http://i.imgur.com/wXBXYhu.jpg</fanart1>
            			<banner1>http://i.imgur.com/kv37dnC.png</banner1>
            		</game>
            

            i want to remove all non US licensed games(thousands spanning almost 30 lists)
            ive been folding all then batch marking by searching (Japan)">

            then either removing all bookmarked lines assuming they will delete all within folded brackets but its not turning out this way at all.

            many or all are just removing the first line/bookmarked line and leaving rest of data which immediately breaks the launcher.

            please help and THANK YOU!

            Scott SumnerS 1 Reply Last reply Reply Quote 0
            • Scott SumnerS
              Scott Sumner @Thomas Daryl Phillips II
              last edited by

              @Thomas-Daryl-Phillips-II

              Your task isn’t really related to the earlier 2 tasks–those were trying to replace text somewhere in a document based upon some text somewhere else in the doc. You just want to find text and replace it (removal by replacement with nothing still qualifies).

              Marking/bookmarking text is problematic here because your text spans multiple lines, and as you found, only the first line of a match is bookmarked. This you can’t follow up the marking with a delete-bookmarked-lines command.

              So I think a search for the following could do what you want. It may not be the best way to do it, but it gets the job done:

              Find what zone: (?-s)^\s*<game(?=.*\(Japan\))(?s).*?</game>\R

              If this (or ANY posting on the Notepad++ Community site) is useful, don’t reply with a “thanks”, simply up-vote ( click the ^ in the ^ 0 v area on the right ).

              1 Reply Last reply Reply Quote 1
              • Thomas Daryl Phillips IIT
                Thomas Daryl Phillips II
                last edited by

                @Scott-Sumner said:

                (?-s)^\s*<game(?=.(Japan))(?s).?</game>\R

                im sorry to ask this. you already helped me so much… ive been banging my head into the wall for days over this!

                but could you please break down and explain how that selected exactly what i needed to be deleted?
                i wish to understand it so i can edit the command to fit similar filtering needs.

                i can see you used <game
                (Japan)
                and </game>

                as the keywords

                could you break down the expressions used step by step?

                Scott SumnerS 1 Reply Last reply Reply Quote 1
                • Scott SumnerS
                  Scott Sumner @Thomas Daryl Phillips II
                  last edited by Scott Sumner

                  @Thomas-Daryl-Phillips-II

                  Sure. I guess that means it worked for you. Okay, step by step I’ll break down the regular expression:

                  (?-s): for whatever follows, when a . is used, only allow a match on the current line (a . is a “wildcard” for “any character”)

                  ^: from the start of a line

                  \s*: match any amount of whitespace (spaces or tabs)

                  <game: match <game exactly

                  (?=.*\(Japan\)): keep the match going only if the exact text (Japan) occurs later on the same line (the “same line” part is due to the (?-s) from earlier)–note that this is just saying “keep the match alive”, it doesn’t include any “Japan” text in the actual match!

                  (?s): switch to saying that for whatever follows, a . is allowed to match any character across line boundaries

                  .*?: minimally match any number of characters until what comes next is satisfied–note this is what actually makes “Japan” part of the real match

                  </game>: match </game> exactly

                  \R: match a line-ending

                  I think I hit it all…and like I said earlier, I didn’t analyze the problem to death so I’m sure there are better ways to do it. But as this forum is about Notepad++ and how to get things done with it, and NOT about how to craft the best-ever regular expression, I can let it go… :-)

                  1 Reply Last reply Reply Quote 1
                  • Thomas Daryl Phillips IIT
                    Thomas Daryl Phillips II
                    last edited by

                    @Scott

                    @Scott-Sumner said:

                    @Thomas-Daryl-Phillips-II

                    Sure. I guess that means it worked for you. Okay, step by step I’ll break down the regular expression:

                    (?-s): for whatever follows, when a . is used, only allow a match on the current line (a . is a “wildcard” for “any character”)

                    ^: from the start of a line

                    \s*: match any amount of whitespace (spaces or tabs)

                    <game: match <game exactly

                    (?=.*\(Japan\)): keep the match going only if the exact text (Japan) occurs later on the same line (the “same line” part is due to the (?-s) from earlier)–note that this is just saying “keep the match alive”, it doesn’t include any “Japan” text in the actual match!

                    (?s): switch to saying that for whatever follows, a . is allowed to match any character across line boundaries

                    .*?: minimally match any number of characters until what comes next is satisfied–note this is what actually makes “Japan” part of the real match

                    </game>: match </game> exactly

                    \R: match a line-ending

                    I think I hit it all…and like I said earlier, I didn’t analyze the problem to death so I’m sure there are better ways to do it. But as this forum is about Notepad++ and how to get things done with it, and NOT about how to craft the best-ever regular expression, I can let it go… :-)

                    theres one thing im confused about which is the back slash used in (Japan)

                    Scott SumnerS 1 Reply Last reply Reply Quote 1
                    • Scott SumnerS
                      Scott Sumner @Thomas Daryl Phillips II
                      last edited by

                      @Thomas-Daryl-Phillips-II

                      You can see that ( and ) are used in a few other places in the regular expression. This is a clue that these characters have special meaning. So…if you have literal ( and ) that you need to match in your text, you need to put them in as \( and \). The \ is an instruction to say “interpret the following symbol literally”.

                      I don’t know your exact needs–it could very well be that matching Japan and not the more restrictive (Japan) meets your need. If so, you could change to: (?-s)^\s*<game(?=.*Japan)(?s).*?</game>\R

                      1 Reply Last reply Reply Quote 1
                      • Thomas Daryl Phillips IIT
                        Thomas Daryl Phillips II
                        last edited by

                        this is everything i need for a HUGE chunk of my project.

                        you saved my sanity and i appreciate it!!!

                        1 Reply Last reply Reply Quote 1
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors