• Login
Community
  • Login

Search and remove items within tags.

Scheduled Pinned Locked Moved General Discussion
13 Posts 4 Posters 4.0k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J
    John Thompson 0
    last edited by Mar 10, 2021, 1:25 AM

    Every week I publish a page that is growing longer and longer as popularity grows.

    I take a webpage, put it into notepad, and clean it up. Mostly I dump some columns within a table, then tighten it up from there to make a cleaned out short version.

    The question is this, right now I do a search for [img] which is bb code indicating and image. Of course the whole line is

    [img]path-to-photo.jpg[/img]
    

    Right now I just search the doc for all [img] tags. then go one by one deleting the image calls through the whole page. Normally there are 75-125 hits per task.

    it gets to be very tiresome doing this all manually. what I’m hoping to find is a way to have notepad++ delete everything from
    [img] to [/img] But of course a regular search doesn’t work because each path is going to be different.

    is there a plugin or a way I could search dynamically and have notepad just delete anything that has the [img] tag?

    Sorry to ask such a simple question but it sure would save me some time!

    Thanks~

    1 Reply Last reply Reply Quote 1
    • T
      Terry R
      last edited by Terry R Mar 10, 2021, 1:53 AM Mar 10, 2021, 1:50 AM

      @John-Thompson-0 said in Search and remove items within tags.:

      is there a plugin or a way I could search dynamically and have notepad just delete anything that has the [img] tag?

      Sure is. It’s called a regular expression or regex. It uses codes to identify characters in several ways.

      So you haven’t stated if ALL the [img]...[/img] tags will ALWAYS be on 1 line but a possible idea would be to look for a range of characters following the [img] up until (and including) the following [/img] tag.

      So using the “Replace” function we have:
      Find What:(?-s)\\[img\\].+?\\[/img\\]
      Replace With: empty field here
      This is a regex so the search mode must be regular expression. You can have the wrap around button ticked, or make sure the cursor is in the very first position of the file before starting.

      With this you can use the “Find” button initially and it will locate (and highlight) the first occurrence. At this point if you want to delete it then click the “Replace” button. As the replacement field is empty this effectively means delete the highlighted text. The next occurrence of the tag will then be highlighted ready for you to either press Find or Replace. You would use Find button it NOT wanting to delete the highlighted occurrence.

      So as a suggestion, use it with the Find/Replace buttons for a one time process. If you are happy that it correctly highlights every occurrence the next time you process a file you could use the “Replace All” button. This finds and Replaces (so deletes) all occurrences with the one click.

      As stated above this will ONLY find those occurrences within 1 line. If you have occurrences that occur over 2 (or more) lines then a change will be required. It’s as simple as changing the (?-s) to (?s).

      Come back with your results. We may need to alter it if you find some are missed, or even some other text is highlighted when it shouldn’t be.

      We can also give you a bit of background into the codes used in this regex.

      Terry

      PS had to edit as forgot the markdown engine driving these posts ate some of my \ characters.

      J 2 Replies Last reply Mar 10, 2021, 3:23 PM Reply Quote 5
      • J
        John Thompson 0
        last edited by Mar 10, 2021, 6:09 AM

        Beautiful. this is going to save a huge bit of time each week!

        Sometimes the img tag goes on for two lines so I will need to use (?s) for sure.

        I will be running this shortly and will report my findings. Thank you SO much!

        1 Reply Last reply Reply Quote 1
        • J
          John Thompson 0 @Terry R
          last edited by Mar 10, 2021, 3:23 PM

          @Terry-R

          I can tell you for sure that my first trial it worked great. I put 6 different img tags in the doc and ran the search/replace and it worked perfectly. I think this is going to do it.

          You have no idea how helpful this is going to be for me.

          Thank you!

          A 1 Reply Last reply Mar 10, 2021, 3:46 PM Reply Quote 3
          • A
            Alan Kilborn @John Thompson 0
            last edited by Mar 10, 2021, 3:46 PM

            @John-Thompson-0

            So what you’ve been given to solve your problem is one more most basic and core substitutions of this type that you can do.
            You can see the power of it now, I’m sure.
            Do yourself a favor and acquaint yourself with other similar techniques buy having a read HERE.

            A J 2 Replies Last reply Mar 10, 2021, 6:40 PM Reply Quote 2
            • A
              Alan Kilborn @Alan Kilborn
              last edited by Mar 10, 2021, 6:40 PM

              @Alan-Kilborn said:

              …buy having a read HERE…

              Contrary to what that says, nothing to “buy”, it’s all FREE. :-)

              (Correction: … by having a read…)

              1 Reply Last reply Reply Quote 1
              • T
                Terry R
                last edited by Mar 10, 2021, 8:21 PM

                @John-Thompson-0 said in Search and remove items within tags.:

                I take a webpage, put it into notepad, and clean it up. Mostly I dump some columns within a table, then tighten it up from there to make a cleaned out short version.

                As @Alan-Kilborn said, look at the regex documentation and try and start the learning process. Since your above statement suggests you have other editing to do, maybe your “new found” knowledge could be put to good use in doing more of the manual edits you perform.

                Regex is awesome in performing lots of editing, so long as the edit can be explained logically. As examples
                "I have this 3 character code and I need to delete it and the following text until I reach an “end of line”
                “I have this number, it can be 7-10 characters long and I need it formatted with the first 3, followed by a “-” and then the rest of the number”

                It could possibly also do your column editing for you Very likely I’d say). You do need to be prepared to spend a bit of time learning. Attempt to see if you can get a working regex for the column editing. We are here to assist but do like to see some ideas that you have tried. If needing to present examples do so in the same manner as your first post (inside the black box) as that prevents the posting engine from potentially mangling the data.

                And as a background to my supplied regex the description is as follows:
                (?-s) - as you found out, this refers to a single line. Actually it means the. (dot) character will not include the end of line (EOL) markers (carriage return and line feeds). The (?s) means the . character will include EOL markers.
                \[img\] - note here that I included the \ character as the [ and ] are special. The \ tells the regex engine that it’s the actual character I want not the special meaning.
                .+? - this is where the real fun begins. The . means a single character position. The + means “greedy” so as many as allowed. The “?” turns the greedy into “lazy” so as less as possible.
                \[/img\] - this is again looking for the actual text [/img].

                So for the regex to succeed it must complete the entire “formula”. This forces the lazy portion .+? to continue adding characters one by one until it finds the following portion, the [/img].

                Good luck
                Terry

                P 1 Reply Last reply Mar 10, 2021, 8:29 PM Reply Quote 1
                • P
                  PeterJones @Terry R
                  last edited by Mar 10, 2021, 8:29 PM

                  @Terry-R ,

                  Correcting a mistake caused by the forum:
                  059900b6-acc9-4493-8d8a-896f02672f01-image.png

                  the “img” tags regex pieces should really be like was shown earlier:
                  b8e98364-8eed-4536-92de-2360b0a332a0-image.png

                  (“wonderful” square-bracket escape “feature” in forum.)

                  1 Reply Last reply Reply Quote 2
                  • T
                    Terry R
                    last edited by Mar 10, 2021, 9:15 PM

                    @PeterJones said in Search and remove items within tags.:

                    Correcting a mistake caused by the forum:

                    Thanks @PeterJones . I was nearly caught out once (and edited that post). Stupid me forgot a second time, what made it worse was my descriptor with the example mentioned the \ so that should have alerted me.

                    That nasty markdown engine. Why can’t it leave well alone!

                    Cheers
                    Terry

                    1 Reply Last reply Reply Quote 0
                    • J
                      John Thompson 0
                      last edited by Mar 11, 2021, 10:42 PM

                      OK THanks everyone. I’m working right now so I don’t have time to reply to the new info yet, but I am going to read up.

                      I’ll respond to the others this evening but wanted to respond to @Terry-R first. I used this today on a full length list and it worked like a charm out of the box.

                      I’ll be running it again Saturday morning so I’ll follow up there once I’ve had a chance to read the other’s responses.

                      @Terry-R said in Search and remove items within tags.:

                      @John-Thompson-0 said in Search and remove items within tags.:

                      is there a plugin or a way I could search dynamically and have notepad just delete anything that has the [img] tag?

                      Sure is. It’s called a regular expression or regex. It uses codes to identify characters in several ways.

                      So you haven’t stated if ALL the [img]...[/img] tags will ALWAYS be on 1 line but a possible idea would be to look for a range of characters following the [img] up until (and including) the following [/img] tag.

                      So using the “Replace” function we have:
                      Find What:(?-s)\\[img\\].+?\\[/img\\]
                      Replace With: empty field here
                      This is a regex so the search mode must be regular expression. You can have the wrap around button ticked, or make sure the cursor is in the very first position of the file before starting.

                      With this you can use the “Find” button initially and it will locate (and highlight) the first occurrence. At this point if you want to delete it then click the “Replace” button. As the replacement field is empty this effectively means delete the highlighted text. The next occurrence of the tag will then be highlighted ready for you to either press Find or Replace. You would use Find button it NOT wanting to delete the highlighted occurrence.

                      So as a suggestion, use it with the Find/Replace buttons for a one time process. If you are happy that it correctly highlights every occurrence the next time you process a file you could use the “Replace All” button. This finds and Replaces (so deletes) all occurrences with the one click.

                      As stated above this will ONLY find those occurrences within 1 line. If you have occurrences that occur over 2 (or more) lines then a change will be required. It’s as simple as changing the (?-s) to (?s).

                      Come back with your results. We may need to alter it if you find some are missed, or even some other text is highlighted when it shouldn’t be.

                      We can also give you a bit of background into the codes used in this regex.

                      Terry

                      PS had to edit as forgot the markdown engine driving these posts ate some of my \ characters.

                      1 Reply Last reply Reply Quote 1
                      • J
                        John Thompson 0 @Alan Kilborn
                        last edited by Mar 14, 2021, 7:24 PM

                        This post is deleted!
                        1 Reply Last reply Reply Quote 0
                        • J
                          John Thompson 0 @Terry R
                          last edited by Apr 26, 2021, 11:20 PM

                          Just wanted to let you know, I haven’t forgotten this post and it’s really become helpful. I know it stays in my cache of searches but I saved it in a snippets file I use anyway.

                          (?-s)\[img\].+?\[/img\]
                          

                          As for explaining it , if I look long enough I get it. I always get confused by having to use escape characters for things and that’s mostly what gets confusing here. but the beginning is just calling perl to look for a pattern correct and inside the parenthesis (is that what they’re called?) are defining what to look for, using the escape characters so it allows me to use characters that might otherwise be part of the language itself?

                          I know I said most of that wrong, but I do get it. It’s just that I can figure out what some of that stuff means, but writing it is a whole nother story.

                          @Terry-R said in Search and remove items within tags.:

                          @John-Thompson-0 said in Search and remove items within tags.:

                          is there a plugin or a way I could search dynamically and have notepad just delete anything that has the [img] tag?

                          Sure is. It’s called a regular expression or regex. It uses codes to identify characters in several ways.

                          So you haven’t stated if ALL the [img]...[/img] tags will ALWAYS be on 1 line but a possible idea would be to look for a range of characters following the [img] up until (and including) the following [/img] tag.

                          So using the “Replace” function we have:
                          Find What:(?-s)\\[img\\].+?\\[/img\\]
                          Replace With: empty field here
                          This is a regex so the search mode must be regular expression. You can have the wrap around button ticked, or make sure the cursor is in the very first position of the file before starting.

                          With this you can use the “Find” button initially and it will locate (and highlight) the first occurrence. At this point if you want to delete it then click the “Replace” button. As the replacement field is empty this effectively means delete the highlighted text. The next occurrence of the tag will then be highlighted ready for you to either press Find or Replace. You would use Find button it NOT wanting to delete the highlighted occurrence.

                          So as a suggestion, use it with the Find/Replace buttons for a one time process. If you are happy that it correctly highlights every occurrence the next time you process a file you could use the “Replace All” button. This finds and Replaces (so deletes) all occurrences with the one click.

                          As stated above this will ONLY find those occurrences within 1 line. If you have occurrences that occur over 2 (or more) lines then a change will be required. It’s as simple as changing the (?-s) to (?s).

                          Come back with your results. We may need to alter it if you find some are missed, or even some other text is highlighted when it shouldn’t be.

                          We can also give you a bit of background into the codes used in this regex.

                          Terry

                          PS had to edit as forgot the markdown engine driving these posts ate some of my \ characters.

                          1 Reply Last reply Reply Quote 0
                          • J
                            John Thompson 0
                            last edited by Apr 26, 2021, 11:25 PM

                            @Alan-Kilborn said in Search and remove items within tags.:

                            @John-Thompson-0

                            So what you’ve been given to solve your problem is one more most basic and core substitutions of this type that you can do.
                            You can see the power of it now, I’m sure.
                            Do yourself a favor and acquaint yourself with other similar techniques buy having a read HERE.

                            I do understand your point. My old business partner and best friend for 20+ years was a programmer and I just could never get my head around it. Like, I can decipher a lot of what’s in the line but to sit down and write it myself? no way. But I suppose if I just started doing it I could. I mean isn’t that how most people learned html and css? I know it’s how I did. back 25 years ago.

                            It’s been hard to get it. I’ve tried taking jquery courses and basic JavaScripting and it just never ‘took hold’ and got me interested. I suppose it’s because I never really ‘had’ to learn that like I did markup etc.

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            The Community of users of the Notepad++ text editor.
                            Powered by NodeBB | Contributors