• Login
Community
  • Login

How to replace only one period or dot and skip the others ?

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
9 Posts 4 Posters 372 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D
    dr ramaanand
    last edited by PeterJones Dec 1, 2024, 6:49 PM Dec 1, 2024, 6:34 PM

    Block of text for testing:-

    <html lang="en">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <META name="viewport" content="width=device-width, initial-scale=1">
    <title>Homeopathic medicine, Homeopathic remedies, CARCINOSIN, Carc</title>
    <META name="description" content="CARCINOSIN Bangalore, Carc">
    <META name="keywords" content="Homeopathic medicine Bangalore, Homeopathic remedies Bangalore, CARCINOSIN Bangalore, Carc">
    <META name="robots" content="index, follow" />
    <link rel="canonical" href="https://cure.com/CARCINOSIN.html">
    <META name="google-site-verification" content="B5jrpKjfHEj--_J-rT51c3CG8zg1sY_ZRQAbqQ1oN5Q">
    <link href="css/style.css" rel="stylesheet" type="text/css" media="all">
    <link href="https://www.cure4incurables.in/css/bootstrap.min.css" rel="stylesheet">
    <link href="https://www.cure4incurables.in/css/style1.css" rel="stylesheet">
    <link href="css/style.css" rel="stylesheet" type="text/css" media="all">
    <link rel="stylesheet" type="text/css" href="engine1/style.css" media="screen">
    <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css">
    </head>
    <script src="js/bootstrap.min.js"></script>
    <script src="js/backtotop.js"></script>cure.com
    <p>cure. com</p>
    

    I want to replace only the . in the cure.com and skip all the other periods (dots)
    I tried this Regular expression to no avail:-

    (<link[\S\s]*?<\/head>)(*SKIP)(*F)|(<[\S\s]*?>)(*SKIP)(*F)|(\.\s*\w)(*SKIP)(*F)|(\.)
    

    —

    moderator added code markdown around text; please don’t forget to use the </> button to mark example text as “code” so that characters don’t get changed by the forum

    T P 2 Replies Last reply Dec 1, 2024, 6:44 PM Reply Quote -1
    • T
      Terry R @dr ramaanand
      last edited by Terry R Dec 1, 2024, 8:09 PM Dec 1, 2024, 6:44 PM

      @dr-ramaanand
      It’s a bit confusing as you say you want to ONLY change 1 DOT character. Yet in your small example block there are 5 instances of cure.. So is it only one of those, all 5 of those. Also the last cure. instance has a following space:
      <p>cure. com</p>
      space is just after the DOT character.

      Terry

      EDIT. Sorry it was actually only 3 instances in the example block. I had find counting in regex mode, not literal. So DOT character was any character.

      D 1 Reply Last reply Dec 1, 2024, 7:43 PM Reply Quote 0
      • P
        PeterJones @dr ramaanand
        last edited by PeterJones Dec 1, 2024, 7:00 PM Dec 1, 2024, 6:56 PM

        @dr-ramaanand ,

        Remember to format your example data. It’s not like you don’t know how, because you remembered to format the regex in the same post.

        But HTML and the URLs used in your example data don’t come through the way you expect, and all your quote marks get changed by the forum into smart quotes when you choose to ignore formatting.

        If you were new to the forum, it would just warrant a reminder. But for someone who has been using the forum as their personal regex writing service for two years, you should at least go to the effort of formatting your post (and looking at the PREVIEW in order to verify it’s been formatted), even if you’ve ignored all warnings that this forum is not a regex writing service.

        By you “forgetting” to format, you are just making it harder for members of the Community to help you. I’d think that you’d want to do everything in your power to make it easier for Terry and Guy to help you, considering how much they bend over backwards to give answers to your years of regex questions. But instead, you repay their kindness and grace with a lack of effort.

        ----

        Useful References

        • Please Read Before Posting
        • Template for Search/Replace Questions
        • Formatting Forum Posts
        • Notepad++ Online User Manual: Searching/Regex
        • FAQ: Where to find other regular expressions (regex) documentation

        ----

        Please note: This Community Forum is not a data transformation service; you should not expect to be able to always say “I have data like X and want it to look like Y” and have us do all the work for you. If you are new to the Forum, and new to regular expressions, we will often give help on the first one or two data-transformation questions, especially if they are well-asked and you show a willingness to learn; and we will point you to the documentation where you can learn how to do the data transformations for yourself in the future. But if you repeatedly ask us to do your work for you, you will find that the patience of usually-helpful Community members wears thin. The best way to learn regular expressions is by experimenting with them yourself, and getting a feel for how they work; having us spoon-feed you the answers without you putting in the effort doesn’t help you in the long term and is uninteresting and annoying for us.

        1 Reply Last reply Reply Quote 2
        • D
          dr ramaanand @Terry R
          last edited by dr ramaanand Dec 1, 2024, 7:45 PM Dec 1, 2024, 7:43 PM

          @Terry-R I want to skip the dot in the cure. com as well as all the other dots, including those between the < and > but find/match the dot in the cure.com

          T 1 Reply Last reply Dec 1, 2024, 8:32 PM Reply Quote 0
          • T
            Terry R @dr ramaanand
            last edited by Dec 1, 2024, 8:32 PM

            @dr-ramaanand

            Still not sure of which cure.com you are looking for but…

            Yes, as @PeterJones said you do seem to lean a lot on the regulars here to cook up a regex for you. Whilst you have provided some regexes you have tried with your various questions it still seems you stop trying too quickly and just want help. That would suggest minimal effort on your behalf so that you can feel good about asking for help.

            Since your request seemed to be looking for text in a specific area of the file, did you not think to look at the FAQ post here and try to input the data to suit your example. I did and it quickly found the 1 entry I think you were looking for. You were directed to this exact same FAQ post 2 years ago by @PeterJones.

            (?-si:</[^>]+>|(?!\A)\G)(?s-i:(?!<).)*?\K(?-si:cure.[a-z])

            Terry

            D 1 Reply Last reply Dec 1, 2024, 8:52 PM Reply Quote 2
            • D
              dr ramaanand @Terry R
              last edited by Dec 1, 2024, 8:52 PM

              @Terry-R Thanks a lot. Can you also provide me a link to read about the (*SKIP) (*F) method when 2 or more strings need to be skipped? I could not find anything online.

              T 1 Reply Last reply Dec 1, 2024, 9:00 PM Reply Quote 0
              • T
                Terry R @dr ramaanand
                last edited by Terry R Dec 1, 2024, 9:00 PM Dec 1, 2024, 9:00 PM

                @dr-ramaanand

                Well then you haven’t looked. Again it seems that you stop looking/trying too early. @guy038 brought these backtracking controls to this forum, use this forum’s search function. However I’d suggest looking at other documentation and for that start with yet another of the FAQ posts here.
                Another is rexegg.com .

                Terry

                1 Reply Last reply Reply Quote 2
                • G
                  guy038
                  last edited by guy038 Dec 2, 2024, 1:01 PM Dec 2, 2024, 10:10 AM

                  Hello, @dr-ramaanand, @terry-r, @peterjones and All,

                  @dr-ramaanand, you said :

                  I want to replace only the . in the cure.com and skip all the other periods (dots)

                  OK, I understand, but, by which character or string do you want to replace this literal dot ? Do you need to delete the .com string ? I’m just curious ?


                  Here is a variant of the @terry-r solution :

                  SEARCH (?-i:</[a-z]+>|(?!\A)\G)(?s:(?!<).)*?cure\K\.(?=[a-z]+)

                  Anyway, for most of your questions, the generic regex, exposed in this post below, seems to be your best friend !

                  https://community.notepad-plus-plus.org/topic/22690/generic-regex-replacing-in-a-specific-zone-of-text


                  For a nice explanation of the (*SKIP)(*F) syntax, follow the link below :

                  https://www.rexegg.com/backtracking-control-verbs.php#skipfail

                  A generic regex, corresponding to its behavior, could be :

                  What_I_don’t want(*SKIP)(*FAIL)|What_I_want    or    What_I_don’t want(*SKIP)(*F)|What_I_want


                  I finally was able to find a regex, using these two backtracking control verbs, which seems adapted to your present problem :

                  SEARCH <[^<>]+>(*SKIP)(*F)|(?-i)\.(?=[a-z]+)

                  So, as you said :

                  • Anything between a < and a > character would be ignored

                  • Because of the look-head, which expects some lower-case letters right after the literal ., it also ignored the case cure. com string !


                  Now, you should be interested by the general regex, below, which matches, in an HTML document, any text, not ONLY composed of blanks chars, even spread over several lines, which lie within any >............< range, whatever it is !

                  SEARCH / MARK >\s+<(*SKIP)(*F)|(?<=>)[^<>]+(?=<)

                  Notes :

                  • Contrary to the previous regex, with that new regex, we’re searching for text between a > and the nearest < character

                  • If the zone of chars, within the >.......< range, contains ONLY space characters, this zone is simply ignored, because of the (*SKIP)(*F) syntax

                  Test that regex against your initial text, pasted in a new tab ( I slightly changed it, in order to add some line-breaks ! )

                  <html lang="en">
                  <head>
                  <meta http-equiv="Content-Type" content="text/html;
                   charset=utf-8">
                  <meta http-equiv="X-UA-Compatible" content="IE=edge">
                  <META name="viewport" content="width=device-width, initial-scale=1">
                  <title>Homeopathic medicine,
                   Homeopathic
                   remedies, CARCINOSIN,
                   Carc</title>
                  <META name="description" content="CARCINOSIN Bangalore, Carc">
                  <META name="keywords" content="Homeopathic medicine Bangalore, Homeopathic remedies Bangalore, CARCINOSIN Bangalore, Carc">
                  <META name="robots" content="index, follow" />
                  <link rel="canonical"
                   href="https://cure.com/CARCINOSIN.html">
                  <META name="google-site-verification" content="B5jrpKjfHEj--_J-rT51c3CG8zg1sY_ZRQAbqQ1oN5Q">
                  <link href="css/style.css" rel="stylesheet" type="text/css" media="all">
                  <link href="https://www.cure4incurables.in/css/bootstrap.min.css" rel="stylesheet">
                  <link href="https://www.cure4incurables.in/css/style1.css" rel="stylesheet">
                  <link href="css/style.css" rel="stylesheet"
                   type="text/css" media="all">
                  <link rel="stylesheet" type="text/css" href="engine1/style.css" media="screen">
                  <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css">
                  </head>
                  <script src="js/bootstrap.min.js"></script>
                  <script src="js/backtotop.js"></script>cure.com
                  <p>cure. c
                  om</p>
                  

                  Interesting, isn’t it ! You may try this regex against some of your HTML documents, too !

                  Of course, the more general regex, below, matches absolutely any non-null range of chars, even multi-lines ones, located between a > and the nearest < character of an HTML document

                  SEARCH (?<=>)[^<>]+(?=<)

                  Best Regards,

                  guy038

                  D 1 Reply Last reply Dec 2, 2024, 5:44 PM Reply Quote 0
                  • D
                    dr ramaanand @guy038
                    last edited by Dec 2, 2024, 5:44 PM

                    @guy038 I was trying to skip finding/matching the dots and periods (full stops) that I understood were necessary and find the rest.
                    I finally found just two or three unnecessary ones out of a folder containing 300+files using this Regular expression:-

                    (<link[\S\s]*?<\/h1>)(*SKIP)(*F)|(<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">[\S\s]*?<div class="left">)(*SKIP)(*F)|(<!--[\S\s}*?\-->)(*SKIP)(*F)|(<[\S\s]*?>)(*SKIP)(*F)|(Note:[\S\s]*?<\/span>)(*SKIP)(*F)|(\.\s*\w)(*SKIP)(*F)|(<ul[^<>]*+>[\S\s]*?</ul>)(*SKIP)(*F)|(<style[^<>]*+>[\S\s]*?</style>)(*SKIP)(*F)|(for\s*any\s*questions\/\s*treatment\s*\.)(*SKIP)(*F)|(Efficacy\s*studies\s*\.)(*SKIP)(*F)|(All\s*rights\s*reserved\s*\.)(*SKIP)(*F)|(a\.m\.)(*SKIP)(*F)|(p\.m\.)(*SKIP)(*F)|(\.\s*\[)(*SKIP)(*F)|(\.\s*\()(*SKIP)(*F)|(\]\.)(*SKIP)(*F)|(etc\.)(*SKIP)(*F)|(C\.V\.S\.)(*SKIP)(*F)|(C\.V\.A\.)(*SKIP)(*F)|(C\.N\.S\.)(*SKIP)(*F)|(G\.I\.T\.)(*SKIP)(*F)|(\.\])(*SKIP)(*F)|(\.\))(*SKIP)(*F)|(B\.P\.)(*SKIP)(*F)|(\.\s*&lt;)(*SKIP)(*F)|(\.\s*&gt;)(*SKIP)(*F)|(\.\s*')(*SKIP)(*F)|(\.\s*&quot;\w)(*SKIP)(*F)|(\.,\s*\w)(*SKIP)(*F)|(\.\s*")(*SKIP)(*F)|(M\.D\.)(*SKIP)(*F)|(R\.S\.)(*SKIP)(*F)|(identity\.)(*SKIP)(*F)|\.
                    

                    Thanks a lot guys! Please keep helping people who ask questions here, I don’t know who else will. Thanks again!

                    1 Reply Last reply Reply Quote 0
                    2 out of 9
                    • First post
                      2/9
                      Last post
                    The Community of users of the Notepad++ text editor.
                    Powered by NodeBB | Contributors