Ignore some text in regex



  • Hi
    I have this regex that search for spifict amount of character

    (?-s).{65,80}(?=\x20)
    

    and this

    $0[Char]
    

    for add a new character at end of result
    its working well when I use it on whole line
    its just a normal line that i want to use above regex on it
    but if i have a xml file and I want to apply it on only some text between tags, what can i do?
    for example I have something like this
    <Cell ss:StyleID="sReadWrite"><Data ss:Type="String">its just a normal line that i want to use above regex on it</Data></Cell>
    how can i ignore all xml tags and only use regex in the text that i want?
    sorry if i can tell what i want right and thanks



  • @0YouKnow0 ,

    It’s great that you have a starting regex that works for the simple use case; thanks for sharing it, as it will make moving you forward to the end goal much easier.

    Recently, our super regex guru @guy038 published a post that gives a “generic” regex for searching and replacing for a particular expression, between START and END tags.

    Your search expression would be the FR in the generic, and your replacement expression would be the RR. Depending on how specific you want your START and END to be, I would probably pick a BSR = > and ESR = </, so that it will look for anything that matches your FR that’s between a > at the end of a tag and an </ at the beginning of the end tag. The generic expression was designed to disallow having the BSR or ESR inside the matched, so I believe that will do it for you. But if you only want it between certain tags, like only inside of <Data...>...</Data>, then you could make more detailed BSR and ESR.

    Actually, when I tried plugging those in as specified with the simpler BSR and ESR, I found that it didn’t quite match correctly:

    <Cell ss:StyleID="sReadWrite"><Data ss:Type="String">its just a normal line that i want to use above regex on it</Data></Cell>
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                  starts at <Data instead of at its
    

    You could limit that by not allowing your FR to match any > characters… so instead of (?-s).{65,80}, maybe [^>\r\n]{65,80}. No, that doesn’t solve it. :-(

    Wait, your expression only allows 65-80 characters before the space. But your text its just a normal line that i want to use above regex on it is only 59 characters long, so it’s not long enough to match. If I change the data contents to its just a normal line that i want to use above regex on it which has at least 65 characters in it, then your .-based still gets it wrong, but my modified character-class based FR will match:

    <Cell ss:StyleID="sReadWrite"><Data ss:Type="String">its just a normal line that i want to use above regex on it which has at least 65 characters in it</Data></Cell>
                                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    

    Even if I change to BSR = <Data.*?> and ESR = </Data>, it will incorrectly match with your (?-s).{65,80}(?=\x20), whereas FR = [^>\r\n]{65,80}(?=\x20) will grab just the first 65-80 characters before a space.

    Sometimes, even with the generic expressions, you have to do more exploring to get it to work.

    Hopefully, you can plug in the modified FR, and your choice of appropriate BSR and ESR, into the generic expression, and see how it works. And using the Find to show the various matches will help you debug issues before you go ahead and Replace. But, if you need help.

    ----

    Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as literal text using the </> toolbar button or manual Markdown syntax. To make regex in red (and so they keep their special characters like *), use backticks, like `^.*?blah.*?\z`. Screenshots can be pasted from the clipboard to your post using Ctrl+V to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries.


Log in to reply