Find-in-FIles: Can’t Replace Multiple Instances of Word
-
@PeterJones I love it when a plan comes together!
Will let you know how testing goes “in the wild” (i.e., on actual files).
-
@Sylvester-Bullitt said in Find-in-FIles: Can’t Replace Multiple Instances of Word:
(?:lyrics-text|\G).+?(?<!^)(?<!<p>)(?<!<p class="chorus">)\Kstar(?=(.+?</div>))
…
all 3 occurrences were replaced in one fell swoop.I was surprised it was 3 occurrences, because the first occurrence was before
lyrics-text
.I was reminded that the
\G
can actually match the start of the text under certain circumstances. This isn’t 100% clear in the User Manual, but in the Boost Regex documentation that it links to, it says (emphasis mine),The sequence
\G
matches only at the end of the last match found, or at the start of the text being matched if no previous match was found.To prevent
\G
from matching the start, you need to make sure the first alternative consumes the \A: FIND =(?s)(\A.*?lyrics-text|\G).+?(?<!^)(?<!<p>)(?<!<p class="chorus">)\Kstar(?=(.+?</div>))
With that, it only finds and replaces 2 in your “Twinkle Twinkle” file, instead of 3. -
This post is deleted! -
@PeterJones said in Find-in-FIles: Can’t Replace Multiple Instances of Word:
This isn’t 100% clear in the User Manual,
I have tweaked the UM to include the phrase from the boost manual, to make it more clear. It is doubtful anything I write, especially regarding regular expressions, can be 100% clear. ;-)
-
@PeterJones Peter, I’m unfamiliar with the syntax (?:lyrics-text|\G). It resembles a lookbehind, but all the lookbehinds I’ve seen look like (? <=a) (i.e., no colon). What exactly is this thing? Where is it documented?
-
@PeterJones said in Find-in-FIles: Can’t Replace Multiple Instances of Word:
To prevent \G from matching the start, you need to make sure the first alternative consumes the \A: FIND = (?s)(\A.*?lyrics-text|\G).+?(?<!^)(?<!<p>)(?<!<p class=“chorus”>)\Kstar(?=(.+?</div>))
With that, it only finds and replaces 2 in your “Twinkle Twinkle” file, instead of 3.It might not matter in @Sylvester-Bullitt’s application, but it should be noted that if the file does not contain the text
lyrics-text
at all, the expression given will replace every occurrence ofstar
. -
@Sylvester-Bullitt said in Find-in-FIles: Can’t Replace Multiple Instances of Word:
@PeterJones I love it when a plan comes together!
Will let you know how testing goes “in the wild” (i.e., on actual files).
Here’s the problem file:
<!DOCTYPE HTML> <html lang="en-us"> <head> <meta charset="utf-8"> <title>Mary Had a Little Lamb</title> <meta name="description" content="Words: Sarah Hale, 1830. Music: None."> <meta name="keywords" content="Sarah Hale"> <link rel="stylesheet" href="../../css/hymn.css"> <script src="../../js/jquery.js"></script> <script src="../../js/base.js"></script> <script src="../../js/hymn.js"></script> <link rel="prev" href="../../htm/h/e/w/o/hewonsav.htm"> <link rel="next" href="../../htm/h/e/s/a/hesallwo.htm"> <link rel="up" href="../../ttl/ttl-h.htm"> </head> <body> <section id="preface"> <h1 class="screen-reader-only">Introduction</h1> <div class="preface-text"> <p><span class="lead">Words:</span> <a href="../../bio/h/a/l/e/hale_sjb.htm">Sarah J. Hale</a>, 1830.</p> <p><span class="lead">Music:</span> John Doe (<a href="../../mid/d/u/m/m/dummy.mid" title="Listen to music, MIDI format">🔊</a> <a href="../../pdf/en/d/u/m/m/Dummy.pdf" title="Download score, PDF format">pdf</a> <a href="../../nwc/d/u/m/m/Dummy.nwc" title="Download score, Noteworthy Composer format">nwc</a>).</p> </div> </section> <p>This page is used to test global search-and-replace using regular expressions. </p> <section class="lyrics"> <div class="stanzas"><div class="lyrics-text mc ll"> <p>Mary had a little lambkin,<br> Its fleece was white as snow.<br> And everywhere that Mary went,<br> The lamb was sure to go.<br> He followed her to school one day,<br> That was against the rule.<br> It made the children laugh and play<br> To see a lamb at school.</p> <p>And so the teacher turned him out,<br> But still he lingered near,<br> And waited patiently about<br> Till Mary did appear.<br> And then he ran to her, and laid<br> His head upon her arm,<br> As if he said <q>I’m not afraid,<br> You’ll keep me from all harm.</q></p> <p><q>What makes the lamb love Mary so?</q><br> The eager children cry.<br> <q>‘Oh, Mary loves the lamb, you know,</q><br> The teacher did reply.<br> <q>And you each gentle animal<br> In confidence may bind,<br> And make them follow at your call,<br> If you are always kind.</q> </p> </div></div> </section> </body> </html>
When I searched for lamb, it found the expected instances in the lyrics section (class = “lyrics-text”), but surprisingly, it also found Lamb in the <title>. But the regex, as I originally wrote it, said it should only find matches after the string lyrics-text.
Did adding /G change the behavior I think you mentioned that it might make subsequent searches start at the beginning of the file.
-
@Coises Ships passing in the night.
I just made a post saying when I searched for the word lamb, it also found the word Lamb in the <title>, not just after lyrics-text. Search text:
<!DOCTYPE HTML> <html lang="en-us"> <head> <meta charset="utf-8"> <title>Mary Had a Little Lamb</title> <meta name="description" content="Words: Sarah Hale, 1830. Music: None."> <meta name="keywords" content="Sarah Hale"> <link rel="stylesheet" href="../../css/hymn.css"> <script src="../../js/jquery.js"></script> <script src="../../js/base.js"></script> <script src="../../js/hymn.js"></script> <link rel="prev" href="../../htm/h/e/w/o/hewonsav.htm"> <link rel="next" href="../../htm/h/e/s/a/hesallwo.htm"> <link rel="up" href="../../ttl/ttl-h.htm"> </head> <body> <section id="preface"> <h1 class="screen-reader-only">Introduction</h1> <div class="preface-text"> <p><span class="lead">Words:</span> <a href="../../bio/h/a/l/e/hale_sjb.htm">Sarah J. Hale</a>, 1830.</p> <p><span class="lead">Music:</span> John Doe (<a href="../../mid/d/u/m/m/dummy.mid" title="Listen to music, MIDI format">🔊</a> <a href="../../pdf/en/d/u/m/m/Dummy.pdf" title="Download score, PDF format">pdf</a> <a href="../../nwc/d/u/m/m/Dummy.nwc" title="Download score, Noteworthy Composer format">nwc</a>).</p> </div> </section> <p>This page is used to test global search-and-replace using regular expressions. </p> <section class="lyrics"> <div class="stanzas"><div class="lyrics-text mc ll"> <p>Mary had a little lambkin,<br> Its fleece was white as snow.<br> And everywhere that Mary went,<br> The lamb was sure to go.<br> He followed her to school one day,<br> That was against the rule.<br> It made the children laugh and play<br> To see a lamb at school.</p> <p>And so the teacher turned him out,<br> But still he lingered near,<br> And waited patiently about<br> Till Mary did appear.<br> And then he ran to her, and laid<br> His head upon her arm,<br> As if he said <q>I’m not afraid,<br> You’ll keep me from all harm.</q></p> <p><q>What makes the lamb love Mary so?</q><br> The eager children cry.<br> <q>‘Oh, Mary loves the lamb, you know,</q><br> The teacher did reply.<br> <q>And you each gentle animal<br> In confidence may bind,<br> And make them follow at your call,<br> If you are always kind.</q> </p> </div></div> </section> </body> </html>
It sounds like your comment addresses that. Am I correct?
-
@Sylvester-Bullitt said in Find-in-FIles: Can’t Replace Multiple Instances of Word:
Ships passing in the night.
More than that, you aren’t noticing all the posts because of the rapid posting.
I already explained exactly what happened with
\G
ion this post, which contains a fix for the\G
issue.@Coises’s follow-on showed that if any of your files don’t have
lyrics-text
at all, then my fix-for-\G
will replace all instances ofstar
orlamb
or what have you – but I’m hoping, for your sake, that all the files that your Find in Files filter will match will containlyrics-text
somewhere. -
@Sylvester-Bullitt said in Find-in-FIles: Can’t Replace Multiple Instances of Word:
(?:lyrics-text|\G)
(?:...)
is a non-capturing subgroup. -
@PeterJones said in Find-in-FIles: Can’t Replace Multiple Instances of Word:
@Sylvester-Bullitt said in Find-in-FIles: Can’t Replace Multiple Instances of Word:
Ships passing in the night.
More than that, you aren’t noticing all the posts because of the rapid posting.
I already explained exactly what happened with
\G
ion this post, which contains a fix for the\G
issue.@Coises’s follow-on showed that if any of your files don’t have
lyrics-text
at all, then my fix-for-\G
will replace all instances ofstar
orlamb
or what have you – but I’m hoping, for your sake, that all the files that your Find in Files filter will match will containlyrics-text
somewhere.Took me this long to get it (so I’ll post here rather than editing my earlier comment), but I think:
(?s)(\A.*?(lyrics-text|\Z(*COMMIT)(*FAIL))|\G).+?(?<!^)(?<!<p>)(?<!<p class="chorus">)\Kstar(?=(.+?</div>))
fixes that problem.
-
@PeterJones Three things:
-
Yes, all the files (assuming they’re generated properly from my template) have the string lyrics-text.
-
I just tried Coises’ suggested modification to the regex:
(?s)(\A.*?lyrics-text|\G).+?(?<!^)(?<!<p>)(?<!<p class=“chorus”>)\Klamb(?=(.+?</div>))
As advertised, it no longer matches the Lamb in the <title> tag, which is the desired behavior, since we’re only changing lyrics.
- What is the construct that resembles a lookbehind, but has the asterisk & question mark? That is,
(\A.*?lyrics-text|\G)
Still testing, but things are looking more and more promising!
-
-
@PeterJones “I see,” said the blind carpenter, as he picked up his hammer and saw!
-
Suggested reading:
Perl Regular Expression Syntax
Boost-Extended Format String SyntaxNotepad++ uses the Boost regular expression library. The above links are to the documentation for the current version; I believe Notepad++ is a couple minor versions behind, but there should be little or no practical difference.
-
@Sylvester-Bullitt said in Find-in-FIles: Can’t Replace Multiple Instances of Word:
I just tried Coises’ suggested modification to the regex:
That was @PeterJones, not me. I was in the process of writing a post explaining why it couldn’t be done when he posted showing how to do it.
-
@Sylvester-Bullitt Good news!
Testing the new-and-improved regex against 2 files on disk worked perfectly!
It even worked when I had to undo a mistake with the replacement string, changing it to the one I really meant (I just changed the regex and clicked Replace All again).
So for now (fingers tightly crossed), it looks like we can declare victory! Does anyone have any more pearls of wisdom to add to this adventure?
Thank you so much for for your help and patience.
By the way, if you’d like to see the Web site where this will be used, click here!
Cheers!
-
@Sylvester-Bullitt said in Find-in-FIles: Can’t Replace Multiple Instances of Word:
- What is the construct that resembles a lookbehind, but has the asterisk & question mark?
That was answered here
-
@PeterJones I spoke to soon. Sigh.
I just ran this regex against live Web site files (fortunately, just Find All, not replacing anything yet):
(?s)(\A.*?lyrics-text|\G).+?(?<!^)(?<!<p>)(?<!<p class=“chorus”>)\KSavior(?=(.+?</div>))
This regex had ignored the <title> element in my earlier tests, but it did not ignore the title in the file text below (i.e., it matched the word Savior in the title). Can anyone see why?
<!DOCTYPE HTML> <html lang="en-us"> <head> <meta charset="utf-8"> <title>Alas! and Did My Savior Bleed?</title> <meta name="alt-title" content="At the Cross"> <meta name="description" content="Words: Isaac Watts, 1709. Music: Hugh Wilson, 1800."> <meta name="keywords" content="Isaac Watts,Hugh Wilson,Ralph Hudson"> <link rel="stylesheet" href="../../../../../css/hymn.css"> <script src="../../../../../js/jquery.js"></script> <script src="../../../../../js/languages.js"></script> <script src="../../../../../js/base.js"></script> <script src="../../../../../js/hymn.js"></script> <link rel="prev" href="../../../i/r/f/airfille.htm"> <link rel="next" href="../../../b/n/a/abnature.htm"> <link rel="up" href="../../../../../ttl/ttl-a.htm"> <link rel="alternate" href="../../../../../non/es/e/n/l/a/enlacruz.htm" hreflang="es"> <link rel="alternate" href="../../../../../non/ml/a/l/a/s/alas_and_did_my_savior_bleed_ml.htm" hreflang="ml"> <link rel="alternate" href="../../../../../non/ml/a/l/a/s/alas_and_did_my_savior_bleed_2_ml.htm" hreflang="ml"> </head> <body> <section> <h1 class="screen-reader-only">Scripture Verse</h1> <div class="css-marquee" role="marquee"> <p><q>There is one God and one mediator between God and men, the man Christ Jesus, who gave Himself as a ransom for all men.</q> 1 Timothy 2:5–6</p> </div> </section> <section id="preface"> <h1 class="screen-reader-only">Introduction</h1> <figure><img alt="portrait" src="../../../../../img/w/a/t/t/watts_i.jpg" width="200" height="300"><figcaption>Isaac Watts<br>1674–1748</figcaption></figure> <div class="preface-text"> <p><span class="lead">Words:</span> <a href="../../../../../bio/w/a/t/t/watts_i.htm">Isaac Watts</a>, <cite class="book verbose">Hymns and Spiritual Songs</cite> 1707–09<span class="verbose">, Book 2, number 9. <q>Godly sorrow arising from the sufferings of Christ.</q> <a href="../../../../../bio/h/u/d/s/hudson_re.htm">Ralph E. Hudson</a> wrote the refrain in 1885</span>.</p> <p><span class="lead">Music:</span> <span class="music verbose">Martyrdom</span> <a href="../../../../../bio/w/i/l/s/o/n/h/wilson_h.htm">Hugh Wilson</a>, 1800 (<a href="../../../../../mid/m/a/r/t/martyrdom.mid" title="Listen to music, MIDI format">🔊</a> <a href="../../../../../pdf/en/m/a/r/t/Martyrdom.pdf" title="Download score, PDF format">pdf</a> <a href="../../../../../nwc/m/a/r/t/Martyrdom.nwc" title="Download score, Noteworthy Composer format">nwc</a>)<span class="verbose"> (does not use the refrain)</span>.</p> <div class="alt-tune"> <p>Alternate Tunes:</p> <ul> <li><span>Abney (Hull)</span> <a href="../../../../../bio/h/u/l/l/hull_a.htm">Asa Hull</a> (1828–1907) (<a href="../../../../../mid/a/b/n/e/abney_hull.mid" title="Listen to music, MIDI format">🔊</a> <a href="../../../../../pdf/en/a/b/n/e/Abney(Hull).pdf" title="Download score, PDF format">pdf</a> <a href="../../../../../nwc/a/b/n/e/Abney(Hull).nwc" title="Download score, Noteworthy Composer format">nwc</a>)</li> <li><span>Hudson</span> <a href="../../../../../bio/h/u/d/s/hudson_re.htm">Ralph E. Hudson</a>, <cite class="book">Songs of Peace, Love and Joy</cite> (<span class="map" onclick="show('Alliance,OH')">Alliance</span> Ohio: 1885) (<a href="../../../../../mid/h/u/d/s/hudson.mid" title="Listen to music, MIDI format">🔊</a> <a href="../../../../../pdf/en/a/t/t/h/AtTheCross.pdf" title="Download score, PDF format">pdf</a> <a href="../../../../../nwc/a/t/t/h/AtTheCross.nwc" title="Download score, Noteworthy Composer format">nwc</a>) (uses refrain below). It is with this tune that the hymn is known as <span class="hymn-title">At the Cross.</span></li> <li><span>Liberty Hall</span> in <cite class="book">Wyeth’s Repository of Sacred Music</cite>, by <a href="../../../../../bio/w/y/e/t/wyeth_j.htm">John Wyeth</a>, 1810 (<a href="../../../../../mid/l/i/b/e/liberty_hall.mid" title="Listen to music, MIDI format">🔊</a> <a href="../../../../../pdf/en/l/i/b/e/LibertyHall.pdf" title="Download score, PDF format">pdf</a> <a href="../../../../../nwc/l/i/b/e/LibertyHall.nwc" title="Download score, Noteworthy Composer format">nwc</a>)</li> </ul></div></div> <figure><img alt="illustration" src="../../../../../img/c/r/u/c/Crucifixion,SimonVouet.jpg" height="300" width="200"><figcaption>Crucifixion<br>Simon Vouet<br>1590–1649</figcaption></figure> </section> <section> <h1 class="screen-reader-only">Background</h1> <blockquote class="verbose mc"> <p>[In] the autumn of 1850…revival meetings were being held in the Thirtieth Street Methodist Church. Some of us went down every evening; and, on two occasions, I sought peace at the atlar [sic], but did not find the joy I craved, until one evening, November 20, 1850, it seemed to me that the light must indeed come then or never; and so I arose and went to the altar alone. After a prayer was offered, they began to sing the grand old consecration hymn,</p> <p lang="en-gb"><q>Alas, and did my Saviour bleed, and did my Sovereign die?</q></p> <p>And when they reached the third line of the fourth stanza,</p> <p><q>Here Lord, I give myself away,</q></p> <p>My very soul was flooded with a celestial light. I sprang to my feet, shouting <q>hallelujah,</q> and then for the first time I realized that I had been trying to hold the world in one hand and the Lord in the other.</p> <p><a href="../../../../../bib/c/crosby.htm">Crosby</a>, p. 24</p> </blockquote> </section> <section class="lyrics"> <div class="audio"><audio class="primary" controls loop><source src="../../../../../ogg/m/a/r/t/martyrdom.ogg" type="audio/ogg"></audio></div> <h1 class="screen-reader-only">Lyrics</h1> <div class="stanzas"><div class="lyrics-text mc ll"> <p>Alas! and did my Savior bleed<br> And did my Sovereign die?<br> Would He devote that sacred head<br> For such a worm as I?</p> <p class="chorus">Refrain</p> <p class="chorus">At the cross, at the cross where I first saw the light,<br> And the burden of my heart rolled away,<br> It was there by faith I received my sight,<br> And now I am happy all the day!</p> <p>Thy body slain, sweet Jesus, Thine,<br> And bathed in its own blood,<br> While all exposed to wrath divine,<br> The glorious Sufferer stood!</p> <p class="chorus">Refrain</p> <p>Was it for crimes that I had done<br> He groaned upon the tree?<br> Amazing pity! grace unknown!<br> And love beyond degree!</p> <p class="chorus">Refrain</p> <p>Well might the sun in darkness hide<br> And shut his glories in,<br> When Christ, the mighty Maker died,<br> For man the creature’s sin.</p> <p class="chorus">Refrain</p> <p>Thus might I hide my blushing face<br> While His dear cross appears,<br> Dissolve my heart in thankfulness,<br> And melt my eyes to tears.</p> <p class="chorus">Refrain</p> <p>But drops of grief can ne’er repay<br> The debt of love I owe:<br> Here, Lord, I give my self away<br> ’Tis all that I can do.</p> <p class="chorus">Refrain</p> </div></div> </section> </body> </html>
-
@Sylvester-Bullitt said in Find-in-FIles: Can’t Replace Multiple Instances of Word:
Does anyone have any more pearls of wisdom to add to this adventure?
Be aware that these expressions match parts of words; e.g., the “star” in “starlight” or “restart” will be matched. I’ll leave it as an exercise for you to study a bit and attempt to find a fix for that, if it is a problem.
No regular expression thread is finished until @guy038 drops in to tell us that there’s a better way to do it.
-
You’ve found another glitch:
If there are no matches following
lyrics-text
, the expression we’ve suggested will match from the beginning of the file.All three matches are in the head section of the document. There are no matches after
lyrics-text
, because the word is hyphenated in the lyrics text.