Regex: Delete empty lines inside an html tag, after .dot
-
In the example below, I have many empty lines between first sentence and the second. So, I want to use a regex, so as to connect the two sentences, after the .dot
<meta name="description" content=" I go home. But I cannot go to work "/>
My regex is almost good, deletes empy lines, except that does not connect the sentences.
FIND:
(?-si:<meta name="description" content="|(?!\A)\G)(?s-i:(?!"/>).)*?\K^\s+
REPLACE BY: (
leave empty
)Can anyone help me with a better solution?
-
@Neculai-I-Fantanaru
If you want to remove all newlines (not just empty lines) within thecontent
attributes, this should work:
(?-i)(?:meta name="description" content="|(?!\A)\G)(?:(?!"/>).)*?\K\R+
.The
\R
metacharacter matches all newlines, including\r
,\n
, and\r\n
.My understanding (I could be wrong) is that newlines in general are disallowed inside XML (and by extension, HTML) attribute names.
-
After some more thought, I came up with one that eliminates only empty lines and the last newline before the close quote if that’s really what you want:
replace(?s-i)(?:meta name="description" content="|(?!\A)\G)(?:(?!"/>).)*?\K\R(?=$|[^"\r\n]*?")
with nothing.This will convert
<meta name="description" content=" I go home. should stay on own line. but this will collapse. But I cannot go to work "/> <meta name="description" content=" foo "/> <meta name="description" content=" I go home. But I cannot go to work "/>
into
<meta name="description" content=" I go home. should stay on own line. but this will collapse.But I cannot go to work "/> <meta name="description" content=" foo"/> <meta name="description" content=" I go home.But I cannot go to work "/>
-
@Mark-Olson said in Regex: Delete empty lines inside an html tag, after .dot:
(?s-i)(?:meta name=“description” content=“|(?!\A)\G)(?:(?!”/>).)?\K\R(?=$|[^"\r\n]?")
not quite. Because your regex doesn’t put all lines on the same line. After replacement, must become like this:
<meta name="description" content=" I go home. should stay on own line. but this will collapse.But I cannot go to work "/>
-
read my thread here: https://community.notepad-plus-plus.org/topic/24369/regex-help-with-reverse-line/7
thanks to
PeterJones
i think the second part of this topic can help youalso you can delete all blank empty lines from
Edit - Line operations - Remove Empty Lines
then applyPeterJones
regex to put all text in single line -
I find a better solution, I update my regex:
FIND:
(?-si:<meta name="description" content="|(?!\A)\G)(?s-i:(?!"/>).)*?\K\s+\s+
REPLACE BY:
\x20
So, the generic will be:
(?-si:FIRST-PART|(?!\A)\G)(?s-i:(?!SECOND-PART).)*?\KREGEX-REPLACE
-
@Neculai-I-Fantanaru
Yes, my regex did that. If you looked at my data, you would see that your initial example was part of it and my regex did that.I’m glad you found a solution that works. However, I would note that the
\s+\s+
in your regex should be replaced with\s+
because the second\s+
contributes nothing. -
@Mark-Olson said in Regex: Delete empty lines inside an html tag, after .dot:
@Neculai-I-Fantanaru
Yes, my regex did that. If you looked at my data, you would see that your initial example was part of it and my regex did that.I’m glad you found a solution that works. However, I would note that the
\s+\s+
in your regex should be replaced with\s+
because the second\s+
contributes nothing.Thanks it helps a lot.