<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Line break before every UPPERCASE word]]></title><description><![CDATA[<p dir="auto">Hey!</p>
<p dir="auto">I have text files of scanned tables that OCRed into a single line. The original table was essentially 3 columns: An UPPERCASE surname, a number (rating), a dividing dash and a couple of senteces of text (I don’t even need that).<br />
Something like this:</p>
<pre><code>MÜLLER 6 - Blahblah. SMITH 5 - Asdds. Asdsd. DI CARLO 8,5 - And. Maybe even. Multiple. Sentences here.
</code></pre>
<p dir="auto">to</p>
<pre><code>MÜLLER 6 - Blahblah. 
SMITH 5 - Asdds. Asdsd. 
DI CARLO 8,5 - And. Maybe even. Multiple. Sentences here.
</code></pre>
<p dir="auto">Can you help me out with an expression to break the lines before every completely UPPERCASE word, but not at every Sentence?<br />
Also, is there an elegant way to replace the leading space between the name and the number withour affecting the spaces in multipart names?</p>
<p dir="auto">Thank you!</p>
]]></description><link>https://community.notepad-plus-plus.org/topic/21792/line-break-before-every-uppercase-word</link><generator>RSS for Node</generator><lastBuildDate>Thu, 16 Apr 2026 07:13:38 GMT</lastBuildDate><atom:link href="https://community.notepad-plus-plus.org/topic/21792.rss" rel="self" type="application/rss+xml"/><pubDate>Tue, 07 Sep 2021 13:48:22 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to Line break before every UPPERCASE word on Wed, 08 Sep 2021 19:56:31 GMT]]></title><description><![CDATA[<p dir="auto">Hello <a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/23057">@floyddebarber</a>, <a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/3841">@peterjones</a> and <strong>All</strong>,</p>
<p dir="auto">An <strong>alternative</strong> solution would be :</p>
<p dir="auto">SEARCH <strong><code>(?-i)(?&lt;=\.)\h*(?=\u\u)</code></strong></p>
<p dir="auto">REPLACE <strong><code>\r\n</code></strong></p>
<p dir="auto">So, for instance, from this <em>INPUT</em> text :</p>
<pre><code class="language-diff">MÜLLER 6 - Blahblah.         SMITH 5 - Asdds. Asdsd.DI CARLO 8,5 - And. Maybe even. Multiple. Sentences here.
</code></pre>
<p dir="auto">you would get the <em>OUTPUT</em> text :</p>
<pre><code class="language-diff">MÜLLER 6 - Blahblah.
SMITH 5 - Asdds. Asdsd.
DI CARLO 8,5 - And. Maybe even. Multiple. Sentences here.
</code></pre>
<hr />
<p dir="auto"><strong>Notes</strong> :</p>
<ul>
<li>
<p dir="auto">This regex searches a range of <strong>horizontal blank</strong> chars ( <strong><code>\x20</code></strong>, <strong><code>\x09</code></strong> or <strong><code>\x85</code></strong> ), possibly <strong>null</strong>, but <em>ONLY IF</em> :</p>
<ul>
<li>
<p dir="auto">It is <strong>preceded</strong> with a literal <strong>full period</strong> due to the positive <strong>look-behind</strong> <strong><code>(?&lt;=\.)</code></strong></p>
</li>
<li>
<p dir="auto">It is <strong>followed</strong> with <strong>two upper-case</strong> letters, accentuated or not, due to the positive <strong>look-around</strong> <strong><code>(?=\u\u)</code></strong></p>
</li>
</ul>
</li>
<li>
<p dir="auto">And, in <strong>replacement</strong>, this range is just replaced by a <strong>Windows</strong> line-break ( <strong><code>\r\n</code></strong> ) ( Use <strong><code>\n</code></strong> only if working on <strong>Unix</strong> files )</p>
</li>
</ul>
<p dir="auto">Best Regards,</p>
<p dir="auto">guy038</p>
]]></description><link>https://community.notepad-plus-plus.org/post/69576</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/69576</guid><dc:creator><![CDATA[guy038]]></dc:creator><pubDate>Wed, 08 Sep 2021 19:56:31 GMT</pubDate></item><item><title><![CDATA[Reply to Line break before every UPPERCASE word on Tue, 07 Sep 2021 15:39:15 GMT]]></title><description><![CDATA[<p dir="auto">Wow, many thanks for the fast and detailed reply!</p>
]]></description><link>https://community.notepad-plus-plus.org/post/69535</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/69535</guid><dc:creator><![CDATA[floyddebarber]]></dc:creator><pubDate>Tue, 07 Sep 2021 15:39:15 GMT</pubDate></item><item><title><![CDATA[Reply to Line break before every UPPERCASE word on Tue, 07 Sep 2021 14:05:28 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/23057">@floyddebarber</a> said in <a href="/post/69530">Line break before every UPPERCASE word</a>:</p>
<blockquote>
<p dir="auto">MÜLLER 6 - Blahblah. SMITH 5 - Asdds. Asdsd. DI CARLO 8,5 - And. Maybe even. Multiple. Sentences here.</p>
</blockquote>
<p dir="auto">FIND = <code>(?-i)\h+(\b\u{2}[\u\x20]+)</code><br />
REPLACE = <code>\r\n$1</code><br />
SEARCH MODE = regular expression</p>
<p dir="auto">important concepts:</p>
<ul>
<li><code>\h</code> and <code>\u</code> and <code>[...]</code> = character classes: <a href="https://npp-user-manual.org/docs/searching/#character-classes" rel="nofollow ugc">https://npp-user-manual.org/docs/searching/#character-classes</a></li>
<li><code>+</code> and <code>{2}</code> = multiplying operators: <a href="https://npp-user-manual.org/docs/searching/#multiplying-operators" rel="nofollow ugc">https://npp-user-manual.org/docs/searching/#multiplying-operators</a></li>
<li><code>\b</code> = anchors: <a href="https://npp-user-manual.org/docs/searching/#anchors" rel="nofollow ugc">https://npp-user-manual.org/docs/searching/#anchors</a></li>
<li><code>(?-i)</code> = search modifiers: <a href="https://npp-user-manual.org/docs/searching/#search-modifiers" rel="nofollow ugc">https://npp-user-manual.org/docs/searching/#search-modifiers</a></li>
<li><code>(...)</code> = capture groups: <a href="https://npp-user-manual.org/docs/searching/#capture-groups-and-backreferences" rel="nofollow ugc">https://npp-user-manual.org/docs/searching/#capture-groups-and-backreferences</a></li>
<li><code>\r\n</code> = control characters: <a href="https://npp-user-manual.org/docs/searching/#control-characters" rel="nofollow ugc">https://npp-user-manual.org/docs/searching/#control-characters</a></li>
<li><code>$1</code> = substitution escape sequences: <a href="https://npp-user-manual.org/docs/searching/#substitution-escape-sequences" rel="nofollow ugc">https://npp-user-manual.org/docs/searching/#substitution-escape-sequences</a></li>
</ul>
<p dir="auto">edit: the boundary <code>\b</code> isn’t necessary; I had that in there from an early version, but I had added the <code>\h+</code> before to prevent <code>MÜLLER</code> from getting an extra CRLF before it, so the boundary was no longer needed.</p>
]]></description><link>https://community.notepad-plus-plus.org/post/69531</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/69531</guid><dc:creator><![CDATA[PeterJones]]></dc:creator><pubDate>Tue, 07 Sep 2021 14:05:28 GMT</pubDate></item></channel></rss>