<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Delete all rows of a text file except company names]]></title><description><![CDATA[<p dir="auto">This is a text file in hmtl containing company information. I want to delete everything except the company names.<br />
Every row containing the company names has this code before the name. &lt;font color=‘#595f75’&gt;&lt;strong&gt;<br />
And this code after the company name. &lt;/strong&gt;</p>
<p dir="auto">I tried using a regex find and replace with that supposedly would do this but it didn’t work so I’m asking here for suggestions.</p>
<p dir="auto">An example of one companies html code listing. There could be as many as a 1000 companies in each list so automating this would be a big help.</p>
<p dir="auto">&lt;table align=‘left’ cellspacing=‘0’ cellpadding=‘3’ width=‘500’&gt;&lt;tr&gt;&lt;td align=‘left’ width=‘60%’ valign=‘top’&gt;<br />
&lt;font color=‘#595f75’&gt;&lt;strong&gt;A &amp; L INDUSTRIAL SERVICES&lt;/strong&gt;&lt;br&gt;Misty Martinez&lt;br&gt;&lt;/font&gt;<br />
&lt;font color=‘#595f75’&gt;2910 East P Street&lt;br&gt;Deer Park, TX  77536&lt;/font&gt;   &lt;a href=‘<a href="http://maps.google.com/?q=2910+East+P+Street%2C+Deer+Park%2C+TX+77536" rel="nofollow ugc">http://maps.google.com/?q=2910+East+P+Street%2C+Deer+Park%2C+TX+77536</a>’ target=‘_blank’ style=“color: ##228dc1;”&gt;Map&lt;/a&gt;&lt;/td&gt;<br />
&lt;td width=‘40%’ align=‘right’ valign=‘top’&gt;281 470-9805&lt;br&gt;Fax: 281 470-9899&lt;br&gt;&lt;a href=‘<a href="http://www.anlindustrial.com" rel="nofollow ugc">http://www.anlindustrial.com</a>’ target=‘_blank’&gt;&lt;font color=‘#228dc1’&gt;<a href="http://www.anlindustrial.com" rel="nofollow ugc">www.anlindustrial.com</a>&lt;/font&gt;&lt;/a&gt;&lt;br&gt;&lt;a href=‘<a href="mailto:misty.martinez@anlindustrial.com" rel="nofollow ugc">mailto:misty.martinez@anlindustrial.com</a>’&gt;&lt;font color=‘#228dc1’&gt;Email&lt;/font&gt;&lt;/a&gt;&lt;/td&gt;<br />
&lt;/tr&gt;&lt;tr&gt;&lt;td align=‘left’ colspan=3&gt;&lt;span style=‘font-style: italic; font-weight: bold;’&gt;<br />
&lt;/span&gt;&lt;/td&gt;&lt;/tr&gt;<br />
&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;</p>
<p dir="auto">Thanks for any suggestions. It’s probably simple for you but not for me.</p>
]]></description><link>https://community.notepad-plus-plus.org/topic/16888/delete-all-rows-of-a-text-file-except-company-names</link><generator>RSS for Node</generator><lastBuildDate>Thu, 21 May 2026 02:01:20 GMT</lastBuildDate><atom:link href="https://community.notepad-plus-plus.org/topic/16888.rss" rel="self" type="application/rss+xml"/><pubDate>Mon, 07 Jan 2019 00:39:30 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to Delete all rows of a text file except company names on Mon, 07 Jan 2019 10:42:31 GMT]]></title><description><![CDATA[<p dir="auto">Thanks to everyone who helped with this question. Each answer contributed to the solution. Special thanks to guy038 who gave me a better understanding of how the code works and his solution worked perfectly.</p>
<p dir="auto">Ray Fellers</p>
]]></description><link>https://community.notepad-plus-plus.org/post/38117</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/38117</guid><dc:creator><![CDATA[Raymond Lee Fellers]]></dc:creator><pubDate>Mon, 07 Jan 2019 10:42:31 GMT</pubDate></item><item><title><![CDATA[Reply to Delete all rows of a text file except company names on Mon, 07 Jan 2019 10:21:35 GMT]]></title><description><![CDATA[<p dir="auto">Hello, <a class="plugin-mentions-user plugin-mentions-a" href="/user/raymond-lee-fellers" aria-label="Profile: raymond-lee-fellers">@<bdi>raymond-lee-fellers</bdi></a>, <a class="plugin-mentions-user plugin-mentions-a" href="/user/terry-r" aria-label="Profile: terry-r">@<bdi>terry-r</bdi></a> and <strong>All</strong>,</p>
<p dir="auto">So, <strong>Raymond</strong>, you would like to delete <strong>everything</strong> except the <strong>company names</strong> which are located :</p>
<ul>
<li>
<p dir="auto"><strong>After</strong> the string <strong><code>&lt;font color='#595f75'&gt;&lt;strong&gt;</code></strong></p>
</li>
<li>
<p dir="auto"><strong>Before</strong> the string <strong><code>&lt;/strong&gt;</code></strong></p>
</li>
</ul>
<p dir="auto">No problem at all with <strong>regular</strong> expressions ;-))</p>
<hr />
<ul>
<li>
<p dir="auto">Copy / Paste your <strong>html</strong> file in a <strong>new</strong> Notepad++ tab</p>
</li>
<li>
<p dir="auto">Open the <strong>Replace</strong> dialog ( <strong><code>Ctrl + H</code></strong> )</p>
</li>
<li>
<p dir="auto">SEARCH <strong><code>(?s).+?&lt;font color='#595f75'&gt;&lt;strong&gt;((?-s).+?)&lt;/strong&gt;|.+</code></strong></p>
</li>
<li>
<p dir="auto">REPLACE <strong><code>\1\r\n</code></strong>  ( or <strong><code>\1\n</code></strong> if you work with <strong>UNIX</strong> files )</p>
</li>
<li>
<p dir="auto">Tick the <strong><code>Wrap around</code></strong> option</p>
</li>
<li>
<p dir="auto">Select the <strong><code>Regular expression</code></strong> search mode</p>
</li>
<li>
<p dir="auto">Click on the <strong><code>Replace All</code></strong> button</p>
</li>
</ul>
<p dir="auto">Et voilà !</p>
<hr />
<p dir="auto"><strong>Notes</strong> :</p>
<ul>
<li>
<p dir="auto">First, the global <strong>modifier</strong> <strong><code>(?s)</code></strong> means that, by <strong>default</strong>,  the <strong>dot</strong> character will match <strong>any single</strong> char ( <strong>standard</strong> or <strong>EOL</strong> one )</p>
</li>
<li>
<p dir="auto">Then the part <strong><code>.+?&lt;font color='#595f75'&gt;&lt;strong&gt;</code></strong> looks, from <strong>cursor</strong> position, for the <strong>smallest</strong> range, even on <strong>multi</strong>-lines, of <strong>any</strong> char till the <strong>literal</strong> string <strong><code>&lt;font color='#595f75'&gt;&lt;strong&gt;</code></strong></p>
</li>
<li>
<p dir="auto">Now, the part <strong><code>((?-s).+?)&lt;/strong&gt;</code></strong> tries to match the <strong>smallest</strong> range of <strong>standard</strong> characters, in a <strong>single</strong> line due to the <strong><code>(?-s)</code></strong> <strong>modifier</strong>, till the <strong>literal</strong> string <strong><code>&lt;/strong&gt;</code></strong>. That range is stored as <strong>group <code>1</code></strong>, because of the <strong>parentheses</strong></p>
</li>
<li>
<p dir="auto">If <strong>no</strong> more range <strong><code>&lt;font color='#595f75'&gt;&lt;strong&gt;............&lt;/strong&gt;</code></strong> <strong>cannot</strong> be found, the regex tries the <strong>second</strong> alternative, after the <strong><code>|</code></strong> symbol ( <strong><code>.+</code></strong> ) which catches <strong>all</strong> the <strong>remaining</strong> chars till the <strong>very end</strong> of the file</p>
</li>
<li>
<p dir="auto">In <strong>replacement</strong>, any <strong>company name</strong> is rewritten, <strong><code>\1</code></strong>, followed with the <strong>EOL</strong> chars <strong><code>\r\n</code></strong> and <strong>remaining</strong> chars at <strong>end</strong> of the file are simply replaced with a single <strong>line-break</strong> as, in that <strong>second</strong> alternative, the <strong>group <code>1</code></strong> is <strong>not</strong> defined !</p>
</li>
</ul>
<p dir="auto"><strong>Remark</strong> :</p>
<p dir="auto">If you do <strong>not</strong> tick the <strong><code>Wrap around</code></strong> option, in order to run the <strong>regex</strong> S/R from <strong>current</strong> location till the <strong>end</strong> of file, only, be sure that <strong>cursor</strong> is at the <strong>very beginning</strong> of the <strong>current</strong> line, <strong>before</strong> replacement !</p>
<p dir="auto">Best Regards,</p>
<p dir="auto">guy038</p>
]]></description><link>https://community.notepad-plus-plus.org/post/38106</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/38106</guid><dc:creator><![CDATA[guy038]]></dc:creator><pubDate>Mon, 07 Jan 2019 10:21:35 GMT</pubDate></item><item><title><![CDATA[Reply to Delete all rows of a text file except company names on Mon, 07 Jan 2019 01:56:37 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/raymond-lee-fellers" aria-label="Profile: Raymond-Lee-Fellers">@<bdi>Raymond-Lee-Fellers</bdi></a><br />
Sorry, slight mistake in previous post, I meant to say you could use the “cut bookmarked lines” and then paste in another tab in NPP. However the easiest option is to use “remove unmarked lines”, which will leave the lines you DO want.</p>
<p dir="auto">Terry</p>
]]></description><link>https://community.notepad-plus-plus.org/post/38105</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/38105</guid><dc:creator><![CDATA[Terry R]]></dc:creator><pubDate>Mon, 07 Jan 2019 01:56:37 GMT</pubDate></item><item><title><![CDATA[Reply to Delete all rows of a text file except company names on Mon, 07 Jan 2019 01:46:20 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/raymond-lee-fellers" aria-label="Profile: Raymond-Lee-Fellers">@<bdi>Raymond-Lee-Fellers</bdi></a></p>
<p dir="auto">Actually there is another way to delete the lines you don’t want. I’ll explain as it seems you may have some regex knowledge already.<br />
Under the Search menu there is a “mark” option. Now you use the text you know that exists for the companies (this MUST NOT occur any any lines you want to delete, only the ones to remain) and insert into the Mark “find what” field. Click on the bookmark line option and then click on “mark all”. So this has now marked all the lines you want to keep. From here you can use the Search menu,  Bookmark (near bottom) and select either “remove bookmarked lines” or “remove unmarked lines”. If the first option, then open another tab in NPP and paste them there.</p>
<p dir="auto">I hope that helps.</p>
<p dir="auto">Terry</p>
]]></description><link>https://community.notepad-plus-plus.org/post/38104</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/38104</guid><dc:creator><![CDATA[Terry R]]></dc:creator><pubDate>Mon, 07 Jan 2019 01:46:20 GMT</pubDate></item><item><title><![CDATA[Reply to Delete all rows of a text file except company names on Mon, 07 Jan 2019 01:36:38 GMT]]></title><description><![CDATA[<p dir="auto">Firstly I can see from your example that possibly at least 1 line has wrapped and now appears as at least 2 lines. The examples are very important as when we create a regex knowing how the line REALLY appears is very important.</p>
<p dir="auto">Can I therefore suggest you read the FAQ, specifically the posting called<br />
“Request for Help without sufficient information to help you”.<br />
In there is how to represent the data (example) so that the markdown interpreter (which runs these posts) does NOT interfere with the formatting.</p>
<p dir="auto">Terry</p>
]]></description><link>https://community.notepad-plus-plus.org/post/38103</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/38103</guid><dc:creator><![CDATA[Terry R]]></dc:creator><pubDate>Mon, 07 Jan 2019 01:36:38 GMT</pubDate></item></channel></rss>