<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Replacing in specific columns (More difficult than you&#x27;d think)]]></title><description><![CDATA[<p dir="auto">I have some twitter data in the form of TXT files that I’m using for academic purposes. For example, here’s some tweets by Senator Chuck Schumer:<br />
<img src="/assets/uploads/files/1624631505126-chuckschumer.png" alt="Chuck Schumer" class=" img-fluid img-markdown" /><br />
As you can see there’s a tweet code, date and time, time zone code, twitter handle, and the content of the tweet, and the sections are delimited by space, which is obviously an issue as the contents of the tweets contain many spaces. I tried column select but replace in selection is greyed out<br />
<img src="/assets/uploads/files/1624631917137-chuckschumer2.png" alt="ChuckSchumer2.png" class=" img-fluid img-markdown" /><br />
Replace all works fine here but I can’t select the specific columns (option is not greyed out if normal select is used). I was thinking about using Regex for this but I’m not sure how to get it to do what I want. Ideally code where I could simply replace the first five spaces in a line into commas would be ideal, since tweet code lengths can vary a little as you go back, but I don’t know how to do that. Maybe a Macro?</p>
]]></description><link>https://community.notepad-plus-plus.org/topic/21391/replacing-in-specific-columns-more-difficult-than-you-d-think</link><generator>RSS for Node</generator><lastBuildDate>Tue, 21 Apr 2026 20:23:01 GMT</lastBuildDate><atom:link href="https://community.notepad-plus-plus.org/topic/21391.rss" rel="self" type="application/rss+xml"/><pubDate>Fri, 25 Jun 2021 14:42:44 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to Replacing in specific columns (More difficult than you&#x27;d think) on Fri, 25 Jun 2021 17:15:01 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/3841">@PeterJones</a> can’t believe I didn’t think of that. Thanks!</p>
]]></description><link>https://community.notepad-plus-plus.org/post/67394</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/67394</guid><dc:creator><![CDATA[LingEd]]></dc:creator><pubDate>Fri, 25 Jun 2021 17:15:01 GMT</pubDate></item><item><title><![CDATA[Reply to Replacing in specific columns (More difficult than you&#x27;d think) on Fri, 25 Jun 2021 15:47:24 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/22381">@LingEd</a> said in <a href="/post/67385">Replacing in specific columns (More difficult than you'd think)</a>:</p>
<blockquote>
<p dir="auto">I altered the regex code to replace the first 5 commas instead of the first 4</p>
</blockquote>
<p dir="auto">Congratulations.  That means you understood what was going on.  Knowing that people learn from what I write, rather than just copy/pasting and moving on, is always a good feeling.</p>
<blockquote>
<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/3841">@PeterJones</a> OK, sorry to bother, but new problem. When I import the altered files into Excel with comma as delimiter it creates too many columns when the tweets contain commas themselves. I was thinking what I could do is replace the 6th+ instances of commas (…) in a line with a very uncommon character like “ɤ” and then search/replace that character in excel after the fact. The thing is, I don’t know how to write the Regex code to replace not just the 6th instance, but the 7th, 8th, 9th, etc instances. Thanks for the help!</p>
</blockquote>
<p dir="auto">That’s one good idea.  If I were to do it that way, step 1 would be to just replace all commas with <code>ɤ</code>.  Step 2 would be your 5-space-to-comma replacement from above.</p>
<p dir="auto">But since you’re trying to make valid CSV to open in Excel, CSV has a way of putting quotes around a field so that any commas inside will be treated as part of the text, not as a field separator.  But that would mean that if you have any text with quotes in it, that will get messed up.  But there’s a way around that by escaping the quote by changing any <code>"</code> to <code>""</code>.   So my procedure for what I think what you want with your data:</p>
<ol start="0">
<li>Search Mode = regular expression for all of this</li>
<li>FIND = <code>"</code><br />
REPLACE = <code>""</code> to escape the quotes</li>
<li>FIND = <code>^(\S+) (\S+) (\S+) (\S+) (\S+) (.*$)</code><br />
REPLACE = <code>$1,$2,$3,$4,$5,"$6"</code> to change spaces to commas and to put quotes around the text.</li>
</ol>
]]></description><link>https://community.notepad-plus-plus.org/post/67388</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/67388</guid><dc:creator><![CDATA[PeterJones]]></dc:creator><pubDate>Fri, 25 Jun 2021 15:47:24 GMT</pubDate></item><item><title><![CDATA[Reply to Replacing in specific columns (More difficult than you&#x27;d think) on Fri, 25 Jun 2021 15:32:02 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/3841">@PeterJones</a> OK, sorry to bother, but new problem. When I import the altered files into Excel with comma as delimiter it creates too many columns when the tweets contain commas themselves. I was thinking what I could do is replace the 6th+ instances of commas (I altered the regex code to replace the first 5 commas instead of the first 4) in a line with a very uncommon character like “ɤ” and then search/replace that character in excel after the fact. The thing is, I don’t know how to write the Regex code to replace not just the 6th instance, but the 7th, 8th, 9th, etc instances. Thanks for the help!</p>
]]></description><link>https://community.notepad-plus-plus.org/post/67385</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/67385</guid><dc:creator><![CDATA[LingEd]]></dc:creator><pubDate>Fri, 25 Jun 2021 15:32:02 GMT</pubDate></item><item><title><![CDATA[Reply to Replacing in specific columns (More difficult than you&#x27;d think) on Fri, 25 Jun 2021 15:17:28 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/3841">@PeterJones</a> Thanks! Works like a charm</p>
]]></description><link>https://community.notepad-plus-plus.org/post/67381</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/67381</guid><dc:creator><![CDATA[LingEd]]></dc:creator><pubDate>Fri, 25 Jun 2021 15:17:28 GMT</pubDate></item><item><title><![CDATA[Reply to Replacing in specific columns (More difficult than you&#x27;d think) on Fri, 25 Jun 2021 15:00:43 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/22381">@LingEd</a> ,</p>
<p dir="auto">data:</p>
<pre><code>123456786 1999-12-31 23:59:57 -0400 &lt;username&gt; Three! with more spaces
123456787 1999-12-31 23:59:58 -0400 &lt;username&gt; Two! with more spaces
123456788 1999-12-31 23:59:59 -0400 &lt;username&gt; One! with more spaces
123456789 1900-01-01 00:00:00 -0400 &lt;username&gt; Happy Y2K Bug!
</code></pre>
<ul>
<li>FIND = <code>^(\S+) (\S+) (\S+) (\S+)\x20</code><br />
(I used a <code>\x20</code>, which is equivalent to a space character at the end to make it obvious that there’s something at the end, so you will get it when you copy/paste; if typing the regex, you could just use a space after the last parentheses)</li>
<li>REPLACE = <code>$1,$2,$3,$4,</code></li>
<li>Search Mode = Regular expression</li>
</ul>
<pre><code>123456786,1999-12-31,23:59:57,-0400,&lt;username&gt; Three! with more spaces
123456787,1999-12-31,23:59:58,-0400,&lt;username&gt; Two! with more spaces
123456788,1999-12-31,23:59:59,-0400,&lt;username&gt; One! with more spaces
123456789,1900-01-01,00:00:00,-0400,&lt;username&gt; Happy Y2K Bug!
</code></pre>
]]></description><link>https://community.notepad-plus-plus.org/post/67379</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/67379</guid><dc:creator><![CDATA[PeterJones]]></dc:creator><pubDate>Fri, 25 Jun 2021 15:00:43 GMT</pubDate></item></channel></rss>