<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[&quot;Find in files&quot; special characters not working anymore]]></title><description><![CDATA[<p dir="auto">Hello.</p>
<p dir="auto">I’ve been Search and Replacing in a bunch of files for years till one day.</p>
<p dir="auto">I have TV shows subtitles containing special characters that don’t show up on my TV, so I replace those with normal letters and they’re a lot, so I need to batch.</p>
<p dir="auto">The chars don’t show as they should in notepad++ either, though I’m on UTF-8, they are ș ț Ș Ț and I was replacing them like this:</p>
<p dir="auto">º -&gt; s<br />
þ -&gt; t<br />
ª -&gt; S<br />
Þ -&gt; T</p>
<p dir="auto">Find in current file works just fine on my characters. Find and replace in files works just fine on normal letters.</p>
<p dir="auto">For example if I search for º in the open file it finds it just fine. Not in Find in Files, I get 0 results. But if look for any normal letter it works as it should.</p>
<p dir="auto">I’ve reinstalled hoping some setting blew up but no luck, probably something on my part :(</p>
<p dir="auto">Notepad++ v7.7.1   (64-bit)<br />
Build time : Jun 16 2019 - 21:24:47<br />
Path : C:\Program Files\Notepad++\notepad++.exe<br />
Admin mode : OFF<br />
Local Conf mode : OFF<br />
OS : Windows 10 (64-bit)<br />
Plugins : AutoSave.dll BetterMultiSelection.dll Explorer.dll mimeTools.dll NppConverter.dll NppToolBucket.dll PreviewHTML.dll</p>
]]></description><link>https://community.notepad-plus-plus.org/topic/18452/find-in-files-special-characters-not-working-anymore</link><generator>RSS for Node</generator><lastBuildDate>Wed, 10 Jun 2026 03:02:06 GMT</lastBuildDate><atom:link href="https://community.notepad-plus-plus.org/topic/18452.rss" rel="self" type="application/rss+xml"/><pubDate>Fri, 01 Nov 2019 10:34:54 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to &quot;Find in files&quot; special characters not working anymore on Mon, 23 Dec 2019 22:53:21 GMT]]></title><description><![CDATA[<p dir="auto">Hello!<br />
I do not want to create a new topic because my problem is pretty much the same. But I would like to get a short answer, I’m not interested in the character coding stuff.</p>
<p dir="auto">So, I have a lot of .cpp files, all of them are ANSI and to my luck, each comment made in Korean language. For example, there is a comment: “ÇöŔç Ŕ§ÄˇżˇĽ­ »çżëÇŇ Ľö ľř˝Ŕ´Ď´Ů.”<br />
(This means “Not available at this location.” if I change the character encoding to Windows-949 but it is not important now.)</p>
<p dir="auto">Few notepad++ patches before I was able to search in my source files for special encoded characters, but nowadays I can’t.</p>
<p dir="auto">So my question is, what can I do to fix the search?</p>
<p dir="auto">I do not want to install an older version of notepad just because of this, but I think I must. What happened? Why not working correctly the search anymore? What can I do?</p>
<p dir="auto">Thank you in advance!</p>
]]></description><link>https://community.notepad-plus-plus.org/post/49422</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/49422</guid><dc:creator><![CDATA[Gregori]]></dc:creator><pubDate>Mon, 23 Dec 2019 22:53:21 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;Find in files&quot; special characters not working anymore on Mon, 04 Nov 2019 12:49:45 GMT]]></title><description><![CDATA[<p dir="auto">Thank you for your time and answers, gentlemen, wasn’t expecting such support on this forum.</p>
]]></description><link>https://community.notepad-plus-plus.org/post/48302</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/48302</guid><dc:creator><![CDATA[Pro Bg]]></dc:creator><pubDate>Mon, 04 Nov 2019 12:49:45 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;Find in files&quot; special characters not working anymore on Sun, 03 Nov 2019 07:04:51 GMT]]></title><description><![CDATA[<p dir="auto">Hello, <a class="plugin-mentions-user plugin-mentions-a" href="/user/pro-bg" aria-label="Profile: Pro-Bg">@<bdi>Pro-Bg</bdi></a>, <a class="plugin-mentions-user plugin-mentions-a" href="/user/peterjones" aria-label="Profile: peterjones">@<bdi>peterjones</bdi></a> and <strong>All</strong>,</p>
<p dir="auto">Thanks to <a class="plugin-mentions-user plugin-mentions-a" href="/user/peterjones" aria-label="Profile: peterjones">@<bdi>peterjones</bdi></a>, I understood that I simply forgot to act in the <strong>right</strong> order :-(( So, <a class="plugin-mentions-user plugin-mentions-a" href="/user/pro-bg" aria-label="Profile: Pro-Bg">@<bdi>Pro-Bg</bdi></a>, just forget the <strong>second</strong> part of my <strong>previous</strong> post, where I described the <strong>regex</strong> S/R, which is <strong>wrong</strong>  :-((</p>
<p dir="auto">So, I, first, <strong>downloaded</strong> your archive and <strong>extracted</strong> the <strong><code>Marco Polo S01E01 The Wayfarer 720p BluRay DTS x264-EbP.srt</code></strong> file</p>
<p dir="auto">When opening your file, in <strong>Notepad++</strong>, I get an <strong><code>ANSI</code></strong> encoded file. BTW, I also tried to <strong>untick</strong> the <strong><code>Settings &gt; Preferences &gt; MISC &gt; Autodetect character encoding</code></strong> option. Luckily, after re-opening Notepad++ and loading your file, its encoding have <strong>not</strong> been changed and was still <strong><code>ANSI</code></strong> !</p>
<p dir="auto">I renamed your file with a <strong>shorter</strong> name and chose the <strong><code>.txt</code></strong> extension. So, from now on, your <strong>initial</strong> file will be named <strong><code>Test.txt</code></strong> !</p>
<p dir="auto">I’m about to show you <strong><code>3</code></strong> <strong>different</strong> methods to solve the <a class="plugin-mentions-user plugin-mentions-a" href="/user/pro-bg" aria-label="Profile: Pro-Bg">@<bdi>Pro-Bg</bdi></a>’s problem. Note that the <strong>first</strong> one is just <strong>Peter</strong>’s solution !</p>
<hr />
<p dir="auto"><strong>FIRST</strong> method :</p>
<ul>
<li>I used the <strong>iconv</strong> utility, as suggested by <strong>Peter</strong>, running the command, below, in a <strong>DOS</strong> console window :</li>
</ul>
<pre><code class="language-diff">iconv -f ISO-8859-2 -t UTF-8 Test.txt &gt; Test_ICONV.txt
</code></pre>
<p dir="auto">Indeed, the result is fine and the <strong><code>4</code></strong> characters <strong><code>ª</code></strong> , <strong><code>º</code></strong>, <strong><code>Þ</code></strong> and <strong><code>þ</code></strong> were <strong>correctly</strong> translated in the <strong><code>4</code></strong> chars <strong><code>Ş</code></strong> , <strong><code>ş</code></strong>, <strong><code>Ţ</code></strong> and <strong><code>ţ</code></strong> :-))</p>
<p dir="auto"><strong>Remark</strong> :</p>
<p dir="auto">If we assume that your file was, <strong>initially</strong>, a <strong><code>Windows-1250</code></strong> encoded file and that we run the command, below :</p>
<pre><code class="language-diff">iconv -f WINDOWS-1250 -t UTF-8 Test.txt &gt; Test_2.txt
</code></pre>
<p dir="auto">One can easily verify that the <strong>two</strong> output files are quite <strong>identical</strong>. So, regarding this file, these <strong>two</strong> encodings are <strong>equivalent</strong>. Nice !</p>
<p dir="auto"><strong>Note</strong> :</p>
<p dir="auto">Be <strong>aware</strong>, however, that the <strong><code>4</code></strong> characters <strong><code>Ş</code></strong> , <strong><code>ş</code></strong>, <strong><code>Ţ</code></strong> and <strong><code>ţ</code></strong>, in the <strong>output</strong> file, are letters with a <strong>cedilla</strong> and <strong>not</strong> the <strong>Romanian</strong> letters with a <strong>comma</strong> below : <strong><code>Ș</code></strong> , <strong><code>ș</code></strong>, <strong><code>Ț</code></strong> and <strong><code>ț</code></strong> !</p>
<hr />
<p dir="auto"><strong>SECOND</strong> method :</p>
<ul>
<li>
<p dir="auto">Open a <strong>new</strong> file ( <strong><code>Ctrl + N</code></strong> )</p>
</li>
<li>
<p dir="auto">If your <strong>default</strong> encoding, for <strong>new</strong> files, is <strong>not</strong> <strong><code>ANSI</code></strong>, select the first option <strong><code>Encoding &gt; ANSI</code></strong> for this <strong>empty</strong> file. Note that , as your file is <strong>empty</strong>, you could, either, run the option <strong><code>Encoding &gt; Convert to ANSI</code></strong></p>
</li>
</ul>
<p dir="auto">=&gt; The <strong><code>ANSI</code></strong> encoding should be displayed in the <strong>status bar</strong></p>
<ul>
<li>
<p dir="auto">Now, <strong>copy / paste</strong> the contents of the <strong><code>Test.txt</code></strong> file, in this <strong>new</strong> file</p>
</li>
<li>
<p dir="auto">Then, run one of the <strong>two</strong> options :</p>
<ul>
<li>
<p dir="auto"><strong><code>Encoding &gt; Character Sets &gt; Central European &gt; Windows-1250</code></strong></p>
</li>
<li>
<p dir="auto"><strong><code>Encoding &gt; Character Sets &gt; Eastern European &gt; ISO 8859-2</code></strong></p>
</li>
</ul>
</li>
<li>
<p dir="auto">A <strong>small</strong> window, with title <strong>Lose Undo Ability Waning</strong> pops up : <em>You should save the current modification. All the saved modifications can not be undone. Continue ?</em></p>
</li>
<li>
<p dir="auto">Choose the <strong>default</strong> choice, clicking on the <strong><code>Yes</code></strong> button</p>
</li>
<li>
<p dir="auto">The <strong>Save as</strong> dialog then occurs. So, save this <strong>new</strong> file as , let’s say, <strong><code>Test_NPP.txt</code></strong></p>
</li>
</ul>
<p dir="auto">=&gt; Note that the <strong><code>Windows-1250</code></strong> ( or <strong><code>ISO 8859-2</code></strong> ) encoding is shown in the <strong>status bar</strong></p>
<ul>
<li>Then select the <strong><code>Encoding &gt; Convert to UTF-8</code></strong> option  ( Do <strong>not</strong> choose the <strong><code>UTF-8</code></strong> only option ! )</li>
</ul>
<p dir="auto">=&gt; This time, the <strong><code>UTF-8</code></strong> encoding is displayed in the <strong>status bar</strong></p>
<ul>
<li><strong>Save</strong> the modifications ( <strong><code>Ctrl + S</code></strong> )</li>
</ul>
<p dir="auto">The <strong>nice</strong> thing is that the <strong><code>Test_NPP.txt</code></strong> file, built from within <strong>N++</strong> and the <strong><code>Test_ICONV.txt</code></strong> file, <strong>output</strong> of the <strong><code>iconv</code></strong> <strong>Dos</strong> command, are strictly <strong>identical</strong> !</p>
<hr />
<p dir="auto"><strong>THIRD</strong> method ( a bit <strong>longer</strong> ! ) :</p>
<ul>
<li>
<p dir="auto">Open a <strong>new</strong> file ( <strong><code>Ctrl + N</code></strong> )</p>
</li>
<li>
<p dir="auto">If your <strong>default</strong> encoding, for <strong>new</strong> files, is <strong>not</strong> <strong><code>ANSI</code></strong>, select the first option <strong><code>Encoding &gt; ANSI</code></strong> for this <strong>empty</strong> file</p>
</li>
</ul>
<p dir="auto">=&gt; The <strong><code>ANSI</code></strong> encoding should be displayed in the <strong>status bar</strong></p>
<ul>
<li>
<p dir="auto">Now, <strong>copy / paste</strong> the contents of <strong><code>Test.txt</code></strong>, in this <strong>new</strong> file</p>
</li>
<li>
<p dir="auto">First, we’ll try to get rid of <strong>standard</strong> characters, in order to <strong>identify</strong> which characters would have a <strong>different byte</strong> sequence, when migrated to <strong><code>UTF-8</code></strong>. This concerns, principally, characters with code-point <strong>above</strong> <strong><code>\x7F</code></strong>. So :</p>
</li>
<li>
<p dir="auto"><strong>Suppression</strong> of any <strong><code>ASCII</code></strong> character, with code in the <strong><code>[ 0 - 127 ]</code></strong> range :</p>
<ul>
<li>
<p dir="auto">SEARCH <strong><code>[\x00-\x7f]+</code></strong></p>
</li>
<li>
<p dir="auto">REPLACE <strong><code>Leave EMPTY</code></strong></p>
</li>
</ul>
</li>
<li>
<p dir="auto">Let only <strong>one</strong> character per line :</p>
<ul>
<li>
<p dir="auto">SEARCH <strong><code>.</code></strong></p>
</li>
<li>
<p dir="auto">REPLACE <strong><code>$0\r\n</code></strong></p>
</li>
</ul>
</li>
<li>
<p dir="auto">Run the <strong><code>Edit &gt; Line Operations &gt; Sort Lines Lexicographically Ascending</code></strong> option</p>
</li>
<li>
<p dir="auto">Run the <strong><code>Edit &gt; Line Operations &gt; Remove Consecutive Duplicate Lines</code></strong> option</p>
</li>
</ul>
<p dir="auto">=&gt; You’re left with a <strong>tiny</strong> list of <strong><code>9</code></strong> characters <strong><code>ª    º    Ã    Î    Þ    â    ã    î    þ</code></strong></p>
<ul>
<li>
<p dir="auto">Run the <strong><code>Encoding &gt; Character Sets &gt; Central European &gt; Windows-1250</code></strong> option</p>
</li>
<li>
<p dir="auto">A small window, with title <strong>Lose Undo Ability Waning</strong> pops up : <em>You should save the current modification. All the saved modifications can not be undone. Continue ?</em></p>
</li>
<li>
<p dir="auto">Choose the <strong>default</strong> choice, clicking on the <strong><code>Yes</code></strong> button</p>
</li>
<li>
<p dir="auto">The <strong>Save as</strong> dialog occurs. So, save this <strong>new</strong> file, anywhere, with a <strong>dummy</strong> name</p>
</li>
</ul>
<p dir="auto">=&gt; The <strong><code>Windows-1250</code></strong> encoding is shown in the <strong>status bar</strong></p>
<ul>
<li>The tiny list have been <strong>changed</strong> into these <strong><code>9</code></strong> following characters <strong><code>Ş    ş    Ă    Î    Ţ    â    ă    î    ţ</code></strong>, rewritten, below, with their <strong>codes</strong> :</li>
</ul>
<pre><code class="language-Z">Characters           Ş      ş      Ă      Î      Ţ      â      ă      î      ţ

In Windows-1250    00AA   00BA   00C3   00CE   00DE   00E2   00E3   00EE   00FE

( Unicode value    015E   015f   0102   00CE   0162   00E2   0103   00EE   0163 )
</code></pre>
<p dir="auto">Refer, to that purpose, to the link :</p>
<p dir="auto"><a href="https://en.wikipedia.org/wiki/Windows-1250" rel="nofollow ugc">https://en.wikipedia.org/wiki/Windows-1250</a></p>
<p dir="auto">After examination of the different <strong>Unicode</strong> values, we can <strong>eliminate</strong> the <strong><code>3</code></strong> characters <strong><code>Î</code></strong>, <strong><code>â</code></strong> and <strong><code>î</code></strong>, which are <strong>identical</strong> in the <strong>two</strong> encodings ( Note that they correspond to the characters with an <strong>Unicode</strong> value under <strong><code>\x0100</code></strong> )</p>
<ul>
<li>
<p dir="auto">Open a <strong>new</strong> file ( <strong><code>Ctrl + N</code></strong> )</p>
</li>
<li>
<p dir="auto">If your <strong>default</strong> encoding, for <strong>new</strong> files, is <strong>not</strong> <strong><code>ANSI</code></strong>, select the first option <strong><code>Encoding &gt; ANSI</code></strong> for this <strong>empty</strong> file</p>
</li>
</ul>
<p dir="auto">=&gt; The <strong><code>ANSI</code></strong> encoding should be displayed in the <strong>status bar</strong></p>
<ul>
<li>
<p dir="auto">Now, <strong>copy / paste</strong> the contents of <strong><code>Test.txt</code></strong>, in this <strong>new</strong> file</p>
</li>
<li>
<p dir="auto">Run the <strong><code>Encoding &gt; Convert to UTF-8</code></strong> option  ( Do <strong>not</strong> choose the <strong><code>UTF-8</code></strong> only option ! )</p>
</li>
</ul>
<p dir="auto">=&gt; The <strong><code>UTF-8</code></strong> encoding is, now, displayed in the <strong>status bar</strong></p>
<ul>
<li>
<p dir="auto">Perform the following <strong>regex</strong> S/R :</p>
<ul>
<li>
<p dir="auto">SEARCH <strong><code>(\x{00AA})|(\x{00BA})|(\x{00C3})|(\x{00DE})|(\x{00E3})|(\x{00FE})</code></strong></p>
</li>
<li>
<p dir="auto">REPLACE <strong><code>(?1\x{015E})(?2\x{015F})(?3\x{0102})(?4\x{0162})(?5\x{0103})(?6\x{0163})</code></strong></p>
</li>
</ul>
</li>
</ul>
<p dir="auto">=&gt; <strong><code>733</code></strong> replacements done</p>
<ul>
<li>Save this <strong>new</strong> file and name it, let’s say, <strong><code>Test_REGEX.txt</code></strong></li>
</ul>
<p dir="auto">Again, the <strong>nice</strong> thing is that the <strong><code>Test_REGEX.txt</code></strong> file, built from within <strong>N++</strong>, with a <strong>regex</strong> S/R, and the <strong><code>Test_ICONV.txt</code></strong> file, <strong>output</strong> of the <strong><code>iconv</code></strong> <strong>Dos</strong> command, are strictly <strong>identical</strong>, too !</p>
<p dir="auto">Best Regards,</p>
<p dir="auto">guy038</p>
<p dir="auto"><strong>P.S.</strong> :</p>
<p dir="auto">Now, <a class="plugin-mentions-user plugin-mentions-a" href="/user/pro-bg" aria-label="Profile: Pro-Bg">@<bdi>Pro-Bg</bdi></a>, if you <strong>really</strong> want to see the <strong>Romanian</strong> <strong><code>Ș</code></strong> , <strong><code>ș</code></strong>, <strong><code>Ț</code></strong> and <strong><code>ț</code></strong> letters, with <strong>comma</strong> below :</p>
<ul>
<li>
<p dir="auto">In <strong>N++</strong>, open, either, the <strong><code>Test_ICONV.txt</code></strong>, <strong><code>Test_NPP.txt</code></strong> or <strong><code>Test_REGEX</code></strong> <strong>output</strong> file, ( <strong>identical</strong> <strong><code>UTF-8</code></strong> encoded files ! )</p>
</li>
<li>
<p dir="auto">Perform this last <strong>regex</strong> S/R :</p>
<ul>
<li>
<p dir="auto">SEARCH <strong><code>(\x{015E})|(\x{015F})|(\x{0162})|(\x{0163})</code></strong> ( Characters <strong><code>S</code></strong> and <strong><code>T</code></strong> with <strong>cedilla</strong> )</p>
</li>
<li>
<p dir="auto">REPLACE <strong><code>(?1\x{0218})(?2\x{0219})(?3\x{021A})(?4\x{021B})</code></strong> ( <strong>Romanian</strong> Characters <strong><code>S</code></strong> and <strong><code>T</code></strong> with <strong>comma</strong> below )</p>
</li>
</ul>
</li>
<li>
<p dir="auto"><strong>Re-save</strong> your file</p>
</li>
</ul>
<p dir="auto"><strong>P.P.S.</strong> :</p>
<p dir="auto">You are <strong>really</strong> lucky, whose mother tongue is <strong>English</strong> ! You have to worry, <strong>very little</strong>, about all these <strong>encoding</strong> problems ;-))</p>
]]></description><link>https://community.notepad-plus-plus.org/post/48280</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/48280</guid><dc:creator><![CDATA[guy038]]></dc:creator><pubDate>Sun, 03 Nov 2019 07:04:51 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;Find in files&quot; special characters not working anymore on Sat, 02 Nov 2019 15:33:47 GMT]]></title><description><![CDATA[<p dir="auto">I’m going to assume the solution <a class="plugin-mentions-user plugin-mentions-a" href="/user/guy038" aria-label="Profile: guy038">@<bdi>guy038</bdi></a> posted will work, because they usually are (or, at least, they are moving in the direction of working for whoever asked the question, because Guy doesn’t stop until they do work).</p>
<p dir="auto">However, before he posted, I had started down a non-regex road; I think it will be useful, so even after Guy’s post, I continued to write it up.</p>
<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/pro-bg" aria-label="Profile: Pro-Bg">@<bdi>Pro-Bg</bdi></a> said:</p>
<blockquote>
<p dir="auto">…noticed later that they’re ANSI. That how the subbers made them, …<br />
And yes, those are the Romanian letters,</p>
</blockquote>
<p dir="auto">When you said that, I took a look at the files.  When you open them with <a href="https://npp-user-manual.org/docs/preferences/#misc" rel="nofollow ugc"><strong>Preferences &gt; Settings &gt; Misc &gt; ☑ Autodetect character encoding</strong></a> enabled, they detect as “ANSI”, and those characters show up as you originally posted.  Since you said “Romanian”, I assumed maybe it was really a Central or Eastern European encoding used, rather than the default “ANSI” Western European encoding.</p>
<p dir="auto">So I went to **Encoding &gt; Character Sets &gt; Eastern European &gt; **: Choosing <strong>ISO 8859-2</strong> appeared to work.  But while writing this up, I realized that Romanian can be considered <a href="https://en.wikipedia.org/wiki/Central_Europe" rel="nofollow ugc">Central European</a> as well, so I tried choosing <strong>… &gt; Central European &gt; OEM 852</strong>, which made those characters box-drawing, so that was obviously wrong.  <strong>… &gt; Central European &gt; Windows 1250</strong> appeared to convert those to the right characters as well.</p>
<p dir="auto">I don’t know all the differences between <a href="https://en.wikipedia.org/wiki/ISO-8859-2" rel="nofollow ugc">ISO 8859-2</a> and <a href="https://en.wikipedia.org/wiki/Windows-1250" rel="nofollow ugc">Windows 1250</a> – ah, per Wikipedia, “Windows-1250 is similar to ISO-8859-2 and has all the printable characters it has and more. However a few of them are rearranged”.  You would have to know more about the files to determine which of those encodings they really are; though my guess, if they’re for subtitles, then they were done with the ISO 8859-2, not the Microsoft-centric Windows-1250.</p>
<p dir="auto">So really, in Notepad++, instead of doing a search-replace, all you need to do is to change the <strong>Encoding &gt; Character Set</strong> to the appropriate one (probably ISO 8859-2, but maybe Windows-1250).  After doing that, so it’s displayed properly, you should be able to read and edit the file to your heart’s content.  If you’re going to be editing the file multiple times, I would suggest <strong>Encoding &gt; Convert to UTF-8-BOM</strong>, so it will change the encoded single-byte Romanian characters to their UTF-8 multi-byte encoding, with the BOM character inserted at the beginning of the file.  Once you save after the conversion, then the next time you open the file with Notepad++, it will properly interpret it as UTF-8, and all the characters will be interpreted and displayed correctly.</p>
<p dir="auto">As far as subtitles go: I’m guessing what prompted this is that the subtitles were showing up wrong in your video player of choice.  My guess is that it was because your player didn’t know / couldn’t guess the right encoding for the file, so used ANSI like Notepad++ did.  I don’t know whether your player handles UTF-8 better than a random encoding… but if it does, then maybe saving the file after converting to UTF-8-BOM will make it work right in your player.  You might be able to google for your player’s name and “encoding” or “utf-8” or “unicode”, to find out which encoding it assumes or prefers.</p>
<p dir="auto">However, if you have a lot of files, Notepad++ might not be the most efficient for batch-converting the encoding.<br />
The <a href="https://superuser.com/questions/27060/batch-convert-files-for-encoding/49147#49147" rel="nofollow ugc">superuser answer</a> that I referenced in <a href="https://community.notepad-plus-plus.org/post/47893">my post in another thread</a> links to <a href="http://gnuwin32.sourceforge.net/packages/libiconv.htm" rel="nofollow ugc">a version of <code>iconv</code> for Windows</a>, which should be able to automate the conversion from ISO 8859-2 to UTF-8.</p>
<pre><code>iconv -f ISO-8859-2 -t utf-8 sourcefile.srt &gt; outfile.srt
</code></pre>
<p dir="auto">To get that to do all files in a given directory, open a cmd.exe prompt in that directory, and run</p>
<pre><code>FOR %f in (*.srt) do @( iconv -f ISO-8859-2 -t utf-8 "%f" &gt; "%~nf.utf8%~xf" )
</code></pre>
<p dir="auto">When I ran that on the marco polo files you showed us for download, it did properly convert them to utf-8.</p>
]]></description><link>https://community.notepad-plus-plus.org/post/48272</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/48272</guid><dc:creator><![CDATA[PeterJones]]></dc:creator><pubDate>Sat, 02 Nov 2019 15:33:47 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;Find in files&quot; special characters not working anymore on Sat, 02 Nov 2019 14:46:56 GMT]]></title><description><![CDATA[<p dir="auto">Hi, <a class="plugin-mentions-user plugin-mentions-a" href="/user/pro-bg" aria-label="Profile: Pro-Bg">@<bdi>Pro-Bg</bdi></a> and <strong>All</strong>,</p>
<p dir="auto">Firstly, when you begin to ask about characters <strong>representation</strong> and/or <strong>code</strong>, the best is to ask yourself : Does my operating system contains a <strong>font</strong> which can <strong>properly</strong> handle these characters and <strong>correctly</strong> displays their <strong>glyphs</strong> ?</p>
<p dir="auto">Now, unfortunately, these <strong>Romanian <code>4</code></strong> characters <strong><code>Ș</code></strong>,<strong><code>ș</code></strong>, <strong><code>Ț</code></strong> and <strong><code>ț</code></strong>, of <strong>Unicode</strong> code-point <strong><code>0218</code></strong>, <strong><code>0219</code></strong>, <strong><code>021A</code></strong> and <strong><code>021B</code></strong>, are handled by <strong>very few</strong> proportional fonts and, AFAIK, by the monospaced font <strong><code>Consolas</code></strong> <strong>only</strong> !</p>
<p dir="auto">So I advice you to use the <strong><code>Consolas</code></strong> font, which should be part of your system… On <strong><code>Windows 7</code></strong>, its version is <strong><code>5.22</code></strong>  and from <strong><code>Windows 8</code></strong>, its <strong>version</strong> is <strong><code>5.32</code></strong> and contains <strong><code>2,735</code></strong> glyphs</p>
<p dir="auto">From within <strong>notepad++</strong> :</p>
<ul>
<li>
<p dir="auto">Select the <strong><code>Settings &gt; Style Configurator &gt;</code></strong> option</p>
</li>
<li>
<p dir="auto">Select <strong><code>Global styles</code></strong> in the <strong>Language</strong> drop-down list</p>
</li>
<li>
<p dir="auto">Select <strong><code>Default style</code></strong> in the <strong>Style</strong> drop-down list</p>
</li>
<li>
<p dir="auto">In the <strong>Font Style</strong> area, choose the <strong><code>Consolas</code></strong> font, from the drop-down list of <strong>fonts</strong></p>
</li>
<li>
<p dir="auto">Click on the <strong><code>Save &amp; Close</code></strong> button</p>
</li>
</ul>
<p dir="auto"><strong>Remark</strong> :</p>
<p dir="auto">In Notepad++, comparing the glyphs of these <strong><code>4</code></strong> <strong>Romanian</strong> characters ( with <strong>comma</strong> below ) with their <strong>equivalent</strong> chars ( with a <strong>cedilla</strong> ), with the <strong><code>Consolas</code></strong> font, I noticed, when maximum <strong>zoom</strong> is used, that :</p>
<ul>
<li>
<p dir="auto">Regarding the letter <strong><code>S</code></strong> and <strong><code>s</code></strong>, the <strong>cedilla</strong> seems <strong>closer</strong> to the <strong>bottom</strong> of character than the <strong>comma</strong>  : <strong><code>Ș  ș  Ş  ş</code></strong></p>
</li>
<li>
<p dir="auto">Regarding the letter <strong><code>T</code></strong> and <strong><code>t</code></strong>, the character appearance seems rather <strong>identical</strong> : <strong><code>Ț  ț  Ţ  ţ</code></strong></p>
</li>
</ul>
<hr />
<p dir="auto">Secondly, I don’t see any reason which could explain that the <strong>search/replacement</strong> would work when using the <strong><code>Replace</code></strong> dialog and <strong>NOT</strong> with the <strong><code>Find in Files</code></strong> dialog !</p>
<p dir="auto"><strong>Two</strong> solutions :</p>
<ul>
<li>
<p dir="auto">Open the <strong><code>Replace</code></strong> dialog</p>
<ul>
<li>
<p dir="auto">SEARCH <strong><code>(\x{0218})|(\x{0219})|(\x{021A})|(\x{021B})</code></strong></p>
</li>
<li>
<p dir="auto">REPLACE <strong><code>(?1S)(?2s)(?3T)(?4t)</code></strong></p>
</li>
<li>
<p dir="auto">Untick the <strong><code>Match whole word only</code></strong>, if necessary</p>
</li>
<li>
<p dir="auto">Tick the <strong><code>Match case</code></strong> box option ( <strong>Important</strong> )</p>
</li>
<li>
<p dir="auto">Tick the <strong><code>Wrap around</code></strong> box option</p>
</li>
<li>
<p dir="auto">Select the <strong><code>Regular expression</code></strong> radio expression mode</p>
</li>
<li>
<p dir="auto">Click on the <strong><code>Replace All</code></strong> button</p>
</li>
</ul>
</li>
<li>
<p dir="auto">Open the <strong><code>Find in Files</code></strong> dialog</p>
<ul>
<li>
<p dir="auto">SEARCH <strong><code>(\x{0218})|(\x{0219})|(\x{021A})|(\x{021B})</code></strong></p>
</li>
<li>
<p dir="auto">REPLACE <strong><code>(?1S)(?2s)(?3T)(?4t)</code></strong></p>
</li>
<li>
<p dir="auto">Type in the <strong>correct</strong> file type in the <strong><code>Filters:</code></strong> zone</p>
</li>
<li>
<p dir="auto">Type in the correct <strong>absolute path</strong> name to your file, in the <strong><code>Directory:</code></strong> zone or click on the <strong><code>Follow current doc.</code></strong> box option</p>
</li>
<li>
<p dir="auto">Choose, optionally, the <strong><code>In all sub-folders</code></strong> box option, if you need to browse a file <strong>tree</strong></p>
</li>
<li>
<p dir="auto">Untick the <strong><code>Match whole word only</code></strong>, if necessary</p>
</li>
<li>
<p dir="auto">Tick the <strong><code>Match case</code></strong> box option ( <strong>Important</strong> )</p>
</li>
<li>
<p dir="auto">Select the <strong><code>Regular expression</code></strong> radio expression mode</p>
</li>
<li>
<p dir="auto">Click on the <strong><code>Replace in Files</code></strong> button</p>
</li>
<li>
<p dir="auto">Valid the <strong>Are you sure?</strong> dialog</p>
</li>
</ul>
</li>
</ul>
<p dir="auto"><strong>Notes</strong> :</p>
<ul>
<li>
<p dir="auto">In <strong>search</strong>, any of these <strong><code>4</code></strong> characters <strong><code>\x{####}</code></strong>  are stored in <strong>groups</strong>, from <strong><code>1</code></strong> to <strong><code>4</code></strong>, due to the <strong>embedded</strong> parentheses <strong><code>()</code></strong></p>
</li>
<li>
<p dir="auto">In <strong>replacement</strong>, due to the <strong>conditional</strong> replacement syntax <strong><code>(?#....)</code></strong>, where <strong><code>#</code></strong> is the <strong>number</strong> of the <strong>matched</strong> group, the appropriate <strong>standard</strong> replacement <strong>letter</strong>, <strong><code>S</code></strong>, <strong><code>s</code></strong>, <strong><code>T</code></strong> or <strong><code>t</code></strong>  is just rewritten !</p>
</li>
</ul>
<p dir="auto">Cheers,</p>
<p dir="auto">guy038</p>
]]></description><link>https://community.notepad-plus-plus.org/post/48270</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/48270</guid><dc:creator><![CDATA[guy038]]></dc:creator><pubDate>Sat, 02 Nov 2019 14:46:56 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;Find in files&quot; special characters not working anymore on Sat, 02 Nov 2019 10:51:00 GMT]]></title><description><![CDATA[<p dir="auto">Sorry, I was looking at another file and presumed all are UTF-8, but noticed later that they’re ANSI. That how the subbers made them, I have no idea.</p>
<p dir="auto">And yes, those are the Romanian letters, with comma below, not cedilla, but subbers in my country follow their own rules…</p>
<p dir="auto">I’ll upload two of the subtitles here <a href="https://gofile.io/?c=Hz4Uts" rel="nofollow ugc">https://gofile.io/?c=Hz4Uts</a> because I don’t have enough privileges to upload in this topic.</p>
<p dir="auto">I can do search and replace in the current open file and it works just fine, it’s just the function that finds in files that doesn’t seem to work…</p>
]]></description><link>https://community.notepad-plus-plus.org/post/48263</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/48263</guid><dc:creator><![CDATA[Pro Bg]]></dc:creator><pubDate>Sat, 02 Nov 2019 10:51:00 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;Find in files&quot; special characters not working anymore on Sat, 02 Nov 2019 13:47:52 GMT]]></title><description><![CDATA[<p dir="auto">Hello, <a class="plugin-mentions-user plugin-mentions-a" href="/user/pro-bg" aria-label="Profile: Pro-Bg">@<bdi>Pro-Bg</bdi></a> and <strong>All</strong>,</p>
<p dir="auto">You said, in your post :</p>
<blockquote>
<p dir="auto">The chars don’t show as they should in notepad++ either, though I’m on UTF-8, they are ș ț Ș Ț and I was …</p>
</blockquote>
<p dir="auto">So, seemingly, you refer to the <strong><code>4</code></strong> characters, below :</p>
<pre><code class="language-z">From the Latin Extended-B Unicode Script [ 0180 – 024F ] :

|  0218  |  Letter Ș  |  LATIN CAPITAL LETTER S WITH COMMA BELOW  |

|  0219  |  Letter ș  |  LATIN SMALL LETTER S WITH COMMA BELOW    |

|  021A  |  Letter Ț  |  LATIN CAPITAL LETTER T WITH COMMA BELOW  |

|  021B  |  Letter ț  |  LATIN SMALL LETTER T WITH COMMA BEL      |
</code></pre>
<p dir="auto">See, to that purpose :</p>
<p dir="auto"><a href="http://www.unicode.org/charts/PDF/U0180.pdf" rel="nofollow ugc">http://www.unicode.org/charts/PDF/U0180.pdf</a></p>
<p dir="auto">However, after some searches and from the characters you see, <strong>effectively</strong>, in your file ( characters <strong><code> ª  º  Þ and þ</code></strong> ), due to an <strong>erroneous</strong> encoding, I suppose that you refer, <strong>instead</strong>, to these <strong><code>4</code></strong> characters :</p>
<pre><code class="language-z">From the Latin Extended-A Unicode Script [ 0100 - 017F ] :

|  015E  |  Letter Ş  |  LATIN CAPITAL LETTER S WITH CEDILLA  |

|  015F  |  Letter ş  |  LATIN SMALL LETTER S WITH CEDILLA    |

|  0162  |  Letter Ţ  |  LATIN CAPITAL LETTER T WITH CEDILLA  |

|  0163  |  Letter ţ  |  LATIN SMALL LETTER T WITH CEDILLA    |
</code></pre>
<p dir="auto">See, to that purpose :</p>
<p dir="auto"><a href="http://www.unicode.org/charts/PDF/U0100.pdf" rel="nofollow ugc">http://www.unicode.org/charts/PDF/U0100.pdf</a></p>
<p dir="auto">Note that if you intend to <strong>copy /paste</strong> some characters from these <strong><code>PDF</code></strong> files, of the <strong>Unicode Consortium</strong>, I advice you to <strong>download</strong> them, <strong>first</strong>. Just because, depending of your <strong>browser</strong>, some characters, although <strong>well</strong> displayed, may <strong>not</strong> be correctly pasted :-((</p>
<p dir="auto">So, <strong>before</strong> going any further, which <strong>kind</strong> of characters are you referring to ? Indeed, depending of the <strong>set</strong> of characters used , we should need a <strong>different</strong> font, which <strong>properly</strong> handles these characters and <strong>correctly</strong> displays their <strong>glyphs</strong> !</p>
<p dir="auto">See you later,</p>
<p dir="auto">Best Regards,</p>
<p dir="auto">guy038</p>
]]></description><link>https://community.notepad-plus-plus.org/post/48247</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/48247</guid><dc:creator><![CDATA[guy038]]></dc:creator><pubDate>Sat, 02 Nov 2019 13:47:52 GMT</pubDate></item></channel></rss>