Community
    • Login

    Issue with Polish letters

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    27 Posts 5 Posters 10.1k Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • EkopalypseE Offline
      Ekopalypse
      last edited by Ekopalypse

      That means, select Windows-1250

      6198d9f7-4571-46a5-b37a-71b584c9b1b0-image.png

      and if it looks ok - convert to utf8 - save it - done.

      nightzneroN 1 Reply Last reply Reply Quote 2
      • nightzneroN Offline
        nightznero @Ekopalypse
        last edited by

        @Ekopalypse Thanks Eko a lot!

        EkopalypseE 1 Reply Last reply Reply Quote 2
        • EkopalypseE Offline
          Ekopalypse @nightznero
          last edited by

          @nightznero - my pleasure.

          1 Reply Last reply Reply Quote 1
          • Alan KilbornA Offline
            Alan Kilborn
            last edited by

            I didn’t follow super-closely, but was there reason to not convert to UTF-8 and then stay with that?

            EkopalypseE 1 Reply Last reply Reply Quote 0
            • EkopalypseE Offline
              Ekopalypse @Alan Kilborn
              last edited by Ekopalypse

              @Alan-Kilborn

              We had to find the right (ansi) encoding first, otherwise the conversion to utf-8 would result in incorrect text.

              1 Reply Last reply Reply Quote 4
              • guy038G Offline
                guy038
                last edited by guy038

                Hello, @nightznero, @alan-kilborn, @ekopalypse and All,

                Encoding notions are really difficult to handle and are usually a nightmare for most of us !

                From the @nightznero’s problem, I tried to build a method to guess the right encoding of an ANSI encoded file, containing characters wrongly displayed !


                • First, copy all the text, below, in the clipboard :
                •--------•---------•---------•---------•---------•---------•---------•---------•
                |  Code  | Win1250 | Win1251 | Win1252 | Win1253 | Win1254 | Win1257 | Win1258 |
                •--------•---------•---------•---------•---------•---------•---------•---------•
                |   80   |    €    |    Ђ    |    €    |    €    |    €    |    €    |    €    |
                |   81   |    ◊    |    Ѓ    |    ◊    |    ◊    |    ◊    |    ◊    |    ◊    |
                |   82   |    ‚    |    ‚    |    ‚    |    ‚    |    ‚    |    ‚    |    ‚    |
                |   83   |    ◊    |    ѓ    |    ƒ    |    ƒ    |    ƒ    |    ◊    |    ƒ    |
                |   84   |    „    |    „    |    „    |    „    |    „    |    „    |    „    |
                |   85   |    …    |    …    |    …    |    …    |    …    |    …    |    …    |
                |   86   |    †    |    †    |    †    |    †    |    †    |    †    |    †    |
                |   87   |    ‡    |    ‡    |    ‡    |    ‡    |    ‡    |    ‡    |    ‡    |
                |   88   |    ◊    |    €    |    ˆ    |    ◊    |    ˆ    |    ◊    |    ˆ    |
                |   89   |    ‰    |    ‰    |    ‰    |    ‰    |    ‰    |    ‰    |    ‰    |
                |   8A   |    Š    |    Љ    |    Š    |    ◊    |    Š    |    ◊    |    ◊    |
                |   8B   |    ‹    |    ‹    |    ‹    |    ‹    |    ‹    |    ‹    |    ‹    |
                |   8C   |    Ś    |    Њ    |    Œ    |    ◊    |    Œ    |    ◊    |    Œ    |
                |   8D   |    Ť    |    Ќ    |    ◊    |    ◊    |    ◊    |    ¨    |    ◊    |
                |   8E   |    Ž    |    Ћ    |    Ž    |    ◊    |    ◊    |    ˇ    |    ◊    |
                |   8F   |    Ź    |    Џ    |    ◊    |    ◊    |    ◊    |    ¸    |    ◊    |
                |   90   |    ◊    |    ђ    |    ◊    |    ◊    |    ◊    |    ◊    |    ◊    |
                |   91   |    ‘    |    ‘    |    ‘    |    ‘    |    ‘    |    ‘    |    ‘    |
                |   92   |    ’    |    ’    |    ’    |    ’    |    ’    |    ’    |    ’    |
                |   93   |    “    |    “    |    “    |    “    |    “    |    “    |    “    |
                |   94   |    ”    |    ”    |    ”    |    ”    |    ”    |    ”    |    ”    |
                |   95   |    •    |    •    |    •    |    •    |    •    |    •    |    •    |
                |   96   |    –    |    –    |    –    |    –    |    –    |    –    |    –    |
                |   97   |    —    |    —    |    —    |    —    |    —    |    —    |    —    |
                |   98   |    ◊    |    ◊    |    ˜    |    ◊    |    ˜    |    ◊    |    ˜    |
                |   99   |    ™    |    ™    |    ™    |    ™    |    ™    |    ™    |    ™    |
                |   9A   |    š    |    љ    |    š    |    ◊    |    š    |    ◊    |    ◊    |
                |   9B   |    ›    |    ›    |    ›    |    ›    |    ›    |    ›    |    ›    |
                |   9C   |    ś    |    њ    |    œ    |    ◊    |    œ    |    ◊    |    œ    |
                |   9D   |    ť    |    ќ    |    ◊    |    ◊    |    ◊    |    ¯    |    ◊    |
                |   9E   |    ž    |    ћ    |    ž    |    ◊    |    ◊    |    ˛    |    ◊    |
                |   9F   |    ź    |    џ    |    Ÿ    |    ◊    |    Ÿ    |    ◊    |    Ÿ    |
                •--------•---------•---------•---------•---------•---------•---------•---------•
                |   A0   |         |         |         |         |         |         |         |
                |   A1   |    ˇ    |    Ў    |    ¡    |    ΅    |    ¡    |    ◊    |    ¡    |
                |   A2   |    ˘    |    ў    |    ¢    |    Ά    |    ¢    |    ¢    |    ¢    |
                |   A3   |    Ł    |    Ј    |    £    |    £    |    £    |    £    |    £    |
                |   A4   |    ¤    |    ¤    |    ¤    |    ¤    |    ¤    |    ¤    |    ¤    |
                |   A5   |    Ą    |    Ґ    |    ¥    |    ¥    |    ¥    |    ◊    |    ¥    |
                |   A6   |    ¦    |    ¦    |    ¦    |    ¦    |    ¦    |    ¦    |    ¦    |
                |   A7   |    §    |    §    |    §    |    §    |    §    |    §    |    §    |
                |   A8   |    ¨    |    Ё    |    ¨    |    ¨    |    ¨    |    Ø    |    ¨    |
                |   A9   |    ©    |    ©    |    ©    |    ©    |    ©    |    ©    |    ©    |
                |   AA   |    Ş    |    Є    |    ª    |    ◊    |    ª    |    Ŗ    |    ª    |
                |   AB   |    «    |    «    |    «    |    «    |    «    |    «    |    «    |
                |   AC   |    ¬    |    ¬    |    ¬    |    ¬    |    ¬    |    ¬    |    ¬    |
                |   AD   |    ­    |    ­    |    ­    |    ­    |    ­    |    ­    |    ­    |
                |   AE   |    ®    |    ®    |    ®    |    ®    |    ®    |    ®    |    ®    |
                |   AF   |    Ż    |    Ї    |    ¯    |    ―    |    ¯    |    Æ    |    ¯    |
                |   B0   |    °    |    °    |    °    |    °    |    °    |    °    |    °    |
                |   B1   |    ±    |    ±    |    ±    |    ±    |    ±    |    ±    |    ±    |
                |   B2   |    ˛    |    І    |    ²    |    ²    |    ²    |    ²    |    ²    |
                |   B3   |    ł    |    і    |    ³    |    ³    |    ³    |    ³    |    ³    |
                |   B4   |    ´    |    ґ    |    ´    |    ΄    |    ´    |    ´    |    ´    |
                |   B5   |    µ    |    µ    |    µ    |    µ    |    µ    |    µ    |    µ    |
                |   B6   |    ¶    |    ¶    |    ¶    |    ¶    |    ¶    |    ¶    |    ¶    |
                |   B7   |    ·    |    ·    |    ·    |    ·    |    ·    |    ·    |    ·    |
                |   B8   |    ¸    |    ё    |    ¸    |    Έ    |    ¸    |    ø    |    ¸    |
                |   B9   |    ą    |    №    |    ¹    |    Ή    |    ¹    |    ¹    |    ¹    |
                |   BA   |    ş    |    є    |    º    |    Ί    |    º    |    ŗ    |    º    |
                |   BB   |    »    |    »    |    »    |    »    |    »    |    »    |    »    |
                |   BC   |    Ľ    |    ј    |    ¼    |    Ό    |    ¼    |    ¼    |    ¼    |
                |   BD   |    ˝    |    Ѕ    |    ½    |    ½    |    ½    |    ½    |    ½    |
                |   BE   |    ľ    |    ѕ    |    ¾    |    Ύ    |    ¾    |    ¾    |    ¾    |
                |   BF   |    ż    |    ї    |    ¿    |    Ώ    |    ¿    |    æ    |    ¿    |
                •--------•---------•---------•---------•---------•---------•---------•---------•
                |   C0   |    Ŕ    |    А    |    À    |    ΐ    |    À    |    Ą    |    À    |
                |   C1   |    Á    |    Б    |    Á    |    Α    |    Á    |    Į    |    Á    |
                |   C2   |    Â    |    В    |    Â    |    Β    |    Â    |    Ā    |    Â    |
                |   C3   |    Ă    |    Г    |    Ã    |    Γ    |    Ã    |    Ć    |    Ă    |
                |   C4   |    Ä    |    Д    |    Ä    |    Δ    |    Ä    |    Ä    |    Ä    |
                |   C5   |    Ĺ    |    Е    |    Å    |    Ε    |    Å    |    Å    |    Å    |
                |   C6   |    Ć    |    Ж    |    Æ    |    Ζ    |    Æ    |    Ę    |    Æ    |
                |   C7   |    Ç    |    З    |    Ç    |    Η    |    Ç    |    Ē    |    Ç    |
                |   C8   |    Č    |    И    |    È    |    Θ    |    È    |    Č    |    È    |
                |   C9   |    É    |    Й    |    É    |    Ι    |    É    |    É    |    É    |ֹ
                |   CA   |    Ę    |    К    |    Ê    |    Κ    |    Ê    |    Ź    |    Ê    |ֺ
                |   CB   |    Ë    |    Л    |    Ë    |    Λ    |    Ë    |    Ė    |    Ë    |
                |   CC   |    Ě    |    М    |    Ì    |    Μ    |    Ì    |    Ģ    |    ̀     |
                |   CD   |    Í    |    Н    |    Í    |    Ν    |    Í    |    Ķ    |    Í    |
                |   CE   |    Î    |    О    |    Î    |    Ξ    |    Î    |    Ī    |    Î    |
                |   CF   |    Ď    |    П    |    Ï    |    Ο    |    Ï    |    Ļ    |    Ï    |
                |   D0   |    Đ    |    Р    |    Ð    |    Π    |    Ğ    |    Š    |    Đ    |
                |   D1   |    Ń    |    С    |    Ñ    |    Ρ    |    Ñ    |    Ń    |    Ñ    |
                |   D2   |    Ň    |    Т    |    Ò    |    ◊    |    Ò    |    Ņ    |    ̉     |
                |   D3   |    Ó    |    У    |    Ó    |    Σ    |    Ó    |    Ó    |    Ó    |
                |   D4   |    Ô    |    Ф    |    Ô    |    Τ    |    Ô    |    Ō    |    Ô    |
                |   D5   |    Ő    |    Х    |    Õ    |    Υ    |    Õ    |    Õ    |    Ơ    |
                |   D6   |    Ö    |    Ц    |    Ö    |    Φ    |    Ö    |    Ö    |    Ö    |
                |   D7   |    ×    |    Ч    |    ×    |    Χ    |    ×    |    ×    |    ×    |
                |   D8   |    Ř    |    Ш    |    Ø    |    Ψ    |    Ø    |    Ų    |    Ø    |
                |   D9   |    Ů    |    Щ    |    Ù    |    Ω    |    Ù    |    Ł    |    Ù    |
                |   DA   |    Ú    |    Ъ    |    Ú    |    Ϊ    |    Ú    |    Ś    |    Ú    |
                |   DB   |    Ű    |    Ы    |    Û    |    Ϋ    |    Û    |    Ū    |    Û    |
                |   DC   |    Ü    |    Ь    |    Ü    |    ά    |    Ü    |    Ü    |    Ü    |
                |   DD   |    Ý    |    Э    |    Ý    |    έ    |    İ    |    Ż    |    Ư    |
                |   DE   |    Ţ    |    Ю    |    Þ    |    ή    |    Ş    |    Ž    |    ̃     |
                |   DF   |    ß    |    Я    |    ß    |    ί    |    ß    |    ß    |    ß    |
                •--------•---------•---------•---------•---------•---------•---------•---------•
                |   E0   |    ŕ    |    а    |    à    |    ΰ    |    à    |    ą    |    à    |
                |   E1   |    á    |    б    |    á    |    α    |    á    |    į    |    á    |
                |   E2   |    â    |    в    |    â    |    β    |    â    |    ā    |    â    |
                |   E3   |    ă    |    г    |    ã    |    γ    |    ã    |    ć    |    ă    |
                |   E4   |    ä    |    д    |    ä    |    δ    |    ä    |    ä    |    ä    |
                |   E5   |    ĺ    |    е    |    å    |    ε    |    å    |    å    |    å    |
                |   E6   |    ć    |    ж    |    æ    |    ζ    |    æ    |    ę    |    æ    |
                |   E7   |    ç    |    з    |    ç    |    η    |    ç    |    ē    |    ç    |
                |   E8   |    č    |    и    |    è    |    θ    |    è    |    č    |    è    |
                |   E9   |    é    |    й    |    é    |    ι    |    é    |    é    |    é    |
                |   EA   |    ę    |    к    |    ê    |    κ    |    ê    |    ź    |    ê    |
                |   EB   |    ë    |    л    |    ë    |    λ    |    ë    |    ė    |    ë    |
                |   EC   |    ě    |    м    |    ì    |    μ    |    ì    |    ģ    |    ́     |
                |   ED   |    í    |    н    |    í    |    ν    |    í    |    ķ    |    í    |
                |   EE   |    î    |    о    |    î    |    ξ    |    î    |    ī    |    î    |
                |   EF   |    ď    |    п    |    ï    |    ο    |    ï    |    ļ    |    ï    |
                |   F0   |    đ    |    р    |    ð    |    π    |    ğ    |    š    |    đ    |
                |   F1   |    ń    |    с    |    ñ    |    ρ    |    ñ    |    ń    |    ñ    |
                |   F2   |    ň    |    т    |    ò    |    ς    |    ò    |    ņ    |    ̣     |
                |   F3   |    ó    |    у    |    ó    |    σ    |    ó    |    ó    |    ó    |
                |   F4   |    ô    |    ф    |    ô    |    τ    |    ô    |    ō    |    ô    |
                |   F5   |    ő    |    х    |    õ    |    υ    |    õ    |    õ    |    ơ    |
                |   F6   |    ö    |    ц    |    ö    |    φ    |    ö    |    ö    |    ö    |
                |   F7   |    ÷    |    ч    |    ÷    |    χ    |    ÷    |    ÷    |    ÷    |
                |   F8   |    ř    |    ш    |    ø    |    ψ    |    ø    |    ų    |    ø    |
                |   F9   |    ů    |    щ    |    ù    |    ω    |    ù    |    ł    |    ù    |
                |   FA   |    ú    |    ъ    |    ú    |    ϊ    |    ú    |    ś    |    ú    |
                |   FB   |    ű    |    ы    |    û    |    ϋ    |    û    |    ū    |    û    |
                |   FC   |    ü    |    ь    |    ü    |    ό    |    ü    |    ü    |    ü    |
                |   FD   |    ý    |    э    |    ý    |    ύ    |    ı    |    ż    |    ư    |
                |   FE   |    ţ    |    ю    |    þ    |    ώ    |    ş    |    ž    |    ₫    |
                |   FF   |    ˙    |    я    |    ÿ    |    ◊    |    ÿ    |    ˙    |    ÿ    |
                •--------•---------•---------•---------•---------•---------•---------•---------•
                

                Note that, in this table, the ◊ character means that the character is not defined for the corresponding encoding !


                • Open a new N++ tab ( Ctrl + N )

                • Run the command Encoding > Convert to UTF-8-BOM ( IMPORTANT )

                • Paste the clipboard contents in that new tab ( Ctrl + V )

                • Save this file as Windows_European_Encodings.txt

                • From the first word, not correctly displayed of your ANSI file ( le¿y in @nightznero’s text ), select the wrong character ( ¿ )

                • Open the Find dialog ( Ctrl + F )

                • Tick the March case and the Wrap around options

                • Select the Normal search mode

                • Switch back to the Windows_European_Encodings.txt file, that we just created

                • Click on the Find Next button

                => The caret should be on the line :

                |   BF   |    ż    |    ї    |    ¿    |    Ώ    |    ¿    |    æ    |    ¿    |
                

                Necessarily, your correct character, instead of the ¿ char, must be found within that line !

                And @nightznero would have easily detected that the right character was ż, forming the word leży ! Now, as the ż belongs to the Windows-1250 encoding :

                • Select the command Encoding > Character Sets > Central European > Windows-1250

                => All the text seems, now, completely readable ;-))

                • So, encode this file with the UTF-8 encoding, running one of these two commands :

                  • Encoding > Convert to UTF-8

                  • Encoding > Convert to UTF-8-BOM

                • Save the changed contents ( Ctrl + S )


                Note that we could have searched for other characters, listed below, which are accentuated characters from @nightznero’s text :

                    •--------•---------•      •---------•
                    |  Code  | Win1252 |      | Win1250 |
                    •--------•---------•      •---------•
                    |   8C   |    Œ    |      |    Ś    |
                    |   9C   |    œ    |      |    ś    |
                    |   9F   |    Ÿ    |      |    ź    |
                    |   A3   |    £    |      |    Ł    |
                    |   A5   |    ¥    |      |    Ą    |
                    |   AF   |    ¯    |      |    Ż    |
                    |   B3   |    ³    |  =>  |    ł    |
                    |   B9   |    ¹    |      |    ą    |
                    |   BF   |    ¿    |      |    ż    |
                    |   C6   |    Æ    |      |    Ć    |
                    |   E6   |    æ    |      |    ć    |
                    |   EA   |    ê    |      |    ę    |
                    |   F1   |    ñ    |      |    ń    |
                    •--------•---------•      •---------•
                

                BTW, I found out a character which is different in all the different Windows-125# Windows encodings. This is the ANSI char \x{de}. To write it, simply hold down the Alt key and hit, successively, the keys 0, 2, 2 and 2, from the numeric keypad !

                •--------•--------------•--------------•--------------•--------------•--------------•--------------•--------------•--------------•--------------•
                |  Code  |   Win-1250   |   Win-1251   |   Win-1252   |   Win-1253   |   Win-1254   |   Win-1257   |   Win-1258   |   Win-1255   |   Win-1256   |
                |  ALT   •--------------•--------------•--------------•--------------•--------------•--------------•--------------•--------------•--------------•
                | + 0222 |  Centr. Eur. |   Cyrillic   |  West. Eur.  |    Greek     |   Turkish    |    Baltic    |  Vietnamese  |    Hebrew    |    Arabic    |
                •--------•--------------•--------------•--------------•--------------•--------------•--------------•--------------•--------------•--------------•
                |   DE   |      Ţ       |      Ю       |      Þ       |      ή       |      Ş       |      Ž       |      ̃        |  Undefined   |      ق       |
                •--------•--------------•--------------•--------------•--------------•--------------•--------------•--------------•--------------•--------------•
                

                So, for instance, if you type the \x{de} character, in an ANSI encoded file :

                • If the character displayed is ή, this means that your current ANSI codepage is probably Win-1253

                • If the character displayed is Ţ, this means that your current ANSI codepage must be Win-1250

                Just run the command ? > Debug Info... to verify !


                To end with, from this link, you should be convinced to always manage UTF-8 encoded files ! ( ~ 96,7 % of all files coded in Websites ! )

                You may also click, to the left part, on the yearly list, which perfectly shows the growth of the UTF-8 encoding and the decrease of all other encodings, during these last ten years !

                Now, to get the contents of the Windows encodings, as text files, click on :    https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS

                Best Regards,

                guy038

                1 Reply Last reply Reply Quote 3
                • ArkadiuszMichalskiA Offline
                  ArkadiuszMichalski
                  last edited by

                  For polish he should use ISO 8859-2 (Eastern European), but nowadays I would rather recommend UTF-8.
                  @nightznero To twoja twórczość?

                  1 Reply Last reply Reply Quote 3

                  Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                  Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                  With your input, this post could be even better 💗

                  Register Login
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors