• Login
Community
  • Login

Python - string encoding

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
6 Posts 2 Posters 5.2k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D
    DaveyD
    last edited by Jan 13, 2016, 9:36 PM

    Hi
    I have a simple python script that takes the selected text, reverses it, and replaces the selection with the reversed string
    With standard English letters, it works perfectly, but when I have Hebrew letters selected, some characters don’t get entered correctly (I get some binary hex characters e.g. xA3 in a blue box)

    The document is encoded with OEM 862 encoding
    Here is the script that I use

    text = editor.getSelText()
    editor.replaceSel(text[::-1])
    

    If anyone can help with this I would appreciate it
    Thanks,
    Davey

    C 1 Reply Last reply Jan 13, 2016, 11:37 PM Reply Quote 0
    • C
      Claudia Frank @DaveyD
      last edited by Claudia Frank Jan 13, 2016, 11:37 PM Jan 13, 2016, 11:37 PM

      Hello Davey,

      did you read this ?

      Cheers
      Claudia

      1 Reply Last reply Reply Quote 1
      • D
        DaveyD
        last edited by Jan 14, 2016, 1:39 AM

        Hi Claudia - Thanks for the link!
        I did read about that (but not there, rather on stackoverflow.com ), and I’ve tried and tried but haven’t been able to figure out what to do!
        Now that you pointed to it again, I put more time into it… and finally I figured out a method to get it to work!
        However, I have no clue how it works!
        This is what I have:

        • A text file encoded in OEM 862 (or in python ‘cp862’)
        • A script with default encoding (ANSI)

        So, I would think that I have to do some type of encoding with cp862 and utf-8 (since python works in utf-8).
        I tried:

        text.decode('cp862')
        text.encode('cp862')
        
        text.decode('cp862')
        text.encode('utf-8')
        
        text.decode('utf-8')
        text.encode('cp862') 
        

        None of the above worked!
        Now, I tried the exact example given in the help file you pointed to above:

        text.decode('utf-8')
        text.encode('utf-8')
        

        And this worked perfectly!
        This is great! My script is now working!
        But… Why?! why does this make sense??
        I am so confused with this encoding business…! :)

        Thanks,
        Davey

        C 1 Reply Last reply Jan 14, 2016, 11:18 PM Reply Quote 0
        • C
          Claudia Frank @DaveyD
          last edited by Claudia Frank Jan 14, 2016, 11:18 PM Jan 14, 2016, 11:18 PM

          Hello Davey,

          what should I say, it’s like the regexes.
          Whenever you think you finally understand its behaviour
          something happens which you didn’t expect.
          Tbh, I can’t give you any advice on this as I’m still in the same situation as you - why does it what it does. ;-)

          But, c’mon - be happy it works and there will be a tomorrow when it fails again,
          so take the chance to celebrate - hip hip horroray ;-)

          Cheers
          Claudia

          1 Reply Last reply Reply Quote 0
          • D
            DaveyD
            last edited by Jan 15, 2016, 2:54 AM

            :) :) :)
            Thanks Claudia - you made me laugh!
            I just hope it doesnt happen any time soon…! :)

            Cheers
            Davey

            1 Reply Last reply Reply Quote 0
            • C
              Claudia Frank
              last edited by Jan 15, 2016, 3:02 AM

              you made me laugh!

              Well, then my work is done - time to party

              Cheers
              Claudia

              1 Reply Last reply Reply Quote 0
              4 out of 6
              • First post
                4/6
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors