• Login
Community
  • Login

How to view Bengali characters instead of its html encoding

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
ms wordbengaliutf-8html encoding
6 Posts 3 Posters 3.8k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • N
    Nabodita Gangully
    last edited by Jun 2, 2017, 5:24 PM

    I am creating an ebook of content written in Bengali. The file is created in MS Word (using the on-screen keyboard) and saved as an .htm file (Save As Webpage - Filtered). However, when I open this .htm file in Notepad++, I see the html equivalents of the Bengali characters instead of the characters themselves.

    Notepad++ is allowing me to type in Bengali in the Editor so I suspect this may not be a font issue. Does anyone know what I have to do to display the Bengali characters? Thanks.

    C 1 Reply Last reply Jun 2, 2017, 6:06 PM Reply Quote 1
    • C
      Claudia Frank @Nabodita Gangully
      last edited by Jun 2, 2017, 6:06 PM

      @Nabodita-Gangully

      Notepad++ shows what has been saved by Word, means as you have selected that the file
      should be saved as html file Word decided to replace the Bengali chars with the html encoded version. Which is nice, as this means that most of the browsers shouldn’t have an issue
      displaying the page correctly.

      If you want to see the “real” glyph than you need to replace the html encoded version with
      the correct version of your used encoding but this could mean, that a browser might have
      an issue displaying the page correctly.

      Cheers
      Claudia

      1 Reply Last reply Reply Quote 1
      • N
        Nabodita Gangully
        last edited by Jun 2, 2017, 6:36 PM

        @Claudia-Frank
        Not very sure what would be the best solution under the circumstances. Should I write a script to convert the file? I believe there is also a plugin called HTML Tags which can do this for me…

        Thank you for responding so promptly!

        C 1 Reply Last reply Jun 2, 2017, 7:04 PM Reply Quote 0
        • C
          Claudia Frank @Nabodita Gangully
          last edited by Jun 2, 2017, 7:04 PM

          @Nabodita-Gangully

          I have to admit that I don’t have any experience in creating ebooks.
          Does the ebook format specify a certain encoding? UTF-8?
          Or is it basically html?

          If the html tags plugin can do this, yes why not using it.
          If it can’t, I assume it should be possible to write two python scripts
          which do maybe something like this

          Convert the html encoded tags into the “real” utf-8 encoding
          and you can start writing and once you’re finished write another script
          which reverts it to html encoded strings again.

          Of course, python script plugin needs to be installed in this case.
          Let me know if you wanna go this way.

          Cheers
          Claudia

          1 Reply Last reply Reply Quote 0
          • N
            Nabodita Gangully
            last edited by Jun 2, 2017, 7:33 PM

            I read somewhere that an epub is a website in a box (can’t remember where I came across that phrase) and it is very apt. Yes, its basically xhtml 1.1 and css for styling.

            Will post back with test results; I am familiar with python so that may be the way to go.

            Thanks again!

            1 Reply Last reply Reply Quote 1
            • P
              Pouemes44
              last edited by Jun 5, 2017, 5:05 PM

              hello Nabodita

              can you paste here some word that you have to undrstand

              1 Reply Last reply Reply Quote 0
              6 out of 6
              • First post
                6/6
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors