Community
    • Login

    How to view Bengali characters instead of its html encoding

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    ms wordbengaliutf-8html encoding
    6 Posts 3 Posters 3.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Nabodita GangullyN
      Nabodita Gangully
      last edited by

      I am creating an ebook of content written in Bengali. The file is created in MS Word (using the on-screen keyboard) and saved as an .htm file (Save As Webpage - Filtered). However, when I open this .htm file in Notepad++, I see the html equivalents of the Bengali characters instead of the characters themselves.

      Notepad++ is allowing me to type in Bengali in the Editor so I suspect this may not be a font issue. Does anyone know what I have to do to display the Bengali characters? Thanks.

      Claudia FrankC 1 Reply Last reply Reply Quote 1
      • Claudia FrankC
        Claudia Frank @Nabodita Gangully
        last edited by

        @Nabodita-Gangully

        Notepad++ shows what has been saved by Word, means as you have selected that the file
        should be saved as html file Word decided to replace the Bengali chars with the html encoded version. Which is nice, as this means that most of the browsers shouldn’t have an issue
        displaying the page correctly.

        If you want to see the “real” glyph than you need to replace the html encoded version with
        the correct version of your used encoding but this could mean, that a browser might have
        an issue displaying the page correctly.

        Cheers
        Claudia

        1 Reply Last reply Reply Quote 1
        • Nabodita GangullyN
          Nabodita Gangully
          last edited by

          @Claudia-Frank
          Not very sure what would be the best solution under the circumstances. Should I write a script to convert the file? I believe there is also a plugin called HTML Tags which can do this for me…

          Thank you for responding so promptly!

          Claudia FrankC 1 Reply Last reply Reply Quote 0
          • Claudia FrankC
            Claudia Frank @Nabodita Gangully
            last edited by

            @Nabodita-Gangully

            I have to admit that I don’t have any experience in creating ebooks.
            Does the ebook format specify a certain encoding? UTF-8?
            Or is it basically html?

            If the html tags plugin can do this, yes why not using it.
            If it can’t, I assume it should be possible to write two python scripts
            which do maybe something like this

            Convert the html encoded tags into the “real” utf-8 encoding
            and you can start writing and once you’re finished write another script
            which reverts it to html encoded strings again.

            Of course, python script plugin needs to be installed in this case.
            Let me know if you wanna go this way.

            Cheers
            Claudia

            1 Reply Last reply Reply Quote 0
            • Nabodita GangullyN
              Nabodita Gangully
              last edited by

              I read somewhere that an epub is a website in a box (can’t remember where I came across that phrase) and it is very apt. Yes, its basically xhtml 1.1 and css for styling.

              Will post back with test results; I am familiar with python so that may be the way to go.

              Thanks again!

              1 Reply Last reply Reply Quote 1
              • Pouemes44P
                Pouemes44
                last edited by

                hello Nabodita

                can you paste here some word that you have to undrstand

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                The Community of users of the Notepad++ text editor.
                Powered by NodeBB | Contributors