Community
    • Login

    Map language to extension

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    9 Posts 5 Posters 594 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Michael VincentM
      Michael Vincent
      last edited by

      I’m sure I’m missing something obvious - haven’t had morning coffee yet. Is there a way (API) to associate the language type and the file extension?

      I’m thinking in NppExec or PythonScript. I know the Notepad++ API provides:

      • NPPM_GETCURRENTLANGTYPE
      • NPPM_GETLANGUAGENAME
      • NPPM_GETLANGUAGEDESC

      But is there something like NPPM_GETLANGEXTENSIONS that returns “.txt” for “Normal Text” for example, that I’m completely blanking on?

      Cheers.

      EkopalypseE 1 Reply Last reply Reply Quote 0
      • EkopalypseE
        Ekopalypse @Michael Vincent
        last edited by

        @Michael-Vincent

        No, afaik there is no such api. May I ask what purpose it could have?
        My first thought was if notepad.getCurrentFilename().endswith('.py') ...
        to check or
        notepad.getCurrentFilename().rpartition('.')[2]
        to get the extension.

        Michael VincentM 1 Reply Last reply Reply Quote 0
        • Michael VincentM
          Michael Vincent @Ekopalypse
          last edited by

          @Ekopalypse said in Map language to extension:

          May I ask what purpose it could have?

          I have a PythonScript callback that triggers anytime a new buffer is activated or the language of the current buffer is changed. It does some default settings stuff and I also have it change the tabs from 4 to 2 spaces for some UDLs that can’t be done via the normal Notepad++ Preferences setting for tabs/spaces.

          Currently, I search on the file extension, but that doesn’t work in the case of a new-1 buffer where I just switch the language. There is no extension to match. I could add a or to the conditional to check the language type too, but was wondering if I could somehow map the language and extension together. Trying to avoid adding the .yang extension to one list and forgetting to add the “User Defined language file - Yang” name to the other list, for example.

          Cheers.

          1 Reply Last reply Reply Quote 0
          • Mark OlsonM
            Mark Olson
            last edited by Mark Olson

            Figured I’d save you the trouble to figure out how to parse:

            from Npp import *
            
            import os
            from xml.etree import ElementTree as ET
            
            APPDATA_DIR = os.path.join(os.getenv('APPDATA'))
            NPP_DIR =  os.path.join(APPDATA_DIR, 'Notepad++')
            LANG_XML = os.path.join(NPP_DIR, 'langs.xml')
            UDL_DIR = os.path.join(NPP_DIR, 'userDefineLangs')
            UDL_XML_LIST = [os.path.join(NPP_DIR, 'userDefineLang.xml')]
            UDL_XML_LIST.extend(os.path.join(UDL_DIR, x) for x in os.listdir(UDL_DIR))
            
            def parse_lang_extensions(lang_xml_element):
                return lang_xml_element.attrib('ext').split(' ')
                
            def parse_lang_xml(ext_lang_dict, languages_list, lang_tag):
                '''languages_list: an XML element containinng a list of
                XML elements describing a language
                lang_tag: the tag for the XML elements describing a language:
                    either 'UserLang' for UDLs or 'Language' for predefined langs
                '''
                for lang in languages_list.findall(lang_tag):
                    exts = lang.attrib['ext']
                    name = lang.attrib['name']
                    for ext in exts.split():
                        ext_lang_dict[ext] = name
            
            def get_ext_lang_dict():
                ext_lang_dict = {}
            
                lang_xml = ET.parse(LANG_XML)
                lang_xml_languages = lang_xml.getroot().find('Languages')
                parse_lang_xml(ext_lang_dict, lang_xml_languages, 'Language')
                
                for udl_xml_fname in UDL_XML_LIST:
                    udl_xml = ET.parse(udl_xml_fname)
                    udl_lang_list = udl_xml.getroot()
                    parse_lang_xml(ext_lang_dict, udl_lang_list, 'UserLang')
                    
                return ext_lang_dict
                
            ext_lang_dict = get_ext_lang_dict()
            print(get_ext_lang_dict())
            
            Michael VincentM 1 Reply Last reply Reply Quote 2
            • Michael VincentM
              Michael Vincent @Mark Olson
              last edited by

              @Mark-Olson said in Map language to extension:

              My first thought on how to approach this would just be to parse langs.xml, userDefinedLang.xml, and all the UDL XML files in userDefineLangs.xml at startup.

              Yes, that’s why I was hoping there was an API for this since Notepad++ does it already.

              Cheers.

              1 Reply Last reply Reply Quote 1
              • Mark OlsonM
                Mark Olson
                last edited by Mark Olson

                Slight improvement to earlier idea:

                from Npp import *
                
                import os
                from xml.etree import ElementTree as ET
                
                APPDATA_DIR = os.path.join(os.getenv('APPDATA'))
                NPP_DIR =  os.path.join(APPDATA_DIR, 'Notepad++')
                LANG_XML = os.path.join(NPP_DIR, 'langs.xml')
                UDL_DIR = os.path.join(NPP_DIR, 'userDefineLangs')
                UDL_XML_LIST = [os.path.join(NPP_DIR, 'userDefineLang.xml')]
                UDL_XML_LIST.extend(os.path.join(UDL_DIR, x) for x in os.listdir(UDL_DIR))
                
                def parse_lang_extensions(lang_xml_element):
                    return lang_xml_element.attrib('ext').split(' ')
                    
                def parse_lang_xml(ext_lang_dict, languages_list, lang_tag):
                    '''languages_list: an XML element containinng a list of
                    XML elements describing a language
                    lang_tag: the tag for the XML elements describing a language:
                        either 'UserLang' for UDLs or 'Language' for predefined langs
                    '''
                    for lang in languages_list.findall(lang_tag):
                        exts = lang.attrib['ext']
                        name = lang.attrib['name']
                        for ext in exts.split():
                            ext_lang_dict[ext] = name
                
                def map_exts_and_langs():
                    ext_lang_dict = {}
                
                    lang_xml = ET.parse(LANG_XML)
                    lang_xml_languages = lang_xml.getroot().find('Languages')
                    parse_lang_xml(ext_lang_dict, lang_xml_languages, 'Language')
                    
                    for udl_xml_fname in UDL_XML_LIST:
                        udl_xml = ET.parse(udl_xml_fname)
                        udl_lang_list = udl_xml.getroot()
                        parse_lang_xml(ext_lang_dict, udl_lang_list, 'UserLang')
                        
                    lang_ext_dict = {}
                    for ext, lang in ext_lang_dict.items():
                        lang_ext_dict.setdefault(lang, []).append(ext)
                    
                    return ext_lang_dict, lang_ext_dict
                    
                ext_lang_dict, lang_ext_dict = map_exts_and_langs()
                print(f'{ext_lang_dict = }')
                print(f'{lang_ext_dict = }')
                

                It was actually quite fast when I ran it, so probably not a big deal to run on startup.

                EkopalypseE PeterJonesP 2 Replies Last reply Reply Quote 3
                • EkopalypseE
                  Ekopalypse @Mark Olson
                  last edited by

                  @Michael-Vincent
                  Excuse my ignorance, but I still don’t understand why you need to know the extension. What you described seems doable if you know the language currently used.
                  Otherwise, @Mark-Olson’s approach seems to be the way to go.

                  1 Reply Last reply Reply Quote 0
                  • PeterJonesP
                    PeterJones @Mark Olson
                    last edited by PeterJones

                    @Mark-Olson said in Map language to extension:

                    APPDATA_DIR = os.path.join(os.getenv('APPDATA'))
                    NPP_DIR =  os.path.join(APPDATA_DIR, 'Notepad++')
                    LANG_XML = os.path.join(NPP_DIR, 'langs.xml')
                    UDL_DIR = os.path.join(NPP_DIR, 'userDefineLangs')
                    UDL_XML_LIST = [os.path.join(NPP_DIR, 'userDefineLang.xml')]
                    UDL_XML_LIST.extend(os.path.join(UDL_DIR, x) for x in os.listdir(UDL_DIR))
                    

                    Future readers: Please note that this makes it only work in a standard installation that uses AppData. For doLocalConf mode, or cloud-mode, or -settingsDir, this will fail. (I know that @Mark-Olson and @Michael-Vincent both realize this; this is for future readers, whom I cannot assume have gained that piece of knowledge yet.) For that, I like using pluginConfig=notepad.getPluginConfigDir() / NPPM_GETPLUGINSCONFIGDIR, and then working my way up two directories from pluginConfig, because that ensures you’ve found the the active configuration directory, whichever config mode the user is in.

                    So, to update the example above

                    PLUGIN_CONFIG_DIR = notepad.getPluginConfigDir()
                    PLUGIN_DIR = os.path.dirname(PLUGIN_CONFIG_DIR)
                    NPP_DIR = os.path.dirname(PLUGIN_DIR)
                    
                    1 Reply Last reply Reply Quote 7
                    • Alan KilbornA
                      Alan Kilborn
                      last edited by

                      @Mark-Olson said in Map language to extension:

                      print(f’{ext_lang_dict = }‘)
                      print(f’{lang_ext_dict = }')

                      As long as PythonScript 2.x is what is installed via Plugins Admin, we probably shouldn’t publish PS3-only scripts.

                      Suggest changing these two lines to:

                      print('ext_lang_dict:', ext_lang_dict)
                      print('lang_ext_dict:', lang_ext_dict)
                      

                      and, of course, adding this line at the very top of the script file:

                      from __future__ import print_function
                      
                      1 Reply Last reply Reply Quote 5
                      • First post
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors