Community
    • Login

    Ask Scintilla about the text scope such as enclosing brackets or quotes?

    Scheduled Pinned Locked Moved Notepad++ & Plugin Development
    7 Posts 4 Posters 109 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Vitalii DovganV
      Vitalii Dovgan
      last edited by

      Dear Scintilla users and plugin developers!

      As Scintilla perfectly highlights the brackets themselves and the text inside quotes, its syntax engine is expected to be aware of the current scope of each particular part of the text: which exact part is enclosed in quotes and which exact part is enclosed in brackets.

      My question is: how can I retrieve this information for a given caret position? I.e. how can I understand whether a given position (Sci_Position) belongs to some quotes or brackets?

      Let’s consider the following example of the text within Scintilla:

      void func()
      {
        const auto s = "ab|c";
      }
      

      where ‘|’ is the position of the caret.
      For the given example, I’m interested in some Scintilla’s API messages that would tell me the following:

      • the caret position belongs to (is within) the quotes scope of “abc”, and this scope starts with an opening ‘"’ at the position 34 and ends with a closing ‘"’ at the position 38;
      • the outer scope around the quotes “abc” starts with an opening “{” at the position 14 and ends with a closing “}” at the position 42.

      Is it possible to retrieve such information from Scintilla?

      rdipardoR PeterJonesP 2 Replies Last reply Reply Quote 0
      • rdipardoR
        rdipardo @Vitalii Dovgan
        last edited by

        @Vitalii-Dovgan said in Ask Scintilla about the text scope such as enclosing brackets or quotes?:

        As Scintilla perfectly highlights the brackets themselves and the text inside quotes, its syntax engine is expected to be aware of the current scope of each particular part of the text: which exact part is enclosed in quotes and which exact part is enclosed in brackets.

        Take at look at how brace matching is done. See anything in there that looks like context awareness?

        I’ve said this elsewhere, but it bears repeating: Scintilla is not an LSP or even a tree-sitter; a Scintilla “document” is a byte buffer and nothing else.

        1 Reply Last reply Reply Quote 4
        • PeterJonesP
          PeterJones @Vitalii Dovgan
          last edited by PeterJones

          @Vitalii-Dovgan said in Ask Scintilla about the text scope such as enclosing brackets or quotes?:

          Is it possible to retrieve such information from Scintilla

          Maybe not exactly as you want it, as @rdipardo said. But depending on your needs, there might be a sufficient approximation.

          For example, most languages have their own styleID for quoted text, so the active style will start and end at the quotes – I am not an expert, but I don’t think there’s a way of asking “where does the current style start/end”. The best I can think of is to poll the styles at individual positions in the document using SCI_GETSTYLEINDEXAT.

          Braced-regions don’t usually have their own style… but in languages where knowing which braces you were inside, there will probably be folding based on those braces. Again, I am not sure there’s something that does exactly what you want. My first thought was to use SCI_GETFOLDLEVEL and navigate forward/backward to find the ends of those; but while I was there, I saw the SCI_GETLASTCHILD and SCI_GETFOLDPARENT, which might be close to what you want.

          CoisesC Vitalii DovganV 2 Replies Last reply Reply Quote 2
          • CoisesC
            Coises @PeterJones
            last edited by

            @Vitalii-Dovgan
            @PeterJones said in Ask Scintilla about the text scope such as enclosing brackets or quotes?:

            Braced-regions don’t usually have their own style… but in languages where knowing which braces you were inside, there will probably be folding based on those braces.

            There is also usually a style applied to braces that is different from strings or comments, which can help distinguish “real” braces from ones in a quoted string or a comment. See this mess in my experimental ControlledAutoIndent if you wish; it isn’t pretty.

            Once you identify an enclosing brace, you can find the matching one with SCI_BRACEMATCH.

            A potential annoyance is that lexer styling is not yet applied when SCN_CHARADDED is sent; since brace matching uses styling information to find a correctly-matched brace, when a closing brace is typed, you can’t find the matching opening brace in an SCN_CHARADDED routine with SCI_BRACEMATCH, because the closing brace hasn’t been styled yet. (That is the main reason ControlledAutoIndent doesn’t process braces until the Enter key is typed — I wanted to rely on the lexer to identify which braces “count.” It seemed error-prone to me to replicate the lexer’s job.)

            1 Reply Last reply Reply Quote 2
            • Vitalii DovganV
              Vitalii Dovgan
              last edited by Vitalii Dovgan

              Thank you for the inputs!
              I’ve been thinking about such functions as GoToMatchingBracket (for a bracket at the position of the caret) and GoToNearestBracket (for a nearest bracket around the position of the caret), as well as SelToMatchingBracket and SelToNearestBracket, in XBrackets Lite. And you’ve given me good points to start.
              Such functions exist in XBrackets for AkelPad, so I’m considering them in XBrackets Lite for Notepad++. It will not be a part of the already planned next release of XBrackets Lite, but maybe a part of the next one after that.
              Talking about XBrackets for AkelPad, it contains pretty complicated logic for GoToNearestBracket, including “ambiguity resolutions” for non-straightforward cases such "abc[] {('')} xyz"| "". As AkelPad provides API to request a “quote range” and “fold range”, maintaining some kind of a syntax tree, it allows XBrackets for AkelPad to rely on this information and find matching pairs on long distance with very low chance of mistake. Since Scintilla does not provide this information, I may implement similar algorithms that would not work for long distances and will just stop in case of any ambiguity.

              1 Reply Last reply Reply Quote 0
              • Vitalii DovganV
                Vitalii Dovgan @PeterJones
                last edited by

                @PeterJones

                The SCI_GETSTYLEINDEXAT is useful indeed.
                By getting the styles:

                int nStyleBefore = sciMsgr.SendSciMsg(SCI_GETSTYLEINDEXAT, nCharPos - 1);
                int nStyle = sciMsgr.SendSciMsg(SCI_GETSTYLEINDEXAT, nCharPos);
                int nStyleAfter = sciMsgr.SendSciMsg(SCI_GETSTYLEINDEXAT, nCharPos + 1);
                

                it is possible to identify whether a character at the nCharPos starts a range of a scope:

                if ( nStyleAfter == nStyle && nStyle != nStyleBefore )
                {
                  // the range starts at nCharPos and continues to nCharPos + 1
                }
                else if ( nStyleBefore == nStyle && nStyle != nStyleAfter )
                {
                  // the range starts at nCharPos and continues to nCharPos - 1
                }
                
                CoisesC 1 Reply Last reply Reply Quote 1
                • CoisesC
                  Coises @Vitalii Dovgan
                  last edited by

                  @Vitalii-Dovgan

                  Be careful, though, what you mean by a “scope.”

                  Consider your own example from earlier in the thread:

                  void func()
                  {
                    const auto s = "abc";
                  }
                  

                  On the basis you describe, the word auto would appear to be a “scope” beginning with a and ending with o.

                  1 Reply Last reply Reply Quote 2
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors