Community
    • Login

    functionList for LaTeX: Trying to use classRange to have a hierarchical chapter > section document outline

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    6 Posts 2 Posters 201 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Jason McGeeJ
      Jason McGee
      last edited by Jason McGee

      I’m trying to use the npp function list to produce a document outline for me which shows something like the following:

      Chapter 1
       > Section 1.1
       > Section 1.2
      Chapter 2
       > Section 2.1
       > Section 2.2
      

      I can’t work out why this doesn’t work:

      <NotepadPlus>
        <functionList>
          <parser id="latex" displayName="LaTeX" commentExpr="(%.*?$)">
            <classRange mainExpr="\\chapter\*?\{(.*?)\}">
              <className>
                <nameExpr expr=".*"/>
              </className>
              <function mainExpr="\\section\*?\{(.*?)\}">
                <functionName>
                  <funcNameExpr expr=".*"/>
                </functionName>
              </function>
            </classRange>
          </parser>
        </functionList>
      </NotepadPlus>
      

      As far as I can tell from the manual, that should look for \chapter{} or \chapter*{} as the start of a class and with no openSymbole/closeSymbole it should treat everything after the first \chapter{} as a class range until the next \chapter{}.

      Instead I get an empty function list.

      I can get chapters and sections on one level ok using \\(chapter|section)... for the regex, but I can’t get classRange to work for the life of me.

      PeterJonesP 2 Replies Last reply Reply Quote 0
      • PeterJonesP
        PeterJones @Jason McGee
        last edited by

        @Jason-McGee,

        I’m trying to use the npp function list to produce a document outline for me which shows something like the following:

        could you share some example LaTeX which would resolve to that FunctionList? It would make it easier for us to help you.

        I can’t work out why this doesn’t work:
        …

                     <classRange mainExpr="\\chapter\*?\{(.*?)\}">
        

        \*? says “0 or 1 literal asterisk characters”. Is that really what you’re trying to match? I don’t know enough about LaTeX to make an educated guess there; it just seems odd to me. But looking at the built-in latex.xml, I guess that uses the \*? after each of it’s “functions”, so maybe that is reasonable.

        As far as I can tell from the manual, that should look for \chapter{} or \chapter*{} as the start of a class and with no openSymbole/closeSymbole it should treat everything after the first \chapter{} as a class range until the next \chapter{}.

        If you don’t have openSymbole/closeSymbole, then the mainExpr for the classRange needs to match the entire class, from beginning to end, and then it searches for any functions inside the results of that regex. (And it needs to be a multi-line match)

        Instead I get an empty function list.

        Unfortunately, that’s a common occurrence when trying to debug FunctionList, especially with classes.

        I can get chapters and sections on one level ok using \\(chapter|section)... for the regex, but I can’t get classRange to work for the life of me.

        (Did you start on your own, or did you start with the latex.xml that ships with Notepad++ since v8.7? Because that shows a bigger list of things all at the same level, which is better than nothing)

        I’ll see if I can come up with something, starting from the default latex.xml, that will get you started in the right direction.

        1 Reply Last reply Reply Quote 0
        • PeterJonesP
          PeterJones @Jason McGee
          last edited by PeterJones

          @Jason-McGee,

          using the example text found here, which has chapters and sections, I am able to get something I think is reasonable.

          <?xml version="1.0" encoding="UTF-8" ?>
          <!-- ==========================================================================\
          |   To learn how to make your own language parser, please check the following
          |   link:    https://npp-user-manual.org/docs/function-list/
          \=========================================================================== -->
          <NotepadPlus>
          	<functionList>
          		<parser
          			displayName="LaTeX Syntax"
          			id         ="latex_class"
          			commentExpr="(?x)
          							(%.*?$)                                 # Comment
          						"
          
          		>
          			<function
          				mainExpr="(?x)                 # free-spacing (see `RegEx - Pattern Modifiers`)
          						  (?im-s)              # ignore case, ^ and $ match start/end of line, dot doesn't match newline
          						  \\(begin|
          						     part\*?|
          							 subsection\*?|
          							 subsubsection\*?|
          							 paragraph\*?|
          							 subparagraph\*?)
          							 {.*}"
          			>
          			</function>
          				<classRange
          					mainExpr    ="(?x)                                          # free-spacing (see `RegEx - Pattern Modifiers`)
          							(?m)                                                # ^ and $ match at line-breaks
          							(?'CLASS_START'
          								^                                               # NO leading white-space at start-of-line
          								\\(chapter\*?)
          							)
          							(?s:.*?)                                            # whatever,
          							(?=                                                 # ...up till
          								\s*                                             # ...optional leading white-space of
          								(?:
          									(?&amp;CLASS_START)                         # ...next header
          								|	\Z                                          # ...or end-of-text
          								)
          							)
          						"
          				>
          					<className>
          						<nameExpr expr="(?x)                                    # free-spacing (see `RegEx - Pattern Modifiers`)
          								\\(chapter\*?)                                  # prefix INCLUDED
          								{                                               # brace before name INCLUDED
          								.*?                                             # name
          								}                                               # brace after name INCLUDED
          							"
          						/>
          					</className>
          					<function
          						mainExpr="(?xm-s)                                         # free-spacing (see `RegEx - Pattern Modifiers`)
          								\\
          								(
          									section\*?
          									|subsection\*?
          									|subsubsection\*?
          									|paragraph\*?
          									|subparagraph\*?
          								)
          								{.*?}
          							"
          					>
          						<functionName>
          							<funcNameExpr expr="(?xm-s)                           # free-spacing (see `RegEx - Pattern Modifiers`)
          								\\
          								(
          									section\*?
          									|subsection\*?
          									|subsubsection\*?
          									|paragraph\*?
          									|subparagraph\*?
          								)
          								{.*?}
          								"
          							/>
          						</functionName>
          					</function>
          				</classRange>
          		</parser>
          	</functionList>
          </NotepadPlus>
          

          e866255c-1ac9-46a5-8e9a-74185bcadbba-image.png

          Or, if you want to hide the \XYZ{...} wrappers around everything:

          <?xml version="1.0" encoding="UTF-8" ?>
          <!-- ==========================================================================\
          |   To learn how to make your own language parser, please check the following
          |   link:    https://npp-user-manual.org/docs/function-list/
          \=========================================================================== -->
          <NotepadPlus>
          	<functionList>
          		<parser
          			displayName="LaTeX Syntax"
          			id         ="latex_class"
          			commentExpr="(?x)
          							(%.*?$)                                 # Comment
          						"
          
          		>
          			<function
          				mainExpr="(?x)                 # free-spacing (see `RegEx - Pattern Modifiers`)
          						  (?im-s)              # ignore case, ^ and $ match start/end of line, dot doesn't match newline
          						  \\(begin|
          						     part\*?|
          							 subsection\*?|
          							 subsubsection\*?|
          							 paragraph\*?|
          							 subparagraph\*?)
          							 {.*}"
          			>
          					<functionName>
          						<nameExpr expr="(?xm-s)                                 # free-spacing (see `RegEx - Pattern Modifiers`)
          								(?<={)
          								.*?
          								(?=})
          							"
          						/>
          					</functionName>
          			</function>
          				<classRange
          					mainExpr    ="(?x)                                          # free-spacing (see `RegEx - Pattern Modifiers`)
          							(?m)                                                # ^ and $ match at line-breaks
          							(?'CLASS_START'
          								^                                               # NO leading white-space at start-of-line
          								\\(chapter\*?)
          							)
          							(?s:.*?)                                            # whatever,
          							(?=                                                 # ...up till
          								\s*                                             # ...optional leading white-space of
          								(?:
          									(?&amp;CLASS_START)                         # ...next header
          								|	\Z                                          # ...or end-of-text
          								)
          							)
          						"
          				>
          					<className>
          						<nameExpr expr="(?x)                                    # free-spacing (see `RegEx - Pattern Modifiers`)
          								(?<={)                                          # brace before name
          								.*?                                             # name
          								(?=})                                           # brace after name
          							"
          						/>
          					</className>
          					<function
          						mainExpr="(?xm-s)                                         # free-spacing (see `RegEx - Pattern Modifiers`)
          								\\
          								(
          									section\*?
          									|subsection\*?
          									|subsubsection\*?
          									|paragraph\*?
          									|subparagraph\*?
          								)
          								{.*?}
          							"
          					>
          						<functionName>
          							<funcNameExpr expr="(?xm-s)                           # free-spacing (see `RegEx - Pattern Modifiers`)
          									(?<={)
          									.*?
          									(?=})
          								"
          							/>
          						</functionName>
          					</function>
          				</classRange>
          		</parser>
          	</functionList>
          </NotepadPlus>
          

          which will yield:
          f421601b-3c8b-4127-950e-738d11d9f284-image.png

          You can, of course, feel free to tweak it to match your desires.

          (Also, note that any class (ie, chapter) that doesn’t have a function (ie, section or similar) will not be listed in the FunctionList. That’s one of the quirks of the FunctionList.)

          1 Reply Last reply Reply Quote 2
          • Jason McGeeJ
            Jason McGee
            last edited by Jason McGee

            could you share some example LaTeX which would resolve to that FunctionList? It would make it easier for us to help you.

            Sorry, I didn’t see your replies until today. The example you found is fine. Here’s a minimal one for my purposes:

            \documentclass{scrreprt}
            
            \begin{document}
            
            \chapter{Chapter with sections}
            \section{Section 1.1}
            Lorem ipsum
            
            \section{Section 1.2}
            dolor sit amet
            
            \chapter{Chapter with no sections}
            consectetur adipiscing elit.
            
            \chapter{Chapter with unnumbered sections}
            \section*{Heading with no number}
            Phasellus mollis posuere ante vel tincidunt. 
            
            \section*{Second heading with no number}
            Donec faucibus tellus sapien, vitae fringilla nulla bibendum eget.
            
            \appendix
            \chapter{First appendix}
            \include{document}
            
            \chapter{Appendix with sections}
            \section{B.1}
            Nam mauris nisl, cursus at erat in, 
            
            \section{B.2}
            molestie luctus nulla.
            
            \end{document}
            

            \*? says “0 or 1 literal asterisk characters”. Is that really what you’re trying to match? I don’t know enough about LaTeX to make an educated guess there; it just seems odd to me. But looking at the built-in latex.xml, I guess that uses the \*? after each of it’s “functions”, so maybe that is reasonable.

            Yep that’s normal LaTeX syntax. Many functions have starred and unstarred versions - see example above for the usage on sections/chapters if you’re curious.

            If you don’t have openSymbole/closeSymbole, then the mainExpr for the classRange needs to match the entire class, from beginning to end, and then it searches for any functions inside the results of that regex. (And it needs to be a multi-line match)

            That explains where I was going wrong!

            (Did you start on your own, or did you start with the latex.xml that ships with Notepad++ since v8.7? Because that shows a bigger list of things all at the same level, which is better than nothing)

            I started with the default but cut it right down as I only want it to match chapters and sections and not every instance of \begin{environment} or structure at subsection or below.

            (Also, note that any class (ie, chapter) that doesn’t have a function (ie, section or similar) will not be listed in the FunctionList. That’s one of the quirks of the FunctionList.)

            Unfortunately, I do want it to pick up chapters that have no sections.

            As a workaround, could I have the <function mainExpr=x> just match the first thing in the chapter if it doesn’t find a \section in it? Meaning something like “IF \\section\*?{.*?} exists THEN match every instance of \\section\*?{.*?}, ELSE match the first instance of \S+.”

            Then there’d always be a “function” in every chapter and the “empty” ones would still show up.

            I don’t know how (or if its possible) to do that with regexes though.

            PeterJonesP 1 Reply Last reply Reply Quote 0
            • PeterJonesP
              PeterJones @Jason McGee
              last edited by

              @Jason-McGee said in functionList for LaTeX: Trying to use classRange to have a hierarchical chapter > section document outline:

              could I have the <function mainExpr=x> just match the first thing in the chapter if it doesn’t find a \section in it? Meaning something like “IF \\section\*?{.*?} exists THEN match every instance of \\section\*?{.*?}, ELSE match the first instance of \S+.”

              With the way that the nesting works for FunctionLists (they don’t just do one regex; they do regex just on the results of previous regex, and it gets confusing), I am not certain how to accomplish that.

              I thought I could try to relax some of the rules, so that inside a \chapter, anything starting with \ would start a function (even the chapter!) – and because the chapter would match, then it would have contents. And that “worked” (as long as you don’t mind having each \chapter class repeated as a function in that class, too):
              a408c133-8124-431a-a667-ad39d34fb412-image.png

              But, unfortunately, as you can see, it doesn’t pick up the \appendix or the \include. If you needed it to, then this wouldn’t work for you. But if that’s an okay compromise, then here’s an XML

              <?xml version="1.0" encoding="UTF-8" ?>
              <!-- ==========================================================================\
              |   To learn how to make your own language parser, please check the following
              |   link:    https://npp-user-manual.org/docs/function-list/
              \=========================================================================== -->
              <NotepadPlus>
              	<functionList>
              		<parser
              			displayName="LaTeX Syntax"
              			id         ="latex_class"
              			commentExpr="(?x)
              							(%.*?$)                                 # Comment
              						"
              
              		>
              			<function
              				mainExpr="(?x)                 # free-spacing (see `RegEx - Pattern Modifiers`)
              						  (?im-s)              # ignore case, ^ and $ match start/end of line, dot doesn't match newline
              						  \\(begin|
              						     part\*?|
              							 subsection\*?|
              							 subsubsection\*?|
              							 paragraph\*?|
              							 subparagraph\*?)
              							 {.*}"
              			>
              			</function>
              				<classRange
              					mainExpr    ="(?x)                                          # free-spacing (see `RegEx - Pattern Modifiers`)
              							(?m)                                                # ^ and $ match at line-breaks
              							(?'CLASS_START'
              								^                                               # NO leading white-space at start-of-line
              								\\(chapter\*?)
              							)
              							(?s:.*?)                                            # whatever,
              							(?=                                                 # ...up till
              								\s*                                             # ...optional leading white-space of
              								(?:
              									(?&amp;CLASS_START)                         # ...next header
              								|	\Z                                          # ...or end-of-text
              								)
              							)
              						"
              				>
              					<className>
              						<nameExpr expr="(?x)                                    # free-spacing (see `RegEx - Pattern Modifiers`)
              								\\(chapter\*?)                                  # prefix INCLUDED
              								{                                               # brace before name INCLUDED
              								.*?                                             # name
              								}                                               # brace after name INCLUDED
              							"
              						/>
              					</className>
              					<function
              						mainExpr="(?xm-s)                                         # free-spacing (see `RegEx - Pattern Modifiers`)
              								^ \\ \w .*
              							"
              					>
              						<functionName>
              							<funcNameExpr expr="(?xm-s)                           # free-spacing (see `RegEx - Pattern Modifiers`)
              								^ \\ \w .*
              								"
              							/>
              						</functionName>
              					</function>
              				</classRange>
              		</parser>
              	</functionList>
              </NotepadPlus>
              

              If it’s not an okay compromise, then one of the other regex+FunctionList experts is going to chime in, because I spent more than an hour experimenting with that this morning, and I’m out of ideas. (I’m specifically hoping @MAPJe71 stops by, as the resident FunctionList guru)

              Jason McGeeJ 1 Reply Last reply Reply Quote 1
              • Jason McGeeJ
                Jason McGee @PeterJones
                last edited by Jason McGee

                Thanks @PeterJones, I really appreciate your time! I spent a chunk of time yesterday too trying to work out the regex if-then-else functionality before I gave up.

                I wouldn’t want to pick up everything that starts with \ because that will pull in a lot of commands that aren’t related to document structure (every command starts with \). For example, here’s my minimum working example with a numbered list in it, and that picks up the \begin{enumerate} and \items:

                \documentclass{scrreprt}
                
                \begin{document}
                
                \chapter{Chapter 1 with sections}
                \section{1.1}
                \subsection{1.1.1}
                Lorem 
                
                \subsection{1.1.2}
                ipsum
                
                \section{1.2}
                \begin{enumerate}
                \item dolor
                \item sit 
                \item amet
                \end{enumerate}
                
                \chapter{Chapter 2 with no sections}
                consectetur adipiscing elit.
                
                \chapter{Chapter 3 with unnumbered sections}
                \section*{Heading with no number}
                Phasellus mollis posuere ante vel tincidunt. 
                
                \section*{Second heading with no number}
                Donec faucibus tellus sapien, vitae fringilla nulla bibendum eget.
                
                \appendix
                	\chapter{Appendix A}
                	\include{document}
                
                	\chapter{Appendix B with sections}
                	\section{B.1}
                	Nam mauris nisl, cursus at erat in, 
                
                	\section{B.2}
                	molestie luctus nulla.
                
                \end{document}
                

                … but picking up the \chapter{} as a function along with the \section{}s is a great workaround!

                Here’s what I have now:

                <?xml version="1.0" encoding="UTF-8" ?>
                <!-- ==========================================================================\
                |   To learn how to make your own language parser, please check the following
                |   link:    https://npp-user-manual.org/docs/function-list/
                \=========================================================================== -->
                <NotepadPlus>
                	<functionList>
                		<parser
                			displayName="LaTeX Syntax"
                			id         ="latex_class"
                			commentExpr="(?x)
                							(%.*?$)                                 # Comment
                						"
                
                		>
                			<function
                				mainExpr="(?x)                         # free-spacing (see `RegEx - Pattern Modifiers`)
                						  (?im-s)                      # ignore case, ^ and $ match start/end of line, dot doesn't match newline
                						  \\begin{document}            # match start of document 
                						  "
                			>
                			</function>
                			<classRange
                				mainExpr    ="(?x)                     # free-spacing (see `RegEx - Pattern Modifiers`)
                						(?m)                           # ^ and $ match at line-breaks
                						(?'CLASS_START'
                							^\s*                       # optional leading white space before \chapter
                							\\(chapter\*?)
                						)
                						(?s:.*?)                       # whatever,
                						(?=                            # ...up till
                							\s*                        # ...optional leading white-space of
                							(?:
                								(?&amp;CLASS_START)    # ...next header
                							|	(\\end{document})      # ...or end of document
                							)
                						)
                					"
                			>
                				<className>
                					<nameExpr expr="(?x)               # free-spacing (see `RegEx - Pattern Modifiers`)
                								(?<={)                 # brace before name
                								.*?                    # name
                								(?=})                  # brace after name
                							"
                						/>
                				</className>
                				
                				<function
                					mainExpr="(?xm-s)                  # free-spacing (see `RegEx - Pattern Modifiers`)
                							\\(chapter|                # match chapter so that even \chapters with no \section appear
                							section|                   # match \section 
                							subsection|                # match \subsection
                							)\*?{.*}                   # match starred and unstarred commands
                						"
                				>
                					<functionName>
                						<funcNameExpr expr=".*"/>
                					</functionName>
                				</function>
                			</classRange>
                		</parser>
                	</functionList>
                </NotepadPlus>
                

                And the result on the sample file:
                Screenshot 2025-05-14 094212.png

                I modified the classRange mainExpr because I wanted to also match indented \chapter{}s (like I have for the appendices in the new sample file). After that change I found that the last \chapter{} wasn’t being matched with \Z so I changed the alternate search to look for \end{document} instead (which will always appear) and that worked.

                Thanks for your help!

                1 Reply Last reply Reply Quote 2
                • First post
                  Last post
                The Community of users of the Notepad++ text editor.
                Powered by NodeBB | Contributors