• Login
Community
  • Login

search and extracting text from text file

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
9 Posts 4 Posters 12.7k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T
    Tormod Skaret 0
    last edited by Feb 13, 2022, 6:37 PM

    I need to search for the text procedure and function returning the actual text line. in addition a text box is added after the proc/func as comment on what the routine do. This should also be extracted. The result should be stored in a new file including line number of routine, or sent to memory for retrieving with a ctrlv. how to do this in notepad++

    A T 2 Replies Last reply Feb 13, 2022, 6:44 PM Reply Quote 0
    • A
      Alan Kilborn @Tormod Skaret 0
      last edited by Feb 13, 2022, 6:44 PM

      @tormod-skaret-0

      Maybe a specific example would help illustrate your problem better, and facilitate coming up with a solution? It’s all very vague.

      1 Reply Last reply Reply Quote 1
      • T
        Terry R @Tormod Skaret 0
        last edited by Feb 13, 2022, 6:52 PM

        @tormod-skaret-0 said in search and extracting text from text file:

        The result should be stored in a new file including line number of routine

        You give very little information, no example and yet expect a response to help you. I’m AMAZED!

        Yet here I am responding. First a question.
        You say there is a text box that you also need to locate/return as a part of the result. Are you sure this is a TEXT based file?

        A text box would not appear in a TEXT file, unless it was crudely made using the minus (-) sign, vertical (|) bar and whatever else might help. If it was using these, they would appear on different lines to the data you are looking for.

        I do have an idea though, it would involve copying the file, then adding the line number to the front of every line. The contents would then be filtered such that the data you are looking for is all that remains, all other lines are removed. This then gives you the required line number of the routine and the content of that line.

        That’s about the lot of it, until as Alan says, you provide more information and an example.

        Terry

        1 Reply Last reply Reply Quote 1
        • T
          Tormod Skaret 0
          last edited by Feb 13, 2022, 7:39 PM

          Sorry for the inconvinience, here is a picture:
          324fd07f-a448-4bcf-91bc-f906e4bc83ec-image.png
          What I need to do is make a list of routines (both procedures and functions with reference to its starting line. As you can see every routine has a text attached explaining what the routine do, this text need also to be extracted, but not the implementation. This is an example how evety routine is laid out, you se the end of previous routine and start of next procedure. The simplest will be to put it in a new file where it can be edited. The idea is to place all headers and asociated text on the top of the same text file so copying the result from memory is sufficient. Here is expected layout:
          846c6232-5f48-4da1-a6f8-6ccac4663833-image.png

          A T M 3 Replies Last reply Feb 13, 2022, 7:48 PM Reply Quote 0
          • A
            Alan Kilborn @Tormod Skaret 0
            last edited by Alan Kilborn Feb 13, 2022, 7:49 PM Feb 13, 2022, 7:48 PM

            @tormod-skaret-0

            Does this find all of your lines of interest?:

            Find: (?-s)^//\h+function .+
            Search mode: Regular expression

            Also, you are asking people to help you manipulate text, so pasting screenshots is not as good as posting actual text that people could copy/paste to experiment on.

            1 Reply Last reply Reply Quote 0
            • T
              Terry R @Tormod Skaret 0
              last edited by Feb 13, 2022, 7:51 PM

              @tormod-skaret-0 said in search and extracting text from text file:

              Sorry for the inconvinience, here is a picture:

              That makes it soo much easier to understand. Although you have provided an example (in an image), could you do the same as actual text. To do so, you need to insert the text within a black box in your post. So select about 2-4 of the functions, preferable together so we can see exactly what a contiguous portion of the file is like.

              There is a pinned post in the FAQ section called :FAQ Desk: Template for Search/Replace Questions", please read this before making the next post. It explains how to insert examples as text.

              So my original idea of adding line numbers seems to be on the right track. Once that was done another regular expression would add that line number in the “box” section for each function. Then the last bit of the process would be to remove all code lines leaving just the function name, line number and the associated box explaining what the function does.

              Terry

              1 Reply Last reply Reply Quote 2
              • M
                mpheath @Tormod Skaret 0
                last edited by Feb 14, 2022, 1:18 AM

                @tormod-skaret-0

                Source looks like Pascal.

                Regular Expressions is not about line count. It is about matching patterns. If you want line count then you may need something like a script. If the line count is not required, then a Regular Expression or two might be suitable.

                Here is a regular expression:

                1. Press Ctrl+M to display the Mark dialog.
                2. Check the Regular expression radio button.
                3. Find what: ^(?:function|procedure).+?\R+(?://.*?\R)*.
                4. Press button Mark All.
                5. Press button Copy Marked Text.
                6. Press Ctrl+N to open a new tab and press Ctrl+V to paste.

                The result may have ---- lines separating the captures in the pasted content. Empty lines may exist too. May need to do a replacement to remove these from the pasted result.

                1. Press Ctrl+H to display the Replace dialog (or select the Replace tab in the previous dialog).
                2. Check the Regular expression radio button.
                3. Find what: ^\R|^-+\R.
                4. Replace with: Empty string.
                5. Press the Replace All button.

                Similar Source:

                function __SkrASMStrLen(Source : ^byte) : word;
                
                // ============================================
                //    function __SkrASMStrLen(Source : ^byte) : word;
                //
                //  This routine returns ...
                //  Note: ...
                //
                // Optimized: xxxx-xx-xx
                // ============================================
                
                begin
                end;
                
                function __SkrASMStrLen2(Source : ^byte) : word;
                
                // ============================================
                //    function __SkrASMStrLen2(Source : ^byte) : word;
                //
                //  This routine returns ...
                //  Note: ...
                //
                // Optimized: xxxx-xx-xx
                // ============================================
                
                begin
                end;
                
                function __SkrASMStrLen3(Source : ^byte) : word;
                
                begin
                end;
                
                function __SkrASMStrLen4(Source : ^byte) : word;
                
                begin
                end;
                

                Final Result:

                function __SkrASMStrLen(Source : ^byte) : word;
                // ============================================
                //    function __SkrASMStrLen(Source : ^byte) : word;
                //
                //  This routine returns ...
                //  Note: ...
                //
                // Optimized: xxxx-xx-xx
                // ============================================
                function __SkrASMStrLen2(Source : ^byte) : word;
                // ============================================
                //    function __SkrASMStrLen2(Source : ^byte) : word;
                //
                //  This routine returns ...
                //  Note: ...
                //
                // Optimized: xxxx-xx-xx
                // ============================================
                function __SkrASMStrLen3(Source : ^byte) : word;
                function __SkrASMStrLen4(Source : ^byte) : word;
                
                A 1 Reply Last reply Feb 15, 2022, 8:35 PM Reply Quote 4
                • T
                  Tormod Skaret 0
                  last edited by Feb 15, 2022, 7:36 PM

                  Thank you,
                  this recipe worked nicely, this make it possible for me to make a header to my library files without showing the implementation. Since the routines are contained in a Pascal unit, a header need to contain the routine layout - now bot this is served and the text what the routine does is served. The requirement for adding line numbers (can be substituted with page number) is for easily find back to routines in a printed version since this library contain more than 80 routines. For those who wonders, the implementation is written in MIPS for PIC 32 microcontrollers. Thank you again for quick and good response.
                  Tormod

                  1 Reply Last reply Reply Quote 1
                  • A
                    Alan Kilborn @mpheath
                    last edited by Alan Kilborn Feb 15, 2022, 8:36 PM Feb 15, 2022, 8:35 PM

                    @mpheath said in search and extracting text from text file:

                    Press button Copy Marked Text.

                    Press Ctrl+N to open a new tab and press Ctrl+V to paste.

                    The result may have ---- lines separating the captures in the pasted content.

                    This (the ---- lines) will happen when marked text contains a line-ending(s).

                    If you search for something simple, e.g. foobar and it occurs many times, when you do Copy Marked Text each match will be separated by a line-ending, typically CRLF, and no ---- will appear.

                    However, if the search hit contains a line-ending, then a Copy Marked Text, in the copied text the matches are separated by ----CRLF, so that you can see what is actually part of the match and what is delimiter. In other words, the ---- lines are the delimiter.

                    Example text:

                    fubar
                    
                    foobar
                    
                    foo
                    bar
                    
                    fu
                    bar
                    

                    Use the Mark function and mark the regex (?-s)f[ou].*?bar and press Copy Marked Text, and then Paste elsewhere to obtain:

                    fubar
                    foobar
                    

                    Use the Mark function and mark the regex (?s)f[ou].*?bar and press Copy Marked Text, and then Paste elsewhere to obtain:

                    fubar
                    ----
                    foobar
                    ----
                    foo
                    bar
                    ----
                    fu
                    bar
                    

                    I don’t know that this is expressed in the user manual.

                    1 Reply Last reply Reply Quote 3
                    3 out of 9
                    • First post
                      3/9
                      Last post
                    The Community of users of the Notepad++ text editor.
                    Powered by NodeBB | Contributors