Community
    • Login

    search and extracting text from text file

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    9 Posts 4 Posters 12.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Tormod Skaret 0T
      Tormod Skaret 0
      last edited by

      I need to search for the text procedure and function returning the actual text line. in addition a text box is added after the proc/func as comment on what the routine do. This should also be extracted. The result should be stored in a new file including line number of routine, or sent to memory for retrieving with a ctrlv. how to do this in notepad++

      Alan KilbornA Terry RT 2 Replies Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @Tormod Skaret 0
        last edited by

        @tormod-skaret-0

        Maybe a specific example would help illustrate your problem better, and facilitate coming up with a solution? It’s all very vague.

        1 Reply Last reply Reply Quote 1
        • Terry RT
          Terry R @Tormod Skaret 0
          last edited by

          @tormod-skaret-0 said in search and extracting text from text file:

          The result should be stored in a new file including line number of routine

          You give very little information, no example and yet expect a response to help you. I’m AMAZED!

          Yet here I am responding. First a question.
          You say there is a text box that you also need to locate/return as a part of the result. Are you sure this is a TEXT based file?

          A text box would not appear in a TEXT file, unless it was crudely made using the minus (-) sign, vertical (|) bar and whatever else might help. If it was using these, they would appear on different lines to the data you are looking for.

          I do have an idea though, it would involve copying the file, then adding the line number to the front of every line. The contents would then be filtered such that the data you are looking for is all that remains, all other lines are removed. This then gives you the required line number of the routine and the content of that line.

          That’s about the lot of it, until as Alan says, you provide more information and an example.

          Terry

          1 Reply Last reply Reply Quote 1
          • Tormod Skaret 0T
            Tormod Skaret 0
            last edited by

            Sorry for the inconvinience, here is a picture:
            324fd07f-a448-4bcf-91bc-f906e4bc83ec-image.png
            What I need to do is make a list of routines (both procedures and functions with reference to its starting line. As you can see every routine has a text attached explaining what the routine do, this text need also to be extracted, but not the implementation. This is an example how evety routine is laid out, you se the end of previous routine and start of next procedure. The simplest will be to put it in a new file where it can be edited. The idea is to place all headers and asociated text on the top of the same text file so copying the result from memory is sufficient. Here is expected layout:
            846c6232-5f48-4da1-a6f8-6ccac4663833-image.png

            Alan KilbornA Terry RT mpheathM 3 Replies Last reply Reply Quote 0
            • Alan KilbornA
              Alan Kilborn @Tormod Skaret 0
              last edited by Alan Kilborn

              @tormod-skaret-0

              Does this find all of your lines of interest?:

              Find: (?-s)^//\h+function .+
              Search mode: Regular expression

              Also, you are asking people to help you manipulate text, so pasting screenshots is not as good as posting actual text that people could copy/paste to experiment on.

              1 Reply Last reply Reply Quote 0
              • Terry RT
                Terry R @Tormod Skaret 0
                last edited by

                @tormod-skaret-0 said in search and extracting text from text file:

                Sorry for the inconvinience, here is a picture:

                That makes it soo much easier to understand. Although you have provided an example (in an image), could you do the same as actual text. To do so, you need to insert the text within a black box in your post. So select about 2-4 of the functions, preferable together so we can see exactly what a contiguous portion of the file is like.

                There is a pinned post in the FAQ section called :FAQ Desk: Template for Search/Replace Questions", please read this before making the next post. It explains how to insert examples as text.

                So my original idea of adding line numbers seems to be on the right track. Once that was done another regular expression would add that line number in the “box” section for each function. Then the last bit of the process would be to remove all code lines leaving just the function name, line number and the associated box explaining what the function does.

                Terry

                1 Reply Last reply Reply Quote 2
                • mpheathM
                  mpheath @Tormod Skaret 0
                  last edited by

                  @tormod-skaret-0

                  Source looks like Pascal.

                  Regular Expressions is not about line count. It is about matching patterns. If you want line count then you may need something like a script. If the line count is not required, then a Regular Expression or two might be suitable.

                  Here is a regular expression:

                  1. Press Ctrl+M to display the Mark dialog.
                  2. Check the Regular expression radio button.
                  3. Find what: ^(?:function|procedure).+?\R+(?://.*?\R)*.
                  4. Press button Mark All.
                  5. Press button Copy Marked Text.
                  6. Press Ctrl+N to open a new tab and press Ctrl+V to paste.

                  The result may have ---- lines separating the captures in the pasted content. Empty lines may exist too. May need to do a replacement to remove these from the pasted result.

                  1. Press Ctrl+H to display the Replace dialog (or select the Replace tab in the previous dialog).
                  2. Check the Regular expression radio button.
                  3. Find what: ^\R|^-+\R.
                  4. Replace with: Empty string.
                  5. Press the Replace All button.

                  Similar Source:

                  function __SkrASMStrLen(Source : ^byte) : word;
                  
                  // ============================================
                  //    function __SkrASMStrLen(Source : ^byte) : word;
                  //
                  //  This routine returns ...
                  //  Note: ...
                  //
                  // Optimized: xxxx-xx-xx
                  // ============================================
                  
                  begin
                  end;
                  
                  function __SkrASMStrLen2(Source : ^byte) : word;
                  
                  // ============================================
                  //    function __SkrASMStrLen2(Source : ^byte) : word;
                  //
                  //  This routine returns ...
                  //  Note: ...
                  //
                  // Optimized: xxxx-xx-xx
                  // ============================================
                  
                  begin
                  end;
                  
                  function __SkrASMStrLen3(Source : ^byte) : word;
                  
                  begin
                  end;
                  
                  function __SkrASMStrLen4(Source : ^byte) : word;
                  
                  begin
                  end;
                  

                  Final Result:

                  function __SkrASMStrLen(Source : ^byte) : word;
                  // ============================================
                  //    function __SkrASMStrLen(Source : ^byte) : word;
                  //
                  //  This routine returns ...
                  //  Note: ...
                  //
                  // Optimized: xxxx-xx-xx
                  // ============================================
                  function __SkrASMStrLen2(Source : ^byte) : word;
                  // ============================================
                  //    function __SkrASMStrLen2(Source : ^byte) : word;
                  //
                  //  This routine returns ...
                  //  Note: ...
                  //
                  // Optimized: xxxx-xx-xx
                  // ============================================
                  function __SkrASMStrLen3(Source : ^byte) : word;
                  function __SkrASMStrLen4(Source : ^byte) : word;
                  
                  Alan KilbornA 1 Reply Last reply Reply Quote 4
                  • Tormod Skaret 0T
                    Tormod Skaret 0
                    last edited by

                    Thank you,
                    this recipe worked nicely, this make it possible for me to make a header to my library files without showing the implementation. Since the routines are contained in a Pascal unit, a header need to contain the routine layout - now bot this is served and the text what the routine does is served. The requirement for adding line numbers (can be substituted with page number) is for easily find back to routines in a printed version since this library contain more than 80 routines. For those who wonders, the implementation is written in MIPS for PIC 32 microcontrollers. Thank you again for quick and good response.
                    Tormod

                    1 Reply Last reply Reply Quote 1
                    • Alan KilbornA
                      Alan Kilborn @mpheath
                      last edited by Alan Kilborn

                      @mpheath said in search and extracting text from text file:

                      Press button Copy Marked Text.

                      Press Ctrl+N to open a new tab and press Ctrl+V to paste.

                      The result may have ---- lines separating the captures in the pasted content.

                      This (the ---- lines) will happen when marked text contains a line-ending(s).

                      If you search for something simple, e.g. foobar and it occurs many times, when you do Copy Marked Text each match will be separated by a line-ending, typically CRLF, and no ---- will appear.

                      However, if the search hit contains a line-ending, then a Copy Marked Text, in the copied text the matches are separated by ----CRLF, so that you can see what is actually part of the match and what is delimiter. In other words, the ---- lines are the delimiter.

                      Example text:

                      fubar
                      
                      foobar
                      
                      foo
                      bar
                      
                      fu
                      bar
                      

                      Use the Mark function and mark the regex (?-s)f[ou].*?bar and press Copy Marked Text, and then Paste elsewhere to obtain:

                      fubar
                      foobar
                      

                      Use the Mark function and mark the regex (?s)f[ou].*?bar and press Copy Marked Text, and then Paste elsewhere to obtain:

                      fubar
                      ----
                      foobar
                      ----
                      foo
                      bar
                      ----
                      fu
                      bar
                      

                      I don’t know that this is expressed in the user manual.

                      1 Reply Last reply Reply Quote 3
                      • First post
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors