Extracting multiple lines at once from multiple files



  • Hi,

    I have a lot of files that I need to pull the below code from and extract to a text document. I already have wrote the regex code to find the stuff I need (in parentheses). For a single file, this is ok because I just bookmark the line and then remove the unbookmarked lines. But is there a way to do this for multiple files in a folder at a time? Below is an example of a file I’m working with.

    /NDTMRY+Helvetica-Narrow+T42+SQWZGL*1 
    2.97497 5.423 5.03197 0 
    ]xsh
    164 45.1998 mo
    (YES)
    [4.65033 4.65031 0 ]xsh
    17 46.4994 mo
    

    And this is my regex code to find what I need:

    ^\(.*?$

    This bookmarks the (yes) and then I just use the search> bookmark> remove unbookmarked lines. But this only works for one file at a time.

    Thanks !



  • Hello, @acme1235 and All,

    Here is the solution : just find the appropriate regex S/R in order to delete any line which does not begin with an opening parenthese ( and use the Find in Files panel ( Ctrl + Shift + F ) !


    So, the road map is :

    • Open the Find in Files dialog ( Ctrl + Shift + F )

    • SEARCH (?-s)^[^(].*\R?

    • REPLACE Leave EMPTY

    • Type in the subset of files to process ( *.txt, *.log or else… ), in the Filters zone

    • Type in the complete path of the needed directory, in the Directory zone or select it with the ... button

    • Untick all the box options

    • Click on the Replace All button

    • Valid the Are you sure? dialog

    Voila !


    Of course, for more security, I advice you to test, first, my regex S/R, against one file by, either :

    • Use the Replace dialog ( Ctrl + H ) and tick the Wrap around option to only process the current file

    • Type in a complete file name in the Filters field of the Find in Files dialog

    Of course, you may make a complete backup of the directory to process, too !

    Best Regards

    guy038



  • @guy038 thank you for your help! I was looking at the problem backwards



  • @guy038 said in Extracting multiple lines at once from multiple files:

    Here is the solution : just find the appropriate regex S/R in order to delete any line which does not begin with an opening parenthese ( and use the Find in Files panel ( Ctrl + Shift + F ) !

    I think there is another, possibly better or at least non-destructive method. It uses the find in files, but not to remove lines. I refer to the search results window. This will:
    From Find All in … searches, three types of sections are added to the Search results window. First is a line describing what was searched for, how many total matches (known as “hits”) occurred (this is also shown in the title bar for the window, for the most recently-occurring search), and how many files had matches. Second is a line that shows the filename with the matches and the count of matches for that file (this type will be repeated if the search found multiple files with matches). Last comes the details about the matches found, including line number and the line contents with the matched text emphasized. The default emphasis is red text on a yellow background, but this may be changed in the Style Configurator’s “Search result” Language area.
    Key points are that it provides the filename, number of hits in that file and the line containing that hit. Of course this information can then be copied and further edited. As the OP wants that information in one file I think this process would expedite that and the side benefit is the original files remain unedited.

    I’m not on a PC at this time to further the test but feel confident it is easily achievable.

    Terry



  • @Acme1235 said in Extracting multiple lines at once from multiple files:

    I have a lot of files that I need to pull the below code from and extract to a text document.

    @Acme1235 @guy038 I’ve done some testing and it works as I expected.

    I created 4 files, 3 of which had the (xxx) code in them with differing text inside each parentheses. I ran the “Find in Files” option with the OP’s regex and the search result window provided the following:

    Search "^\(.*?$" (3 hits in 3 files of 4 searched)
      C:\Users\terry\Documents\NPP tests\21809\n1.txt (1 hit)
    	Line 5: (YES)
      C:\Users\terry\Documents\NPP tests\21809\n2.txt (1 hit)
    	Line 5: (PICK)
      C:\Users\terry\Documents\NPP tests\21809\n3.txt (1 hit)
    	Line 5: (NZ)
    

    This is then selected and copied to another (empty) tab and the following regex is run to remove the unwanted data. It’s still a bit rough, I hadn’t tried to remove EOLs as well, it’s just a test and the process could be streamlined, possibly incorporating into a macro, so a one-click process.
    Find What:(?-is)^.*?Line \d+:\h*(.+)|.*
    Replace With:\1

    and I get

    
    
    (YES)
    
    (PICK)
    
    (NZ)
    

    At this point there are still empty lines which can easily be removed using the Edit, Line Operations, Remove Empty Lines. The result is a single tab containing the data required which can then be saved as a single file.`

    Terry



  • Hi, @acme1235, @terry-r and All,

    Ah…, yes, clever method, Terry ;-)

    And from your search results panel, here are two other possible layouts :

    • With the regex S/R :

      • SEARCH (?-s)\(1\x20hit\)\R

      • REPLACE Leave EMPTY

    we would obtain :

    Search "^\(.*?$" (3 hits in 3 files of 4 searched)
      C:\Users\terry\Documents\NPP tests\21809\n1.txt 	Line 5: (YES)
      C:\Users\terry\Documents\NPP tests\21809\n2.txt 	Line 5: (PICK)
      C:\Users\terry\Documents\NPP tests\21809\n3.txt 	Line 5: (NZ)
    
    • with the regex S/R :

      • SEARCH (?-s)^Search.+\R|\x20\(1\x20hit\)\R

      • REPLACE Leave EMPTY

    we would obtain :

      C:\Users\terry\Documents\NPP tests\21809\n1.txt	Line 5: (YES)
      C:\Users\terry\Documents\NPP tests\21809\n2.txt	Line 5: (PICK)
      C:\Users\terry\Documents\NPP tests\21809\n3.txt	Line 5: (NZ)
    

    Best Regards,

    guy038



  • I’ll point this out here, because maybe a lot of people don’t know it. And it seems to pull the same data @Terry-R obtained, bit making some of the steps he described unnecessary.

    Start with the Search results window as Terry showed it:

    781b6239-8625-4a2b-b95d-657f9ee45b42-image.png

    Right click anywhere in the “Search” line, for example where I show the yellow dot here:

    f483f034-7767-4625-8324-2507beacb396-image.png

    The context menu will appear where you’ll want to choose “Copy Selected Line(s)”:

    1bcd41af-bef7-48e8-9817-a7fe0f2d198d-image.png

    Side note: It really is a poor name, because nothing was selected before running the command. :(

    After running that command, you’ll have the following in the clipboard:

    (YES)
    (PICK)
    (NZ)
    

    In a nutshell what the “Copy Selected Line(s)” command does is copy only the data-lines of the search result – no search info, no filename info, no “Line xx:” prefix.

    Where the "selected part comes in is if you do make a selection before running the command via right-clicking somewhere on the selection, for example:

    a018e992-1c9a-478b-b421-d28a0a4c6c26-image.png

    Then your result (in the clipboard) would be just from those lines selected:

    (YES)
    (PICK)
    


  • Hello @acme1235, @terry-r, @alan-kilborn and All,

    My bad ! I just paste the literal @terry-r’s text to test my regexes. So, I could’nt get the Search Results context options. Of course, Alan, your N++ solution is just the more simple and the best one ;-))

    Anyway, I suppose that @acme1235 should be pleased with the many ways to display results !

    BR

    guy038