• Login
Community
  • Login

Extracting multiple lines at once from multiple files

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
regexbookmarksearch & replace
8 Posts 4 Posters 2.4k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A
    Acme1235
    last edited by Sep 10, 2021, 3:01 PM

    Hi,

    I have a lot of files that I need to pull the below code from and extract to a text document. I already have wrote the regex code to find the stuff I need (in parentheses). For a single file, this is ok because I just bookmark the line and then remove the unbookmarked lines. But is there a way to do this for multiple files in a folder at a time? Below is an example of a file I’m working with.

    /NDTMRY+Helvetica-Narrow+T42+SQWZGL*1 
    2.97497 5.423 5.03197 0 
    ]xsh
    164 45.1998 mo
    (YES)
    [4.65033 4.65031 0 ]xsh
    17 46.4994 mo
    

    And this is my regex code to find what I need:

    ^\(.*?$

    This bookmarks the (yes) and then I just use the search> bookmark> remove unbookmarked lines. But this only works for one file at a time.

    Thanks !

    T 1 Reply Last reply Sep 12, 2021, 2:40 AM Reply Quote 0
    • G
      guy038
      last edited by guy038 Sep 10, 2021, 5:24 PM Sep 10, 2021, 4:27 PM

      Hello, @acme1235 and All,

      Here is the solution : just find the appropriate regex S/R in order to delete any line which does not begin with an opening parenthese ( and use the Find in Files panel ( Ctrl + Shift + F ) !


      So, the road map is :

      • Open the Find in Files dialog ( Ctrl + Shift + F )

      • SEARCH (?-s)^[^(].*\R?

      • REPLACE Leave EMPTY

      • Type in the subset of files to process ( *.txt, *.log or else… ), in the Filters zone

      • Type in the complete path of the needed directory, in the Directory zone or select it with the ... button

      • Untick all the box options

      • Click on the Replace All button

      • Valid the Are you sure? dialog

      Voila !


      Of course, for more security, I advice you to test, first, my regex S/R, against one file by, either :

      • Use the Replace dialog ( Ctrl + H ) and tick the Wrap around option to only process the current file

      • Type in a complete file name in the Filters field of the Find in Files dialog

      Of course, you may make a complete backup of the directory to process, too !

      Best Regards

      guy038

      A T 2 Replies Last reply Sep 11, 2021, 8:04 PM Reply Quote 3
      • A
        Acme1235 @guy038
        last edited by Sep 11, 2021, 8:04 PM

        @guy038 thank you for your help! I was looking at the problem backwards

        1 Reply Last reply Reply Quote 0
        • T
          Terry R @guy038
          last edited by Sep 11, 2021, 10:25 PM

          @guy038 said in Extracting multiple lines at once from multiple files:

          Here is the solution : just find the appropriate regex S/R in order to delete any line which does not begin with an opening parenthese ( and use the Find in Files panel ( Ctrl + Shift + F ) !

          I think there is another, possibly better or at least non-destructive method. It uses the find in files, but not to remove lines. I refer to the search results window . This will:
          From Find All in … searches, three types of sections are added to the Search results window. First is a line describing what was searched for, how many total matches (known as “hits”) occurred (this is also shown in the title bar for the window, for the most recently-occurring search), and how many files had matches. Second is a line that shows the filename with the matches and the count of matches for that file (this type will be repeated if the search found multiple files with matches). Last comes the details about the matches found, including line number and the line contents with the matched text emphasized. The default emphasis is red text on a yellow background, but this may be changed in the Style Configurator’s “Search result” Language area.
          Key points are that it provides the filename, number of hits in that file and the line containing that hit. Of course this information can then be copied and further edited. As the OP wants that information in one file I think this process would expedite that and the side benefit is the original files remain unedited.

          I’m not on a PC at this time to further the test but feel confident it is easily achievable.

          Terry

          1 Reply Last reply Reply Quote 2
          • T
            Terry R @Acme1235
            last edited by Sep 12, 2021, 2:40 AM

            @Acme1235 said in Extracting multiple lines at once from multiple files:

            I have a lot of files that I need to pull the below code from and extract to a text document.

            @Acme1235 @guy038 I’ve done some testing and it works as I expected.

            I created 4 files, 3 of which had the (xxx) code in them with differing text inside each parentheses. I ran the “Find in Files” option with the OP’s regex and the search result window provided the following:

            Search "^\(.*?$" (3 hits in 3 files of 4 searched)
              C:\Users\terry\Documents\NPP tests\21809\n1.txt (1 hit)
            	Line 5: (YES)
              C:\Users\terry\Documents\NPP tests\21809\n2.txt (1 hit)
            	Line 5: (PICK)
              C:\Users\terry\Documents\NPP tests\21809\n3.txt (1 hit)
            	Line 5: (NZ)
            

            This is then selected and copied to another (empty) tab and the following regex is run to remove the unwanted data. It’s still a bit rough, I hadn’t tried to remove EOLs as well, it’s just a test and the process could be streamlined, possibly incorporating into a macro, so a one-click process.
            Find What:(?-is)^.*?Line \d+:\h*(.+)|.*
            Replace With:\1

            and I get

            
            
            (YES)
            
            (PICK)
            
            (NZ)
            

            At this point there are still empty lines which can easily be removed using the Edit, Line Operations, Remove Empty Lines. The result is a single tab containing the data required which can then be saved as a single file.`

            Terry

            A 1 Reply Last reply Sep 12, 2021, 1:17 PM Reply Quote 4
            • G
              guy038
              last edited by guy038 Sep 12, 2021, 9:25 AM Sep 12, 2021, 9:21 AM

              Hi, @acme1235, @terry-r and All,

              Ah…, yes, clever method, Terry ;-)

              And from your search results panel, here are two other possible layouts :

              • With the regex S/R :

                • SEARCH (?-s)\(1\x20hit\)\R

                • REPLACE Leave EMPTY

              we would obtain :

              Search "^\(.*?$" (3 hits in 3 files of 4 searched)
                C:\Users\terry\Documents\NPP tests\21809\n1.txt 	Line 5: (YES)
                C:\Users\terry\Documents\NPP tests\21809\n2.txt 	Line 5: (PICK)
                C:\Users\terry\Documents\NPP tests\21809\n3.txt 	Line 5: (NZ)
              
              • with the regex S/R :

                • SEARCH (?-s)^Search.+\R|\x20\(1\x20hit\)\R

                • REPLACE Leave EMPTY

              we would obtain :

                C:\Users\terry\Documents\NPP tests\21809\n1.txt	Line 5: (YES)
                C:\Users\terry\Documents\NPP tests\21809\n2.txt	Line 5: (PICK)
                C:\Users\terry\Documents\NPP tests\21809\n3.txt	Line 5: (NZ)
              

              Best Regards,

              guy038

              1 Reply Last reply Reply Quote 2
              • A
                Alan Kilborn @Terry R
                last edited by Alan Kilborn Sep 12, 2021, 1:18 PM Sep 12, 2021, 1:17 PM

                I’ll point this out here, because maybe a lot of people don’t know it. And it seems to pull the same data @Terry-R obtained, bit making some of the steps he described unnecessary.

                Start with the Search results window as Terry showed it:

                781b6239-8625-4a2b-b95d-657f9ee45b42-image.png

                Right click anywhere in the “Search” line, for example where I show the yellow dot here:

                f483f034-7767-4625-8324-2507beacb396-image.png

                The context menu will appear where you’ll want to choose “Copy Selected Line(s)”:

                1bcd41af-bef7-48e8-9817-a7fe0f2d198d-image.png

                Side note: It really is a poor name, because nothing was selected before running the command. :(

                After running that command, you’ll have the following in the clipboard:

                (YES)
                (PICK)
                (NZ)
                

                In a nutshell what the “Copy Selected Line(s)” command does is copy only the data-lines of the search result – no search info, no filename info, no “Line xx:” prefix.

                Where the "selected part comes in is if you do make a selection before running the command via right-clicking somewhere on the selection, for example:

                a018e992-1c9a-478b-b421-d28a0a4c6c26-image.png

                Then your result (in the clipboard) would be just from those lines selected:

                (YES)
                (PICK)
                
                1 Reply Last reply Reply Quote 3
                • G
                  guy038
                  last edited by Sep 12, 2021, 5:28 PM

                  Hello @acme1235, @terry-r, @alan-kilborn and All,

                  My bad ! I just paste the literal @terry-r’s text to test my regexes. So, I could’nt get the Search Results context options. Of course, Alan, your N++ solution is just the more simple and the best one ;-))

                  Anyway, I suppose that @acme1235 should be pleased with the many ways to display results !

                  BR

                  guy038

                  1 Reply Last reply Reply Quote 1
                  5 out of 8
                  • First post
                    5/8
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors