Removing Unwanted Columns



  • Hi all, I have a file with four columns of text such as;

    Server-name IP-address Date Time
    example-svr 10.10.1.0 2018-01-01 12:44:42.350

    I only want to retain column two with the IP address.
    How can I do that?



  • @Wynford-Junior-Thomas

    I would not use an editor. I would use a scripting language instead. Here is one example of how simple the task is using GNU’s version of AWK :

    gawk "{print $2}" somefilename > resultingfilename
    

    You can do it in the editor by performing a search and replace using regular expressions, but I’m not proficient enough with them to provide a good example for you. I’m sure someone will in short order, though…



  • Hello, @wynford-junior-thomas, and All,

    Although the gawk software seems the right program for basic operations, like your, you may, of course, use the N++ search/replace feature, with the special regular expresion mode. Just follow these simple steps :

    • Open your file in Notepad++

    • Open the Replace dialog ( Ctrl + H )

    • Check the two options Wrap around and Regular expresion

    • Type in the regex (?-s)^.+?\x20([0-9.]+).+ , in the Find what: zone

    • Type in the regex \1 , in the Replace with: zone

    • Click on the Replace All button

    Et voilà !


    So, from an initial example, as below :

    example-svr 10.10.1.0 2018-01-01 12:44:42.350
    my_server 10.255.255.57 2018-01-01 12:44:42.350
    Super_server 201.150.1.0 2018-01-01 12:44:42.350
    SVR_007 190.168.1.12 2018-01-01 12:44:42.350
    

    You’ll get the modified text, below :

    10.10.1.0
    10.255.255.57
    201.150.1.0
    190.168.1.12
    

    Notes :

    • The (?-s) part means that the dot, . represents a single standard character, not an End of Line character

    • The ^ represents the beginning of line location

    • Then the part .+? is the smallest range of standard characters, before a space character x20

    • Now, the ([0-9.]+) part represents an IP4 address ( any digit or the dot, repeated one or more times. As it is surounded by parentheses, this IP address is stored as group 1

    • Finally, the .+ part stands for the remaining of the current line

    • In replacement, any line is changed by the IPV4 address, only ( \2 )

    Best Regards,

    guy038



  • Using the NppExec plugin to
    display only the second column from a data file:
    cmd /c for /f “tokens=2” %a in (L:\SomeDataFile.txt) do @echo %a



  • If anyone tries Gogo’s solution, take care when using copy and paste on the provided cmd /c ... line. This initially threw me as I would get this error:

    Imgur

    Removing the copied double-quotes and retyping them solved the problem–just an FYI.

    Also, changing the command line to this may be more useful than a hardcoded path:

    cmd /c for /f "tokens=2" %a in ($(FULL_CURRENT_PATH)) do @echo %a

    For those that don’t have the NppExec plugin, this works just as well from the Run menu’s -> Run… -> The Program to Run box, leaving the results in the clipboard for easy pasting:

    cmd /c "cmd /c for /f "tokens=2" %a in ($(FULL_CURRENT_PATH)) do @echo %a" | clip



  • Of course, all of that “breaks” if the pathname of the file it is run upon contains one or more space characters. I’m not going to try to fix it.

    I was once a skilled batch file expert (even had a few things published a LONG time ago in magazines–remember those?), but I’ve long since lost interest due to the huge number of rules and special cases that no one can rightly remember.

    If someone else wants to tackle fixing it, I gratefully pass the gauntlet… :-D



  • I was curious, so spent the 10 minutes experimenting and researching.

    For npp_exec:

    cmd /c for /f "usebackq tokens=2" %a in ("$(FULL_CURRENT_PATH)") do @echo %a
    

    for run-menu:

    cmd /c "cmd /c for /F "usebackq tokens=2" %a in ("$(FULL_CURRENT_PATH)") do @echo %a" | clip
    

    Normally, to get Win to recognize a file with a space, you just have to put the quotes around it: "$(FULL_CURRENT_PATH)". However, FOR /F ... syntax uses the ("string here") notation to use the contents of the string, rather than the contents of the file. Using the usebackq, the string-quote character is changed from "" to '', so the double-quotes go back to their Windows-purpose of allowing spaced filenames.

    I figured this out from the help for on the cmd.exe command line, but it wasn’t very well explained. I then checked where I should have in the first place, and saw that ss64.com’s handy reference explicitly says in the usebackq section, “This option is useful when dealing with a filename that is a long filename containing spaces, it allows you to put double quotes around the filename”.



  • I’d simply replace all the <sp> with “,”<sp>, then save as .csv file, and import into excel. delete to two columns you don’t want, and export it again as a .csv file and rename it to .txt. Of course, if you don’t have excel, you can’t do this…



  • @Chip-Cooper

    I call a FOUL on Mr. Chip! Nobody here wants an over-complicated Excel solution–Ugh! The best we tolerate is an over-complicated (within) Notepad++ (or its plugins) solution! And we’ve certainly got those in abundance! :-D



  • lol, yeah I noticed that… it’s all good! I’ve been doing this kind of stuff for years… I was thinking of a script and then this popped into my head. I was in the same mindset. :-) I can’t even think of all the times I did things in an over complicated way because I had been coding… One could even make a macro to do it…



  • Inspired by the “excel” suggestion (or otherwise “favorite open-source spreadsheet”), but wanting to replicate that inside NPP itself: use the suggestions from another recent thread to get them into aligned colums in NPP, then use the column-mode editing to cut out whichever columns aren’t wanted (or to re-order columns, or similar). :-)

    But since the OP is probably long gone by now… I probably shouldn’t spend anymore brain-time on this one. :-)


Log in to reply