i want to split my text file into 1332 different files at specific part and save in single folder
-
I have a file containing 1332 cif molecules. i want to convert them into 1332 different cif molecule files in a single folder.
-
You need a programming language and some coding skills in order to do that.
It is not something that Notepad++ can help you with.
Sorry and good luck. -
Hello, @abhishek-sharma, @alan-kilborn and All,
You said :
I have a file containing 1332 cif molecules. i want to convert them into 1332 different cif molecule files in a single folder.
Well, if each cif molecule corresponds to a single text line, here is a very easy solution :
-
Rename your initial file as
File_All.txt
-
Create a new folder in your laptop
-
Move to this new folder
-
Copy the
File_All.txt
file in this folder -
Now, from this link https://code.google.com/archive/p/gnu-on-windows/downloads , download the
gawk-4.1.0-bin.zip
archive, -
Double-click on this archive
-
Extract the unique file
gawk.exe
-
Open a CMD prompt window
-
Type in the command
gawk "BEGIN {n=0} {n++ ; print > \"File_\"n\".txt\"}" File_All.txt
and valid -
After a while, when finished, you should get
1332
files, ( fromFile_1.txt
toFile_1332.txt
), which contain one line, each !
Now, @abhishek-sharma, if each cif molecule is NOT a single line, just show us your file containing the
1332
cif molecules or part of this file !Of course, I going to get some flak, from @alan-kilborn, for showing something that does not concern Notepad++, ( so off-topic ), but hey, the solution is so simple !
Best Regards,
guy038
P.S. :
Anyone can easily test my solution ! For example, if a
File_All.txt
contains17
lines, after execution, you’ll get17
new files, fromFile_1.txt
toFile_17.txt
-
-
@guy038 said :
Of course, I going to get some flak, from @alan-kilborn, for showing something that does not concern Notepad++, ( so off-topic )
Here’s your flak. As promised. :-)
We can’t devolve into solving everyone’s non-N++ problem.
If we do, it becomes uninteresting for those that want to see Notepad++ problems presented and solved here. -
@guy038 People come and ask here because they don’t know where else to ask, so please do help people like us even if it is unrelated to Notepad++ and you know the solution.
-
Please don’t answer cookie baking questions.
If it’s borderline about text editing, like the question above, then the best thing for the original poster is to be pointed that “Notepad++ cannot do this; a tool that might be able to is XYZ, but you will have to look elsewhere to find help for using that tool, because this is a Notepad++ forum, not an XYZ forum.”
If it’s possible to answer in 1-2 sentences about the other tool, then as long as it’s presented with a caveat, “but further questions should be directed to an XYZ forum”, it will usually be tolerated. And with @guy038 's thousands of helpful Notepad++ specific answers, we just politely point out to him when he’s pushing the boundaries (like this time).
But this is a Notepad++ forum, not a general computer -help forum, and not even a text-transformation forum. Please keep questions and answers on topic.
-
I was reading through @guy038’s reply and got excited when I read “very easy solution” and so continued to read until I saw …gnu…download the gawk-4.1.0-bin.zip…
I’ll be a bit more sneaky and do this nearly all in Notepad++.
Step 1 - Do a search/replace with:
Search:
^
Replace:>>"%XFILE%" echo.
Step 2 - Do another search/replace with:
Search:
(?-i)^(>>"%XFILE%" echo.data_(.+))
Replace:set XFILE=data_\2.cif\r\necho Generating %XFILE%\r\n\1
Step 3
Add one line at the top of the file with@echo off
and then save the file asfilename.bat
where filename is something you pick.step 4
Run the newly created batch file. It should generate 1332 separate files nameddata_something.cif
where something is thedata_...
name.Explanation - I made the assumption that each molecule starts with a
data_...
line and using those to generate the file name for each molecule. The search/replace steps convert the file from .cif file format into a batch script that generates the .cif file content. -
@mkupper said:
I’ll be a bit more sneaky and do this nearly all in Notepad++.
And to me this is much more palatable of a solution.
Although it doesn’t entirely use Notepad++, it uses something external to Notepad++ that is present on every PC; specificallyCMD.exe
which is used to “run the newly created batch file”.
We don’t want an extensive discussion of batch/CMD here, however. -
Hi, @abhishek-sharma, @alan-kilborn, @peterjones, @dr-ramaanand, @mkupper and All,
@peterjones, you said :
…we just politely point out to him when he’s pushing the boundaries (like this time).
Peter, I do thank you for your phrasing !
Note that I could have said, in my first post :
-
Hit the
F5
key -
Paste
cmd /c gawk "BEGIN {n=0} {n++ ; print > \"File_\"n\".txt\"}" File_All.txt
in the zone -
Click on the
Run
button
On the other hand, I could have created a batch file, from within N++, called
Split.bat
:@echo off cmd /c gawk "BEGIN {n=0} {n++ ; print > \"File_\"n\".txt\"}" File_All.txt
and then, use the
Run > Run...
option and paste eithercmd /c Split.bat
orcmd /c $(FILE_NAME)
!
Now, I’m going to tease you a little ! In these three posts, below, you’re actually using a similar approach than mime :
https://community.notepad-plus-plus.org/post/26531
https://community.notepad-plus-plus.org/post/46881
https://community.notepad-plus-plus.org/post/50945
I grant you that, in the first post, you began your post with :
This is not a coding help forum.
And, regarding the second post, I understand that you were testing if the output of a long_line_text file was correct or not on a printer device !
Now, here is a solution which only uses the
cmd
command within N++ and, thus, would keep itson-topic
status !- Let’s suppose this INPUT text, pasted in the
File_All.txt
file :
Here is a small text to test if my batch file works as expected
-
Copy the
File_All.txt
asFile_For_Each_Line.bat
-
Open the
File_For_Each_Line.bat
file in Notepad++ -
Add an empty line at the very beginning ( IMPORTANT )
-
Choose the
Language > B > Batch
menu option -
Choose the
Encoding > Convert to ANSI
menu option ( IMPORTANT ) -
Open the Replace dialog (
Ctrl + H
) -
Untick all the box otpions
-
FIND
(?-s)(.+)|\A\R
-
REPLACE
(?1set /a NUM+=1 & echo \1 > file_%NUM%.txt:@echo OFF\r\nchcp 1252 > NUL\r\nset NUM=1\r\n)
-
Tick the
Wrap around
option -
Select the
Regular expression
search mode -
Click once on the
Replace All
button -
Save the modifications of the
File_For_Each_Line.bat
file
Now :
-
Hit the
F5
key -
Paste
cmd /c File_For_Each_Line.bat
in the zone -
Click on the
Run
button
or
-
Open a DOS command prompt
-
Type
File_For_Each_Line.bat
and hit theEnter
key
=> You should get, in current directory, eight new files, from
file_1.txt
tofile_8.txt
, each containing one line of the initialFile_All.txt
Voila !
Best Regards,
guy038
-
-
@guy038 ,
Sure am glad I didn’t suggest a dBASE Plus solution for this problem and end up in that list. <g,d,r>