Hi, @daniel-b-0 and All,
Last UPDATED on 2024/05/22 : In the first version of this post, I exposed some real names of my personal photos. After reflection, I decided, for confidentiality, to change it and only show non-personal data !!
I understand that my method cannot be used safely with files of important size. So, I’m going to expose an second method which should work in all cases !
I experimented this new method with real data : A USB key of mine, containing 8,186 photos, collected over a period from 2004 to 2023
( Don’t worry, these photos are also stored on two external hard drives. In all circonstances, we must imitate the Mother Nature;, which uses RNA to code proteins and, NEVER, DNA itself for this purpose !! )
The general organisation of my USB drive is :
G:\_PHOTOS\2004\06_11-22_xxxxxxx - xxxxxxxxx - xxxxxxxxxxxxxx \01.jpg
G:\_PHOTOS\2004\06_11-22_xxxxxxx - xxxxxxxxx - xxxxxxxxxxxxxx \02.jpg
G:\_PHOTOS\2004\06_11-22_xxxxxxx - xxxxxxxxx - xxxxxxxxxxxxxx \03.jpg
G:\_PHOTOS\2004\06_11-22_xxxxxxx - xxxxxxxxx - xxxxxxxxxxxxxx \03_ORG.jpg
G:\_PHOTOS\2004\06_11-22_xxxxxxx - xxxxxxxxx - xxxxxxxxxxxxxx \04.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\01.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\02.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\03.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\04.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\05.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\06.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\07.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\08.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\09.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\10.jpg
G:\_PHOTOS\2005\03_22_SKI_xx xxxxxxx\01.jpg
G:\_PHOTOS\2005\03_22_SKI_xx xxxxxxx\02.jpg
G:\_PHOTOS\2005\03_22_SKI_xx xxxxxxx\03.jpg
G:\_PHOTOS\2005\08_22_xxxx xxxxxx\01.jpg
G:\_PHOTOS\2006\01_07_xxxxxxx xxxxxxxxxxx\01.jpg
G:\_PHOTOS\2023\10_08xxxxx xxxxx xxxxxxxxxxxx\01.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\01.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\02.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\03.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\04.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\05.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\06.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\07.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\08.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\09.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\10.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\11.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\12.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\13.jpg
G:\_PHOTOS\2023\12_26_xxxxx xxxxxxxxx xx xxxx xxxxxxx\01.jpg
G:\_PHOTOS\2023\12_26_xxxxx xxxxxxxxx xx xxxx xxxxxxx\02.jpg
G:\_PHOTOS\2023\12_26_xxxxx xxxxxxxxx xx xxxx xxxxxxx\03.jpg
G:\_PHOTOS\2023\12_31_xxxxxx - xxxxxxxx\01.jpg
So, sorted by year, then by motif ( month_day[-day]_location_reason or, sometimes, month_day[-day]_reason_location ) and finally by photo number, with, sometimes, the initial of the person who took the photo ( -A for Annie, my sister, -X for unknown, etc, )
In order to mimic your download.txt file, I placed the \x02 delimiters right after the G:_PHOTOS\ part and right before the \xx.jpg part; giving this format :
G:\_PHOTOS\2004\06_11-22_xxxxxxx - xxxxxxxxx - xxxxxxxxxxxxxx \01.jpg
G:\_PHOTOS\2004\06_11-22_xxxxxxx - xxxxxxxxx - xxxxxxxxxxxxxx \02.jpg
G:\_PHOTOS\2004\06_11-22_xxxxxxx - xxxxxxxxx - xxxxxxxxxxxxxx \03.jpg
G:\_PHOTOS\2004\06_11-22_xxxxxxx - xxxxxxxxx - xxxxxxxxxxxxxx \03_ORG.jpg
G:\_PHOTOS\2004\06_11-22_xxxxxxx - xxxxxxxxx - xxxxxxxxxxxxxx \04.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\01.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\02.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\03.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\04.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\05.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\06.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\07.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\08.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\09.jpg
G:\_PHOTOS\2005\01_24-29_SKI_xxxx xxxx xxxxx\10.jpg
G:\_PHOTOS\2005\03_22_SKI_xx xxxxxxx\01.jpg
G:\_PHOTOS\2005\03_22_SKI_xx xxxxxxx\02.jpg
G:\_PHOTOS\2005\03_22_SKI_xx xxxxxxx\03.jpg
G:\_PHOTOS\2005\08_22_xxxx xxxxxx\01.jpg
G:\_PHOTOS\2006\01_07_xxxxxxx xxxxxxxxxxx\01.jpg
G:\_PHOTOS\2023\10_08xxxxx xxxxx xxxxxxxxxxxx\01.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\01.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\02.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\03.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\04.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\05.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\06.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\07.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\08.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\09.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\10.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\11.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\12.jpg
G:\_PHOTOS\2023\12_15_xxxxxx xxxxxxx xxxxxxxx xxx\13.jpg
G:\_PHOTOS\2023\12_26_xxxxx xxxxxxxxx xx xxxx xxxxxxx\01.jpg
G:\_PHOTOS\2023\12_26_xxxxx xxxxxxxxx xx xxxx xxxxxxx\02.jpg
G:\_PHOTOS\2023\12_26_xxxxx xxxxxxxxx xx xxxx xxxxxxx\03.jpg
G:\_PHOTOS\2023\12_31_xxxxxx - xxxxxxxx\01.jpg
In this way, we are sure that the zones, between delimiters, are unique like, for instance :
Then, I randomized this file, using the N++ option :
Edit > Line Operations > Sort Lines Randomly
So my download.txt file looks like :
G:\_PHOTOS\2014\08_01_xxxxxxxx xxxxxxxxxxxx\009_G.jpg
G:\_PHOTOS\2014\02_21-22_xxxxxxxxxx_xxxxxxxxxx xxxxxx\07.jpg
G:\_PHOTOS\2012\08_07-22_xxxxxxxx xxxxxxxxx\034_X.jpg
G:\_PHOTOS\2010\05_29_xxxxxxxxx xxxxxxx_xxxxxxxx\14.jpg
G:\_PHOTOS\2017\08_10-28_xx xxxx\013.jpg
G:\_PHOTOS\2010\10_30-31_xxxxxx_xxxxxxxxxxxx xxxxx\076_X.jpg
Secondly, I created an exist.txt file, made of all the different zones, between the STX delimiters. I obtained a file of 366 lines, whose I randomly deleted 45 of them, giving a final exist.txt file with 321 lines. So, at the end of the new method, we should get a file of all the lines containing one of the missing 45 zones !
Important :
For a correct realization, you must use the last v8.6.5 version of Notepad++, which improves the multi-selection process !
In all the search/replacements, listed below :
The Wrap around option is checked
The Regular expression search mode is checked
All the other options are un-checked
Let’s go :
First, re-copy your download.txt file as mark.txt
Open the mark.txt file in N++
Open the Replace dialog ( Ctrl + H )
SEARCH (?-s)^.*\x02(.+)\x02.*
Click on the Replace All button
=> We just keep the zones between delimiters
Now, use the menu option Edit > Line Operations > Sort Lines Lexicographically Ascending
Re-open the Replace dialog ( Ctrl + H )
SEARCH (?-s)^(.+\R)\K\1+
Click on the Replace All button
=> The duplicate lines are deleted and your mark.txt file should have decreased drastically ! In my case, I did get a mark.txt file with only 366 different lines
Then, append your exist.txt at the end of the mark.txt file. In my case, the file contains 366 + 321 so 687 lines
Again, use the menu option Edit > Line Operations > Sort Lines Lexicographically Ascending
Re-open the Replace dialog ( Ctrl + H )
SEARCH (?-s)^(.+\R)\1
Click on the Replace All button
=> The mark.txt file should have decreased and now contains only the zones which require downloading. In my case, it contains, as expected, 45 lines / zones !
If the
last line of the
mark.txt file ends with an
EOL, delete the
EOL characters of this
last line
Note :
If all or some lines contain sub-folders, you’ll have to replace any \ character with a the literal \\ string
Now, on column 1, do a zero-length COLUMN selection of all the lines ( indication N × 0 in the status bar )
Type in a | pipe character
Hit the Home key
Hit the Backspace key
=> The file is changed into a one-line file
Hit the Home key, again
Delete the first | character
Finally, save the mark.txt file, now a single-line file
Remark :
If the
entire line contains more than
2,000 characters,
split this
long line in parts, right before a
| char and delete any
| remaining at
beginning and/or
end of the lines
For example :
Of course, in this case, you'll have to REPEAT the MARK operation, described below, for each CREATED line
Now, re-copy your download.txt file as to_do.txt
Switch to the mark.txt tab, containing, most of a time, just a single line
Select all the text ( Ctrl + A )
Open the Mark dialog ( Ctrl + M )
=> The text should be automatically inserted in the dialog
Check the Bookmark line and Purge for each search options ( IMPORTANT )
Switch back to the to_do.txt tab
Click on the Mark All button
=> Message of the dialog Mark: xxx matches in entire file ( 876, in my case )
In the Bookmark margin, select, with the right-click button, the option Remove Unmarked Lines or use the menu option Search > Bookmark > Remove Unmarked Lines
Click on the Clear all marks button of the Mark dialog
Finally, save the to_do.txt file
=> You should get all the files that require downloading, In my theoric case, from the 45 zones to take in account, I got a list of 876 files / lines to “download” ;-))
Best Regards,
P.S. :
Here’s a tip to count a list of numbers :
Do a multi-column selection of all these numbers, located anywhere in your current file
Paste them in a new tab
Do a zero-length COLUMN selection of all these numbers
Hit the + sign
Hit the Home key
Hit the Backspace key
Hit the End key
Insert the = sign
Copy all contents of this single line ( Ctrl + C )
Open calc.exe
Paste the contents of the clipboard ( Ctrl + V )
=> Here you are : the Windows calculator should show you the total of your **list of numbers ;-)) No possibility of errors and quick result !
You may even count numbers in other bases !