Hi, @dipsi7772 and All,
You said :
Strange thin is, on some files. just <!doctype html> this is the result.
I would say al files are the same format , so Its mystious.
It’s not strange and it’s not related to file format at all ! The reason, is that , for files with big size, it may happen that the regex does not work properly and deletes all characters but the first line of your HTML files. Indeed, as the regex is :
(?s).*?(?-si:pass(word)?.*?>(.+?)(?=</div>))|.+
The beginning (?s).*?(?-si:pass(word)? means that the regex engine selects all characters, even displayed on several lines, from current position of the caret till the first word pass or password. In some files, this range of characters can be significative and this fact could explain the non-expected results !
If your HTML files are not important nor confidentiel, simply e-mail me one of these files, which produces errors. I’ll try to find out an other regex which works correctly, in all cases ;-))
Next, you said :
Is it maybe possible to implement a rule that avoids duplicate results?
My question is : In the copied HTML files, that contains the passwords ( 1 per line ), which is the maximum length of these files ?
Depending of this length, a regex solution may be possible… However, if you don’t mind changing the initial order of these passwords, just use, for each copied HTML file, the two menu options, below :
Edit > Line Operations > Sort Line Lexicographically Ascending
Edit > Line Operations > Remove Consecutive Duplicate Lines
Finally, you said :
Another thing is, even I choose “Automatischer Zeienumbruch” each line is writte ine ONE line, not in a second one which woud avoid the vertical scrolling.
I’m sorry because I cannot guess what you’re speaking of :-(( Depending on your file ending characters, discussed previously, and using the appropriate Replace regex :
\2\r\n for Windows files
OR
\2\n for Unix files
The View > Word Wrap option should work correctly !?
Best Regards,
guy038