Feature request: huge file editing support



  • Hi, I am a programmer and use NP++ as my favorite editor.

    I often need to view, and sometimes transform, huge files (e.g. logs or auto-generated scripts with several millions lines). I think it would be very useful if NP++ could be optimized in order to allow viewing, editing, and processing of files of any size.

    I would suggest a “straming-like” loading mode for huge files. In such mode, NP++ wouldn’t load the file content entirely, but only a chunk of lines where the viewer (or the working processor) is currently located at.

    Modifications could be saved in temporary chunked backup files, which are then merged onto the entire stream by a background worker thread.

    This is a limited mode, thus non-scalable functionalities (e.g. select all) may be disabled.

    Maybe some plugins won’t currently work whithin this mode (e.g. those that require the entire file contents loaded, such as document map), but I think many could be reimplemented for achieving the same functionality (e.g. compare, find/replace).

    Thank you very much, I hope my suggestion could be useful to you.
    Bye,
    Alberto



  • There has been discussions about large file support and also a few people working on a 64bit version that should be able to handle large files.



  • A 64bit version is not the complete solution to large file support.



  • @milipili I agree, it is just a nice first step. FYI Scintilla v3.6.0 has better 64-bit support



  • Alberto,

    I wonder how big your files are in bytes, and what software you use currently to edit your files.

    Doing what you suggest with files containing chunks of text wouldn’t even need 64 bits - I think that is a bit of a distraction.

    However I would have thought that the resultant editor would be so different from NP++ - both in functionality and internal code - that it might be simpler to start again and write a specialised C program. I would certainly take this approach if I had your problem.

    For example, such a program might keep an index of the location of every 5000’th line (say) that would remain available at the end of the edit for re-use. This would let you zoom to a specific line at high speed.

    Judging by all the powerful features of NP++, I imagine the software must pass through the memory image of the file quite regularly - e.g. for highlighting - and every one of those scans is going to be horribly slow on a multi-GB file.

    David



  • I agree that a 64-bit exe isn’t required but is probably the soonest-to-be-implemented option. There are programs (that I use frequently) that do exact what was described, just load up chunks of the file at a time to search/edit. N++ would need alot of underlying changes to operate this way (which I guess is technically possible) but you’d loose alot of the functionality that makes N++ something more powerful than plain old notepad.



  • Indeed, loading chunks won’t fit n++ usage. N++ must remain an in-memory editor. Other scenarios are probably too specific (without a dedicated plugin).
    However it is true that we have to reduce the memory usage of the application. The size required to edit a file should match its size on disk, plus of course a nearly constant value, for metadata like folding, highlighting… Then a 64bits version would allow files with a size greater than ~1.5GiB.
    Actually a 64bits version of n++ is not an objective in the short term. It is not a real problem but would be quite a bad idea for now. It will divide the community, with all 32bits plugins heavily used but not updated, old platforms… So a real replacement must be provided before such changes.
    However in a first step we can make enhancements in n++. Using UTF-16 internally and heavily relying on Scintilla (+ some old n++ code of course) are the main issues and thus our main concerns (from my point of view - DonHo too).
    We were talking about refactoring on github (in anticipation of all those issues), and that will be the first target in a near future, to make something less memory hungry and more predictable.



  • Actually a 64bits version of n++ is not an objective in the short term

    I don’t know why, because the current code compiles and works well on x64.

    Using UTF-16 internally

    Is this a requirement for Windows-based GUI? Because, otherwise, UTF-8 is the way to go.



  • As explained, a 64bits version will break compatibility with the current plugins. N++ will be 64 bits later, when we will drop support for 32bits platforms.



  • To confirm, the current 32bit version will download and run well on a Windows 64bit system, correct?? Thanks.


Log in to reply