Community
    • Login

    Advice developing plugin that should process large files

    Scheduled Pinned Locked Moved Notepad++ & Plugin Development
    3 Posts 3 Posters 391 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Bas de ReuverB
      Bas de Reuver
      last edited by

      I’m working on a plugin to handle CSV files, and I’m running into an issue with large files.

      When trying to parse file like 10MB or 100MB of text, the Notepad++ plugin seems to freeze. Now obviously reading a large file will take some time, but Notepad++ stops responding and even after just waiting a long while the process never seems to finish.

      It is a C# plugin and to read and process the file I’m using something like this:

      var data = ScintillaStreams.StreamAllText();
      
      while (line != null)
      {
        line = data.ReadLine();
        //etc.
      }
      

      Should this be working, is this a good way to read the currently opened file in NPP into the plugin?
      And could this cause the plugin to crash, or is it maybe caused by something else?

      Does anyone know a better way to handle this, or have any advice?

      1 Reply Last reply Reply Quote 0
      • EkopalypseE
        Ekopalypse
        last edited by

        Disclaimer - I don’t know enough about C# to understand if this is
        the code to go with.

        Q1: Is this C# code using the latest scintilla version? Recent Npp introduced a new version and this introduced a modified notification structure.

        Q2: Are you doing your validation from a npp or scintilla callback?
        If so, make sure your code is as fast as possible or use a thread for doing the validation part.

        I’ve done a quick test with python and a downloaded csv from here.
        Validating these 1.500.000 lines took ~20 seconds on my
        old i5 2nd Gen. and have to say that I just used some very naive approach like using GetText instead of GetCharacterPointer etc…
        I saw my memory usage of npp was increasing from ~400 to 1.5GB during validating so make sure your resources are available.

        1 Reply Last reply Reply Quote 2
        • joakim wennergrenJ
          joakim wennergren
          last edited by

          Since ScintillaStreams is something I wrote I can give you some tips.

          Reading 100 MB from N++ is going to take a few seconds, not matter how you do it. In order to not lock up the N++ interface you have to make sure you do the processing in a different thread, e.g. something like

                   Task.Factory.StartNew(() =>
                      {
                          var data = ScintillaStreams.StreamAllText();
          
                          while (line != null)
                          {
                              line = data.ReadLine();
                              //etc.
                          }
          
                          // Only interact with N++ on the main thread
                          this.Invoke((Action)(() => { MessageBox.Show("The stuff is finished", "MyPlugin"); }));
                      });
          

          The way ScintillaStreams works is by getting a pointer to the N++ text buffer, and read it as “raw” as possible. This is likely going to crash horribly if the user modifies the text while it’s reading, so giving control back to the user is kind of a two-edged sword. I still feel it’s worth it though, the user experience is much better. Also consider showing the status of your progress somehow so the users feels something is happening.

          I don’t know why “the process never seems to finish”. A 10Mb file should take a second or so to read

          1 Reply Last reply Reply Quote 3
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors