Community
    • Login

    [New Plugin] CSV Lint

    Scheduled Pinned Locked Moved Notepad++ & Plugin Development
    81 Posts 25 Posters 72.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • rdipardoR
      rdipardo @Bas de Reuver
      last edited by

      @bas-de-reuver,

      Right up until the line SetText(datanew.ToString()) the StringBuilder contains the correct “Bünder Voll” etc. values.

      That’s because the API detects the file encoding for you:

        /// <summary>
        /// Reads the whole document as a text stream, trying to use the right encoding
        /// </summary>
        public static StreamReader StreamAllText()
        {
          var doc = PluginBase.CurrentScintillaGateway;
          var codepage = doc.GetCodePage();
          var encoding = codepage == (int)SciMsg.SC_CP_UTF8 ? Encoding.UTF8 : Encoding.Default;
          return new StreamReader(StreamAllRawText(), encoding);
        }
      

      Problem is, a StringBuilder is just a simple utility with no encoding property that you can set, so the text returned by ToString() will be encoded in the system’s default (usually single-byte) code page.

      Creating a StreamWriter with the StreamWriter(Stream, Encoding) overload would be more useful. The second parameter could be set by calling scintillaGateway.GetCodePage() and choosing an appropriate System.Text.Encoding based on the return value (as in the API method shown above). Scintilla doesn’t declare unique constants for every possible encoding; SC_CP_UTF8 really stands for “Unicode,” i.e., any multi-byte encoding.

      If want to keep the simplicity of StringBuilder, you could always reduce the reformatted text to bytes, encode each one, then recompose them into a string, like this:

                    StringBuilder datanew = new StringBuilder();
      
                    // ... do the reformat
      
                   /// try to match the file encoding of the open buffer
                   /// <seealso cref="CsvQuery.PluginInfrastructure.ScintillaStreams.StreamAllText"/>
                   Encoding docEncoding =
                       scintillaGateway.GetCodePage() == (int)SciMsg.SC_CP_UTF8
                       ? Encoding.UTF8
                       : Encoding.Default;
      
                   // update text in editor
                   var byteBuf = new char[datanew.Length];
                   datanew.CopyTo(0, byteBuf, 0, datanew.Length);
                   var dataBytes = docEncoding.GetBytes(byteBuf);
                   scintillaGateway.SetText(docEncoding.GetString(dataBytes));
      

      Note The fallback choice of System.Text.Encoding.Default is just for illustration. It’s not recommended in practice on .NET Framework. Besides, every character in the ASCII code page fits inside the CLR Char type (which is always UTF-16).

      1 Reply Last reply Reply Quote 5
      • Alan KilbornA
        Alan Kilborn @Bas de Reuver
        last edited by

        @bas-de-reuver said in [New Plugin] CSV Lint:

        That’s why I’ve created a video to show how you can use this plug-in to validate data, reformat datetime values, split column functions

        I got around to watching the video. Very nice intro to the plugin!

        1 Reply Last reply Reply Quote 3
        • Lycan ThropeL
          Lycan Thrope @Bas de Reuver
          last edited by Lycan Thrope

          @bas-de-reuver ,

          I had watched your video earlier, but being involved elsewhere with my developing UDL and associated files needing to be done, I didn’t get to really appreciate what it was offering. However, now that the language is mostly done, for now, I started going back to a project of mine that has been “slow-rolling” and started working on it. One of the things that I was trying to do was break down what were fledgling attempts at a quick database that was huge. The data was all needed, I just didn’t take the time to break them into smaller usable entities while I was making a quick app for data entry, viewing, searching, etc.

          I needed to clean up, and I was able to separate in dBASE some of the table information, in this case, customers (actually shippers and receivers but am combining their information under just customers) and I needed to clean up and split a field. I could have probably done it in my environment, but decided to take the time to see if I could use your plug-in to do some of the work and simplify the cleanup. It worked beautifully, and although I could accomplish it by not converting it to CSV, it was so much simpler just to convert the data and split and clean it up via your plugin.

          I just wanted to thank you for developing this plugin, and making the video, that, although I didn’t understand all of the capabilities you were mentioning about it at the time, I figured it couldn’t hurt to play with it a little, and I’m very happy I did. Thanks for doing the plugin and video. Keep up the good work. :)

          Bas de ReuverB 1 Reply Last reply Reply Quote 6
          • Bas de ReuverB
            Bas de Reuver @Lycan Thrope
            last edited by

            @lycan-thrope said in [New Plugin] CSV Lint:

            It worked beautifully

            Cool, that is also the goal for this plugin; save time by making the inspecting and cleaning of data easier. So thanks 😀 that’s nice to hear you found it useful.

            1 Reply Last reply Reply Quote 2
            • TanquenT
              Tanquen
              last edited by

              I’m not able to get this plugin to work. I need to make sure it’s not adding any text/data. I just want to make the CSV data easier to read.
              My CSV has a large number of columns and different headers every few rows. It defaults to FixedLength but after changing to CSVDelimited it just adds the text “XML” at the top and nothing changes.

              Format=FixedLength
              ColNameHeader=False
              Col1=XML Text Width 9999

              Format=CSVDelimited
              ColNameHeader=False
              Col1=XML Text Width 9999

              Bas de ReuverB 1 Reply Last reply Reply Quote 1
              • T SwitzerT
                T Switzer
                last edited by

                any update planned to update CSV Lint to work with current version of notepad ++

                Lycan ThropeL Bas de ReuverB 2 Replies Last reply Reply Quote 1
                • Lycan ThropeL
                  Lycan Thrope @T Switzer
                  last edited by

                  @t-switzer ,
                  There already is, but since you haven’t posted which version of NPP you’re using, the assumption is that it is the latest version, and yes, there is an update for it. At present, you’ll need to delete the current version in the plugin folders, or it won’t allow the new NPP to start. Then after you get it started, you can install the newest plugin via the Plugin manager, or go to this site and download it yourself for a self install: CSVList Github page

                  1 Reply Last reply Reply Quote 2
                  • Bas de ReuverB
                    Bas de Reuver @Tanquen
                    last edited by

                    @Tanquen said in [New Plugin] CSV Lint:

                    My CSV has a large number of columns and different headers every few rows. It defaults to FixedLength but after changing to CSVDelimited it just adds the text “XML” at the top and nothing changes.

                    Thanks for mentioning your issue. It sounds like the plug-in can’t recognise this specific data file. I suspect the file includes many < or > characters as well as many , or ; characters or something like that. This can “confuse” the autodetect function so to speak, meaning it can’t determine which is the correct separator character, so it doesn’t interpret the data and columns correctly.

                    Is it possible to send the data file to my e-mail address (see About dialog)? If it contains privacy sensitive data or is too large, then maybe edit the file and just include a few lines of data to reproduce this issue?

                    Btw someone metioned a similar issue so in a future update I want to add a where you can (optionally) manually specify the separator character.

                    In the mean time you can somehow manually construct the meta data, like below.

                    Format=CSVDelimited
                    ColNameHeader=False
                    Col1=Field1 Text Width 50
                    Col2=Field2 Text Width 50
                    Col3=Field3 Text Width 50
                    Col4=Field4 Text Width 50
                    etc.

                    Or alternatively, first try to delete the rows (if possible) that are causing trouble, so keep only a few rows with representative data, and then click Refresh from data, and then apply that resulting metadata to the complete file with all the rows.

                    1 Reply Last reply Reply Quote 3
                    • Bas de ReuverB
                      Bas de Reuver @T Switzer
                      last edited by

                      @T-Switzer said in [New Plugin] CSV Lint:

                      any update planned to update CSV Lint to work with current version of notepad ++

                      Like @Lycan-Thrope mentioned, there is a new CSV Lint v0.4.5.2 which you can manually download from the github page. That version will be included automatically in the Plugin Admin in the upcoming Notepad++ v8.4.3.

                      It looks like the compatibility issues with the new Lexer v5 are solved now 🤞 and I want to wait and see before continuing and adding too many other features to the plug-in.

                      1 Reply Last reply Reply Quote 3
                      • PeterJonesP PeterJones referenced this topic on
                      • Eric YangE
                        Eric Yang
                        last edited by

                        Is there a way not to change the background color? Notepad++'s default theme has white background so CSV Lint looks OK after syntax coloring. But I normally use Solarized theme (dark bg) and CSV Lint changes all the text to white background.

                        Bas de ReuverB 1 Reply Last reply Reply Quote 1
                        • Bas de ReuverB Bas de Reuver referenced this topic on
                        • Bas de ReuverB
                          Bas de Reuver @Eric Yang
                          last edited by

                          @Eric-Yang sorry for the late answer, I had missing this post.

                          You can go to the menu Plugins -> CSV Lint -> Settings there is a button “Colors” to select from 4 pre-defined colorsets for the column syntax highlighting, see color preview here.

                          If you use a dark mode/dark background theme, then it’s best to select either Dark mode (pastel) or Dark mode (neon). Btw you need to close and restart Notepad++ before the new colors are visible.

                          Bas de ReuverB 1 Reply Last reply Reply Quote 3
                          • Bas de ReuverB
                            Bas de Reuver @Bas de Reuver
                            last edited by

                            With the Notepad++ update to v8.4.7 yesterday, the new Plugin Admin now also contains an update for CSVLint plug-in from v0.4.5.4 to v0.4.6.2. I hope the plugin will save everyone some time when working with csv files, let me know what you think.

                            It now also has a sort function and improved compatibility with Windows 11 unicode UTF8 setting. Also the default syntax highlighting now has 12 colors instead of 8, with a bit more pleasing colors imho.

                            csvlint_sort.png

                            See below for complete list of plugin updates and bugfixes since the last Notepad++ version:

                            v0.4.6

                            • Improved compatibility with Windows unicode UTF8 setting
                            • Sort data, new option to sort on column
                            • Split column, add options pad character and search and replace
                            • Split column, remove options when contains and decode multiple value
                            • Default color sets now have 12 colors instead of 8 (less repeats) + optimal color contrast
                            • Settings dialog, color set preview icons
                            • Autodetect improved, skip empty lines + clear message when nothing detected
                            • Metadata for fixed width, also output absolute positions

                            v0.4.6.1

                            • Apply quotes bugfix, also values that contain CrLf character
                            • Sort data and split column, use quotes correctly
                            • Sort data and split column, also support fixed width

                            v0.4.6.2

                            • Detect fixed width, allow manual column positions
                            • Button to toggle syntax highlighting
                            • Allow user to change font in docked window textboxes
                            datatraveller1D 1 Reply Last reply Reply Quote 7
                            • datatraveller1D
                              datatraveller1 @Bas de Reuver
                              last edited by

                              HI @Bas-de-Reuver,
                              Thank you for the plugin. The plugin is nice, but sometimes I want to switch to the original Notepad++ view for a .csv file. Is there an option to turn off the CSV Lint view?

                              datatraveller1D 1 Reply Last reply Reply Quote 1
                              • datatraveller1D
                                datatraveller1 @datatraveller1
                                last edited by

                                I have found out that switching the menu point “Language” - “CSVLint” to Language - “None (Normal Text)” is most probably the solution.

                                Bas de ReuverB 1 Reply Last reply Reply Quote 0
                                • Bas de ReuverB
                                  Bas de Reuver @datatraveller1
                                  last edited by

                                  @datatraveller1 Yes you’re right , it’s the menu Language > None (Normal Text) to clear the syntax highlighting colors from a csv file.

                                  Btw in the latest version of the plug-in v0.4.6.2 there is also a button on the docked windows to toggle between CSV Lint colors or no syntax highlighting. It does the same thing as the Language menu items though.

                                  1 Reply Last reply Reply Quote 1
                                  • Fruchtzwerg94F
                                    Fruchtzwerg94
                                    last edited by

                                    The latest version includes a PR which I’ve created exactly to target this issue:
                                    PR: Added button to enable or disable language #42
                                    GIF
                                    Should be exactly what you are looking for in a very simple way.

                                    1 Reply Last reply Reply Quote 4
                                    • le dinhyenL
                                      le dinhyen
                                      last edited by

                                      Hello guys i’m newbie here :( i have a csv data as below. Is there any way to align it to column like excel did?

                                      "@ABC_INFORMATION"
                                      "ID1","ID2","ID3"
                                      "ip1","ip2",""
                                      
                                      1 Reply Last reply Reply Quote 0
                                      • datatraveller1D
                                        datatraveller1
                                        last edited by

                                        @le-dinhyen There is another plugin (CSVQuery) that displays the columns aligned, but this plugin is not suitable for editing. I would recommend a CSV editor for editing CSV files. (I could tell you good CSV editors, but they are not freeware).

                                        BTW, your CSV file is not a valid CSV file. A CSV file must conform to RFC 4180 CSV rules (https://www.rfc-editor.org/rfc/rfc4180).
                                        Each line must have the same number of fields, but your first line contains one field and the other three fields.

                                        le dinhyenL 1 Reply Last reply Reply Quote 4
                                        • le dinhyenL
                                          le dinhyen @datatraveller1
                                          last edited by

                                          @datatraveller1 Yes . my csv file format is specific for customer system. we used to open it by rainbow csv in visual studio. i’m just wondering if we have any function that can display the columns aligned and edit it in realtime. Thank you so much for reply

                                          1 Reply Last reply Reply Quote 0
                                          • Bas de ReuverB Bas de Reuver referenced this topic on
                                          • Pierre de la VerreP
                                            Pierre de la Verre
                                            last edited by

                                            Hi
                                            I just started with this plugin and have a problem …

                                            A CSV with 45 columns (650 characters long), semicolon : separator, 16 lines. The display in different colours is OK, but I want to “convert” it to a “space separated” file, like

                                            Col1        Col2    Col3
                                            1.2         4.5    John
                                            

                                            So I select “Reformat / Column Separator: Fixed Width”. The result is that is removed the defined header and creates a file with no space between the columns, like

                                            1.24.5John
                                            

                                            What is the problem here, how to do it right?

                                            Thanks

                                            Lycan ThropeL Pierre de la VerreP 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors