Community
    • Login

    Number of lines NP++ and Excel shows won't match

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    12 Posts 5 Posters 1.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • glossarG
      glossar
      last edited by

      Hi all,
      I have two files (“IATE_de.txt” and “IATE_en.txt”) number of lines of which won’t match in NP++ and Excel respectively. Since yesterday I’ve been trying to align the lines in these two files, based on the numbers standing at the very beginning of lines in NP++ (namely the numbers before the very first tabulation) or in the first column in Excel. The said numbers are unuqie in each file and after going back and forth between NP++ and Excel and numerous operations to delete lines with unique numbers and keep those lines with numbers that are also in the other file, I believe to have managed to come very close to aligning the files, again based on the said numbers. Now there is a problem with the number of lines that NP++ and Excel shows and both won’t match! See the screenshots. Even if I repeated those operations for finding, deleting, keeping unique and/or duplicate numbers, hence the thier respective lines, the number of lines still won’t match. I believe the number of lines Excels shows for each file is correct, besides both match, which I expect and tried to achieve but NP++ won’t agree and shows different numbers of lines.

      Screenshots for the file “IATE_de.txt”
      alt text
      alt text

      Screenshots for the file “IATE_en.txt”
      alt text
      alt text

      Download:
      IATE_en.txt: https://easyupload.io/uzkda6
      IATE_en.txt: https://easyupload.io/z0tayl
      Could it be a bug in NP++ or could there be a problem with my clipboard when copying and pasting between NP++ and Excel?

      Could someone please have a look into the files and confirm whether the numbers of lines NP++ and Excel shows, matches for the respective file? I use NP++ ver. 7.9.3 64-bit and Excel 2016.

      Thank you so much in advance!
      glossar.

      glossarG PeterJonesP 2 Replies Last reply Reply Quote 0
      • glossarG
        glossar @glossar
        last edited by

        Addendum:

        1- I’ve sorted the lines in both files in NP++ (Ascending ignoring case) as well as with Excel’s sorting function - the result in both case is the same: The number of lines won’t match in NP++ and Excel for the respective file.

        2- Excel can’t find:

        • any more duplicate cells in the column A, correspoing to the numbers before the very first of the tabulation in NP++
        • unique cells in the column A, wenn treating the whole content of a file as a table on Excel (i.e. the columns A, B and C) and making Excel compare the two tables (i.e. the whole content of the both files in the olumns A, B, C and D, E, F) based the columns A and D that correspond to the numbers

        which suggests that the both files must have reached the state that I’ve tried to achieve: To align the lines of both files based the on the said numbers. But the fact that NP++ shows different number of lines for each file compared to that Excel shows, confuses me.

        1 Reply Last reply Reply Quote 0
        • Alan KilbornA
          Alan Kilborn
          last edited by

          I really think it is on you to do what you’ve asked someone else to do here.

          After you do your work, if you do decide that there’s a bug in Notepad++ behavior, by all means return here for further discussion of it.

          glossarG 1 Reply Last reply Reply Quote 0
          • glossarG
            glossar @Alan Kilborn
            last edited by glossar

            @Alan-Kilborn

            You seem not to have read what I wrote above. I repeated over and over again and am still repeating the process/operations. I wish I could screen-record it. Just now, I’ve re-produced it - the number of lines Excel shows/has and the number of lines NP++ shows after I simply copy the whole content from Excel and paste it to NP++ won’t match.

            Alan KilbornA 1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by guy038

              Hi, @glossar and All,

              I correctly downloaded your files, witch contain 402,132 lines for IATE_en.txt and 417,213 lines for IATE_de.txt

              Unfortunately, with my old XP SP3 machine, my Excel version cannot open tables with more than 65,536 lines :-((

              However, I can affirm that the last 65,536th line, of both files, are strictly identical in Notepad++ and in Excel :

              • Line 65,536 : 1109427 medical science safety factor, in IATE_en.txt

              • Line 65,536 : 1173731 AGRICULTURE, FORESTRY AND FISHERIES Holzwirtschaft, in IATE_de.txt

              Best Regards,

              guy038

              glossarG 1 Reply Last reply Reply Quote 1
              • Alan KilbornA
                Alan Kilborn @glossar
                last edited by

                @glossar said in Number of lines NP++ and Excel shows won't match:

                You seem not to have read what I wrote above. I repeated over and over again and am still repeating the process/operations. I wish I could screen-record it. Just now, I’ve re-produced it - the number of lines Excel shows/has and the number of lines NP++ shows after I simply copy the whole content from Excel and paste it to NP++ won’t match.

                I stand by my previous statement:

                “After you do your work, if you do decide that there’s a bug in Notepad++ behavior, by all means return here for further discussion of it.”

                1 Reply Last reply Reply Quote 1
                • glossarG
                  glossar @guy038
                  last edited by glossar

                  @guy038 said in Number of lines NP++ and Excel shows won't match:

                  Hi, @glossar and All,

                  I correctly downloaded your files, witch contain 402,132 lines for IATE_en.txt and 417,213 lines for IATE_de.txt

                  Unfortunately, with my old XP SP3 machine, my Excel version cannot open tables with more than 65,536 lines :-((

                  However, I can affirm that the last 65,536th line, of both files, are strictly identical in Notepad++ and in Excel :

                  • Line 65,536 : 1109427 medical science safety factor, in IATE_en.txt

                  • Line 65,536 : 1173731 AGRICULTURE, FORESTRY AND FISHERIES Holzwirtschaft, in IATE_de.txt

                  Best Regards,

                  guy038

                  Hello Guy,
                  Thank you so much for jumping in!

                  The fact that you’ve seen the 402,132 lines for “IATE_en.txt” and 417,213 lines for “IATE_de.txt” supports my suspection that there might be something wrong with Notepad++, because, again, my Excel shows the same number of lines for both file (which- again- is expected), then the (first) 65,536 lines out of 400K or so, may not be an enough sample to claim otherwise, i.e. everthing works the way expected both in NP and Excel.

                  Greetings,
                  glossar

                  1 Reply Last reply Reply Quote 0
                  • guy038G
                    guy038
                    last edited by guy038

                    Hello, @glossar and All,

                    Seemingly, your English and German EXCEL files contains exactly 402,128 records / raws / lines ! And, when opened in N++, you get more lines. This means that a single line, in Excel is sometimes displayed as two or more consecutive lines in Notepad++


                    Thus, just slice your initial files in smaller parts and compare, each time, if the number of lines differ, when opened in Excel and Notepad++

                    And, little by little, decrease the selection … till you get a file with, let’s say, 10 records, only, which still has a different number of lines in both applications. Then, it shouldn’t be very difficult to verify which characters forces the N++ text to be displayed in several lines, instead of a single line in Excel !

                    BR

                    guy038

                    Michael VincentM 1 Reply Last reply Reply Quote 3
                    • Michael VincentM
                      Michael Vincent @guy038
                      last edited by Michael Vincent

                      @guy038 said in Number of lines NP++ and Excel shows won't match:

                      Seemingly, your English and German EXCEL files contains exactly 402,128 records / raws / lines ! And, when opened in N++, you get more lines. This means that a single line, in Excel is sometimes displayed as two or more consecutive lines in Notepad++

                      @glossar

                      @guy038 sounds right. Assuming some cells have a carriage return / line feed probably in quotes which Excel respects and keeps in a single cell, but Notepad++ as a text editor probably puts on a new line thus increasing the line count in N++.

                      Cheers.

                      glossarG 1 Reply Last reply Reply Quote 4
                      • glossarG
                        glossar @Michael Vincent
                        last edited by

                        @guy038 @Michael-Vincent

                        Thank you both for the solution!

                        Michael - I’ve just removed all quotes in both files and the numbers of lines now match in NP and Excel!

                        Cheers,
                        glossar

                        1 Reply Last reply Reply Quote 1
                        • PeterJonesP
                          PeterJones @glossar
                          last edited by

                          @glossar said in Number of lines NP++ and Excel shows won't match:

                          number of lines of which won’t match in NP++ and Excel respectively

                          Because the lines of a plain text file are not equivalent to the rows of an Excel spreadsheet. A row in a spreadsheet can contain one or more newline sequences, whereas a line in a text document in a text editor ends with a newline sequence by definition.

                          If you don’t understand the difference between a text file and a spreadsheet, I suggest you start studying these subtle differences.

                          I’ve just removed all quotes in both files and the numbers of lines now match in NP and Excel!

                          Congratulations. You just fixed your file by breaking the data. I hope this is not critical data that you are breaking.

                          "a1 line1
                          a1 line2","b1 line 1
                          b1 line 2","c1","d1 line 1
                          d1 line2
                          d1 line3"
                          

                          That is a CSV file that represents exactly one row in the spreadsheet, but is obviously five lines of text.

                          If you were to just remove the quotes, then open the CSV in a spreadsheet, it would fill in a1, a2, b2, a3, b3, c3, a4, a5 – which is a completely different data structure, with way too many cells being populated, in the wrong rows and columns.

                          If this data is anything other than a personal hobby you are doing for yourself with no outside implications, then please reconsider just blindly deleting the quote marks without understanding the consequences – because it could cost you or someone else their job, their money, or worse! PLEASE UNDERSTAND THIS!

                          If you do not understand the differences between a spreadsheet and a text file, then please just use a spreadsheet for manipulating spreadsheet data until such time as you have understood the sometimes subtle

                          OTHER READERS: Please do not follow the example of blindly deleting quotes in a CSV to get the number of rows in a spreadsheet to match with the number of lines shown in a text editor

                          Michael VincentM 1 Reply Last reply Reply Quote 4
                          • Michael VincentM
                            Michael Vincent @PeterJones
                            last edited by

                            @PeterJones said in Number of lines NP++ and Excel shows won't match:

                            OTHER READERS: Please do not follow the example of blindly deleting quotes in a CSV to get the number of rows in a spreadsheet to match with the number of lines shown in a text editor

                            YES, what he said. My “advice” above was more of a diagnosis than a course of treatment. The problem was probably quotes to capture newlines. I never meant that the fix was to remove the quotes! As @PeterJones says, this CHANGES your data!

                            Cheers.

                            1 Reply Last reply Reply Quote 3
                            • First post
                              Last post
                            The Community of users of the Notepad++ text editor.
                            Powered by NodeBB | Contributors