Community
    • Login

    Deleting numbers from LIST 1, that also appear in LIST 2

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    26 Posts 5 Posters 3.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • guy038G
      guy038
      last edited by

      Hello, @m-p, @peterjones, @alan-kilborn, @troshindv and All,

      Continuation of my previous post !

      • Then, we use the Edit > Line Operations > Sort Lines Lexicographically Ascending menu option, without any selection

      => The example text becomes :

      0. This License applies to any                                                                               11
      1. You may copy and distribute                                                                               13
      1. You may copy and distribute                                                                               52
      10. If you wish to incorporate                                                                               38
      10. If you wish to incorporate                                                                               72
      11. BECAUSE THE PROGRAM IS                                                                                   40
      12. IN NO EVENT UNLESS REQUIRED                                                                              41
      12. IN NO EVENT UNLESS REQUIRED                                                                              74
      2. You may modify your copy or                                                                               15
      2. You may modify your copy or                                                                               54
      3. You may copy and distribute                                                                               22
      4. You may not copy, modify,                                                                                 28
      4. You may not copy, modify,                                                                                 64
      5. You are not required to                                                                                   29
      5. You are not required to                                                                                   65
      6. Each time you redistribute                                                                                30
      6. Each time you redistribute                                                                                66
      7. If, as a consequence of a                                                                                 31
      8. If the distribution and/or                                                                                35
      8. If the distribution and/or                                                                                70
      9. The Free Software Foundation                                                                              36
      9. The Free Software Foundation                                                                              71
      Activities other copying,                                                                                    12
      Activities other copying,                                                                                    51
      Also, for each author's protect                                                                              07
      Also, for each author's protect                                                                              47
      END OF TERMS AND CONDITIONS                                                                                  42
      END OF TERMS AND CONDITIONS                                                                                  75
      Each version is given a                                                                                      37
      Finally, any free program is                                                                                 08
      Finally, any free program is                                                                                 48
      For example, if you distribute                                                                               05
      If any portion of this section                                                                               32
      If any portion of this section                                                                               67
      If distribution of executable                                                                                27
      If distribution of executable                                                                                63
      In addition, mere aggregation                                                                                21
      In addition, mere aggregation                                                                                59
      It is not the purpose of this                                                                                33
      It is not the purpose of this                                                                                68
      NO WARRANTY                                                                                                  39
      NO WARRANTY                                                                                                  73
      Preamble                                                                                                     01
      TERMS AND CONDITIONS FOR COPYING                                                                             10
      TERMS AND CONDITIONS FOR COPYING                                                                             50
      The licenses for most software                                                                               02
      The licenses for most software                                                                               43
      The precise terms and condition                                                                              09
      The precise terms and condition                                                                              49
      The source code for a work mean                                                                              26
      The source code for a work mean                                                                              62
      These requirements apply to the                                                                              19
      These requirements apply to the                                                                              57
      This section is intended to make                                                                             34
      This section is intended to make                                                                             69
      Thus, it is not the intent of                                                                                20
      Thus, it is not the intent of                                                                                58
      To protect your rights, we need                                                                              04
      To protect your rights, we need                                                                              45
      We protect your rights with two                                                                              06
      We protect your rights with two                                                                              46
      When we speak of free software,                                                                              03
      When we speak of free software,                                                                              44
      You may charge a fee for the                                                                                 14
      You may charge a fee for the                                                                                 53
      a) Accompany it with the                                                                                     23
      a) You must cause the modified                                                                               16
      b) Accompany it with a written                                                                               24
      b) Accompany it with a written                                                                               60
      b) You must cause any work that                                                                              17
      b) You must cause any work that                                                                              55
      c) Accompany it with the                                                                                     25
      c) Accompany it with the                                                                                     61
      c) If the modified program                                                                                   18
      c) If the modified program                                                                                   56
      
      • Now, we open the Replace dialog ( Ctrl + H )

        • SEARCH ^(.+)(\x20+\d+\R)(\1(?2))+

        • REPLACE Leave EMPTY

        • Tick the Wrap around option

        • Select the Regular expression search mode

        • Click on the Replace All button

      => You should get the status message 33 occurrences were replaced, leading to this text :

      0. This License applies to any                                                                               11
      11. BECAUSE THE PROGRAM IS                                                                                   40
      3. You may copy and distribute                                                                               22
      7. If, as a consequence of a                                                                                 31
      Each version is given a                                                                                      37
      For example, if you distribute                                                                               05
      Preamble                                                                                                     01
      a) Accompany it with the                                                                                     23
      a) You must cause the modified                                                                               16
      
      • Although it would be possible, to use the column mode selection, to sort the lines by the number, at end of the lines, I’m not sure it would work properly with an huge list. So, I prefer to take a safer method and perform an other regex S/R :

        • SEARCH ^(.+?)\x20{2,}(\d+)

        • REPLACE \2\t\t\1

      And we end with :

      11		0. This License applies to any
      40		11. BECAUSE THE PROGRAM IS
      22		3. You may copy and distribute
      31		7. If, as a consequence of a
      37		Each version is given a
      05		For example, if you distribute
      01		Preamble
      23		a) Accompany it with the
      16		a) You must cause the modified
      
      • Again, we use the Edit > Line Operations > Sort Lines Lexicographically Ascending menu option, to restore the initial file order, giving :
      01		Preamble
      05		For example, if you distribute
      11		0. This License applies to any
      16		a) You must cause the modified
      22		3. You may copy and distribute
      23		a) Accompany it with the
      31		7. If, as a consequence of a
      37		Each version is given a
      40		11. BECAUSE THE PROGRAM IS
      

      And, finally, we perform a last regex S/R, below, to get rid of the temporary numbering !

      • SEARCH ^\d+\t+

      • REPLACE Leave EMPTY

      => Our expected text, with the 9 unique lines :

      Preamble
      For example, if you distribute
      0. This License applies to any
      a) You must cause the modified
      3. You may copy and distribute
      a) Accompany it with the
      7. If, as a consequence of a
      Each version is given a
      11. BECAUSE THE PROGRAM IS
      

      Best Regards,

      guy038

      Alan KilbornA 1 Reply Last reply Reply Quote 2
      • Alan KilbornA
        Alan Kilborn @guy038
        last edited by

        @guy038

        Perhaps that regex-intensive solution becomes the defacto standard way of solving this problem.

        But, it might be nice to see in Notepad++ itself, a command to “Remove lines from primary view tab that occur in secondary view tab”, or some such less-wordy verbage.

        TroshinDVT 1 Reply Last reply Reply Quote 1
        • TroshinDVT
          TroshinDV @Alan Kilborn
          last edited by

          @Alan-Kilborn said in Deleting numbers from LIST 1, that also appear in LIST 2:

          Perhaps that regex-intensive solution becomes the defacto standard way of solving this problem.

          Will be for small volumes.
          The volume of data dictates its own terms.
          PS. It is better to wrap all actions in a macro.

          Alan KilbornA 1 Reply Last reply Reply Quote 0
          • Alan KilbornA
            Alan Kilborn @TroshinDV
            last edited by

            @TroshinDV said in Deleting numbers from LIST 1, that also appear in LIST 2:

            Will be for small volumes.
            The volume of data dictates its own terms.

            That doesn’t make sense as the solution crafted by @guy038 was specifically considering large “volumes”.

            PS. It is better to wrap all actions in a macro.

            I don’t believe @guy038 's solution can be made into a macro; can you explain how you think it can be?

            1 Reply Last reply Reply Quote 0
            • M PM
              M P
              last edited by

              well, there were definitely some problems. LIST 1 has got 7.4mil number whilst LIST 2 got 1.2mil numbers. I believe that this is definitely too much to deal with. I’ll try the method of @guy038 now even though im not sure if i understood it all. Let’s try it at least

              1 Reply Last reply Reply Quote 2
              • guy038G
                guy038
                last edited by guy038

                Hello, @m-p, @peterjones, @alan-kilborn, @troshindv and All,

                Oh…! Indeed, dealing with two files of 7,400,000 and 1,200,000 lines is not an easy task ! So you will have to work with a 8,600,000 lines file : good luck !

                Do not hesitate to ask me for more information if you encounter difficulties in implementing my method !

                • First, I would advice you to repeat my own tiny example, first, to get its general idea

                • Regarding your real example, I would say that :

                  • The N++ sort feature is very quick, in all cases

                  • I suppose that the numbering operation, with the column editor, should not be very long, too !

                  • May be, the first of the three regex S/R will probably take some time. Just be patient : it should work in the end !

                Best Regards

                guy038

                1 Reply Last reply Reply Quote 1
                • Terry RT Terry R referenced this topic on
                • First post
                  Last post
                The Community of users of the Notepad++ text editor.
                Powered by NodeBB | Contributors