Navigation

    Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    regex to conditionally delete unnecessary lines in .csv files?

    Help wanted · · · – – – · · ·
    3
    5
    1126
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Brandon Tibbitts
      Brandon Tibbitts last edited by

      Can I use regex expression and replace function to delete lines that have 0 in the fourth data field of a .csv file?

      Before
      20180829014727,40.393018,-117.974365,14.54421,163.1
      20180829014740,40.590137,-117.952637,12.82776,162.3
      20180829014754,40.187256,-117.505029,13.20208,162.5
      20180829014832,40.372559,-117.538574,41.76501,162.5
      20180829014836,40.584375,-117.960938,0,162.8
      20180829014840,40.152100,-117.909180,0,162.7
      20180829014845,40.346338,-117.683301,0,162.7

      After
      20180829014727,40.393018,-117.974365,14.54421,163.1
      20180829014740,40.590137,-117.952637,12.82776,162.3
      20180829014754,40.187256,-117.505029,13.20208,162.5
      20180829014832,40.372559,-117.538574,41.76501,162.5

      1 Reply Last reply Reply Quote 0
      • Terry R
        Terry R last edited by

        @Brandon-Tibbitts
        Certainly and quite simply. The following regex will filter out those lines.
        Find What:^([^,]+,){3}0.+\R*
        Replace With:empty field <—nothing in this field.

        Have search mode as “regular expression” and wraparound ticked.
        As a brief explanation:
        [^,] means any character that is NOT a comma
        ([^,]+,) means any number of non-commas followed by a comma.
        {3} means three sets of the expression preceding (inside brackets).
        0.+\R* means a zero followed by as many characters as possible followed by the carriage return line if it exists (this caters for the last line).

        As the replace with field is empty is just removes that line. When complete check the very last line as it will likely be blank, so you may need to remove that.

        Terry

        1 Reply Last reply Reply Quote 0
        • Terry R
          Terry R last edited by

          @Brandon-Tibbitts
          I just realised that although my regex works well on your example there is an instance where it won’t. If the 4th field contains something like 0.76 then my regex will still pick it. The correct regex is:
          ^([^,]+,){3}0,.+\R*
          Note the addition of a comma after the zero.

          Sorry about that.

          Terry

          1 Reply Last reply Reply Quote 3
          • Brandon Tibbitts
            Brandon Tibbitts last edited by

            Thank you Terry,

            That helps me out a ton.

            Is there also a way to do arithmetic on that same field(after the zero values/lines removed), so for example if I wanted to add to or multiply the fourth field value in every line across a file

            I suspect the first part of the regex would be the same ^([^,]+,){3}0,. but rather with a filter to recognize any value in that field instead of the 0, then a replace function similar but to replace it with a modified value

            Or should I post a new topic to cover that one separately?

            Brandon

            Scott Sumner 1 Reply Last reply Reply Quote 0
            • Scott Sumner
              Scott Sumner @Brandon Tibbitts last edited by

              @Brandon-Tibbitts

              You can’t do arithmetic in this (regex) manner. You’re going to need a programming language for that, or perhaps, since you appear to have .csv files, pulling the data into Excel to do the math would work. Good luck.

              1 Reply Last reply Reply Quote 3
              • First post
                Last post
              Copyright © 2014 NodeBB Forums | Contributors