I am having Zero issues!



  • OK, I have a log file with weather data in it. I’m converting the data to be used for a different weather program.

    What I’m running into is I have some fields that are .0 and need to be only 0.

    To make it a bit more tricky, there are some fields that are 0.0000.

    So I’m only wanting to change the 2 digit .0’s into 0’s and not affect the fields with 0.0000.

    What kind of regex will solve this conundrum?

    Here is an example line from the logfile so you can see what I mean.

    08 02 2017 21 10 -8.3 .0 -13.3 30.460 5.0 6.0 248 0 0.000 0.000 0.000 -8.3

    Only that seventh number over should be altered from .0 to 0.

    Thanks!



  • @Dale-Zastoupil said:

    I have some fields that are .0 and need to be only 0.

    Hello Dale, welcome to the Notepad++ community.
    As you say on first look it is tricky, however by using “lookahead” and “lookbehind” functions (within regex) it does become fairly simple. So my regex looks for a . whereby immediately before that character there is NOT a digit. Also immediately after that character there is a 0 followed by NOT a digit.
    So using the “replace” function under “Search” these are the field values to use. You can copy the red text and insert directly if you find issue with typing the information.
    Find What:(?<!\d)\.0(?!\d)
    Replace With:0
    The search mode MUST be regular expression and wrap around should be selected. You then click on “replace all” and all changes will be made within the open text file.

    To give some background on the expression:
    (?<!\d) means a negative lookbehind immediately before the match as it’s before the \.0, in this case no digit must be found at this position.
    \.0 says look for a . and as it’s a reserved character we must delimit it with the \ and followed by a 0.
    (?!\d) means a negative lookahead immediately after the match, so no other digit must follow the first 0.

    I’ve tested it with the .0 at the start of a line, in the middle and at the end, specifically on the last line. The regex spotted ALL those occurrences.

    As always, I do suggest you work on a copy of your original data first and do some spot checks as maybe you have data that doesn’t conform to what your example shows. It’s possible then for a error to occur and data you did NOT want changed to be altered.

    Terry



  • Hi, @dale-zastoupil,

    Your post title is really surprising ! Nice pun ;-)))

    BR

    guy038



  • That regex worked perfectly. However I went onto my next logfile and realized the extra .0’s I created! I made a macro that converted one of the fields from a temperature with a .0 at the end to a regular number. But at some point, it must have clipped some data and made an extra field. So I’ll need some more help. Here is one of the lines.

    01 02 2017 00 05 13.4 78.0 7.8 30.260 7.0 11.0 325 0 0.000 0.000 0.000 13.4

    If you at the 7th field, it shows 78.0. That number should be just 78 with no decimal. The problem is this is temp data. So it could have a leading single digit, negative digit, or triple digit. What I’m looking to do is just drop the .0 .

    Thanks again guys.



  • @Dale-Zastoupil

    a try:

    find what: ^(?>\d+\.*\d*\h){6}-{0,1}\K(\d+)\.0
    replace with: \1

    because of the \K usage you need to press replace all. You cannot step through.



  • @Dale-Zastoupil said:

    So it could have a leading single digit, negative digit, or triple digit.

    Are you referring to JUST the 7th field as possibly being a negative number, or maybe all fields. It does make a huge difference in the regex to help you. Also is it always the 7th field that you might need to clip the .0 from, no other field.

    Good on @Ekopalypse for giving you a possible answer but I’d prefer to know the full range of possible data before committing.

    Terry



  • That is correct. Only modify the seventh set of numbers. That seventh number is the current temperature. I need it with no decimal point. Since I live in North Dakota, that number could be below zero. And it could be 100 degrees. So we have a wide variance.

    The posted above REGEX didn’t quite work. It missed some numbers. I’ll post a section that did convert followed by a chunk that didn’t. I’m guessing that negative 6th position threw things off.

    02 02 2017 19 44 0.7 79 -4.3 30.620 2.0 4.0 238 0 0.000 0.000 0.000 0.7
    02 02 2017 19 45 0.7 79 -4.3 30.620 2.0 4.0 238 0 0.000 0.000 0.000 0.7
    02 02 2017 19 46 0.1 79 -4.8 30.620 3.0 4.0 250 0 0.000 0.000 0.000 0.1
    02 02 2017 19 51 -0.5 79.0 -5.4 30.620 3.0 3.0 244 0 0.000 0.000 0.000 -0.5
    02 02 2017 19 52 -0.5 79.0 -5.4 30.620 3.0 3.0 244 0 0.000 0.000 0.000 -0.5
    02 02 2017 19 53 -0.5 79.0 -5.4 30.620 3.0 3.0 244 0 0.000 0.000 0.000 -0.5
    02 02 2017 19 54 -0.5 79.0 -5.4 30.620 3.0 3.0 244 0 0.000 0.000 0.000 -0.5
    02 02 2017 19 56 -0.7 80.0 -5.4 30.620 4.0 4.0 243 0 0.000 0.000 0.000 -0.7



  • @Dale-Zastoupil

    you are right, this is an issue. As @Terry-R already mentioned, when using regex
    it is needed to have a complete understanding how the data looks like.
    From your postings I get that the data is as follows
    two digit DAY
    two digit MONTH
    four digit YEAR
    two digit HOUR (?)
    two digit MINUTE (?)
    always separated by a space. This is fix over all data, right? Means, it cannot happen
    that a day is represented as one digit or year is represented with only two digits, correct?
    The next field (6th field) can be any decimal with a minus sign to represent a negative number. The decimal sign is a dot. Then we come to the 7th field.

    If all of the above mentioned is true, than a regex like ^.{17}\-{0,1}\d+\.\d+\h-{0,1}\K(\d+)\.0 and replace with \1 should do the job.

    If it isn’t true, then you need to provide more info how each field can look like.



  • You are correct, the first fields are day/month/year/hour/minute. Not sure the next number is, but temp is next. A couple later I recognize as Barometric pressure in inches. The 238 field is a wind direction out of 360. After that, I’m really not sure.

    But I tried your REGEX on 2 files, a total of 44k lines of logs. It converted both of those perfectly. I think it looks cracked. I’ll try a few more log files later today when I have more time. But I really appreciate your assistance.



  • OK, ran into another issue. Had some holes in my data that I had to fill in with a different piece of software. So now instead of showing 79.0 with one decimal place, I now have 3 decimal places. IE 79.000. What REGEX item needs to be altered for 3 decimals to trim?



  • @Dale-Zastoupil

    Let’s split the regex ^.{17}\-{0,1}\d+\.\d+\h-{0,1}\K(\d+)\.0

    ^ = enforces to be start of the line
    . = wildcard for any char except end of line chars
    {17} = {} is quantifier like the * and +
    * means 0 or more times
    + means one or more time
    {} with a single number like 17 means 17 times
    {0,1} means either 0 or 1 times
    \ = escapes the following char as this char might be used in regex language itself in this case
    \-{0,1} = look for literal minus either 0 or one time
    \d = is regex expression meaning any digit
    \d+ = any digit one ore more times
    \. = is different to . as this one means look for literal dot, remember, has been escaped
    \d+ = as before
    \h = is a regex expression meaning a horizontal space char like a space and a tab and …
    -{0,1} = literal minus either zero or one time - oopps - can you guess what I was thinking
    \K = regex expression basically saying forget about all already found. It basically treats the previous
    regex as a condition without creating a match group
    now the final part
    (\d+).0
    () = create a match group which then can be referenced as \1 and \2 etc…
    (\d+) = create a match group if it is any digit one or more times
    \. = literal dot
    0 = literal 0

    At this point you want to have an expression which either finds a single 0,
    a 0 or multiple 0s or no 0 or multiple 0s or …

    I leave it to you to decide which quantifier needs to be added to the 0 but
    I hope my explanation was good enough to understand which one can/should be used.



  • @Dale-Zastoupil

    @PeterJones style :-)

    Solutions as a base64 encoded string

    Xi57MTd9XC17MCwxfVxkK1wuXGQrXGgtezAsMX1cSyhcZCspXC4wKw

    npp can decode such strings but I hope you figured it out yourself.



  • Got it. Thanks. On to crunch my last few years of data.

    Thanks again for your help. Much appreciated. I would have never gotten anywhere close to this.


Log in to reply