I am having Zero issues!
-
That regex worked perfectly. However I went onto my next logfile and realized the extra .0’s I created! I made a macro that converted one of the fields from a temperature with a .0 at the end to a regular number. But at some point, it must have clipped some data and made an extra field. So I’ll need some more help. Here is one of the lines.
01 02 2017 00 05 13.4 78.0 7.8 30.260 7.0 11.0 325 0 0.000 0.000 0.000 13.4
If you at the 7th field, it shows 78.0. That number should be just 78 with no decimal. The problem is this is temp data. So it could have a leading single digit, negative digit, or triple digit. What I’m looking to do is just drop the .0 .
Thanks again guys.
-
a try:
find what:
^(?>\d+\.*\d*\h){6}-{0,1}\K(\d+)\.0
replace with:\1
because of the \K usage you need to press replace all. You cannot step through.
-
@Dale-Zastoupil said:
So it could have a leading single digit, negative digit, or triple digit.
Are you referring to JUST the 7th field as possibly being a negative number, or maybe all fields. It does make a huge difference in the regex to help you. Also is it always the 7th field that you might need to clip the
.0
from, no other field.Good on @Ekopalypse for giving you a possible answer but I’d prefer to know the full range of possible data before committing.
Terry
-
That is correct. Only modify the seventh set of numbers. That seventh number is the current temperature. I need it with no decimal point. Since I live in North Dakota, that number could be below zero. And it could be 100 degrees. So we have a wide variance.
The posted above REGEX didn’t quite work. It missed some numbers. I’ll post a section that did convert followed by a chunk that didn’t. I’m guessing that negative 6th position threw things off.
02 02 2017 19 44 0.7 79 -4.3 30.620 2.0 4.0 238 0 0.000 0.000 0.000 0.7
02 02 2017 19 45 0.7 79 -4.3 30.620 2.0 4.0 238 0 0.000 0.000 0.000 0.7
02 02 2017 19 46 0.1 79 -4.8 30.620 3.0 4.0 250 0 0.000 0.000 0.000 0.1
02 02 2017 19 51 -0.5 79.0 -5.4 30.620 3.0 3.0 244 0 0.000 0.000 0.000 -0.5
02 02 2017 19 52 -0.5 79.0 -5.4 30.620 3.0 3.0 244 0 0.000 0.000 0.000 -0.5
02 02 2017 19 53 -0.5 79.0 -5.4 30.620 3.0 3.0 244 0 0.000 0.000 0.000 -0.5
02 02 2017 19 54 -0.5 79.0 -5.4 30.620 3.0 3.0 244 0 0.000 0.000 0.000 -0.5
02 02 2017 19 56 -0.7 80.0 -5.4 30.620 4.0 4.0 243 0 0.000 0.000 0.000 -0.7 -
you are right, this is an issue. As @Terry-R already mentioned, when using regex
it is needed to have a complete understanding how the data looks like.
From your postings I get that the data is as follows
two digit DAY
two digit MONTH
four digit YEAR
two digit HOUR (?)
two digit MINUTE (?)
always separated by a space. This is fix over all data, right? Means, it cannot happen
that a day is represented as one digit or year is represented with only two digits, correct?
The next field (6th field) can be any decimal with a minus sign to represent a negative number. The decimal sign is a dot. Then we come to the 7th field.If all of the above mentioned is true, than a regex like
^.{17}\-{0,1}\d+\.\d+\h-{0,1}\K(\d+)\.0
and replace with\1
should do the job.If it isn’t true, then you need to provide more info how each field can look like.
-
You are correct, the first fields are day/month/year/hour/minute. Not sure the next number is, but temp is next. A couple later I recognize as Barometric pressure in inches. The 238 field is a wind direction out of 360. After that, I’m really not sure.
But I tried your REGEX on 2 files, a total of 44k lines of logs. It converted both of those perfectly. I think it looks cracked. I’ll try a few more log files later today when I have more time. But I really appreciate your assistance.
-
OK, ran into another issue. Had some holes in my data that I had to fill in with a different piece of software. So now instead of showing 79.0 with one decimal place, I now have 3 decimal places. IE 79.000. What REGEX item needs to be altered for 3 decimals to trim?
-
Let’s split the regex
^.{17}\-{0,1}\d+\.\d+\h-{0,1}\K(\d+)\.0
^ = enforces to be start of the line
. = wildcard for any char except end of line chars
{17} = {} is quantifier like the * and +
* means 0 or more times
+ means one or more time
{} with a single number like 17 means 17 times
{0,1} means either 0 or 1 times
\ = escapes the following char as this char might be used in regex language itself in this case
\-{0,1} = look for literal minus either 0 or one time
\d = is regex expression meaning any digit
\d+ = any digit one ore more times
\. = is different to . as this one means look for literal dot, remember, has been escaped
\d+ = as before
\h = is a regex expression meaning a horizontal space char like a space and a tab and …
-{0,1} = literal minus either zero or one time - oopps - can you guess what I was thinking
\K = regex expression basically saying forget about all already found. It basically treats the previous
regex as a condition without creating a match group
now the final part
(\d+).0
() = create a match group which then can be referenced as \1 and \2 etc…
(\d+) = create a match group if it is any digit one or more times
\. = literal dot
0 = literal 0At this point you want to have an expression which either finds a single 0,
a 0 or multiple 0s or no 0 or multiple 0s or …I leave it to you to decide which quantifier needs to be added to the 0 but
I hope my explanation was good enough to understand which one can/should be used. -
@PeterJones style :-)
Solutions as a base64 encoded string
Xi57MTd9XC17MCwxfVxkK1wuXGQrXGgtezAsMX1cSyhcZCspXC4wKw
npp can decode such strings but I hope you figured it out yourself.
-
Got it. Thanks. On to crunch my last few years of data.
Thanks again for your help. Much appreciated. I would have never gotten anywhere close to this.