Removing duplicated lines out of log file.
-
Hello,
I wan’t to clean up al loggin file by removing the duplicated rules between two errors.
Because the logfile is timebased the rulers are not exact the same.
And if the error reapeers later back is had to be logged again.
Maybe a small example;2018-10-01 09:35:14.101 -04:00 [Debug] Button pressed
2018-10-01 09:35:14.120 -04:00 [Debug] Button pressed
2018-10-01 09:35:15.345 -04:00 [Debug] Button pressed
2018-10-01 09:35:15.824 -04:00 [Debug] Button pressed
2018-10-01 09:35:16.826 -04:00 [Debug] Button pressed
2018-10-01 09:36:16.253 -04:00 [Debug] Button pressed
2018-10-01 09:39:27.014 -04:00 [Debug] Stack light to Yellow
2018-10-01 09:39:28.954 -04:00 [Debug] Current state changed: "MainState: Initial, PreviousMainState: Initial, AggregatedState: Operational, HardwareState: ControllerState: Undefined,AirflowState: Undefined,Airpressuretate: Undefined,VacuumState: Unknown,OutletState: Operational,FrontDoorState: Locked,BackDoorState: Locked,HandlingGateDoorState: Open,ExhaustState: On,IsVacuumApplied: False,VacuumFlow: 0.0979033783078194,AreAllDoorsClosed: False,IsSupplyDoorClosed: True,IsSupplyStationAlarm: False,IsTankInkLevelLow: False,IsEmergencyActive: False,LeftSensorTriggered: False,RightSensorTriggered: False,AirFlow1Active: False,
2018-10-01 09:39:29.954 -04:00 [Debug] Current state changed: "MainState: Initial, PreviousMainState: Initial, AggregatedState: Operational, HardwareState: ControllerState: Undefined,AirflowState: Undefined,Airpressuretate: Undefined,VacuumState: Unknown,OutletState: Operational,FrontDoorState: Locked,BackDoorState: Locked,HandlingGateDoorState: Open,ExhaustState: On,IsVacuumApplied: False,VacuumFlow: 0.0979033783078194,AreAllDoorsClosed: False,IsSupplyDoorClosed: True,IsSupplyStationAlarm: False,IsTankInkLevelLow: False,IsEmergencyActive: False,LeftSensorTriggered: False,RightSensorTriggered: False,AirFlow1Active: False,
2018-10-01 09:39:30.014 -04:00 [Debug] Stack light to Red
2018-10-01 09:40:15.824 -04:00 [Debug] Button pressed
2018-10-01 09:41:13.824 -04:00 [Debug] Button pressed
2018-10-01 09:42:15.924 -04:00 [Debug] Button pressed
2018-10-01 09:43:11.254 -04:00 [Debug] Button pressed
2018-10-01 09:43:27.014 -04:00 [Debug] Stack light to Yellow
2018-10-01 09:44:27.789 -04:00 [Debug] Stack light to Yellow
2018-10-01 09:44:28.105 -04:00 [Debug] Stack light to Yellow
2018-10-01 09:44:31.014 -04:00 [Debug] Stack light to Yellow
2018-10-01 09:45:11.254 -04:00 [Debug] Button pressedneeds to become
2018-10-01 09:35:14.101 -04:00 [Debug] Button pressed
2018-10-01 09:39:27.014 -04:00 [Debug] Stack light to Yellow
2018-10-01 09:39:28.954 -04:00 [Debug] Current state changed: "MainState: Initial, PreviousMainState: Initial, AggregatedState: Operational, HardwareState: ControllerState: Undefined,AirflowState: Undefined,Airpressuretate: Undefined,VacuumState: Unknown,OutletState: Operational,FrontDoorState: Locked,BackDoorState: Locked,HandlingGateDoorState: Open,ExhaustState: On,IsVacuumApplied: False,VacuumFlow: 0.0979033783078194,AreAllDoorsClosed: False,IsSupplyDoorClosed: True,IsSupplyStationAlarm: False,IsTankInkLevelLow: False,IsEmergencyActive: False,LeftSensorTriggered: False,RightSensorTriggered: False,AirFlow1Active: False,
2018-10-01 09:39:30.014 -04:00 [Debug] Stack light to Red
2018-10-01 09:40:15.824 -04:00 [Debug] Button pressed
2018-10-01 09:43:27.014 -04:00 [Debug] Stack light to Yellow
2018-10-01 09:45:11.254 -04:00 [Debug] Button pressedThanks in advance
Danny -
Try this:
Invoke Replace dialog (default key: ctrl+h)
Find what zone:(?-s)^(.{39}(.+)\R)(.{39}\2\R)+
Replace with zone:\1
Wrap around checkbox: ticked
Search mode selection: Regular expression
Action: Press Replace All buttonHere’s the details of how it works:
THE FIND EXPRESSION:
(?-s)^(.{39}(.+)\R)(.{39}\2\R)+
- [Use these options for the whole regular expression][1 ]
(?-s)
- [(hyphen inverts the meaning of the letters that follow)][1 ]
-
- [Dot doesn’t match line breaks][1 ]
s
- [(hyphen inverts the meaning of the letters that follow)][1 ]
- [Assert position at the beginning of a line (at beginning of the string or after a line break character) (carriage return and line feed, form feed)][2 ]
^
- [Match the regex below and capture its match into backreference number 1][3 ]
(.{39}(.+)\R)
- [Match any single character that is NOT a line break character (line feed, carriage return, form feed)][4 ]
.{39}
- [Exactly 39 times][5 ]
{39}
- [Exactly 39 times][5 ]
- [Match the regex below and capture its match into backreference number 2][3 ]
(.+)
- [Match any single character that is NOT a line break character (line feed, carriage return, form feed)][4 ]
.+
- [Between one and unlimited times, as many times as possible, giving back as needed (greedy)][6 ]
+
- [Between one and unlimited times, as many times as possible, giving back as needed (greedy)][6 ]
- [Match any single character that is NOT a line break character (line feed, carriage return, form feed)][4 ]
- [Match a line break (carriage return and line feed pair, sole line feed, sole carriage return, vertical tab, form feed)][7 ]
\R
- [Match any single character that is NOT a line break character (line feed, carriage return, form feed)][4 ]
- [Match the regex below and capture its match into backreference number 3][3 ]
(.{39}\2\R)+
- [Between one and unlimited times, as many times as possible, giving back as needed (greedy)][6 ]
+
- [You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations.][8 ]
+
- [Or, if you don’t want to capture anything, replace the capturing group with a non-capturing group to make your regex more efficient.][8 ]
- [You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations.][8 ]
- [Match any single character that is NOT a line break character (line feed, carriage return, form feed)][4 ]
.{39}
- [Exactly 39 times][5 ]
{39}
- [Exactly 39 times][5 ]
- [Match the same text that was most recently matched by capturing group number 2 (case sensitive; fail if the group did not participate in the match so far)][9 ]
\2
- [Match a line break (carriage return and line feed pair, sole line feed, sole carriage return, vertical tab, form feed)][7 ]
\R
- [Between one and unlimited times, as many times as possible, giving back as needed (greedy)][6 ]
THE REPLACE EXPRESSION:
\1
- [Insert the text that was last matched by capturing group number 1][10 ]
\1
Created with RegexBuddy
[1 ]: https://www.regular-expressions.info/modifiers.html
[2 ]: https://www.regular-expressions.info/anchors.html
[3 ]: https://www.regular-expressions.info/brackets.html
[4 ]: https://www.regular-expressions.info/dot.html
[5 ]: https://www.regular-expressions.info/repeat.html#limit
[6 ]: https://www.regular-expressions.info/repeat.html
[7 ]: https://www.regular-expressions.info/nonprint.html
[8 ]: https://www.regular-expressions.info/captureall.html
[9 ]: https://www.regular-expressions.info/backref.html
[10 ]: https://www.regular-expressions.info/replacebackref.htmlRegexBuddy settings to emulate N++ regex engine: Application=boost::regex 1.54-1.57 / flavor=Default flavor / replacement flavor=All flavor / ^$ match at line breaks / Numbered capture / Allow zero-length matches
This will turn the original text:
2018-10-01 09:35:14.101 -04:00 [Debug] Button pressed 2018-10-01 09:35:14.120 -04:00 [Debug] Button pressed 2018-10-01 09:35:15.345 -04:00 [Debug] Button pressed 2018-10-01 09:35:15.824 -04:00 [Debug] Button pressed 2018-10-01 09:35:16.826 -04:00 [Debug] Button pressed 2018-10-01 09:36:16.253 -04:00 [Debug] Button pressed 2018-10-01 09:39:27.014 -04:00 [Debug] Stack light to Yellow 2018-10-01 09:39:28.954 -04:00 [Debug] Current state changed: "MainState: Initial, PreviousMainState: Initial, AggregatedState: Operational, HardwareState: ControllerState: Undefined,AirflowState: Undefined,Airpressuretate: Undefined,VacuumState: Unknown,OutletState: Operational,FrontDoorState: Locked,BackDoorState: Locked,HandlingGateDoorState: Open,ExhaustState: On,IsVacuumApplied: False,VacuumFlow: 0.0979033783078194,AreAllDoorsClosed: False,IsSupplyDoorClosed: True,IsSupplyStationAlarm: False,IsTankInkLevelLow: False,IsEmergencyActive: False,LeftSensorTriggered: False,RightSensorTriggered: False,AirFlow1Active: False, 2018-10-01 09:39:29.954 -04:00 [Debug] Current state changed: "MainState: Initial, PreviousMainState: Initial, AggregatedState: Operational, HardwareState: ControllerState: Undefined,AirflowState: Undefined,Airpressuretate: Undefined,VacuumState: Unknown,OutletState: Operational,FrontDoorState: Locked,BackDoorState: Locked,HandlingGateDoorState: Open,ExhaustState: On,IsVacuumApplied: False,VacuumFlow: 0.0979033783078194,AreAllDoorsClosed: False,IsSupplyDoorClosed: True,IsSupplyStationAlarm: False,IsTankInkLevelLow: False,IsEmergencyActive: False,LeftSensorTriggered: False,RightSensorTriggered: False,AirFlow1Active: False, 2018-10-01 09:39:30.014 -04:00 [Debug] Stack light to Red 2018-10-01 09:40:15.824 -04:00 [Debug] Button pressed 2018-10-01 09:41:13.824 -04:00 [Debug] Button pressed 2018-10-01 09:42:15.924 -04:00 [Debug] Button pressed 2018-10-01 09:43:11.254 -04:00 [Debug] Button pressed 2018-10-01 09:43:27.014 -04:00 [Debug] Stack light to Yellow 2018-10-01 09:44:27.789 -04:00 [Debug] Stack light to Yellow 2018-10-01 09:44:28.105 -04:00 [Debug] Stack light to Yellow 2018-10-01 09:44:31.014 -04:00 [Debug] Stack light to Yellow 2018-10-01 09:45:11.254 -04:00 [Debug] Button pressed
Into the desired text:
2018-10-01 09:35:14.101 -04:00 [Debug] Button pressed 2018-10-01 09:39:27.014 -04:00 [Debug] Stack light to Yellow 2018-10-01 09:39:28.954 -04:00 [Debug] Current state changed: "MainState: Initial, PreviousMainState: Initial, AggregatedState: Operational, HardwareState: ControllerState: Undefined,AirflowState: Undefined,Airpressuretate: Undefined,VacuumState: Unknown,OutletState: Operational,FrontDoorState: Locked,BackDoorState: Locked,HandlingGateDoorState: Open,ExhaustState: On,IsVacuumApplied: False,VacuumFlow: 0.0979033783078194,AreAllDoorsClosed: False,IsSupplyDoorClosed: True,IsSupplyStationAlarm: False,IsTankInkLevelLow: False,IsEmergencyActive: False,LeftSensorTriggered: False,RightSensorTriggered: False,AirFlow1Active: False, 2018-10-01 09:39:30.014 -04:00 [Debug] Stack light to Red 2018-10-01 09:40:15.824 -04:00 [Debug] Button pressed 2018-10-01 09:43:27.014 -04:00 [Debug] Stack light to Yellow 2018-10-01 09:45:11.254 -04:00 [Debug] Button pressed
- [Use these options for the whole regular expression][1 ]
-
Thank you for your quick response and also for the clear explanation and links to a manual.
This expression does exactly what I meant. -
@D said:
Thank you for your quick response and also for the clear explanation and links to a manual.
THAT is what we helpers like to hear for a response! :-)
Note that the search could be made more restrictive, if necessary. For instance, one could verify that the lines start with a date followed by a time followed by the string
[Debug]
…but I did not include this in my solution because it was more work. :-) - 2 months later
-
Some lines in my file, although there are a few different words, have been deleted according to the above formula. What if I just want to delete completely matching lines?
And if I want to remove the overlapping lines and remove the same match, how? I mean delete both 2. Hope to get your answer -
@Sarah-Duong , welcome to the forum.
If I understand correctly, you have a data file – I’ll assume it looks similar to D’s original logfile – where there’s a possibility that two rows might be exactly alike, as in this example data:
2018-10-01 09:35:14.101 -04:00 [Debug] Button pressed 2018-10-01 09:35:14.120 -04:00 [Debug] Button pressed EXACTLY THE SAME 2018-10-01 09:35:14.120 -04:00 [Debug] Button pressed EXACTLY THE SAME 2018-10-01 09:35:14.120 -04:00 [Debug] Button pressed EXACTLY THE SAME 2018-10-01 09:35:15.345 -04:00 [Debug] Button pressed 2018-10-01 09:35:15.824 -04:00 [Debug] Button pressed 2018-10-01 09:35:16.826 -04:00 [Debug] Button pressed EXACTLY THE SAME 2018-10-01 09:35:16.826 -04:00 [Debug] Button pressed EXACTLY THE SAME 2018-10-01 09:36:16.253 -04:00 [Debug] Button pressed
And you want to delete both of the matching lines, so that the log above would be filtered down to
2018-10-01 09:35:14.101 -04:00 [Debug] Button pressed 2018-10-01 09:35:15.345 -04:00 [Debug] Button pressed 2018-10-01 09:35:15.824 -04:00 [Debug] Button pressed 2018-10-01 09:36:16.253 -04:00 [Debug] Button pressed
If that’s the case, then
- Find What =
(?-s)^(.+\R)(\1)+
- Replace With =
- Search Mode = Regular Expression
If that’s not the case, then you’ll need to provide more information – like example data, including both the BEFORE and the AFTER data, so we know exactly how you want it to change. In order to help us help you, please markup your post in a way that the data comes through exactly as you entered it: make use of your PREVIEW pane, and the formatting/Markdown-in-this-forum instructions given in the “FYI” section I am quoting below.
-----
FYI:This forum is formatted using Markdown , with a help link buried on the little grey
?
in the COMPOSE window/pane when writing your post. For more about how to use Markdown in this forum, please see @Scott-Sumner’s post in the “how to markdown code on this forum” topic , and my updates near the end . It is very important that you use these formatting tips – using single backtick marks around small snippets, and using code-quoting for pasting multiple lines from your example data files – because otherwise, the forum will change normal quotes (""
) to curly “smart” quotes (“”
), will change hyphens to dashes, will sometimes hide asterisks (or if your text isc:\folder\*.txt
, it will show up asc:\folder*.txt
, missing the backslash). If you want to clearly communicate your text data to us, you need to properly format it.
If you have further search-and-replace (“matching”, “marking”, “bookmarking”, regular expression, “regex”) needs, study this FAQ and the documentation it points to. Before asking a new regex question, understand that for future requests, many of us will expect you to show what data you have (exactly), what data you want (exactly), what regex you already tried (to show that you’re showing effort), why you thought that regex would work (to prove it wasn’t just something randomly typed), and what data you’re getting with an explanation of why that result is wrong. When you show that effort, you’ll see us bend over backward to get things working for you. If you need help formatting, see the paragraph above.
Please note that for all regex and related queries, it is best if you are explicit about what needs to match, and what shouldn’t match, and have multiple examples of both in your example dataset. Often, what shouldn’t match helps define the regular expression as much or more than what should match. - Find What =
- about a year later
-
I have created a new topic. But can you tell me how to upload a short paragraph in my text file up here for people to easily visualize? I have seen some people do this, but really I do not know how to manipulate it? It is similar to taking screenshots and posting on your topic.
-
@PeterJones If it is attached with instructions send questions here. This will help you understand the question, and support it correctly. Perhaps I have not found this post yet.
-
This post is deleted!