multiple lines in one line
-
i have issues to see a working regular expression, i would have two questions, but at least i would like to understand mechanics:
I need to put the “keepalive seq number” behind a date/timestamp.
Have no idea, how i filter this, could someone help me with this pleaseExample
[2023-03-30 08:05:38,061] [>RECEIVED] [/100.211.10.31:55218] [2] [OK] []
<keepalive seq=“163”/>
[2023-03-30 08:05:38,061] [SENT>] [/100.211.10.31:55218] [2] [OK] []
<keepaliveResponse seq=“163” />
[2023-03-30 08:05:41,095] [>RECEIVED] [/100.211.10.34:52285] [1] [OK] []
<keepalive seq=“166”/>should be like this:
[2023-03-30 08:05:38,061] [>RECEIVED] [/10.211.1.31:55218] [2] [OK] []<keepalive seq=“163”/>
[2023-03-30 08:05:38,061] [SENT>] [/10.211.1.31:55218] [2] [OK] []<keepaliveResponse seq=“163” />
[2023-03-30 08:05:41,095] [>RECEIVED] [/10.211.1.34:52285] [1] [OK] []<keepalive seq=“166”/> -
Hello, @Michael and All,
The solution is to replace any line break with nothing, IF followed with the
<keepalive
string, leading to the regex S/R :-
SEARCH
\R(?<keepalive)
-
REPLACE
Leave EMPTY
However, there is something weird in your data as, for instance :
The INPUT text contains :
[2023-03-30 08:05:38,061] [>RECEIVED] [/10
0
.211.10.31:55218] [2] [OK] []And the OUTPUT text begins with :
[2023-03-30 08:05:38,061] [>RECEIVED] [/10.211.1.31:55218] [2] [OK] []…
Is it intentional ?
Best Regards,
guy038
-
-
@guy038
sweet works :-)
i have to get more connected to this, i only use a small set of regular expressions!A last question with this, is how i grap lines, which are mentioned way later (line22, = <status>off</status>) to the date/time stamp?
22941826.001 |13:03:06.531 |AppInfo |SIPTcp - wait_SdlSPISignal: Outgoing SIP TCP message to 100.70.5…100 on port 2154 index 4405859
[1457776866,NET]
REFER sip:++4988870058640@100.70.5…100:2154;transport=TLS SIP/2.0
Via: SIP/2.0/TLS 100.64.4.41:5061;branch=z9hG4bK37b567b60ff3d2d
From: sip:++4988870058640@100.64.4.41;tag=773571913
To: sip:++4988870058640@100.70.5..100
Call-ID: 1d973e80-1ee18abc-17543e4-2940a40a@100.64.4.41
CSeq: 101 REFER
Max-Forwards: 70
Contact: sip:++4988870058640@100.64.4.41:5061;transport=tls
User-Agent: Cisco-CUCM14.0
Require: norefersub
Expires: 0
Refer-To: cid:1234567890@100.64.4.41
Content-Id: 1234567890@100.64.4.41
Content-Type: application/x-cisco-remotecc-request+xml
Referred-By: sip:++4988870058640@100.64.4.41
Content-Length: 116<x-cisco-remotecc-request>
<hlogupdate>
<status>off</status>
</hlogupdate>
</x-cisco-remotecc-request> -
Hello, @Michael ,
I advice you to use the
</>
icon when writing a post and replace the selection (code Text
) with your own text file !Much better to visualize your needs !
And the best would be to include two sets of
code texts
:-
Your present data ( INPUT )
-
Your expected data ( OUTPUT )
See you later
Best Regards,
guy038
-
-
Could you please help me with the following search-and-replace problem I am having?
how can i grap lines, which are mentioned way later (line = <status>off</status>)
behind the line with date/time stamp?Here is the data I currently have (“before” data):
22941826.001 |13:03:06.531 |AppInfo |SIPTcp - wait_SdlSPISignal: Outgoing SIP TCP message to 100.70.5…100 on port 2154 index 4405859
[1457776866,NET]
REFER sip:++4988870058640@100.70.5…100:2154;transport=TLS SIP/2.0
Via: SIP/2.0/TLS 100.64.4.41:5061;branch=z9hG4bK37b567b60ff3d2d
From: sip:++4988870058640@100.64.4.41;tag=773571913
To: sip:++4988870058640@100.70.5…100
Call-ID: 1d973e80-1ee18abc-17543e4-2940a40a@100.64.4.41
CSeq: 101 REFER
Max-Forwards: 70
Contact: sip:++4988870058640@100.64.4.41:5061;transport=tls
User-Agent: Cisco-CUCM14.0
Require: norefersub
Expires: 0
Refer-To: cid:1234567890@100.64.4.41
Content-Id: 1234567890@100.64.4.41
Content-Type: application/x-cisco-remotecc-request+xml
Referred-By: sip:++4988870058640@100.64.4.41
Content-Length: 116<x-cisco-remotecc-request>
<hlogupdate>
<status>off</status>
</hlogupdate>
</x-cisco-remotecc-request>Here is how I would like that data to look (“after” data):
22941826.001 |13:03:06.531 |AppInfo |SIPTcp - wait_SdlSPISignal: Outgoing SIP TCP message to 100.70.5…100 on port 2154 index 4405859 <status>off</status> -
@Michael ,
please use the
</>
button in the forum post interface to highlight your example text as plain text (aka “code”) so that we can be more sure that the forum hasn’t modified your text.- FIND =
(?s)^(\d+\.\d+.*?$).*(<status>.*?</status>).*?</x-cisco-remotecc-request>
- REPLACE =
$1 $2
- SEARCH MODE = Regular Expression
My regex made assumptions: that your “keep” line always starts with digits followed by a dot followed by digits, and that there will always be a closing
</x-cisco-remotecc-request>
after the<status>...</status>
line. If those assumptions are wrong, it’s because you didn’t show enough examples (usually, one example is not enough to get a successful regex).----
Useful References
- Please Read Before Posting
- Template for Search/Replace Questions
- Formatting Forum Posts
- Notepad++ Online User Manual: Searching/Regex
- FAQ: Where to find other regular expressions (regex) documentation
----
Please note: This Community Forum is not a data transformation service; you should not expect to be able to always say “I have data like X and want it to look like Y” and have us do all the work for you. If you are new to the Forum, and new to regular expressions, we will often give help on the first one or two data-transformation questions, especially if they are well-asked and you show a willingness to learn; and we will point you to the documentation where you can learn how to do the data transformations for yourself in the future. But if you repeatedly ask us to do your work for you, you will find that the patience of usually-helpful Community members wears thin. The best way to learn regular expressions is by experimenting with them yourself, and getting a feel for how they work; having us spoon-feed you the answers without you putting in the effort doesn’t help you in the long term and is uninteresting and annoying for us.
- FIND =
-
Sorry, new to this, had not figured it out directly :-? here again:
how can i grap lines, which are mentioned way later (line = <status>off</status>)
behind the line with date/time stamp?Here is the data I currently have (“before” data):
22941826.001 |13:03:06.531 |AppInfo |SIPTcp - wait_SdlSPISignal: Outgoing SIP TCP message to 100.70.5…100 on port 2154 index 4405859 [1457776866,NET] REFER sip:++4988870058640@100.70.5…100:2154;transport=TLS SIP/2.0 Via: SIP/2.0/TLS 100.64.4.41:5061;branch=z9hG4bK37b567b60ff3d2d From: sip:++4988870058640@100.64.4.41;tag=773571913 To: sip:++4988870058640@100.70.5…100 Call-ID: 1d973e80-1ee18abc-17543e4-2940a40a@100.64.4.41 CSeq: 101 REFER Max-Forwards: 70 Contact: sip:++4988870058640@100.64.4.41:5061;transport=tls User-Agent: Cisco-CUCM14.0 Require: norefersub Expires: 0 Refer-To: cid:1234567890@100.64.4.41 Content-Id: 1234567890@100.64.4.41 Content-Type: application/x-cisco-remotecc-request+xml Referred-By: sip:++4988870058640@100.64.4.41 Content-Length: 116 <x-cisco-remotecc-request> <hlogupdate> <status>off</status> </hlogupdate> </x-cisco-remotecc-request>
Here is how I would like that data to look (“after” data):
22941826.001 |13:03:06.531 |AppInfo |SIPTcp - wait_SdlSPISignal: Outgoing SIP TCP message to 100.70.5…100 on port 2154 index 4405859 <status>off</status>
-
@Michael ,
Did you read the rest of my reply, where I also gave you the FIND WHAT and REPLACE WITH already?
Because with the updated data, my regex still gives the results you wanted.
-
Hi, @michael, @peterjones and All,
Starting with this INPUT text :
22941826.001 |13:03:06.531 |AppInfo |SIPTcp - wait_SdlSPISignal: Outgoing SIP TCP message to 100.70.5…100 on port 2154 index 4405859 [1457776866,NET] REFER sip:++4988870058640@100.70.5…100:2154;transport=TLS SIP/2.0 Via: SIP/2.0/TLS 100.64.4.41:5061;branch=z9hG4bK37b567b60ff3d2d From: sip:++4988870058640@100.64.4.41;tag=773571913 To: sip:++4988870058640@100.70.5…100 Call-ID: 1d973e80-1ee18abc-17543e4-2940a40a@100.64.4.41 CSeq: 101 REFER Max-Forwards: 70 Contact: sip:++4988870058640@100.64.4.41:5061;transport=tls User-Agent: Cisco-CUCM14.0 Require: norefersub Expires: 0 Refer-To: cid:1234567890@100.64.4.41 Content-Id: 1234567890@100.64.4.41 Content-Type: application/x-cisco-remotecc-request+xml Referred-By: sip:++4988870058640@100.64.4.41 Content-Length: 116 <x-cisco-remotecc-request> <hlogupdate> <status>off</status> </hlogupdate> </x-cisco-remotecc-request>
Strictly speaking, the following regex S/R, almost identical to the @peterjones’s one :
-
SEARCH
(?-s)(^\d+\.\d+.+)(?s:.+?)(<status>off</status>)
-
REPLACE
$1 $2
Gives this OUTPUT text :
22941826.001 |13:03:06.531 |AppInfo |SIPTcp - wait_SdlSPISignal: Outgoing SIP TCP message to 100.70.5…100 on port 2154 index 4405859 <status>off</status> </hlogupdate> </x-cisco-remotecc-request>
I suppose that it’s not exactly what you want ! So here is a second regex S/R which seems more coherent :
-
SEARCH
(?-s)(^\d+\.\d+.+)(?s).+?</x-cisco-remotecc-request>
-
REPLACE
$1 <status>off</status>
22941826.001 |13:03:06.531 |AppInfo |SIPTcp - wait_SdlSPISignal: Outgoing SIP TCP message to 100.70.5…100 on port 2154 index 4405859 <status>off</status>
Best Regards,
guy038
-
-
Okay, nice done!!
(1) i did not know, that i can use several values ($1 $2) and combine them, i did more simple stuff
(2) i do not understand how you go down all rows, till you find the value mentioned by <status>. what is the key operator for this?Status could be even on
<status>on</status>
So my idea is to go through thousands of logfiles, were i would run this command and it should grab the current status
Could you help me with point2
-
@Michael ,
I will briefly explain my regex: but really, you should look at the documentation I linked you to.
FULL:
(?s)^(\d+\.\d+.*?$).*(<status>.*?</status>).*?</x-cisco-remotecc-request>
(?s)
= turns on “. matches newline”^
= starts the match at the beginning of a line(\d+\.\d+.*?$)
= looks for one or more digit, then a decimal point, then one or more digit, followed by everything to the end of the line, and because of the parentheses, puts it all into group #1 (referenced later as$1
).*
= matches 0 or more characters, including the newline – hence, this allows it to match multiple lines at once(<status>.*?</status>)
matches<status>
and</status
> surrounding text, and puts that unit into group #2 (referenced later as$2
).*?</x-cisco-remotecc-request>
= finds all the characters up to and including the next</x-cisco-remotecc-request>
closing tag
Replacement:
$1 $2
= replace everything matched above (from the beginning of the first line in the match all the way to</x-cisco-remotecc-request>
) with just the contents of group 1, followed by a space, followed by contents of group 2.i do not understand how you go down all rows, till you find the value mentioned by <status>. what is the key operator for this?
The
.*
when. matches newline
or(?s)
is enabled will match 0 or more characters including newlines, hence it can capture multiple lines betweenStatus could be even on
My regular expression handled that, because it accepted anything between the
<status>
and</status
>, and put that into a group; Guy’s regex ignored your actual status portion, and just always replaced it with<status>off</status>
(because your example only showed one value, and he assumed that you always wanted to convert the status tooff
because of the way you phrased things)