Remove text in multiple lines starting with specific text and ending in specific text



  • Hello,

    I have a big text of 70k lines and i would like to delete all lines between something specific until something specific

    I.E.

    May 11 04:34:03 175.19.61.11 245.44.1.88 2019-05-11T04:33:57.556+0300 sml5041-60FF83174204087-MJ01AR - 4008 SyslogWorker - 4008 4087 D MDnsDS : Going to poll with pollCount 1
    May 11 05:07:02 175.19.61.11 245.44.1.88 2019-05-11T05:06:56.705+0300 sml5041-60FF83174204087-MJ01AR - 4008 SyslogWorker - 4008 4087 I EventLogService: Aggregate from 1557538437642 (log), 1557538437642 (data)
    May 11 05:07:02 175.19.61.11 245.44.1.88 2019-05-11T05:06:56.706+0300 sml5041-60FF83174204087-MJ01AR - 4008 SyslogWorker - 4008 4088 D NtpTrustedTime: currentTimeMillis() cache hit
    May 11 05:07:02 175.19.61.11 245.44.1.88 2019-05-11T05:06:56.706+0300 sml5041-60FF83174204087-MJ01AR - 4008 SyslogWorker - 4008 D ConnectivityService: setProvNotificationVisibleIntent null visible=false networkType=MOBILE

    In this case, i want to remove all the part of the lines that start with “175.19.61.11” on the 17th digit until the part “SyslogWorker -”.
    All this content i would like it to be deleted from all lines.

    I tried using reg exp but i could not manage it. Anyone can assist?



  • @Ioakeim-Fragkoulis, welcome to the Notepad++ Community.

    You said:

    I tried using reg exp but i could not manage it. Anyone can assist?

    Regex should do it. What did you try?

    Given the data

    May 11 04:34:03 175.19.61.11 245.44.1.88 2019-05-11T04:33:57.556+0300 sml5041-60FF83174204087-MJ01AR - 4008 SyslogWorker - 4008 4087 D MDnsDS : Going to poll with pollCount 1
    May 11 05:07:02 175.19.61.11 245.44.1.88 2019-05-11T05:06:56.705+0300 sml5041-60FF83174204087-MJ01AR - 4008 SyslogWorker - 4008 4087 I EventLogService: Aggregate from 1557538437642 (log), 1557538437642 (data)
    May 11 05:07:02 175.19.61.11 245.44.1.88 2019-05-11T05:06:56.706+0300 sml5041-60FF83174204087-MJ01AR - 4008 SyslogWorker - 4008 4088 D NtpTrustedTime: currentTimeMillis() cache hit
    May 11 05:07:02 175.19.61.11 245.44.1.88 2019-05-11T05:06:56.706+0300 sml5041-60FF83174204087-MJ01AR - 4008 SyslogWorker - 4008 D ConnectivityService: setProvNotificationVisibleIntent null visible=false networkType=MOBILE
    

    Depending on what you mean by “until”:

    Assuming “until” means “up to but not including”

    • Find = (?-s)^.{16}\K175\.19\.61\.11.*(?=SyslogWorker -)
      • don’t have dot match EOL; make sure there are 16 characters before the 175; but don’t include them in the match; find 175.19.61.11, making sure they are literal periods; followed by anything; SyslogWorker - must come after that anything, but don’t include SyslogWorker - in the match (so it won’t be deleted)
    • Replace = (empty)
      • throw away everything that matched
    • mode = regular expression

    which gives:

    May 11 04:34:03 SyslogWorker - 4008 4087 D MDnsDS : Going to poll with pollCount 1
    May 11 05:07:02 SyslogWorker - 4008 4087 I EventLogService: Aggregate from 1557538437642 (log), 1557538437642 (data)
    May 11 05:07:02 SyslogWorker - 4008 4088 D NtpTrustedTime: currentTimeMillis() cache hit
    May 11 05:07:02 SyslogWorker - 4008 D ConnectivityService: setProvNotificationVisibleIntent null visible=false networkType=MOBILE
    

    Alternately, if “until” means “up to and including”, then change just

    • Find = (?-s)^.{16}\K175\.19\.61\.11.*SyslogWorker -
      • same as before, except SyslogWorker - will be part of the match, and thus deleted as well

    which gives:

    May 11 04:34:03  4008 4087 D MDnsDS : Going to poll with pollCount 1
    May 11 05:07:02  4008 4087 I EventLogService: Aggregate from 1557538437642 (log), 1557538437642 (data)
    May 11 05:07:02  4008 4088 D NtpTrustedTime: currentTimeMillis() cache hit
    May 11 05:07:02  4008 D ConnectivityService: setProvNotificationVisibleIntent null visible=false networkType=MOBILE
    

    If you think the second version left an extra space when done, then just add a space after the - in the Find regex.

    If this doesn’t work for you, please see the advice below, which will help you to help us to help you.

    -----
    FYI: I often add this to my response in regex threads, unless I am sure the original poster has seen it before. Here is some helpful information for finding out more about regular expressions, and for formatting posts in this forum (especially quoting data) so that we can fully understand what you’re trying to ask:

    This forum is formatted using Markdown, with a help link buried on the little grey ? in the COMPOSE window/pane when writing your post. For more about how to use Markdown in this forum, please see @Scott-Sumner’s post in the “how to markdown code on this forum” topic, and my updates near the end. It is very important that you use these formatting tips – using single backtick marks around small snippets, and using code-quoting for pasting multiple lines from your example data files – because otherwise, the forum will change normal quotes ("") to curly “smart” quotes (“”), will change hyphens to dashes, will sometimes hide asterisks (or if your text is c:\folder\*.txt, it will show up as c:\folder*.txt, missing the backslash). If you want to clearly communicate your text data to us, you need to properly format it.

    If you have further search-and-replace (“matching”, “marking”, “bookmarking”, regular expression, “regex”) needs, study this FAQ and the documentation it points to. Before asking a new regex question, understand that for future requests, many of us will expect you to show what data you have (exactly), what data you want (exactly), what regex you already tried (to show that you’re showing effort), why you thought that regex would work (to prove it wasn’t just something randomly typed), and what data you’re getting with an explanation of why that result is wrong. When you show that effort, you’ll see us bend over backward to get things working for you. If you need help formatting, see the paragraph above.

    Please note that for all regex and related queries, it is best if you are explicit about what needs to match, and what shouldn’t match, and have multiple examples of both in your example dataset. Often, what shouldn’t match helps define the regular expression as much or more than what should match.



  • Hello @PeterJones ,

    Thank you for your welcome as well as your fast and prompt reply. Your solution worked perfect.

    I wanted to remove all text starting from 175.19.61.11 up to and including SyslogWorker - but when 175.19.61.11 is on 17th digit and not in the beginning or anywhere else in the text. I didn’t know actually how to include a line in my Find.

    What i used was: 175.19.61.11.*?SyslogWorker which did not include the check of characters position on the line. Now with your Find it worked fine.

    P.S.This is my first post on this forum so i didn’t notice the use of Markdown so for further posts i will follow these tips :)


Log in to reply