How to find and delete multiple lines?



  • I have a text file with several multiple choice questions and solutions under separate headings. I want to delete lines containing the solutions without affecting the headings and questions.

    Before Find & Replace:

    1. Out of the total solar insolation that reaches the earth’s surface, most is used by plants for
      a)Respiration
      b)Photosynthesis
      c)Storage
      d)Movement of minerals and fluids
      Solution: A
      Justification: Only a very small fraction (0.1 per cent) is fixed in photosynthesis. More than half is used for
      plant respiration and the remaining part is temporarily stored or is shifted to other portions of the plant. This
      energy captured from Sun further reduces when we pass to higher trophic levels.

    ATMOSPHERIC CIRCULATION AND WEATHER SYSTEMS

    1. Trade winds blowing counter-clockwise in the Northern Hemisphere and blowing clockwise in the Southern Hemisphere is mainly due to
      a) Presence of more number of cyclones in the Northern Hemisphere
      b) Pronounced land effect or continentality in the Northern Hemisphere
      c) Rotation of the earth on its own axis
      d) Thickness of the atmosphere over the equator that produces a reverse jet stream in both the hemispheres
      Solution: C
      Justification: The opposite direction of circulation is due to the Coriolis Effect. The surface air that flows from
      these subtropical high-pressure belts toward the Equator is deflected toward the west in both hemispheres
      by the Coriolis Effect. These winds blow predominantly from the northeast in the Northern Hemisphere and
      from the southeast in the Southern Hemisphere.
      Option A: Tropical cyclones typically form over large bodies of relatively warm water, so they are equally
      predominant in both the hemispheres wherever there are conducive conditions.

    2. Horse latitudes are characterized by

    1. Calm winds
    2. Cloudy skies throughout the year
    3. Precipitation equivalent to the equatorial regions
      Select the correct answer using the codes below.
      a) 1 and 2 only
      b) 2 and 3 only
      c) 1 only
      d) 1 and 3 only
      Solution: C
      Justification: The horse latitudes are located at about 30 degrees north and south of the equator. It is
      common in this region of the subtropics for winds to diverge and either flow toward the poles (known as the
      prevailing westerlies) or toward the equator (known as the trade winds). These diverging winds are the result
      of an area of high pressure, which is characterized by calm winds, sunny skies, and little or no precipitation.
      According to legend, the term comes from ships sailing to the New World that would often become stalled for
      days or even weeks when they encountered areas of high pressure and calm winds. Many of these ships
      carried horses to the Americas as part of their cargo. Unable to sail and resupply due to lack of wind, crews
      often ran out of drinking water.

    Desired Result After Find & Replace:

    1. Out of the total solar insolation that reaches the earth’s surface, most is used by plants for
      a)Respiration
      b)Photosynthesis
      c)Storage
      d)Movement of minerals and fluids

    ATMOSPHERIC CIRCULATION AND WEATHER SYSTEMS

    1. Trade winds blowing counter-clockwise in the Northern Hemisphere and blowing clockwise in the Southern Hemisphere is mainly due to
      a) Presence of more number of cyclones in the Northern Hemisphere
      b) Pronounced land effect or continentality in the Northern Hemisphere
      c) Rotation of the earth on its own axis
      d) Thickness of the atmosphere over the equator that produces a reverse jet stream in both the hemispheres

    2. Horse latitudes are characterized by

    1. Calm winds
    2. Cloudy skies throughout the year
    3. Precipitation equivalent to the equatorial regions
      Select the correct answer using the codes below.
      a) 1 and 2 only
      b) 2 and 3 only
      c) 1 only
      d) 1 and 3 only

    How can I achieve this? Total noob here.



  • @lagey-raho :

    Your data is probably mangled by this web site because you didn’t wrap it in a code block so that it would be unmangled.
    Thus it is hard to tell what you have for sure, but this is a reasonable guess for a transform.

    Search for Solution: (?-s).+\RJustification:(?s).*?(\R)\R
    Replace with ${1}
    Search mode: Regular expression



  • Hello, @lagey-raho, @alan-kilborn and All,

    Alan, I think we should care about possible leading spaces, in any line !

    So, @lagey-raho, the following regex S/R should work in any case :

    • Open the Replace dialog ( Ctrl + H )

    • SEARCH (?-is)^\h*Solution:.+\R\h*Justification:(?:.+\R?)+

    • REPLACE Leave EMPTY

    • Tick the Wrap around option

    • Select the Regular expression search mode

    • Click on the Replace All button


    Note that II assume that an empty line always separates two blocks !

    Best Regards,

    guy038



  • @guy038

    Yep, but we’ve probably both done too much assuming.
    We let the OP tell us.



  • @guy038 said in How to find and delete multiple lines?:

    (?-s)^\hSolution:.+\R\hJustification:(?:.+\R?)+

    @guy038 hello. I don’t understand what does this part of your code: (?:.+\R?)+



  • Hi, @lagey-raho, @alan-kilborn, @robin-cruise and All,

    @Robin-cruise :

    when comparing the before and after text of @lagey-raho’s post, it seems that he wants to delete all lines from the line containing th string Solution : <Letter> to the first next empty line !

    And, when pasting in a new tab, the Justification part seems to be a bunch of lines and NOT a single line ! So, using the free-spacing mode for readability, this bunch of lines could be initially described wih the regex :

    (?x-is) ^ \h* (?# LEADING spaces)  Justification:  .+ \R (?# REST of CURRENT line)  ( .+ \R )* (?# POSSIBLE other lines as * means {0,x} QUANTIFIER)
    

    Of course, the (?# ••••••••••) are just in-line comments !

    But we must consider the case of the last line ending the file without any line-break and, as the line containing the word Justification may be unique, we need to modify the regex as below :

    (?x-is) ^ \h* (?# LEADING spaces)  Justification:  .+ \R? (?# REST of CURRENT line with OPTIONAL line-break)  ( .+ \R? )* (?# POSSIBLE other lines, with OPTIONAL line-break, because * means {0,x} QUANTIFIER)
    

    Note that the ? quantifier, meaning {0,1}, is a greedy quantifier. So the regex will always match the line-break of any line if present !


    Now, you may have noticed, in this regex, the consecutive parts .+ \R? and ( .+ \R? )* which can be simplified as ( .+ \R? )+. And, as we do not need the value of group 1, we’ll use a non-capturing group. Thus, the regex becomes :

    (?x-is) ^ \h* (?# LEADING spaces)  Justification:  (?: .+ \R? )+ (?# END of CURRENT line + POSSIBLE other lines, with OPTIONAL line-break for the LAST line)
    

    Note also that, because of the + quantifier, meaning {1,x}, inside the non-capturing group, the regex will stop to match anything else as soon as it meets a true empty line !


    Now, the complete regex is, then :

    (?x-is)
    ^               #  BEGINNING of line
    \h*             #  POSSIBLE LEADING spaces
    Solution:       #  with this CASE
    .+\R            #  REST of CURRENT line
    \h*             #  POSSIBLE LEADING spaces
    Justification:  #  with this CASE
    (?:             #  BEGINNING of NON-CAPTURING group
       .+ \R?       #    A NON-EMPTY line with OPTIONAL line-break
    )+              #  REPEATED from 1 to x. So standing for the END of CURRENT line + POSSIBLE other lines, with OPTIONAL line-break in LAST line
    

    And, without the free-spacing mode, this search regex is narrowed to the version, below, that I gave in my previous post :

    SEARCH (?-is)^\h*Solution:.+\R\h*Justification:(?:.+\R?)+

    BR

    guy038



  • @guy038 @Alan-Kilborn

    Hey there, thank you for responding. Sorry couldn’t get back early. Had classes.

    I tried Alan’s solution. Didn’t work. Apologies I didn’t wrap it in a code block. Will do so below for any further inputs.

    Guy, your solution worked perfectly well so long as there was an empty line between the blocks. File is a garbled mess.

    So then, here are a few things that appear with consistency in the file.

    1. Justifications do not necessarily follow for all questions. But Solutions do.
    2. Questions always begin with a number followed by period.
    3. All Headings are in ALL-CAPS.

    Will it be possible to eliminate everything Solution onwards till either next line containing ALL_CAPS letters only or beginning with numbers (1., 2., 3.,)?

    Here’s code block version of a different section of the file:

    CLIMATE
    68. All changes in the weather are ultimately caused by the
    a) Rotation of Earth
    b) Energy of the Sun
    c) Hydrological cycle on earth
    d) Primordial heat inside the earth
    Solution: B
    
    Justification: Option A Rotation of earth does cause wind movements; change of day and night; distribution
    of heat on earth etc. However, it does not explain several other phenomena such as seasons on earth;
    extreme heat and cold in Poles etc.
    Option B Sun’s energy causes all these, and is ultimately responsible for life and activity on earth, which also
    subsumes weather phenomena.
    Option C Earth’s hydrological cycle is only partly responsible for the weather. For e.g. rainfall pattern,
    movements of ocean water etc.
    Option D Primordial heat inside the earth manifests itself in form of moving molten magma inside the earth.
    This however is responsible for tectonic and geo-morphological processes, not weather phenomenon.
    Learning: sun is the primary source of energy that causes changes in the weather. Energy absorbed and
    reflected by the earth’s surface, oceans and the atmosphere play important roles in determining the weather
    at any place. Heat from Sun causes changes in temperature; pressure; evaporation; biological activity etc. All
    these in turn determine the weather at a place.
    COMPOSITION AND STRUCTURE OF ATMOSPHERE
    69. Earth’s atmosphere consists of several layers. Consider the following statements with reference to it.
    Assertion (A): Troposphere is the hottest layer of the atmosphere.
    Reason (R): It gets more heat radiation from below, earth’s surface, as compared to other atmospheric layers.
    In the context of the above, which of these is correct?
    a)A is correct, and R is an appropriate explanation of A.
    b)A is correct, but R is not an appropriate explanation of A.
    c)A is correct, but R is incorrect.
    d)Both A and R are incorrect.
    
    
    Solution: A
    Justification: The troposphere is the lowest layer of Earth's atmosphere. The troposphere is heated from
    below. Sunlight warms the ground or ocean, which in turn radiates the heat into the air right above it. This
    warm air tends to rise. That keeps the air in the troposphere "stirred up". Air is warmest at the bottom of the
    troposphere near ground level. Higher up it gets colder. Nearly all of the water vapour and dust particles in
    the atmosphere are in the troposphere. That is why most clouds are found in this lowest layer, too. The
    thickness of the troposphere varies around the planet.
    70. Troposphere is thickest at
    a) Poles
    b) Equator
    c) Sub-tropics
    d) Temperate regions
    Solution: B
    Justification: The troposphere is thicker at the equator than at the poles because the equator is warmer. The
    convection currents of air expand the thickness of the troposphere (atmosphere) at poles. Thus the simple
    reason is thermal expansion of the atmosphere at the equator and thermal contraction near the poles. Also,
    the rotation of the earth causes centrifugal force which is strongest near the equator and pushes the
    atmosphere to greater heights. The thickness of the troposphere also varies with season. The troposphere is
    thicker in the summer and thinner in the winter all around the planet. At the poles in winter, the atmosphere
    is uniformly very cold and the troposphere cannot be distinguished from other layers.
    

    Regards.



  • @lagey-raho said in How to find and delete multiple lines?:

    I tried Alan’s solution. Didn’t work.

    Justifications do not necessarily follow for all questions.

    The original data did not support that statement, leading to why my suggestion didn’t work.

    But, @guy038’s suggestion also demands Justification be present, so I’m not sure…



  • Hello, @lagey-raho, @alan-kilborn, @robin-cruise and All,

    Finally, with your raw text, in reverse video, ( thanks for this inpuut ), we now know that all your lines do not contain any leading blank characters !

    So, basically, from your last post, you want to delete any range of lines :

    • Beginning with the line Solution:•••••, with this exact case

    AND

    • Ending right before a line containing upper-case letters and space characters, ONLY

    OR

    • Ending right before a line beginning with a number, immediately followed with a dot char

    OR

    • Ending at the very end of current file !

    I assume that it’s better to replace all these deleted blocks with a single empty line to get some kind of separation !

    If so, this following regex S/R, expressed with the free-spacing mode (?x), should work :

    SEARCH  :   (?xs-i) ^  \h*  Solution:  .+?  (?=  ^  (?:  \u+  \x20+  )*  \u{2,}  $  |  ^  \d+  \.  |  \z  )
    
    REPLACE :   \r\n  ( or '\n' only, if your file is an UNIX one )
    

    Best Regards,

    guy038


Log in to reply