Hello I need Help regex For Start word +linebreak+ end word to remove CRLF not ant character or line.



  • Hi,
    I have txt file in that few paragraphs are there
    Ex:
    Paragraph 1 CRLF
    Line 1iiiiiiii CRLF
    Line 2nknv, jhsdvfu kjbfi;ii CRLF
    CRLF
    Paragraph 2 CRLF
    Line 1hfvuh’fpv CRLF
    CRLF
    Linwrfyvyg CRLF…

    Now I need Find and Replace to Get output
    Paragraph 1 CRLF
    Line 1iiiiiiii
    Line 2nknv, jhsdvfu kjbfi;ii
    CRLF
    and Same Like Paragraph 2.

    Find: ^pragraph1+linebreaks+pragraph2$ Replace only Line breaks in-between lines keeping stating and ending no character deletion.

    Thanks Friends For Your Help in advance.



  • @Ohm-Dios

    Sorry, your description is rather confusing.

    Could you repost your before and after data using the formatting tools / markdown, as described below.

    If you take this advice to heart, you’ll get better answers.

    ----

    Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as plain text using the </> toolbar button or manual Markdown syntax. Screenshots can be pasted from the clipboard to your post using Ctrl+V to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get… Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries.



  • OK ,
    Please find attached image![alt text](ohm dios.png image url))



  • @PeterJones
    Hi ,
    I have updated my query. Please have a look and do the needful please. thanks.



  • @Ohm-Dios ,

    Even with those screenshots, the only thing I can tell about the CRLF that you want to keep are the ones before the word Paragraph and at the line of the line containing the word Paragraph, but I am doubtful that I have correctly guessed your intended meaning (in that, I doubt the word Paragraph is actually anywhere in your text). Other than that, I have no way of determining which CRLF you want to keep and which CRLF you want to remove. What is the rule that decides whether a line should be joined or not?

    If you don’t have a hard-and-fast rule, then regex won’t be able to help you.

    Either way, I might suggest just highlighting the lines that you want to join manually, and use Ctrl+J (Edit > Line Operations > Join) to join the lines.



  • @PeterJones
    Yes Sir, You got my point ctrl+j But i dont want to go each and every paragraph to join. I need a regex command to do it all at once as per given string of start word and end word. It has to select all the lines between two strings and join keeping only end CRLF .
    Your query-Which CRLF To Keep–> Paragraph 1 CRLF (All lines wraped) CRLF …Next Paragraph
    To Remove --> Inside [ Startword.CRLF ( all lines).CRLF.endword]
    Hope you understand now.
    Thanks.



  • @Ohm-Dios ,

    Sorry, your “clarification” was no help, especially since you didn’t follow my advice.

    The best I can do at this point is guess. I’ll give you that one free guess.

    I would accomplish my guess as a three-step process. Search Mode = regular expression for all steps

    1. FIND = (?-s)(Paragraph.*?$)
      REPLACE = ¶${1}¶
      Replace All
    2. FIND = \R
      REPLACE = empty/nothing
      Replace All
    3. FIND =
      REPLACE = \r\n
      Replace All

    My guess is probably wrong, but I cannot do any more for you.

    If you come back and clarify, actually following my advice for how to format text using the </> button, and giving better examples, someone else might be able to understand you better.



  • @PeterJones said in Hello I need Help regex For Start word +linebreak+ end word to remove CRLF not ant character or line.:

    Thanks sir, Its Worked. So kind of you. This is what i need but some modifications. Suppose my page has multiple paragraph titles. I want to convert only paragraph 1 example. It repeated many times in page along with other paragraph titles. So in search i need to choose paragraph 1 fully until next paragraph title and that paragraph 1 wherever it repeats on that page all at once. thanks .



  • @Ohm-Dios
    Update: Its Like From to string selection i need in Find : paragraph 1----Paragraph 2 (whatever inside to be selected and formatted)



  • @Ohm-Dios ,

    Your examples are still exceedingly unclear. If you want good help, you have to present your data in a way that can be understood. You have not taken any of the advice given to you. When you refuse to take our advice on formatting the data so that it’s understandable, and giving before and after data, you are saying to us that you don’t want to make it easy for us to help you; and with an attitude like that, you’re not likely to get much help.

    For example, the following says what I think you are asking for, but it’s really hard to tell:

    Start with data:

    This 
    is 
    text
    PARAGRAPH MATCHING
    this 
    section 
    should 
    be 
    joined
    PARAGRAPH NO MATCH
    this
    section
    should
    remain
    PARAGRAPH MATCHING
    this
    section
    also
    joined
    PARAGRAPH OTHER
    blah
    

    Then, after the transformation, I think you want:

    This 
    is 
    text
    PARAGRAPH MATCHING
    this section should be joined
    PARAGRAPH NO MATCH
    this
    section
    should
    remain
    PARAGRAPH MATCHING
    this section also joined
    PARAGRAPH OTHER
    blah
    

    (Notice how I highlighted that example text, then clicked the </> button, so the forum marks it as text. This makes your example data obvious, and it makes sure we know the forum didn’t convert your data without you noticing.)

    Unfortunately, that’s an complicated setup – trying to replace newlines only between certain marker pairs, but leave them alone elsewhere – and I am not sure how to accomplish that. It’s also made more complicated because you want to leave the newlines after the START condition (PARAGRAPH MATCHING, in my example)

    If this is what you want, let us know. If this is not what you want, then show a full before-and-after example like I did. If the preview window doesn’t show it as black, then you haven’t clicked the </> button correctly. If you refuse to take this advice, you are likely to not get any more replies.

    Once we know from you what you want the transformation to be, maybe @Terry-R or @guy038 or another regex guru can chime in with a working solution, but I’ve given as much help as I can, at least for now.



  • Hello, @ohm-dios, @peterjones and All,

    Based on the @peterjones’s interpretation, here is my method. I assume that :

    • Your different sections begin, all, with the string PARAGRAPH, whatever its case, followed with a identifier ( for instance Paragraph 1, PARAGRAPH A, paragraph abcd_123 )

    • You want to join all the lines of a specific paragraph, which is, generally, repeated in different locations of your file


    So, starting with this sample text, where some ending blank characters have been added, in the two first paragraphs, for tests :

    This
    is
    text
    PARAGRAPH A
    this      
    section			
    should
    be
    joined  		   
    PARAGRAPH B
    this
    section    
    should
    remain		
    PARAGRAPH A
    this
    section
    should
    be
    joined
    PARAGRAPH A
    this section should be joined
    PARAGRAPH C
    this
    section
    should
    remain
    PARAGRAPH B
    this
    section
    should
    remain
    PARAGRAPH D
    this section should
    remain
    as is
    PARAGRAPH A
    this
    section 
    should
    be
    joined
    PARAGRAPH A
    this
    section 
    should
    be
    joined
    PARAGRAPH OTHER
    blah
    bla blah
    

    Let’s suppose that the specific paragraph searched is named PARAGRAPH A. Then :

    • Open your file in N++

    • Insert an empty line at its very beginning

    • Open the Replace dialog ( Ctrl + H )

    • SEARCH (?-si)(PARAGRAPH A\R|\G).+?\K\h*\R(?!PARAGRAPH)|\h+$

    • REPLACE ?1\x20

    • Tick the Wrap around option

    • Select the Regular expression search mode

    • Click on the Replace All button, exclusively ( Do not use the Replace button ! )

    You should get the expected text :

    This
    is
    text
    PARAGRAPH A
    this section should be joined
    PARAGRAPH B
    this
    section
    should
    remain
    PARAGRAPH A
    this section should be joined
    PARAGRAPH A
    this section should be joined
    PARAGRAPH C
    this
    section
    should
    remain
    PARAGRAPH B
    this
    section
    should
    remain
    PARAGRAPH D
    this section should
    remain
    as is
    PARAGRAPH A
    this section should be joined
    PARAGRAPH A
    this section should be joined
    PARAGRAPH OTHER
    blah
    bla blah
    

    IMPORTANT: Before running the regex S/R, you must move the cursor on an empty line, to avoid some side-effects caused by the \G assertion !

    Notes :

    • For any line, after the PARAGRAPH A line, which is not followed with an other string PARAGRAPH, it changes any possible ending blank characters and the following line-break with a single space character

    • It deletes, as well, all ending blank range of characters, present in any other line of the file

    Best Regards,

    guy038



  • @guy038 Sir, Awesome!. Exactly the same which i expected 100%. Thanks a lot. Need little more help see example 2. It has linebreaks inbetween lines and after paragarph more than one. Thanks .

    code_text
    EXAMPLE-1
    PARAGRAPH A
    this
    section 
    should
    be
    joined
    PARAGRAPH
    bla blah
    EXAMPLE-2
    PARAGRAPH A
    
    
    this
    section 
    should
    be
    
    
    
    joined
    
    
    
    PARAGRAPH OTHER
    blah
    bla blah
    
    
    `
    * ```
    list item
    Example 1 Works 100%
    But for Example 2 Need help.`````


  • @PeterJones Hi sir,
    Sorry i am not denying advice . I am unable to convey / express my Requirement in details. Sorry for that and Thanks for your Recommendations and Now its Resolved.



  • @Ohm-Dios said in Hello I need Help regex For Start word +linebreak+ end word to remove CRLF not ant character or line.:

    Need little more help

    I don’t know that this is “help”.
    I think this is “doing it for you”.
    This is not considered good behavior here.



  • @Alan-Kilborn Hi sir, Sorry for my English. I cant understand what i mentioned wrong. I just conveyed my heartful thanks and asked for few more queries. My intention is to resolve my issue with notepad community. Please correct me if i am wrong and what i mentioned to consider, ( may not be considered has good behavior. ). Actually i felt very happy after seeing that code and works great. Please Accept My Thanks To Notepad Community.



  • Hi, @ohm-dios, @peterjones, @alan-kilborn and All,

    Ah… OK. So, if the searched PARAGRAPH A contains true empty lines or blank lines, you would like these lines to be deleted, as well !

    Then, assuming this initial sample text, where I inserted some empty and blank lines, in the first PARAGRAPH A and PARAGRAPH B, for tests :

    This
    is
    text
    PARAGRAPH A
    this      
    
    
    			
    
    
    
    section			
                
    
    
    
    
    should
    be
    joined  		   
    PARAGRAPH B
    this
    section    
    			
         
    
    should
    
    
    
    remain		
    PARAGRAPH A
    this
    section
    should
    be
    joined
    PARAGRAPH A
    this section should be joined
    PARAGRAPH C
    this
    section
    should
    remain
    PARAGRAPH B
    this
    section
    should
    remain
    PARAGRAPH D
    this section should
    remain
    as is
    PARAGRAPH A
    this
    section 
    should
    be
    joined
    PARAGRAPH A
    this
    section 
    should
    be
    joined
    PARAGRAPH OTHER
    blah
    bla blah
    

    We’ll use a first regex S/R, which deletes any empty or blank line, in the PARAGRAPH A section, only :

    • First, add a dummy NON-empty line at the very beginning of your file

    • SEARCH (?is)(PARAGRAPH A\R|\G)((?!PARAGRAPH).)*?\K(^\h*\R)

    • REPLACE Leave EMPTY

    • Tick the Wrap around option

    • Select the Regular expression search mode

    • Click on the Replace All button, exclusively ( Due to the \K syntax, do not use the Replace button ! )

    You should get the output :

    This
    is
    text
    PARAGRAPH A
    this      
    section			
    should
    be
    joined  		   
    PARAGRAPH B
    this
    section    
    			
         
    
    should
    
    
    
    remain		
    PARAGRAPH A
    this
    section
    should
    be
    joined
    PARAGRAPH A
    this section should be joined
    PARAGRAPH C
    this
    section
    should
    remain
    PARAGRAPH B
    this
    section
    should
    remain
    PARAGRAPH D
    this section should
    remain
    as is
    PARAGRAPH A
    this
    section 
    should
    be
    joined
    PARAGRAPH A
    this
    section 
    should
    be
    joined
    PARAGRAPH OTHER
    blah
    bla blah
    

    Now, let’s run this second regex, that is simply the regex S/R provided in my previous post :

    • Insert an empty line at its very beginning of your file

    • Open the Replace dialog ( Ctrl + H )

    • SEARCH (?i-s)(PARAGRAPH A\R|\G).+?\K\h*\R(?!PARAGRAPH)|\h+$

    • REPLACE ?1\x20

    • Tick the Wrap around option

    • Select the Regular expression search mode

    • Click on the Replace All button, exclusively ( Due to the \K syntax, do not use the Replace button ! )

    Here we are ! This time, everything should be OK :

    This
    is
    text
    PARAGRAPH A
    this section should be joined
    PARAGRAPH B
    this
    section
    
    
    
    should
    
    
    
    remain
    PARAGRAPH A
    this section should be joined
    PARAGRAPH A
    this section should be joined
    PARAGRAPH C
    this
    section
    should
    remain
    PARAGRAPH B
    this
    section
    should
    remain
    PARAGRAPH D
    this section should
    remain
    as is
    PARAGRAPH A
    this section should be joined
    PARAGRAPH A
    this section should be joined
    PARAGRAPH OTHER
    blah
    bla blah
    

    Note that :

    • All the pure empty or blank lines of the first PARAGRAPH A section have been deleted

    • All the pure empty lines of the first PARAGRAPH B section remain unchanged, as expected, and the possible blank lines are simply changed into true empty lines !

    Cheers,

    guy038



  • @guy038 Dear Sir,
    Really Great! .Thanks a Lot its works 100%. You saved Me lot of time. Notepad plus plus is great. You’re the one who truly understand my requirements Please accept My Love and Thanks. I am forever grateful for your support!


Log in to reply