• Login
Community
  • Login

Search line without ending tag

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
8 Posts 3 Posters 989 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A
    Alex Mesch
    last edited by Feb 17, 2020, 3:23 PM

    Hello.

    I’m very bad in regex, can someone tell me, how i can find lines with next problem.
    I have xml document with a lot of info for tax system. Some files generated with errors, closing tag goin to the next line, and i need to find this cases.
    And one more problem, there can be spaces or tabs, before beggining tag.
    Need to find cases like <TypePost>Main *and here closing tag going to the next line

          <Post>Manager</Post>
          <subdivision><Marketing</subdivision>
          <TypePost>Main
          </TypePost>
    
    E 1 Reply Last reply Feb 17, 2020, 3:37 PM Reply Quote 1
    • E
      Ekopalypse @Alex Mesch
      last edited by Ekopalypse Feb 17, 2020, 3:38 PM Feb 17, 2020, 3:37 PM

      @Alex-Mesch

      the \R denotes line endings and \h stands for horizontal spaces which can be spaces or tabs.
      So this in mind you might consider
      find what:<TypePost>Main\R\h+
      replace with: <TypePost>Main

      A 1 Reply Last reply Feb 17, 2020, 3:45 PM Reply Quote 3
      • A
        Alex Mesch @Ekopalypse
        last edited by Feb 17, 2020, 3:45 PM

        @Ekopalypse TY, working.
        But, i’m sorry) I forgot one moment.
        What if here can be any tag and text?
        Something like this, need to find line UsersC and UUID and some other different names.

        <UsersC>21
        </UsersC>
        <UUID>be9a1528-9a/6/0/-4917-8857-12896a7693de</UUID>
        <Date>2020-01-20</Date>
        <UUID>7f8e38ab-ceba-45c5-ab34-834b61bad840
        </UUID>
        
        E 1 Reply Last reply Feb 17, 2020, 4:02 PM Reply Quote 1
        • E
          Ekopalypse @Alex Mesch
          last edited by Feb 17, 2020, 4:02 PM

          @Alex-Mesch

          if your data is consistent then something like this
          find what:<(\w+>)(.*)\R\h*(</\1)
          replace with:\1\2\3
          might do it.

          So we are looking for

          • a tag <(\w+>) (a less sign followed by any word followed by a greater sign
          • followed by any text (.*)
          • followed by a end of line char \R
          • followed by horizontal spaces \h*
          • followed by the start of a closing tag </ followed by what was found in the starting tag \1 -> (</\1)
          A 1 Reply Last reply Feb 17, 2020, 4:13 PM Reply Quote 3
          • A
            Alex Mesch @Ekopalypse
            last edited by Feb 17, 2020, 4:13 PM

            @Ekopalypse amazing, it’s working very well))
            Thank you very much and thank u for description of the process.

            1 Reply Last reply Reply Quote 2
            • G
              guy038
              last edited by guy038 Feb 17, 2020, 6:58 PM Feb 17, 2020, 6:48 PM

              Hello, @alex-mesch, @ekopalypse and All,

              A second possibility, derived from @ekopalypse’s solution, would be :

              • Open the Replace dialog ( Ctrl + H )

              • SEARCH <(\w+)>.*\K\R\h*(?=</\1>)

              • REPLACE Leave EMPTY

              • Now, choice :

                • To tick the Wrap around option if you want to process the S/R on the whole file, from beginning to end

                • To untick the Wrap around option to process the S/R, from current location to the end of the file

                • To do a normal selection of text first and then, tick the In selection option

              • Select the Regular expression search mode

              • Click, exclusively on the Replace All option, whatever your choice !

              Notes :

              • Due to the \K syntax, inside this regex, the search process works correctly, but the “step by step” replacement, with the Replace button, is not functional :-(

              • The search regex looks for a line-break, possibly followed with some blank characters ( tabulation and/or space ), ONLY IF :

                • It is preceded with <, then a name tag \w+, stored as group 1, because embedded in parentheses, then > and any subsequent character(s) .*, even 0, till the line-break

                • It is followed with the same ending tag </...>, due to the positive look-ahead structure ?=</\1>) and the \1 syntax which represents the name tag

              • As the replacement zone is empty, the EOL, and the possible blank chars, are simply deleted !

              Best Regards,

              guy038

              A 1 Reply Last reply Feb 17, 2020, 7:32 PM Reply Quote 2
              • A
                Alex Mesch @guy038
                last edited by Feb 17, 2020, 7:32 PM

                @guy038 thx)
                Tomorrow I will study how it works) Very hard for my brain)

                1 Reply Last reply Reply Quote 0
                • G
                  guy038
                  last edited by Feb 17, 2020, 11:13 PM

                  Hi, @alex-mesch,

                  To begin with, click on the link, below :

                  https://community.notepad-plus-plus.org/topic/15765/faq-desk-where-to-find-regex-documentation

                  Cheers,

                  guy038

                  1 Reply Last reply Reply Quote 0
                  4 out of 8
                  • First post
                    4/8
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors