• Login
Community
  • Login

Q: How to sort groups of lines alphabetically according to the first word after the beginning of each group

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
4 Posts 2 Posters 1.4k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Z
    Ziad Aborami
    last edited by Feb 22, 2019, 6:47 PM

    I am searching for a function in notepad++ that lets me sort groups of lines in this way:

    • The group of lines ist between two regular words for example <article> and </article>
      After <article> I have a variable word * for example* human
      In the next group I have another word for example animal
      I want to sort the groups of lines alphabetically according to the word after <article>

    Example:
    <article>human
    Content
    Of
    Many
    Lines
    </article>
    <article>animal
    Content
    Of
    Another
    Many
    Lines
    </article>
    <article>thing
    Content
    Of
    Other
    Many
    Lines
    </article>

    I want to sort them alphabetically in this way:
    <article>animal
    Content
    Of
    Another
    Many
    Lines
    </article>
    <article>human
    Content
    Of
    Many
    Lines
    </article>
    <article>thing
    Content
    Of
    Other
    Many
    Lines
    </article>
    ,
    I hope that you help me
    Thanks in advance

    1 Reply Last reply Reply Quote 0
    • P
      PeterJones
      last edited by Feb 22, 2019, 8:01 PM

      Step 0: Assume no smiley faces (☺ U+263A) in the data set; assumes windows newlines (CRLF = \r\n)
      Step 1: Find \R(?!<article>) (any newline not followed by the next starting <article>), replace with \x{263a} (smiley face), regular expression mode
      Step 2: Edit > Line Operations > Sort Lines Lexicographically Ascending
      Step 3: Find \x{263a}, replace \r\n, regular expression

      1 Reply Last reply Reply Quote 3
      • P
        PeterJones
        last edited by Feb 22, 2019, 8:42 PM

        By way of explanation: I chose a smiley face as a “record separator” that was unlikely to occur in your data. If it does, pick any other unicode character you are sure isn’t in your data. I was originally going to replace with a space, but I realized that, even though your data doesn’t show any of the lines inside an <article>...</article> having a space, with how artificial your data was, I assumed you may have oversimplified the example, and didn’t want to risk you coming back and saying “but I had a space”. Note that assumptions like these are good reasons for providing sufficient data to tell what you really want, and what are allowed and not allowed situations.

        -----
        FYI:

        This forum is formatted using Markdown , with a help link buried on the little grey ? in the COMPOSE window/pane when writing your post. For more about how to use Markdown in this forum, please see @Scott-Sumner’s post in the “how to markdown code on this forum” topic , and my updates near the end . It is very important that you use these formatting tips – using single backtick marks around small snippets, and using code-quoting for pasting multiple lines from your example data files – because otherwise, the forum will change normal quotes ("") to curly “smart” quotes (“”), will change hyphens to dashes, will sometimes hide asterisks (or if your text is c:\folder\*.txt, it will show up as c:\folder*.txt, missing the backslash). If you want to clearly communicate your text data to us, you need to properly format it.
        If you have further search-and-replace (“matching”, “marking”, “bookmarking”, regular expression, “regex”) needs, study this FAQ and the documentation it points to. Before asking a new regex question, understand that for future requests, many of us will expect you to show what data you have (exactly), what data you want (exactly), what regex you already tried (to show that you’re showing effort), why you thought that regex would work (to prove it wasn’t just something randomly typed), and what data you’re getting with an explanation of why that result is wrong. When you show that effort, you’ll see us bend over backward to get things working for you. If you need help formatting, see the paragraph above.
        Please note that for all regex and related queries, it is best if you are explicit about what needs to match, and what shouldn’t match, and have multiple examples of both in your example dataset. Often, what shouldn’t match helps define the regular expression as much or more than what should match.

        1 Reply Last reply Reply Quote 1
        • Z
          Ziad Aborami
          last edited by Feb 23, 2019, 1:56 AM

          Thanks very much.
          It worked fine.
          I have put
          \R(?!(<article>(.+?)</article>))
          Then the same method

          Best wishes

          1 Reply Last reply Reply Quote 1
          2 out of 4
          • First post
            2/4
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors