Community
    • Login

    Regular expression ( remove everything but leave certain code/word )

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    7 Posts 2 Posters 2.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Handa FlockaH
      Handa Flocka
      last edited by

      hi everyone
      I exported bulk messages from some chat with “json”
      and I want to filter and remove some of them that are:
      “type”: “message”,

      {
         "id": 184160,
         "type": "message",
         "date": "2021-08-23T21:51:20",
         "from": "fifi mark",
         "from_id": "user1917774101",
         "text": "hello where are you from"
        },
        {
         "id": 184162,
         "type": "Quote",
         "date": "2021-08-23T21:51:24",
         "from": "Tommy Montana",
         "from_id": "user1911184795",
         "reply_to_message_id": 184151,
         "text": “In order to write about life first you must live it."
        },
      

      does anyone know how I can remove code from “}” to "{ "
      if “type:” is not “Quote”,
      and it is
      “type”: “message”,

      is this possible using regular expression?
      Thanks

      1 Reply Last reply Reply Quote 0
      • Handa FlockaH
        Handa Flocka
        last edited by

        more clearer: remove from “}” to "{ " if “type”: “message”, and leave if it “type”: “Quote”,

        PeterJonesP 1 Reply Last reply Reply Quote 0
        • PeterJonesP
          PeterJones @Handa Flocka
          last edited by

          @Handa-Flocka ,

          Filtering JSON (or other such data-description languages) would be much easier in a purpose-built tool whose job is to process and filter that kind of data. Regex might be able to handle it, but it would likely depend greatly on the exact content of the data, and if you tried to use that same regex on similar data, there is no guarantee that it would work the next time.

          I am sure one of the regex gurus here could probably come up with something. However, your problem statement still lacks clarity. It’s often a good idea to present your data in

          what I have:

          blah blah blah
          

          what I want it to be:

          blah blahdy blah blech
          

          … in addition to your description of what you want.

          Because when I read “remove from } to {”, I get confused because } is the closing brace and { is the next opening brace, by my reading… and the only thing between those is the comma; do you really want to just delete },[CRLF]{ (where the [CRLF] is a newline sequence)? Or something else?

          Even better would be if your data gave examples of JSON entries that get edited (as above)and JSON entries that don’t get edited (to help us understand what circumstances you want it to change and what circumstances you want it to stay the same)

          Maybe someone else already understands what you want. But I wouldn’t be able to try to solve it without the additional information requested.

          Also see the generic advice below; you already followed some of it, but you’ll get better answers if your search/replace questions follow all the advice.

          Good luck

          ----

          Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as literal text using the </> toolbar button or manual Markdown syntax. To make regex in red (and so they keep their special characters like *), use backticks, like `^.*?blah.*?\z`. Screenshots can be pasted from the clipboard to your post using Ctrl+V to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries.

          1 Reply Last reply Reply Quote 2
          • Handa FlockaH
            Handa Flocka
            last edited by Handa Flocka

            @PeterJones Thanks for taking time and answering

            what I meant exactly is if “type”: is “message”, and it is not “Quote” then remove whole code from it’s beginning till it’s end
            And yes I got it wrong at first it is “{” to “}”.

            if: “type”: “Quote” then it will be untouched.
            My example before

            {
               "id": 184160,
               "type": "message",
               "date": "2021-08-23T21:51:20",
               "from": "fifi mark",
               "from_id": "user1917774101",
               "text": "hello where are you from"
              },
              {
               "id": 184162,
               "type": "Quote",
               "date": "2021-08-23T21:51:24",
               "from": "Tommy Montana",
               "from_id": "user1911184795",
               "reply_to_message_id": 184151,
               "text": “In order to write about life first you must live it."
              },
            

            Result will be:

            {
             "id": 184162,
             "type": "Quote",
             "date": "2021-08-23T21:51:24",
             "from": "Tommy Montana",
             "from_id": "user1911184795",
             "reply_to_message_id": 184151,
             "text": “In order to write about life first you must live it."
            },
            

            the content here is whole chat so it will be bulk
            not just one time occurring
            also can regular expression handle it in bulk?

            PeterJonesP 1 Reply Last reply Reply Quote 0
            • PeterJonesP
              PeterJones @Handa Flocka
              last edited by

              @Handa-Flocka ,

              As long as your blocks never contain nested {} (so no { "id": ####, "nested": { ... }, ... }), then the following will likely work for you:

              • FIND = (?s){(?:(?!"type"\s*:\s*"Quote")[^}])+}\s*,?
              • REPLACE = empty
              • SEARCH MODE = regular expression
              1 Reply Last reply Reply Quote 2
              • Handa FlockaH
                Handa Flocka
                last edited by Handa Flocka

                Thanks @PeterJones
                instruction very clear and it works
                appreciated

                1 Reply Last reply Reply Quote 0
                • Handa FlockaH
                  Handa Flocka
                  last edited by

                  This post is deleted!
                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors