Community
    • Login

    a newbie question about search

    Scheduled Pinned Locked Moved General Discussion
    16 Posts 6 Posters 556 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Mark OlsonM
      Mark Olson @Roni Segoly
      last edited by

      @Roni-Segoly
      You say you have a “huge js file”. If by js you mean JSON, the JsonTools plugin can help. If by js you mean JavaScript, I may not be able to help; JsonTools can handle some JavaScript objects even if they don’t comply with the original JSON specification, but it can’t handle all the complexities of JavaScript syntax.

      If you provide a small example of what you are trying to do, I may be able to suggest how to solve your problem with JsonTools or some other plugin like PythonScript.

      Roni SegolyR 1 Reply Last reply Reply Quote 2
      • Roni SegolyR
        Roni Segoly @Mark Olson
        last edited by

        @Mark-Olson The fact it’s JS is less relevant, I can save and treat as txt
        See example below
        I need the lines starting with “created_at” and “full_text” one after the others, or maybe with line number and then I can sort by line numbers

        I can send the whole file if needed

        },
        “display_text_range” : [
        “0”,
        “24”
        ],
        “favorite_count” : “1”,
        “in_reply_to_status_id_str” : “1846199292232339627”,
        “id_str” : “1846219638109086002”,
        “in_reply_to_user_id” : “4774540948”,
        “truncated” : false,
        “retweet_count” : “0”,
        “id” : “1846219638109086002”,
        “in_reply_to_status_id” : “1846199292232339627”,
        “created_at” : “Tue Oct 15 16:00:37 +0000 2024”,
        “favorited” : false,
        “full_text” : “@zamir_shatz יש גם רמב"פ”,
        “lang” : “iw”,
        “in_reply_to_screen_name” : “zamir_shatz”,
        “in_reply_to_user_id_str” : “4774540948”

        Alan KilbornA Mark OlsonM 2 Replies Last reply Reply Quote 0
        • Alan KilbornA
          Alan Kilborn @Roni Segoly
          last edited by Alan Kilborn

          @Roni-Segoly

          Overall, your posting is vague. There’s a FAQ about properly posting such questions.

          Likely your data is actually:

          },
          "display_text_range" : [
          "0",
          "24"
          ],
          "favorite_count" : "1",
          "in_reply_to_status_id_str" : "1846199292232339627",
          "id_str" : "1846219638109086002",
          "in_reply_to_user_id" : "4774540948",
          "truncated" : false,
          "retweet_count" : "0",
          "id" : "1846219638109086002",
          "in_reply_to_status_id" : "1846199292232339627",
          "created_at" : "Tue Oct 15 16:00:37 +0000 2024",
          "favorited" : false,
          "full_text" : "@zamir_shatz יש גם רמב"פ",
          "lang" : "iw",
          "in_reply_to_screen_name" : "zamir_shatz",
          "in_reply_to_user_id_str" : "4774540948"
          

          If I were doing your task, I might start this way:

          • Invoke Mark with Ctrl+m
          • In Find what put "created_at"|"full_text"
          • Checkmark: Bookmark line, Match case, Wrap around and Regular expression
          • Press Mark all
          • On the Search menu, choose Bookmark, then select Copy Bookmarked Lines
          • Create a new document with File > New (or simply press Ctrl+n)
          • Do Ctrl+v (paste)

          See what that gets you for a start.

          Roni SegolyR 2 Replies Last reply Reply Quote 2
          • guy038G
            guy038
            last edited by guy038

            Hello, @roni-segoly, @mark-olson, @alan-kilborn and All,

            @roni-segoly, you did not provide enough text to guess which should be the right way to help you !

            Do you mean that, from this INPUT text :

            },
            "display_text_range" : [
            "0",
            "24"
            ],
            "favorite_count" : "1",
            "in_reply_to_status_id_str" : "1846199292232339627",
            "id_str" : "1846219638109086002",
            "in_reply_to_user_id" : "4774540948",
            "truncated" : false,
            "retweet_count" : "0",
            "id" : "1846219638109086002",
            "in_reply_to_status_id" : "1846199292232339627",
            "created_at" : "Tue Oct 15 16:00:37 +0000 2024",
            "favorited" : false,
            "full_text" : "@zamir_shatz יש גם רמב"פ",
            "lang" : "iw",
            "in_reply_to_screen_name" : "zamir_shatz",
            "in_reply_to_user_id_str" : "4774540948"
            

            You are expecting this OUTPUT text, with the two lines, beginning with created_at or full_text, moved after the others ones ?

            },
            "display_text_range" : [
            "0",
            "24"
            ],
            "favorite_count" : "1",
            "in_reply_to_status_id_str" : "1846199292232339627",
            "id_str" : "1846219638109086002",
            "in_reply_to_user_id" : "4774540948",
            "truncated" : false,
            "retweet_count" : "0",
            "id" : "1846219638109086002",
            "in_reply_to_status_id" : "1846199292232339627",
            "favorited" : false,
            "lang" : "iw",
            "in_reply_to_screen_name" : "zamir_shatz",
            "in_reply_to_user_id_str" : "4774540948"
            "created_at" : "Tue Oct 15 16:00:37 +0000 2024",
            "full_text" : "@zamir_shatz יש גם רמב"פ",
            

            Best Regards,

            guy038

            1 Reply Last reply Reply Quote 1
            • Roni SegolyR
              Roni Segoly @Alan Kilborn
              last edited by

              @Alan-Kilborn Managed, cheers

              1 Reply Last reply Reply Quote 1
              • Roni SegolyR
                Roni Segoly @Alan Kilborn
                last edited by

                @Alan-Kilborn I did print screen of section of the file as not everyone has Hebrew characters
                I need if possible to be without the labels and date and text in one line, separated by comma
                Like
                “Tue Oct 15 16:00:37 +0000 2024”, “@zamir_shatz יש גם רמב"פ”

                740f7dd0-aa70-4065-ab7f-23e3db7786a2-image.png
                I cannot post the link to the file yet, need two reputations

                Alan KilbornA 1 Reply Last reply Reply Quote 1
                • Mark OlsonM
                  Mark Olson @Roni Segoly
                  last edited by Mark Olson

                  @Roni-Segoly
                  In the future, you should refer to JSON as JSON or json, not js. Calling it js is confusing to programmers like me, because js is generally used to refer to JavaScript, not JSON.

                  JsonTools makes it easy to extract a few fields (like full_text and created_at) from each object in an array of objects, which is what your tweets appear to be.

                  1. Open the JsonTools tree view for your file.
                  2. In the text box in the upper left-hand corner of the tree view, enter the query @[:][created_at, full_text]. This RemesPath query will iterate through the array of objects and extract the created_at and full_text fields from each object.
                  3. Click the Submit query button.
                  4. You can now look at the tree view and notice that the tree displays only the full_text and created_at field in each object.
                  5. Click the Save query result button.
                  6. The fields you wanted will now be in a new buffer, which you can save to a new file if desired.

                  JsonTools has a lot of other features, like a sort form that can sort JSON arrays in a variety of different ways. I recommend reading the documentation; I put a lot of work into making it readable and thorough.

                  EDIT: Don’t post a link to the file. If it’s really large, it will waste the resources of this forum. I know what tweet JSON looks like; I have a bunch of it on my own computer that I use as examples to test JsonTools.

                  EDIT2: If you don’t know what I mean by “array” and “object”, you should read this introduction to JSON. It is a bad idea to work with JSON without understanding it.

                  1 Reply Last reply Reply Quote 2
                  • Alan KilbornA
                    Alan Kilborn @Roni Segoly
                    last edited by

                    @Roni-Segoly:

                    not everyone has Hebrew characters

                    They don’t?

                    1 Reply Last reply Reply Quote 0
                    • guy038G
                      guy038
                      last edited by

                      Hi, @roni-segoly, @mark-olson, @alan-kilborn and All,

                      Very easy with regexes !

                      So :

                      • Move to your file tab, first

                      • Open the Replace dialog ( Ctrl + H )

                      • Untick all box options

                      • SEARCH (?-is)^"created_at"\x20:\x20|\R"full_text"\x20:(.+),$

                      • REPLACE ?1\1

                      • Check the Wrap around option

                      • Select the Regular expression search mode

                      • Click, once only, on the Replace All button

                      Voila !

                      BR

                      guy038

                      Mark OlsonM 1 Reply Last reply Reply Quote 0
                      • Mark OlsonM
                        Mark Olson @guy038
                        last edited by

                        @guy038 said in a newbie question about search:

                        Very easy with regexes !

                        Not in general.

                        Alan KilbornA 1 Reply Last reply Reply Quote 0
                        • Alan KilbornA
                          Alan Kilborn @Mark Olson
                          last edited by

                          @Mark-Olson said in a newbie question about search:

                          Not in general.

                          Well, I guess it depends.
                          Totally generally, then I agree with you.

                          If the data is simple and well formed, it’s doable.
                          That was the presumption I was proceeding upon with my first answer to OP.

                          But, everyone encouraged OP to say more…to little avail.
                          So, if the data was NOT simple and well formed, OP likely did not get what he wanted, at least from my method.

                          But OP seemed satisfied, so, let’s move on…

                          Mark OlsonM 1 Reply Last reply Reply Quote 2
                          • Mark OlsonM
                            Mark Olson @Alan Kilborn
                            last edited by

                            @Alan-Kilborn said in a newbie question about search:

                            If the data is simple and well formed, it’s doable.
                            That was the presumption I was proceeding upon with my first answer to OP.

                            Tweet JSON is extremely complex and deeply nested, with some fields appearing at different nesting depths. The created_at field, for example, appears in the root object, the retweeted_status child of the root object, and the user child of the root object. If the JSON file is printed out with no depth-based indentation, your regex has no way of differentiating between these created_at fields. The full_text field could also appear at different nesting depths.

                            To expand, tweet JSON can have a structure that looks a little bit like this (only much, much worse):

                            [
                              {
                                "Root1": {
                                  "bar": false,
                                  "quz": 1
                                },
                                "rOOt2": {
                                  "quz": 2,
                                  "bar": false
                                },
                                "ROOT3": [
                                  {
                                    "id": 1,
                                    "id_str": 2
                                  }
                                ],
                                "ROot4": [
                                    "id": -37,
                                    "id_str": 75
                                ]
                                "roOT5": "blah"
                              }
                            ]
                            

                            If you write a regex that searches for the bar field, you most likely won’t be able to tell whether its parent is Root1 or rOOt2. A similar issue happens with the id_str and id keys.

                            Alan KilbornA 1 Reply Last reply Reply Quote 1
                            • Alan KilbornA
                              Alan Kilborn @Mark Olson
                              last edited by

                              @Mark-Olson said:

                              Tweet JSON is extremely complex and deeply nested, with some fields appearing at different nesting depths. The created_at field, for example, appears in the root object, the retweeted_status child of the root object, and the user child of the root object. If the JSON file is printed out with no depth-based indentation, your regex has no way of differentiating between these created_at fields. The full_text field could also appear at different nesting depths.

                              and blah blah blah…

                              I hope that’s for the benefit of the OP and not me, because I don’t care an iota about Tweet JSON or WTF the data is. I sold my solution as simple-minded, it’s up to OP to decide if it works for him, or to keep pursuing some other solution.

                              Know thy data…understand how you’re manipulating it – this is OP’s responsibility. As is asking a full and complete question, with representative data fully shown.

                              1 Reply Last reply Reply Quote 2
                              • Alen MarkA
                                Alen Mark @Roni Segoly
                                last edited by

                                @Roni-Segoly
                                Yes, it’s definitely possible to extract specific lines from your large JS file in Notepad++, and there are a couple of ways you can approach this:

                                Using Regular Expressions (Regex): Notepad++ has a powerful “Find” feature that supports regular expressions, which can help you search for patterns in your file. If you know the structure of the lines containing the specific strings you want to extract, you can use a regex search to locate them together.

                                Here’s how to use Regex in Notepad++:

                                Press Ctrl + F to open the Find dialog.
                                Go to the Find tab and select Regular expression in the search mode.
                                Use a regex pattern to find the string you need along with its corresponding line. For example:
                                markdown
                                Copy code
                                (YourFirstString.*\n.*YourSecondString)
                                This will match the first line with YourFirstString and the line immediately after it with YourSecondString.

                                Using a Script Plugin: If you’re dealing with more complex extractions or specific logic, you might want to install the PythonScript or NppExec plugin, which allows you to write and execute small scripts directly in Notepad++. You can write a script that reads the file line by line, checks for the matching strings, and extracts the corresponding lines as needed.

                                These methods should help you extract the corresponding lines from your large JS file. If you have more details on the structure, I could help refine the search process further!

                                PeterJonesP 1 Reply Last reply Reply Quote -3
                                • PeterJonesP
                                  PeterJones @Alen Mark
                                  last edited by

                                  @Alen-Mark ,

                                  Please note: this is the second time you’ve come to the forum and posted ultra-generic content that sounds vaguely on-topic: it is highly reminiscent of AI-generated phraseology.

                                  Please understand that posting AI-Generated content is expressly forbidden in this forum. And if your posts continue to appear as if they are – whether or not they are – you are likely to get banned. If you wish to avoid looking like (and getting banned as) AI, then I suggest you tailor your replies to the individual posts, rather than providing overly-generic responses that don’t take into account the context of the entire conversation.

                                  1 Reply Last reply Reply Quote 2
                                  • First post
                                    Last post
                                  The Community of users of the Notepad++ text editor.
                                  Powered by NodeBB | Contributors