Community
    • Login

    How to mark lines with under "x" characters after : in a line.

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    21 Posts 8 Posters 8.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • EkopalypseE
      Ekopalypse @Francisco
      last edited by

      @Francisco

      from your given example I would say
      find what:^ADDRESS.*$ and leave replace with empty.
      BUT this assumes that the word ADDRESS REALLY starts from the line,
      meaning there is no space or tab or whatever special char in front of ADDRESS.

      MAKE YOUR BACKUP and then run it, slightly modifications might erase your whole file.

      1 Reply Last reply Reply Quote 2
      • PeterJonesP
        PeterJones
        last edited by

        @Ekopalypse said:

        find what:^ADDRESS.*$ and leave replace with empty.

        That leaves a blank line for each deleted line. If @Francisco wants the whole line, including newline, to be deleted, then the from what: could be ^ADDRESS.*(\R|\Z)

        1 Reply Last reply Reply Quote 3
        • FranciscoF
          Francisco
          last edited by

          @PeterJones said:

          ^ADDRESS.*(\R|\Z)

          I have a problem, the selection starts on the first line that contains the word ^ ADDRESS. * (\ R | \ Z) and ends on the last line of the file, selecting the other lines that do not start with ^ ADDRESS. Z)

          Alan KilbornA 1 Reply Last reply Reply Quote 0
          • Alan KilbornA
            Alan Kilborn @Francisco
            last edited by

            @Francisco

            If you want to stay on the same line, lead off the search expression with (?-s). This tells the searcher to not allow a . used later to match a line ending character(s). Thus the .* part won’t spillover match onto multiple lines.

            1 Reply Last reply Reply Quote 2
            • FranciscoF
              Francisco
              last edited by Francisco

              @Alan Kilborn, thanks, good morning everyone, I was successful using ADDRESS (? - s), selecting only the lines that start with ADDRESS. To exclude them, mark all and exclude the marked lines.
              Is it possible to perform this operation on all open files? I can only find, I can not mark them

              Alan KilbornA 1 Reply Last reply Reply Quote 2
              • Alan KilbornA
                Alan Kilborn @Francisco
                last edited by

                @Francisco

                You are correct; you cannot bookmark more than one file per marking operation.

                It isn’t clear to me what your real goal is exactly but it appears to be a deletion operation. I think it is likely that this can be done totally with a regular expression replacement and not a combo of regex marking followed by boomarked lines manipulation.

                FranciscoF 1 Reply Last reply Reply Quote 2
                • FranciscoF
                  Francisco @Alan Kilborn
                  last edited by

                  @Alan-Kilborn thanks…
                  What I need:
                  I have 100 text files, with the same format, each with several lines.
                  6 of these lines, are present in all files and start like this:
                  ADDRESS:
                  ADDRESS-CITY: Christmas
                  ADDRESS-STATE-PROVINCE: RN
                  ADDRESS-POSTALCODE: 59054550
                  ADDRESS-COUNTRY: BRAZIL
                  EMAIL: mjnhx@globo.com
                  I need to easily delete the 6 lines above.

                  1 Reply Last reply Reply Quote 0
                  • PeterJonesP
                    PeterJones
                    last edited by PeterJones

                    @Francisco said:

                    I have 100 text files, with the same format, each with several lines.
                    6 of these lines, are present in all files and start like this:

                    The problem is, you’ve already rejected our solutions (or, at least, you keep on asking, so we have to assume your problem isn’t solved), but have shown nothing that indicates why what we’ve given doesn’t work for you. One reason for this is explained in my boilerplate below (after the dashed line).

                    That said, maybe you’re just unsure how to combine @Alan-Kilborn’s fix to my regex, and then have it actually do the deletion, rather than just highlighting. If that’s the case, then it’s simple. I’ll also tweak my portion, because you have now indicated that it should also delete EMAIL, which wasn’t anywhere in your original problem statement.

                    • Find What: (?-s)^(?:ADDRESS(-.*?)*|EMAIL):.*?(?:\R|\Z)
                      • (?-s): don’t have . match newline
                      • ^: match starts at beginning of line
                      • (?:...): make a group, but don’t give it a number
                      • ADDRESS(-.*?)*: match the word “ADDRESS”, possibly followed by one or more hyphens, possibly followed by other characters
                      • |: the OR operator – will match what is before or what is after
                      • EMAIL: the word EMAIL
                      • :: that group of ADDRESS or EMAIL must be immediately followed by a colon to match
                      • .*?: match the remaining characters on the line
                      • (?:\R|\Z): another unnumbered group, this time containing a NEWLINE sequence (\R = CR, LF, or CRLF) or end-of-file (\Z).
                    • Replace With: empty
                      • this will delete the whole line matched above, including the newline
                    • Mode = regular expression

                    I recommend getting the expression working with one file; once that works, then you can move on to using the Find in Files for all your files.

                    With those settings, this block of text:

                    ADDRESS:
                    ADDRESS-CITY: Christmas
                    ADDRESS-STATE-PROVINCE: RN
                    ADDRESS-POSTALCODE: 59054550
                    ADDRESS-COUNTRY: BRAZIL
                    EMAIL: mjnhx@globo.com
                    You tell us nothing about the remainder of the file, so I don't know whether
                    the following lines match your pattern, or whether they don't:
                    SOMETHING-ELSE: value
                    MORE-COLONED-LINES: here
                    For now, I'll assume you want to keep everything except lines that 
                    start with "ADDRESS...:" or "EMAIL:"
                    

                    would be edited to:

                    You tell us nothing about the remainder of the file, so I don't know whether
                    the following lines match your pattern, or whether they don't:
                    SOMETHING-ELSE: value
                    MORE-COLONED-LINES: here
                    For now, I'll assume you want to keep everything except lines that 
                    start with "ADDRESS...:" or "EMAIL:"
                    

                    Of course, this is still making lots of assumptions. Other possible interpretations are that you want the first six lines of any file to be deleted, whatever the text. And it might be that the “SOMETHING-ELSE:” I indicated in the example text might also be “ADDRESS:”, in which case we’d have to tweak my regex to limit those matches to the first lines of a file, because mine assumes that any lines starting with “ADDRESS…:” or “EMAIL:” will be deleted.

                    It would be easier to help you if you’d give all the information we need at once, rather than doling it out piecemeal. As explained below, a good example would have examples of lines to match and lines not to match, and would show us both the before and after. A good example will also be properly formatted using Markdown (like my example was) – links to Markdown help and regex help are in the boilerplate below.

                    -----
                    FYI: I often add this to my response in regex threads, unless I am sure the original poster has seen it before. Here is some helpful information for finding out more about regular expressions, and for formatting posts in this forum (especially quoting data) so that we can fully understand what you’re trying to ask:

                    This forum is formatted using Markdown, with a help link buried on the little grey ? in the COMPOSE window/pane when writing your post. For more about how to use Markdown in this forum, please see @Scott-Sumner’s post in the “how to markdown code on this forum” topic, and my updates near the end. It is very important that you use these formatting tips – using single backtick marks around small snippets, and using code-quoting for pasting multiple lines from your example data files – because otherwise, the forum will change normal quotes ("") to curly “smart” quotes (“”), will change hyphens to dashes, will sometimes hide asterisks (or if your text is c:\folder\*.txt, it will show up as c:\folder*.txt, missing the backslash). If you want to clearly communicate your text data to us, you need to properly format it.

                    If you have further search-and-replace (“matching”, “marking”, “bookmarking”, regular expression, “regex”) needs, study this FAQ and the documentation it points to. Before asking a new regex question, understand that for future requests, many of us will expect you to show what data you have (exactly), what data you want (exactly), what regex you already tried (to show that you’re showing effort), why you thought that regex would work (to prove it wasn’t just something randomly typed), and what data you’re getting with an explanation of why that result is wrong. When you show that effort, you’ll see us bend over backward to get things working for you. If you need help formatting, see the paragraph above.

                    Please note that for all regex and related queries, it is best if you are explicit about what needs to match, and what shouldn’t match, and have multiple examples of both in your example dataset. Often, what shouldn’t match helps define the regular expression as much or more than what should match.

                    FranciscoF 1 Reply Last reply Reply Quote 2
                    • FranciscoF
                      Francisco @PeterJones
                      last edited by Francisco

                      @PeterJones said:

                      (?-s)^(?:ADDRESS(-.?)|EMAIL):.*?(?:\R|\Z)

                      (? -s) ^ (?: ADDRESS (-. *?) * | EMAIL):. *? (?: \ R | \ Z)
                      This command worked perfectly on all files in a given folder. All lines started by ADDRESS and EMAIL were automatically deleted as desired.
                      I am very pleased and grateful for this important help.
                      Only three files did not have their email deleted, because the email line does not have the word EMAIL at the beginning of the line.
                      P.S. I do not know if it would be possible in this command to include the search for any line that contains the @

                      1 Reply Last reply Reply Quote 1
                      • Nicholas WetzelN
                        Nicholas Wetzel @PeterJones
                        last edited by

                        @PeterJones said:

                        @Nicholas-Wetzel: Welcome to the Notepad++ Community.

                        Example of lines I want to keep:
                        Example of lines I want to delete:

                        Thank you for clearly specifying both. That helps us help you.

                        Using the regex ^.*:.{1,7}(\R+|\z) to find, with replace being empty, should delete those lines

                        Mind checking my new thread here please?

                        https://notepad-plus-plus.org/community/topic/18149/sorting-login-information

                        1 Reply Last reply Reply Quote 0
                        • Hoang NgocH
                          Hoang Ngoc @PeterJones
                          last edited by

                          @PeterJones
                          Hello sir
                          I need help in notepad++, really appreciated
                          List:
                          kkkkk:123456
                          kkkkk:aaaaaa
                          kkkkk:a123456
                          kkkk:123456a
                          Examples of lines I want to delete:
                          kkkkk:123456
                          kkkkk:aaaaaa
                          Delete all line after “:” have only numbers or letter

                          PeterJonesP 1 Reply Last reply Reply Quote 0
                          • PeterJonesP
                            PeterJones @Hoang Ngoc
                            last edited by

                            @Hoang-Ngoc

                            With data:

                            kkkkk:123456
                            kkkkk:aaaaaa
                            kkkkk:a123456
                            kkkk:123456a
                            kkkkk:zzzzz
                            

                            FIND = (?-s)^.*:([[:alpha:]]+|[[:digit:]]+)(\R|\z)
                            REPLACE = empty
                            SEARCH MODE = regular expression
                            yields

                            kkkkk:a123456
                            kkkk:123456a
                            

                            The logic I used: you wanted to delete the whole line, so I had to start with “from the start of the line, any character”; you said it came after a colon, so “followed by a colon”; then “followed by either a group of all letters or a group of all numbers”, then “followed by the end of the line (or end of the file)”. I then translated those into regex tokens.

                            ----

                            Please note: This Community Forum is not a data transformation service; you should not expect to be able to always say “I have data like X and want it to look like Y” and have us do all the work for you. If you are new to the Forum, and new to regular expressions, we will often give help on the first one or two data-transformation questions, especially if they are well-asked and you show a willingness to learn; and we will point you to the documentation where you can learn how to do the data transformations for yourself in the future. But if you repeatedly ask us to do your work for you, you will find that the patience of usually-helpful Community members wears thin. The best way to learn regular expressions is by experimenting with them yourself, and getting a feel for how they work; having us spoon-feed you the answers without you putting in the effort doesn’t help you in the long term and is uninteresting and annoying for us.

                            ----

                            Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as literal text using the </> toolbar button or manual Markdown syntax. To make regex in red (and so they keep their special characters like *), use backticks, like `^.*?blah.*?\z`. Screenshots can be pasted from the clipboard to your post using Ctrl+V to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries.

                            Hoang NgocH 1 Reply Last reply Reply Quote 1
                            • Hoang NgocH
                              Hoang Ngoc
                              last edited by

                              This post is deleted!
                              PeterJonesP 1 Reply Last reply Reply Quote 0
                              • Hoang NgocH
                                Hoang Ngoc @PeterJones
                                last edited by

                                @PeterJones

                                What about before “:”, my website request “Username can only contain the allowed characters: uppercase letters, lowercase letters, numbers (a-z, A-Z, 0-9), underscores, dashes and periods. Username must begin or end with a letter or number and must contain at least one letter.” and “Account name must have 6-15 characters”
                                I wanna delete line not follow the rule

                                1 Reply Last reply Reply Quote 0
                                • PeterJonesP
                                  PeterJones @Hoang Ngoc
                                  last edited by

                                  @Hoang-Ngoc said in How to mark lines with under "x" characters after : in a line.:

                                  What about before “:”, my website request “Username can only contain the allowed characters: uppercase letters, lowercase letters, numbers (a-z, A-Z, 0-9), underscores, dashes and periods. Username must begin or end with a letter or number and must contain at least one letter.” and “Account name must have 6-15 characters”

                                  and then you deleted that and wrote

                                  I wanna delete line not follow the rule

                                  Well, that changes things. Thanks for wasting my time while I was writing up deleting everything that didn’t follow that rule. I’ll edit what I was in the middle of…

                                  -----

                                  The least you can do is ask complete questions and at least attempt to make your posts make sense (for example, the preview window should have showed you that it was rendering your new text as if it were part of my quoted message, before you deleted it)

                                  As I said earlier, this forum is not a data transformation service. So you’ll get one more freebie from me. But you’ve got to try to put more effort in if you’re going to be asking people for help. If you want to do many search-and-replace, you’re going to have to read the official Notepad++ regular expression docs, which I already linked for you before, and have now linked again.

                                  To allow uppercase, lowercase, numbers, underscores, dashes, periods, you can use the [a-zA-Z0-9_.-] . To indicate a specific quantity, you can use {N,M}, where N and M are the range you want to allow. For the more restrictive letter-or-number only for the first and last charcter, use [a-zA-Z0-9] without the other characters. Put that all together: since you want a restrictive followed by N-M less restrictive, followed by a restrictive, the N-M will need to be a range that is two less than the actually-allowed range, so 4-13. Thus, [a-zA-Z0-9][a-zA-Z0-9_.-]{4,13}[a-zA-Z0-9]. And, as before, you need a start-of-line anchor, and want to have the colon after. But this is what’s allowed, and you want to delete what’s not allowed. Since you now want to delete any that match the rules, that’s slightly easier.

                                  FIND = (?-s)^[a-zA-Z0-9][a-zA-Z0-9_.-]{4,13}[a-zA-Z0-9]:.*(\R|\z)

                                  Actually, that almost did it.

                                  short:blah123
                                  thisIs2good:blah123
                                  toooverlylongouidiot:blah123
                                  bad'character:blah123
                                  ok-char:blah123
                                  1234-6789:blah123
                                  -badStart:blah123
                                  badEnd_:blah123
                                  1ok_again2:blah123
                                  

                                  becomes

                                  short:blah123
                                  toooverlylongouidiot:blah123
                                  bad'character:blah123
                                  -badStart:blah123
                                  badEnd_:blah123
                                  

                                  You’ll notice that username=1234-6789 line was deleted, even though it didn’t contain at least one letter. That’s because getting the “at least one letter” is hard. So I want to handle that separately.

                                  Before doing the regex shown above, do a FIND = ^[0-9_.-]{6,15}:.*$ and REPLACE=!KEEPME!$0, which will give an intermediate:

                                  short:blah123
                                  thisIs2good:blah123
                                  toooverlylongouidiot:blah123
                                  bad'character:blah123
                                  ok-char:blah123
                                  !KEEPME!1234-6789:blah123
                                  -badStart:blah123
                                  badEnd_:blah123
                                  1ok_again2:blah123
                                  

                                  Now do the one I showed earlier: (?-s)^[a-zA-Z0-9][a-zA-Z0-9_.-]{4,13}[a-zA-Z0-9]:.*(\R|\z) =>

                                  short:blah123
                                  toooverlylongouidiot:blah123
                                  bad'character:blah123
                                  !KEEPME!1234-6789:blah123
                                  -badStart:blah123
                                  badEnd_:blah123
                                  

                                  Now do FIND = ^!KEEPME! and REPLACE = empty to get rid of that indicator.

                                  short:blah123
                                  toooverlylongouidiot:blah123
                                  bad'character:blah123
                                  1234-6789:blah123
                                  -badStart:blah123
                                  badEnd_:blah123
                                  

                                  Now you only show the usernames that violate your rules.

                                  Hoang NgocH 1 Reply Last reply Reply Quote 0
                                  • Hoang NgocH
                                    Hoang Ngoc @PeterJones
                                    last edited by

                                    @PeterJones

                                    Thank you so much, i really appreciate what you are doing for this community, keep it up

                                    1 Reply Last reply Reply Quote 0
                                    • guy038G
                                      guy038
                                      last edited by

                                      Hello, @hoang-ngoc, @peterjones and All,

                                      The following single search regex could be used and, with an empty replace field, would delete any line with a valid user-name :

                                      SEARCH (?i-s)(?=^[a-z0-9])(?=.*[a-z0-9]:)(?=.*[a-z].*:)^[a-z0-9_.-]{6,15}:.*\R?

                                      REPLACE Leave EMPTY


                                      Notes :

                                      • The (?i-s) forces an insensitive search process and the regex dot . standing for a single standard character

                                      • Then the main part is ^[a-z0-9_.-]{6,15}:.*\R? which searches for 6 to 15 chars, before a colon which can be, either, a standard letter or digit, an underscore, a period or a dash, followed by the remainder of current line and a possible line_break

                                      • This part will be valid ONLY IF, in addition, these three lookaheads are TRUE, at beginning of current line :

                                        • A letter or digit begins the user-name ( part (?=^[a-z0-9]) )

                                        • A letter or digit ends the user-name ( part (?=.*[a-z0-9]:) )

                                        • The user-name contains, at least, ONE letter ( part (?=.*[a-z].*:) )


                                      So, given this INPUT text :

                                      short:••••••••                      #  < 6 chars
                                      ThisIs2good:••••••••                #  OK
                                      Looong_user-name:••••••••           #  > 15 chars
                                      us@er'NA=ME:••••••••                #  NON-VALID chars
                                      ok-chr:••••••••                     #  OK  (  7 chars and ALL chars ALLOWED )
                                      ABCD-FGHI_12.34:••••••••            #  OK  ( 15 chars and ALL chars ALLOWED )
                                      1234-6789:••••••••                  #  NO letter
                                      .User-Name:••••••••                 #  NON-VALID char at START
                                      USER.NAME_:••••••••                 #  NON-VALID char at END
                                      1ok_again2:••••••••                 #  OK
                                      

                                      After the replacment, it would remain :the following OUTPUT text :

                                      short:••••••••                      #  < 6 chars
                                      Looong_user-name:••••••••           #  > 15 chars
                                      us@er'NA=ME:••••••••                #  NON-VALID chars
                                      1234-6789:••••••••                  #  NO letter
                                      .User-Name:••••••••                 #  NON-VALID char at START
                                      USER.NAME_:••••••••                 #  NON-VALID char at END
                                      

                                      Best regards

                                      guy038

                                      1 Reply Last reply Reply Quote 1
                                      • First post
                                        Last post
                                      The Community of users of the Notepad++ text editor.
                                      Powered by NodeBB | Contributors