How to mark lines with under "x" characters after : in a line.



  • I am working on sorting a list of my login information.

    My website requires a password of 8 characters, so I want to remove the lines with 7 or less characters after the : divider between the email/password.

    I am just unsure of how to do this, help is appreciated. :)

    Example of lines I want to keep:

    mystictoffee11@yahoo.com:smartguy
    maniactor2313@yahoo.com:maurermartin

    Examples of lines I want to delete:

    philipsedf3@yahoo.com:plakes9
    genciyev13@mail.ru:241241



  • @Nicholas-Wetzel: Welcome to the Notepad++ Community.

    Example of lines I want to keep:
    Example of lines I want to delete:

    Thank you for clearly specifying both. That helps us help you.

    Using the regex ^.*:.{1,7}(\R+|\z) to find, with replace being empty, should delete those lines



  • @Nicholas-Wetzel

    i hope your example data does not contain real email addresses and real passwords, as you have just published them to the public. 😉



  • Please, how do I delete two lines that start with the same word, in multiple files.
    example:
    ADDRESS: aaaaaaaaaaaaaa
    ADDRESS-CITY: bbbbbbbbb



  • @Francisco

    from your given example I would say
    find what:^ADDRESS.*$ and leave replace with empty.
    BUT this assumes that the word ADDRESS REALLY starts from the line,
    meaning there is no space or tab or whatever special char in front of ADDRESS.

    MAKE YOUR BACKUP and then run it, slightly modifications might erase your whole file.



  • @Ekopalypse said:

    find what:^ADDRESS.*$ and leave replace with empty.

    That leaves a blank line for each deleted line. If @Francisco wants the whole line, including newline, to be deleted, then the from what: could be ^ADDRESS.*(\R|\Z)



  • @PeterJones said:

    ^ADDRESS.*(\R|\Z)

    I have a problem, the selection starts on the first line that contains the word ^ ADDRESS. * (\ R | \ Z) and ends on the last line of the file, selecting the other lines that do not start with ^ ADDRESS. Z)



  • @Francisco

    If you want to stay on the same line, lead off the search expression with (?-s). This tells the searcher to not allow a . used later to match a line ending character(s). Thus the .* part won’t spillover match onto multiple lines.



  • @Alan Kilborn, thanks, good morning everyone, I was successful using ADDRESS (? - s), selecting only the lines that start with ADDRESS. To exclude them, mark all and exclude the marked lines.
    Is it possible to perform this operation on all open files? I can only find, I can not mark them



  • @Francisco

    You are correct; you cannot bookmark more than one file per marking operation.

    It isn’t clear to me what your real goal is exactly but it appears to be a deletion operation. I think it is likely that this can be done totally with a regular expression replacement and not a combo of regex marking followed by boomarked lines manipulation.



  • @Alan-Kilborn thanks…
    What I need:
    I have 100 text files, with the same format, each with several lines.
    6 of these lines, are present in all files and start like this:
    ADDRESS:
    ADDRESS-CITY: Christmas
    ADDRESS-STATE-PROVINCE: RN
    ADDRESS-POSTALCODE: 59054550
    ADDRESS-COUNTRY: BRAZIL
    EMAIL: mjnhx@globo.com
    I need to easily delete the 6 lines above.



  • @Francisco said:

    I have 100 text files, with the same format, each with several lines.
    6 of these lines, are present in all files and start like this:

    The problem is, you’ve already rejected our solutions (or, at least, you keep on asking, so we have to assume your problem isn’t solved), but have shown nothing that indicates why what we’ve given doesn’t work for you. One reason for this is explained in my boilerplate below (after the dashed line).

    That said, maybe you’re just unsure how to combine @Alan-Kilborn’s fix to my regex, and then have it actually do the deletion, rather than just highlighting. If that’s the case, then it’s simple. I’ll also tweak my portion, because you have now indicated that it should also delete EMAIL, which wasn’t anywhere in your original problem statement.

    • Find What: (?-s)^(?:ADDRESS(-.*?)*|EMAIL):.*?(?:\R|\Z)
      • (?-s): don’t have . match newline
      • ^: match starts at beginning of line
      • (?:...): make a group, but don’t give it a number
      • ADDRESS(-.*?)*: match the word “ADDRESS”, possibly followed by one or more hyphens, possibly followed by other characters
      • |: the OR operator – will match what is before or what is after
      • EMAIL: the word EMAIL
      • :: that group of ADDRESS or EMAIL must be immediately followed by a colon to match
      • .*?: match the remaining characters on the line
      • (?:\R|\Z): another unnumbered group, this time containing a NEWLINE sequence (\R = CR, LF, or CRLF) or end-of-file (\Z).
    • Replace With: empty
      • this will delete the whole line matched above, including the newline
    • Mode = regular expression

    I recommend getting the expression working with one file; once that works, then you can move on to using the Find in Files for all your files.

    With those settings, this block of text:

    ADDRESS:
    ADDRESS-CITY: Christmas
    ADDRESS-STATE-PROVINCE: RN
    ADDRESS-POSTALCODE: 59054550
    ADDRESS-COUNTRY: BRAZIL
    EMAIL: mjnhx@globo.com
    You tell us nothing about the remainder of the file, so I don't know whether
    the following lines match your pattern, or whether they don't:
    SOMETHING-ELSE: value
    MORE-COLONED-LINES: here
    For now, I'll assume you want to keep everything except lines that 
    start with "ADDRESS...:" or "EMAIL:"
    

    would be edited to:

    You tell us nothing about the remainder of the file, so I don't know whether
    the following lines match your pattern, or whether they don't:
    SOMETHING-ELSE: value
    MORE-COLONED-LINES: here
    For now, I'll assume you want to keep everything except lines that 
    start with "ADDRESS...:" or "EMAIL:"
    

    Of course, this is still making lots of assumptions. Other possible interpretations are that you want the first six lines of any file to be deleted, whatever the text. And it might be that the “SOMETHING-ELSE:” I indicated in the example text might also be “ADDRESS:”, in which case we’d have to tweak my regex to limit those matches to the first lines of a file, because mine assumes that any lines starting with “ADDRESS…:” or “EMAIL:” will be deleted.

    It would be easier to help you if you’d give all the information we need at once, rather than doling it out piecemeal. As explained below, a good example would have examples of lines to match and lines not to match, and would show us both the before and after. A good example will also be properly formatted using Markdown (like my example was) – links to Markdown help and regex help are in the boilerplate below.

    -----
    FYI: I often add this to my response in regex threads, unless I am sure the original poster has seen it before. Here is some helpful information for finding out more about regular expressions, and for formatting posts in this forum (especially quoting data) so that we can fully understand what you’re trying to ask:

    This forum is formatted using Markdown, with a help link buried on the little grey ? in the COMPOSE window/pane when writing your post. For more about how to use Markdown in this forum, please see @Scott-Sumner’s post in the “how to markdown code on this forum” topic, and my updates near the end. It is very important that you use these formatting tips – using single backtick marks around small snippets, and using code-quoting for pasting multiple lines from your example data files – because otherwise, the forum will change normal quotes ("") to curly “smart” quotes (“”), will change hyphens to dashes, will sometimes hide asterisks (or if your text is c:\folder\*.txt, it will show up as c:\folder*.txt, missing the backslash). If you want to clearly communicate your text data to us, you need to properly format it.

    If you have further search-and-replace (“matching”, “marking”, “bookmarking”, regular expression, “regex”) needs, study this FAQ and the documentation it points to. Before asking a new regex question, understand that for future requests, many of us will expect you to show what data you have (exactly), what data you want (exactly), what regex you already tried (to show that you’re showing effort), why you thought that regex would work (to prove it wasn’t just something randomly typed), and what data you’re getting with an explanation of why that result is wrong. When you show that effort, you’ll see us bend over backward to get things working for you. If you need help formatting, see the paragraph above.

    Please note that for all regex and related queries, it is best if you are explicit about what needs to match, and what shouldn’t match, and have multiple examples of both in your example dataset. Often, what shouldn’t match helps define the regular expression as much or more than what should match.



  • @PeterJones said:

    (?-s)^(?:ADDRESS(-.?)|EMAIL):.*?(?:\R|\Z)

    (? -s) ^ (?: ADDRESS (-. *?) * | EMAIL):. *? (?: \ R | \ Z)
    This command worked perfectly on all files in a given folder. All lines started by ADDRESS and EMAIL were automatically deleted as desired.
    I am very pleased and grateful for this important help.
    Only three files did not have their email deleted, because the email line does not have the word EMAIL at the beginning of the line.
    P.S. I do not know if it would be possible in this command to include the search for any line that contains the @



  • @PeterJones said:

    @Nicholas-Wetzel: Welcome to the Notepad++ Community.

    Example of lines I want to keep:
    Example of lines I want to delete:

    Thank you for clearly specifying both. That helps us help you.

    Using the regex ^.*:.{1,7}(\R+|\z) to find, with replace being empty, should delete those lines

    Mind checking my new thread here please?

    https://notepad-plus-plus.org/community/topic/18149/sorting-login-information



  • @PeterJones
    Hello sir
    I need help in notepad++, really appreciated
    List:
    kkkkk:123456
    kkkkk:aaaaaa
    kkkkk:a123456
    kkkk:123456a
    Examples of lines I want to delete:
    kkkkk:123456
    kkkkk:aaaaaa
    Delete all line after “:” have only numbers or letter



  • @Hoang-Ngoc

    With data:

    kkkkk:123456
    kkkkk:aaaaaa
    kkkkk:a123456
    kkkk:123456a
    kkkkk:zzzzz
    

    FIND = (?-s)^.*:([[:alpha:]]+|[[:digit:]]+)(\R|\z)
    REPLACE = empty
    SEARCH MODE = regular expression
    yields

    kkkkk:a123456
    kkkk:123456a
    

    The logic I used: you wanted to delete the whole line, so I had to start with “from the start of the line, any character”; you said it came after a colon, so “followed by a colon”; then “followed by either a group of all letters or a group of all numbers”, then “followed by the end of the line (or end of the file)”. I then translated those into regex tokens.

    ----

    Please note: This Community Forum is not a data transformation service; you should not expect to be able to always say “I have data like X and want it to look like Y” and have us do all the work for you. If you are new to the Forum, and new to regular expressions, we will often give help on the first one or two data-transformation questions, especially if they are well-asked and you show a willingness to learn; and we will point you to the documentation where you can learn how to do the data transformations for yourself in the future. But if you repeatedly ask us to do your work for you, you will find that the patience of usually-helpful Community members wears thin. The best way to learn regular expressions is by experimenting with them yourself, and getting a feel for how they work; having us spoon-feed you the answers without you putting in the effort doesn’t help you in the long term and is uninteresting and annoying for us.

    ----

    Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as literal text using the </> toolbar button or manual Markdown syntax. To make regex in red (and so they keep their special characters like *), use backticks, like `^.*?blah.*?\z`. Screenshots can be pasted from the clipboard to your post using Ctrl+V to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries.



  • This post is deleted!


  • @PeterJones

    What about before “:”, my website request “Username can only contain the allowed characters: uppercase letters, lowercase letters, numbers (a-z, A-Z, 0-9), underscores, dashes and periods. Username must begin or end with a letter or number and must contain at least one letter.” and “Account name must have 6-15 characters”
    I wanna delete line not follow the rule



  • @Hoang-Ngoc said in How to mark lines with under "x" characters after : in a line.:

    What about before “:”, my website request “Username can only contain the allowed characters: uppercase letters, lowercase letters, numbers (a-z, A-Z, 0-9), underscores, dashes and periods. Username must begin or end with a letter or number and must contain at least one letter.” and “Account name must have 6-15 characters”

    and then you deleted that and wrote

    I wanna delete line not follow the rule

    Well, that changes things. Thanks for wasting my time while I was writing up deleting everything that didn’t follow that rule. I’ll edit what I was in the middle of…

    -----

    The least you can do is ask complete questions and at least attempt to make your posts make sense (for example, the preview window should have showed you that it was rendering your new text as if it were part of my quoted message, before you deleted it)

    As I said earlier, this forum is not a data transformation service. So you’ll get one more freebie from me. But you’ve got to try to put more effort in if you’re going to be asking people for help. If you want to do many search-and-replace, you’re going to have to read the official Notepad++ regular expression docs, which I already linked for you before, and have now linked again.

    To allow uppercase, lowercase, numbers, underscores, dashes, periods, you can use the [a-zA-Z0-9_.-] . To indicate a specific quantity, you can use {N,M}, where N and M are the range you want to allow. For the more restrictive letter-or-number only for the first and last charcter, use [a-zA-Z0-9] without the other characters. Put that all together: since you want a restrictive followed by N-M less restrictive, followed by a restrictive, the N-M will need to be a range that is two less than the actually-allowed range, so 4-13. Thus, [a-zA-Z0-9][a-zA-Z0-9_.-]{4,13}[a-zA-Z0-9]. And, as before, you need a start-of-line anchor, and want to have the colon after. But this is what’s allowed, and you want to delete what’s not allowed. Since you now want to delete any that match the rules, that’s slightly easier.

    FIND = (?-s)^[a-zA-Z0-9][a-zA-Z0-9_.-]{4,13}[a-zA-Z0-9]:.*(\R|\z)

    Actually, that almost did it.

    short:blah123
    thisIs2good:blah123
    toooverlylongouidiot:blah123
    bad'character:blah123
    ok-char:blah123
    1234-6789:blah123
    -badStart:blah123
    badEnd_:blah123
    1ok_again2:blah123
    

    becomes

    short:blah123
    toooverlylongouidiot:blah123
    bad'character:blah123
    -badStart:blah123
    badEnd_:blah123
    

    You’ll notice that username=1234-6789 line was deleted, even though it didn’t contain at least one letter. That’s because getting the “at least one letter” is hard. So I want to handle that separately.

    Before doing the regex shown above, do a FIND = ^[0-9_.-]{6,15}:.*$ and REPLACE=!KEEPME!$0, which will give an intermediate:

    short:blah123
    thisIs2good:blah123
    toooverlylongouidiot:blah123
    bad'character:blah123
    ok-char:blah123
    !KEEPME!1234-6789:blah123
    -badStart:blah123
    badEnd_:blah123
    1ok_again2:blah123
    

    Now do the one I showed earlier: (?-s)^[a-zA-Z0-9][a-zA-Z0-9_.-]{4,13}[a-zA-Z0-9]:.*(\R|\z) =>

    short:blah123
    toooverlylongouidiot:blah123
    bad'character:blah123
    !KEEPME!1234-6789:blah123
    -badStart:blah123
    badEnd_:blah123
    

    Now do FIND = ^!KEEPME! and REPLACE = empty to get rid of that indicator.

    short:blah123
    toooverlylongouidiot:blah123
    bad'character:blah123
    1234-6789:blah123
    -badStart:blah123
    badEnd_:blah123
    

    Now you only show the usernames that violate your rules.



  • @PeterJones

    Thank you so much, i really appreciate what you are doing for this community, keep it up


Log in to reply