Very simple regex question



  • I need to search for and insert hundreds of missing periods in a book. Probably best way is to look for a lowercase character followed by a space and then an uppercase. How would the search be coded? A second part of the question is if I can use Replace to replace the characters that were found. Or would I have to manually insert the period???
    Periods are also missing from ends of paragraphs – this one I could probably figure out but would appreciate help here too.
    I said it was simple! Thank you.



  • As you are obviously aware regexs can only do so much when it comes to actual human-language based text (e.g. “Can I go?” would get an erroneous period)…that being said they can certainly help.

    look for a lowercase character followed by a space and then an uppercase

    The regex would look like this (make sure Match Case is check marked):

    ([a-z]) ([A-Z])
    

    And for the replacement string you would use:

    \1. \2
    

    A bit of explanation.

    • [a-z] matches a single character between a and z
    • [A-Z] matches a single character between A and Z
    • Wrapping something in () saves whatever is inside it. This is called a group.
    • \1 refers to the first group…obviously \2 refers to the second group.

    Periods are also missing from ends of paragraphs

    I guess it depends how “paragraphs” are defined. If it is 2 newlines in a row then knowing that \R means a single newline might be helpful.



  • Thank you! I appreciate your reply as it also teaches. This will save me many hours of work.
    I am not sure whether this would be a good idea, but how would I eliminate “I” from ([A-Z])?
    “I” of course is often used within a sentence and does not need a period – however it often also begins a sentence so I may simply have to handle each occurrence as it comes up. … As you say “regexs can only do so much when it comes to actual human-language based text”



  • how would I eliminate “I” from ([A-Z])

    [A-HJ-Z]
    

    This is probably the most straight forward way of doing it.


Log in to reply