Very simple regex question
-
I need to search for and insert hundreds of missing periods in a book. Probably best way is to look for a lowercase character followed by a space and then an uppercase. How would the search be coded? A second part of the question is if I can use Replace to replace the characters that were found. Or would I have to manually insert the period???
Periods are also missing from ends of paragraphs – this one I could probably figure out but would appreciate help here too.
I said it was simple! Thank you. -
As you are obviously aware regexs can only do so much when it comes to actual human-language based text (e.g. “Can I go?” would get an erroneous period)…that being said they can certainly help.
look for a lowercase character followed by a space and then an uppercase
The regex would look like this (make sure
Match Case
is check marked):([a-z]) ([A-Z])
And for the replacement string you would use:
\1. \2
A bit of explanation.
[a-z]
matches a single character betweena
andz
[A-Z]
matches a single character betweenA
andZ
- Wrapping something in
()
saves whatever is inside it. This is called a group. \1
refers to the first group…obviously\2
refers to the second group.
Periods are also missing from ends of paragraphs
I guess it depends how “paragraphs” are defined. If it is 2 newlines in a row then knowing that
\R
means a single newline might be helpful. -
Thank you! I appreciate your reply as it also teaches. This will save me many hours of work.
I am not sure whether this would be a good idea, but how would I eliminate “I” from ([A-Z])?
“I” of course is often used within a sentence and does not need a period – however it often also begins a sentence so I may simply have to handle each occurrence as it comes up. … As you say “regexs can only do so much when it comes to actual human-language based text” -
how would I eliminate “I” from ([A-Z])
[A-HJ-Z]
This is probably the most straight forward way of doing it.