Help coding text errors
Declan Conner last edited by
As a seventy-year- old author, try as I may, working out regular expression codes just don’t sink in. I’d really appreciate some help if it’s possible to code these problem errors to find them in notepad++.
Here are examples of errors I am trying to find in a text document. I’ve put the errors in brackets.
“Look at the time(.)” I said. (should be a comma and not a period.)
“Yes, I can see we’re late( )” I replied. (missing comma.)
“We should order a taxi(,)” (should be a period.)
“It’s cheaper by bus(.)” I said(,) “Better we take the bus.” (Should be a comma.) (should be a period.)
He rubbed his hand together( ) A cold draught blew through the gap in the door. (Narrative, missing a period where a new sentence follows with a capital letter.)
PeterJones last edited by
Don’t worry. It’s not just you. Regex engines – or at least, the ones I know of, like the one included in Notepad++ or in the programming languages Perl or Python – don’t have the level of AI required to accomplish your goal.
For example, being able to distinguish:
"Look at the time." George said. "Look at the time," George said this while whisking out the door.
and knowing that you really want a comma in the first and a period in the second, is not a simple task. For that exact example, yes, I could come up with a regex that would work. But for all the variations that the English language would accept – you would really need machine learning / artificial intelligence, or a huge program with all the rules from your high school grammar text or your favorite style guide encoded into specific computer-understandable rules. That’s not going to fit in a 20-character or even 200-character regex.
If your goal were to find one of those types of problems in a limited circumstance, you might be able to accomplish it with a complicated regex – but for any of the errors you gave, if I came up with a regex that found it, I would likely be able to find at least one correct version in English that the regex would incorrectly mark as matching the bad pattern.
The general purpose grammar checker that you are describing is not going to be a simple search-and-replace regex in Notepad++.
Some word processors, like Microsoft Word, have grammar checkers. But I just put in a handful of those mistakes, and the only two things that Word pointed out were not liking “I said.” as a complete sentence (they actually said “spoke” would be a better word), and “I grabs the ball” told me to check whether I was staying consistent between single and pluaral.
It didn’t catch any of the mistakes you mentioned… and that’s with the power of Microsoft’s programming team behind it.
I see ads for Grammarly all the time; something like that might work better (though it will probably cost you).
But there are no grammar plugins (just spell check plugins) for Notepad++ that I know of. And regex won’t truly accomplish your goal.
About the best you could do is search for a letter, a period, or a comma before a quote, then a space, and have Find Next cycle through all of those, and then have you apply the amazing power of the 70 years of lingual training that your neural network has received during your lifetime to each of those instances, and deciding for yourself whether they are right or not: searching for
[[:alpha:],\.][”"]( |$)with regular expression mode will find any alphabetical characters or a comma or a period, followed by an end curly quote or normal ASCII quote, followed by a space or the end of the line.
Also, as much of a fan of Notepad++ as I am, I have to say: as a serious author, I am not sure that a text editor is the right tool for the trade; I would think a Word Processor at the minimum, and maybe even fancier publishing-end software would be better for authoring text (books, magazine articles, etc). But if you prefer to stick with pure text, that’s your choice.
Declan Conner last edited by
Unfortunaley, Grammarly isn’t much good either, and many others
I used this below to find the following errors for quotes that Word and Grammarly and many other can’t find, so I was hoping it could do the same for periods and comma errors. I’m not looking to replace as I would have my Word doc open and simply work between notbook++ and Word to put them right.
These are all the errors that notepad++ found using the above
“Rosa, we have to get out of this gulley.( ) (missing end quote)
( )That rain the other day is taking its toll,” I said, and handed the leaf to Rosa. (missing start quote)
(‘)Best we head back to the bunker.” (single quote with double end quote)
(” )Will they attack us?” (Quote the wrong way around due to space)
“Jet, let go,” I said.(”) (quote after said is in error)
“Good, boy, let’s go.(’) (single end quote)
Giant nodded, running his finger down the list.(”) (Narrative shouldn’t have a quote)
carypt last edited by
this is my poor attempt to help . it might contain error possibilities : so be careful . i havent concerned line-breaks (end of line) in my try . please correct me if you know . i would also suggest to use the search /mark funktion and then manually correct the mistakes step by step .
i see one problem-case always has quotes at start and end , and sometimes ongoing sentence , also upper and lower case .
search phrases example 1(period into comma) :
("\u.*)(\.)(")( \u.*)(\.)replace :
$1,$3$4$5. (the brackets define arguments and $number define the numbered backreferences for the former arguments , dot is any character , * means repetion , . (escaped dot) means the real dot-character ,\u means upper case character , \l lower case character, \s is space character)
example 2 (missing comma) : s:
example 3 (comma into period) s:
example 4 s:
example 5 (finding missing period) s :