Regex: What is the difference between Normal Replace, Extended Replace and .Matches Newsline?
-
hi, can anyone tell me what is the difference on REGEX between
Normal Replace, and.Matches Newsline? Also, when isExtended Replaceused? -
The documentation describes extended search mode and regular expression search mode.
But in brief: “Normal” search mode just searches for literal characters, and cannot look for special characters or for fancy “patterns”. “Extended” search mode can search for normal characters and a brief list of special characters “escapes” that allow for searching for special characters like newlines and tabs. “Regular Expression” mode has a much larger selection of escapes and special characters, and can also search for things like “the beginning of the line” or “zero or more copies of the previous character” or other fancy things like that.
“. Matches Newline” only affects “Regular Expression” (regex) searches. In regex mode, the
.character normally matches any character except for the newline characters; if that checkbox is checked, then.also matches the newline characters. -
Hello, @hellena-crainicu and All,
First of all, Hellena, you speak about two different things : about the
Searchmode for one part and about an option of theRegular expressionsearch mode, on the other part !
-
The Search mode which can be :
-
Normal: All the characters, in theFind what:zone are supposed to be literal characters, without any interpretation. However, this statement is not totally exact : it depends on the status of theMatch caseoption ! For example, if you’re searching for the wordLicense:-
It will match the exact string
Licenseif theMatch caseoption is checked -
It will match any form, like
LICENSE,License,licensebut also the strings asliCENSe,liCENSE,LiCeNsE,… if theMatch caseoption is UNchecked
-
-
Extended: In this mode almost all the characters are supposed to be literal characters, without any interpretation. However5special characters can be found with a specific syntax, BOTH in theFind what:and/or theReplace with:zone :-
The Null character (
\x00) with the\0syntax -
The Tabulation character (
\x09) with the\tsyntax -
The New Line character (
\x0A) with the\nsyntax -
The Carriage Return character (
\x0D) with the\rsyntax -
The AntiSlash character (
\x5C) with the\\syntax
-
-
In addition, in the
Extendedmode, anyANSIcharacter can be matched by its character’s code, in base10,8,2or16:-
in DECIMAL, from
\d000to\d255(3digits, between0and9) -
in OCTAL, from
\o000to\o377(3digits, between0and7) -
in BINARY from
\b00000000to\b11111111(8digits, between0and1) -
in HEXADECIMAL from
\x00to\xFF(2hexadecimal chars, between0and9and/or betweenAandF)
-
-
Note that the mention about the
Match caseoption, inNormalsearch mode, is also valid inExtendedmode andRegular Expressionas well ! -
Regular Expression: As you know, in this search mode, a lot of structures has a special signification. For people not acquainted with these notions, consult, first, some tutorials from thisFAQpost https://community.notepad-plus-plus.org/topic/15765/faq-desk-where-to-find-regular-expressions-regex-documentation/2
-
-
Now, the
. matches newlineoption is a functional option for theRegular expressionsearch mode, only !-
If the
. matches newlineis unchecked, this means that the dot regex symbol (.) matches a single standard character. So any character from theBMP, from\x{0000}to\x{FFFD}, WITHOUT the six chars\x{000A}( New Line ),\x{000C}( Form Feed ),\x{000D}( Carriage Return ),\x{0085}( New line ),\x{2028}( Line Separator ) and\x{2029}( Paragrah Separator ) -
If the
. matches newlineis checked, this means that the dot regex symbol (.) matches absolutely any character from the Basic Multilingual plane (BMP), from\x{0000}to\x{FFFD}, with no exception !
-
Last recommendation : in
Extendedsearch mode, it’s best to uncheck theMatch whole word onlyto avoid unpredictable results !Best Regards,
guy038
P.S. :
From above, we can deduce that the
Extendedsearch mode can be replaced, in most cases, by theRegular expressionsearch mode !Only two specific notations, in the
Extendedsearch mode, have no equivalent in theRegular expressionsearch mode :-
The
\b########notation, where each#represents a0or a1binary digit ( for instance,\b01000001matches anAletter ) -
The
\d###notation, where each#represents a digit from0to9( For instance the\d090matches aZletter )
-
-
thank you @guy038