Grab Mobile Phone Number from wrap data

TCAST Mobile

Hi Here…
I have data like as:

healthcare
Pratik Mandiri dr.
socmed
balkisbalxxxx@gmail.com
socmed
+6208129311111
healthcare
Praktek Umum 
socmed
yulianaindixxxx@gmail.com
socmed
+6208154077777

How to grab mobile phone number:
Pratik Mandiri dr. | 6208129311111
Praktek Umum | 6208154077777

Or other result, grab mobile phone number only:
6208129311111
6208154077777

Moderator here, I’ve surrounded the example data in a code box. Please refer to the FAQ post here for the correct method of showing examples.

Mark Olson

Based only on the information you provided in your initial post, if you wanted to extract

Pratik Mandiri dr. | 6208129311111
Praktek Umum | 6208154077777

you would go to the Find/Replace form (Ctrl+H with normal keybindings), check Wrap Around, set Search Mode to Regular expression, and:
Find what: (?-si)healthcare\R(.*)\Rsocmed\R.*\Rsocmed\R\+(\d+)
Replace with: ${1} | ${2}

To understand the regular expression I wrote here, you can read the Notepad++ documentation on regular expressions.

In brief (italicized words or phrases are important concepts from regular expressions):

(?-si) is a flag that makes it so that all ASCII letters are matched case-sensitively (the default is to ignore case) and the . metacharacter matches non-newline characters
The \R escape sequence matches any newline (this could be \r\n for Windows, \r for Mac, or \n for Linux)
.* matches an entire line (including an empty line)
The \d escape sequence matches any one of the characters 0123456789
The * metacharacter indicates that the preceding pattern should be matched 0 or more times
The + metacharacter indicates that the preceding pattern should be matched 1 or more times
the () wrapping (.*) and (\d+) create two capture groups that are referenced in the Replace With as ${1} and ${2} respectively
The Replace With has its own special syntax and set of special sequences that is different from the syntax for the Find what. For instance, in the Replace With syntax, * and + and . have no special meaning; they’re just treated as normal characters.

Hopefully that’s enough information for you to tweak this regular expression as needed, in case what I gave you is not general enough to meet your needs.

guy038

Hello, @tcast-mobile, @mark-olson and All,

@tcast-mobile, just a variant which does not care about the socmed string but simply searches for the nearest multi-digits phone number, preceded by a + sign, beginning a line :

FIND (?-is)healthcare\R(.+)\R(?:.+\R)+?\+(\d+\R)

REPLACE $1 | $2

Notes :

Most of the explanations, relative to the regex syntax, have already been given by Mark. Two other points :

The (?:......) syntax is a non-capturing group, as we do not need this information in replacement
The \+ represents a literal +

Best Regards,

guy038