Remove everything but specific chains
-
Good morning.
I have a large text file and yould like to extract only occurances of “amount xx” where xx can be either letters or digits
(Note that it could be up to 4 characters but always followed by a space).
Each extracted text should appear on a separate line.(I said extracted but deleting everything else would do)
A big thank you for your help.
-
@David-L said:
extract only occurances of “amount xx” where xx can be either letters or digits (Note that it could be up to 4 characters but always followed by a space). Each extracted text should appear on a separate line.
Press Ctrl+m to bring up the Mark window.
Enter data into this window as follows:
Text form of Find what value:
(?-si)amount [A-Za-z0-9]{1,4}(?=\x20)
for easy copy+paste.Press the Mark All button.
Press the Copy Marked Text button.
Your desired data is now in the clipboard.
For more information on the technique used to do this data extraction, see HERE.
-
Hello.
Thank you for your answer
Its working well.
I have few related questions (trying to understand)
If “amount” starts with a capital letter (A), it wont be marked.
If “amount 1234” is followed by a period, it wont be marked eitherIn the exemple, we have 4 character but how Would I do if i want to find something following this patern “Amount A1A1”
Or even “Amount A1A1 and B2B2”
How long the expression can be ?
-
If “amount” starts with a capital letter (A), it wont be marked.
Because you didn’t say initially that case was not significant.
If “amount 1234” is followed by a period, it wont be marked either
Because you said: “it could be up to 4 characters but always followed by a space”
-
Hello.
I know.
Sorry. I though that with a simple answer I could figure out to do a more complex one :)(For the case, i totally missed it since I though the checkbox would do)
I can have several situation
Amount 1234 (space)
amount 1234 (space)
amount 1234. (Period)
amounts 1234 and 4567
amount A1A1I guess I can do 5 simple expression and once done copy the mark, but is it passible to have a single expression more … inclusive ?
-
Maybe
(?i-s)amounts? [a-z0-9]{1,4}(?=[ .])
? -
Hello, @david-l,
But, regarding the two last cases :
amounts 1234 and 4567 amount A1A1
do they need to be followed with a
space
ordot
character as well ?
Remember: regular expressions are a school of precision ! Each character and its position has its importance !
Best Regards,
guy038