Using RegEx to format text for import
-
I have this text
With lines like these
820 Review Conputer Process Annually No 2016-06-02 2019-05-31
1064 Review customer documentation Annually No 2016-07-25 2019-07-25
1282 Monitor and Clean up the Mailbox Weekly No 2015-09-21 2020-05-12
1644 Generic paragraph for monthly report Monthly Yes 2015-10-19 2017-04-13I want to keep the text how ever I need to set a delimiter between each field.
I created a regex to Identify the fields. However I cannot seem to figure out how to use it to “replace” the correct spaces to ; as field seperator.Sorry for the portato code…
(\d{3,4}).(?<=)(.) (?=(Weekly|Monthly|Semi-annually|Quarterly|Annually|Bi-monthly))(.)(No|Yes).(\d{4}-\d{2}-\d{2}).(\d{4}-\d{2}-\d{2})I need to get the above text to look like this
820;Review Conputer Process;Annually;No;2016-06-02;2019-05-31
1064;Review customer documentation;Annually;No;2016-07-25;2019-07-25
1282;Monitor and Clean up the Mailbox;Weekly;No;2015-09-21;2020-05-12
1644;Generic paragraph for monthly report;Monthly;Yes;2015-10-19;2017-04-13the first segment can be either 3 og 4 digits. There is no lenght on the second, the third can be Weekly,Monthly,Semi-annually,Quarterly,Annually or Bi-monthly and the forth one can be either Yes or No and the lat two is just date.
I have to do it one segment of a time, that I ok. Doesn’t need to be fancy.
Hope you can help.
-
from the given data I would think this should do it
find what:^(\d{3,4})\h(.+?)\h(Weekly|Monthly|Semi-annually|Quarterly|Annually|Bi-monthly)\h(No|Yes)\h(\d{4}-\d{2}-\d{2})\h(\d{4}-\d{2}-\d{2})
repalce with:$1;$2;$3;$4;$5;$6 -
Hello, @dennis-andersen, and All,
Ah ! @ekopalypse beat me to it ;-))
This other regex S/R should also work :
SEARCH
(?-si)^\d{3,4}\K\h+|\h+(?!.+(Weekly|Monthly|Semi-annually|Quarterly|Annually|Bi-monthly))REPLACE
;Notes :
-
As usual, the in-line modifiers
(?-si)ensure that :-
Any dot regex symbol
.represents a single standard character ( NOT like-breaks ) -
The search is processed in a non-insensitive way
-
-
Then the regex contains two alternatives, separated by the
|alternation symbol. So the regex engine searches, from current location, for, either :-
Some horizontal blank characters, after a
3or4-digitsnumber, at beginning of lines and, due to the\Ksyntax, only matches these blank char(s)^\d{3,4}\K\h+ -
Some horizontal blank characters, ONLY IF NOT followed by a non-null range of standard chars before the periodicity field
\h+(?!.+(Weekly|Monthly|Semi-annually|Quarterly|Annually|Bi-monthly))
-
-
Due the
\Ksyntax, you cannot use the step by step replacement with theReplacebutton. You must use theReplace Allbutton, exclusively
Best Regards,
guy038
BTW, @dennis-andersen, you very well described your goal and data which greatly helped us to figure out regex solutions ;-))
-
-
You guys are the best!!!
One day I hope to be half as smart as you!!
I got help and I learned something. Best day of the year so far…
Thanks and have a great day.
Dennis -
Hello
-
List itemIdentify the common text string you wish to remove, in this example, “Accepts Google Pay”
-
List itemClick the Column Options dropdown arrow and select Set regular expression. The Set regular expression dialog box appears.
-
List itemIn the Match box, enter “Accepts Google Pay\s+(.+)” to identify the Accepts Google Pay string and then capture the remainder of the string in the first capture capture group.
-
List itemIn the Replace box, enter $1 to print out the first capture group. The prefix disappears from the column data, leaving the remainder of the string intact.
I hope this information will be usefull for you.
Thank you -
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login