Help with the regexes
-
Hi guys, I need a regex or any suggestion for this:
I must
- select all the words that are contained between the terms (plural of and English) in all the file,
- select the lines begenning with the same words
- copy each line beside the correspondent (plural of…English)
Example:
finisher n.\n1 A person who finishes or completes something
finishers n.\n(plural of finisher English)
finishes n.\n(plural of finish English)
finishes off vb.\n(en-third-person singular of: finish off)
finishest vb.\n(en-archaic second-person singular of: finish)
finisheth vb.\n(en-archaic third-person singular of: finish)
finishing n.\n1 the act of completing something
finish.\nvb.\n(present participle of finish English)
finishing line n.\n(alternative form of finish line English)
finishing lines n.\n(plural of finishing line English)
finishing move n.\n(context video games English) In media such as…
finishing moves n.\n(plural of finishing move English)
finishing off vb.\n(present participle of finish off English)Later the text must become:
finisher n.\n1 A person who finishes or completes something
finishers n.\n(plural of finisher English) n.\n1 A person who finishes or completes something
finishes n.\n(plural of finish English)
finishes off vb.\n(en-third-person singular of: finish off)
finishest vb.\n(en-archaic second-person singular of: finish)
finisheth vb.\n(en-archaic third-person singular of: finish)
finishing n.\n1 the act of completing something
finish.\nvb.\n(present participle of finish English)
finishing line n.\n(alternative form of finish line English)
finishing lines n.\n(plural of finishing line English) n.\n(alternative form of finish line English)
finishing move n.\n(context video games English) In media such as…
finishing moves n.\n(plural of finishing move English)n.\n(context video games English) In media such as…
finishing off vb.\n(present participle of finish off English) -
Hi,
The first is relatively easy :
plural of.*EnglishI will search for the rest a bit later.
-
@giuseppe-pulitanò, and All,
Before giving you a regex solution, I noticed a particularity, in your text. ! Examining some of your lines, for example :
finisher n.\n1 finishing line n.\n finishing off vb.\n
We deduce that a word ( like
finisher
) OR a group of words ( likefinishing off
) are always followed with a space character … … except for the line, beginning withfinish.\nvb.\n......
, where a dot immediately follows the wordfinish
!Is this syntax common in your definitions ? Or that particular line should be written
finish .\nvb.\n......
or evenfinish ???.\nvb.\n......
, where???
represents an abbreviation ?See you later,
Best Regards,
guy038
-
It is a TABfile dictionary; each entry is followed by a TAB caracter:
wordTABdefinition
So it is:
finisherTABn.\n1 A person who finishes or completes something
finishersTABn.\n(plural of finisher English)
finishesTABn.\n(plural of finish English)
finishes offTABvb.\n(en-third-person singular of: finish off)
finishestTABvb.\n(en-archaic second-person singular of: finish)
finishethTABvb.\n(en-archaic third-person singular of: finish)
finishingTABn.\n1 the act of completing something
finishTAB.\nvb.\n(present participle of finish English)
finishing lineTABn.\n(alternative form of finish line English)
finishing linesTABn.\n(plural of finishing line English)
finishing moveTABn.\n(context video games English) In media such as…
finishing movesTABn.\n(plural of finishing move English)
finishing offTABvb.\n(present participle of finish off English)Later the text must become:
finisherTABn.\n1 A person who finishes or completes something
finishersTABn.\n(plural of finisher English) n.\n1 A person who finishes or completes something
finishesTABn.\n(plural of finish English)
finishes offTABvb.\n(en-third-person singular of: finish off)
finishestTABvb.\n(en-archaic second-person singular of: finish)
finishethTABvb.\n(en-archaic third-person singular of: finish)
finishingTABn.\n1 the act of completing something
finishTAB.\nvb.\n(present participle of finish English)
finishing lineTABn.\n(alternative form of finish line English)
finishing linesTABn.\n(plural of finishing line English) n.\n(alternative form of finish line English)
finishing moveTABn.\n(context video games English) In media such as…
finishing movesTABn.\n(plural of finishing move English)n.\n(context video games English) In media such as…
finishing offTABvb.\n(present participle of finish off English) -
Hi, @giuseppe-pulitanò, and All,
Ah, perfect ! Even easier to create the correct regex as the
\t
tabulation char separates, without any ambiguity, each header word with its definition :-)). So :-
Open the Replace dialog (
Ctrl + H
) -
SEARCH
(?-is)^(.+)\t(.+)\R.+plural of\x20\1\x20English\)
-
REPLACE
$0\x20\2
-
Select the
Regular expression
search mode -
Tick, preferably, the
Wrap around
option -
Click, once, on the
Replace All
button or several times on theReplace
button
Et voilà !
Notes :
-
The search part tries to grab two lines where the header word of the first line, with its exact case, is embedded in the expression
plural ...... English
of the end of the second line, with its exact case, too. -
At beginning, the part
(?-is)
means that :-
The search is carried on a non-insensitive way,
(?-i)
-
The regex engine considers dot as any single standard character only ( not an EOL one ),
(?-s)
-
-
Then, the
(.+)\t(.+)\R
part catches the first line, with its line-break and stores, as groups1
and2
, text which is, either, before and after the tabulation separator\t
-
And the final part
.+plural of\x20\1\x20English\)
grabs all the second line contents with the condition that the header word,\1
, must be located between the expressionsplural of
andEnglish
, with this exact case -
In replacement, we, first, rewrite these two lines, untouched,
$0
, followed with a space char,\x20
and the definition part of the previous line,\2
Best Regards,
guy038
-
-
@guy038 said:
Hi, @giuseppe-pulitanò, and All,
Ah, perfect ! Even easier to create the correct regex as the
\t
tabulation char separates, without any ambiguity, each header word with its definition :-)). So :-
Open the Replace dialog (
Ctrl + H
) -
SEARCH
(?-is)^(.+)\t(.+)\R.+plural of\x20\1\x20English\)
-
REPLACE
$0\x20\2
-
Select the
Regular expression
search mode -
Tick, preferably, the
Wrap around
option -
Click, once, on the
Replace All
button or several times on theReplace
button
Et voilà !
Notes :
-
The search part tries to grab two lines where the header word of the first line, with its exact case, is embedded in the expression
plural ...... English
of the end of the second line, with its exact case, too. -
At beginning, the part
(?-is)
means that :-
The search is carried on a non-insensitive way,
(?-i)
-
The regex engine considers dot as any single standard character only ( not an EOL one ),
(?-s)
-
-
Then, the
(.+)\t(.+)\R
part catches the first line, with its line-break and stores, as groups1
and2
, text which is, either, before and after the tabulation separator\t
-
And the final part
.+plural of\x20\1\x20English\)
grabs all the second line contents with the condition that the header word,\1
, must be located between the expressionsplural of
andEnglish
, with this exact case -
In replacement, we, first, rewrite these two lines, untouched,
$0
, followed with a space char,\x20
and the definition part of the previous line,\2
Best Regards,
guy038
I tried it but it doesn’t work…
PS: if you find the regex could you give me too the regex for 2 general terms ? For example instead of (plural of …English) , item1 and item2
-
-
Hello, @giuseppe-pulitanò, and All,
Before extending the regex to some general cases, it would be better to solve the present problem !
As for me, I re-verified my regex, against your sample text, and it’s working fine !
Note that regexes are extremely sensitive to real text ! I mean that a simple additional space, somewhere, may cause the regular expression to fail ! So, if your text is neither personal nor confidential, could you send me, by e-mail, part of your text, in order to do additional tests ?
Thanks
See you later,
guy038
-
Hi guy038 I sent you the email with the file at the andress :
Many thanks