Cannot find the fitting regular expression for counting the right class of words...
-
Hello there!
My name’s Wolf and i’m working on my phd-thesis right now. It’s about communication in online-games, and i’m using notepad++ for counting words and wordgroups (i’m from germany, so please forgive me for my bad english).
Right know, i 'm trying to count a special type of posting. It has got the following structure within it:
sucht” (in english: “guild [x] is looking for”)X = any number of any characters. I tried it with “Gilde [[:alnum:]]+ sucht”, which seems to work partially, BUT i think this expression only covers the primary key base. For my search, i need the primary key base as well as every character that differs from it, e.g. A, a, À, Á, Â, Ã, Ä, Å, A, a, à, á, â, ã, ä and å (it has to work for any other character as well).
Sadly, i have no idea how to put that into a regular expression. Could you please help me?
Thank you so much!
Kind greetings
Wolf -
Oh, and there is an additional issue: guild names are often framed by special characters like *, -, /, , <, > etc. the [x]-Part should involve those as well…i hope someone has an idea. :)
-
Hello Quangelosaurus Rex,
Unlike you think about, the POSIX classes
[[:alpha:]]
and[[:alnum:]]
, also, match all the Unicode Latin, Greek, Cyrillic, Hebrew and Arab accentuated letters !So, taking in account your specific characters, including the Space character, of your second post, we, finally, get the two regexes, below :
Gilde [[:alpha:]/<>* -]+ sucht
matches the string Gilde, followed with a space, then any non-null range of Letter, Slash, Less-Than sign, Greater-Than sign, Asterisk, Space, or Hyphen-Minus sign, followed with a Space and the string suchtGilde [[:alnum:]/<>* -]+ sucht
matches the string Gilde, followed with a space, then any non-null range of Number, Letter, Slash, Less-Than sign, Greater-Than sign, Asterisk, Space, or Hyphen-Minus sign, followed with a Space and the string suchtBest Regards,
guy038
P.S. :
You’ll find good documentation, about the new Boost C++ Regex library, v1.55.0 ( similar to the PERL Regular Common Expressions, v1.48.0 ), used by
Notepad++
, since its6.0
version, at the TWO addresses below :http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html
http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/format/boost_format_syntax.html
-
The FIRST link explains the syntax, of regular expressions, in the SEARCH part
-
The SECOND link explains the syntax, of regular expressions, in the REPLACEMENT part
You may, either, look for valuable informations, on the sites, below :
http://www.regular-expressions.info
http://perldoc.perl.org/perlre.html
To end with, feel free to ask, the N++ Community, for infos on any tricky regex that you came across OR for building any tricky regex, for a particular purpose :-))
-
-
Sorry for the late reply. Thank you VERY much for the awesome help! You really saved me here! =) Thank you a thousand times! =)