Wild Card Searching and Replacing

  • I am working with a large file in which dashes, apostrophes, and quotes were all replaced with question marks as the data was downloaded from EndNote. I am needing to replace, for example, 1988?2015 with 1988-2015. I can find these instances using: \d+\d+\d+\d+?
    However, when I go to replace (I want to keep the four digits and replace the question mark with a dash), I have tried \d+\d+\d+\d±
    in the Replace window…And, the result in the text is just that (i. e., \d+\d+\d+\d±)
    What do I include so that everything is retained, but the question mark (which is changed)? I need to search for patterns as some dates are in the form of 1988?94…and possessives are of the form: ?s . Any help or pointing me in the right direction would be appreciated. I’ve been looking at help pages and tutorials on regex all afternoon without much success…I just can’t get the replacement to implement. Thanks

  • You may want to take a look at boost documentation.

    For your task you may use “(\d{4})\?(\d{4})” for searching and “\1-\2” for replacing (without the quotes of course). This will search for a group of four digits followed by a “?” (note the “\” in front of the “?” since the question mark has a special meaning in regexes and thus needs to be escaped to be recognized as a simple character) followed by another group of four digits. The “\d{4}” are enclosed in parentheses to mark them as group to be used in replacement. For the replacement, the first group followed by a “-” followed by the second group is used.

    For the second group you may also use “\d{2,4}” which will select a group of two to four digits.

    For the possessives you may search for (\w+)\?s and replace with “\1’s”.

  • Thank you so much for the help! Worked great and I learned something. Thank you also for the link to the boost documentation…that presented a clear explanation.

  • This post is deleted!

Log in to reply