Author |
Topic: Regex and substring (Read 1684 times) |
|
Barukh
Uberpuzzler
Gender:
Posts: 2276
|
|
Regex and substring
« on: Oct 18th, 2015, 8:53am » |
Quote Modify
|
A friend of mine asked me the following intriguing question: Given a regular expression R and a string S. Devise an algorithm to determine whether there exists another string S' s.t. 1. S is a substring of S'. 2. S' matches R.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Regex and substring
« Reply #1 on: Oct 18th, 2015, 1:11pm » |
Quote Modify
|
Build the finite state automaton corresponding to R Try each state as starting state to check if S can be a substring of any string matching R Work backwards/forwards from start/end of S to find a fitting S'.
|
« Last Edit: Oct 18th, 2015, 1:11pm by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
Barukh
Uberpuzzler
Gender:
Posts: 2276
|
|
Re: Regex and substring
« Reply #2 on: Oct 19th, 2015, 9:12am » |
Quote Modify
|
Nice! But... So, it all boils down to efficiently building an FA for the regex? Can it be efficient always?
|
|
IP Logged |
|
|
|
playful
Junior Member
meet me at the second fork in the road
Gender:
Posts: 100
|
|
Re: Regex and substring
« Reply #3 on: Apr 24th, 2016, 3:03am » |
Quote Modify
|
Here is how I would get started, taking an example. Bracing myself for a semantic discussion on the meaning of "regular expression". The title of the puzzle does say "regex". Example 1. regex pattern R: \d+@hello 2. string S: hello Idea for Algorithm For each position p in the pattern R ...Rp = subpattern from p to end of R ...Temp regex Rt = '' ...For each token t in Rp ......Rt += t ......If Rt matches S .........Return success Return failure Tracing this example: - Position p=0 -- Try to match "hello" with \d+ => Fail -- Try to match "hello" with \d+@ => Fail -- Try to match "hello" with \d+@h => Fail ... -- Try to match "hello" with \d+@hello => Fail - Position p=1 -- Try to match "hello" with @ => Fail -- Try to match "hello" with @h => Fail ... -- Try to match "hello" with @hello => Fail - Position p=2 -- Try to match "hello" with h => Fail -- Try to match "hello" with he => Fail ... -- Try to match "hello" with hello => Succeed Example string S': 123@hello Where this breaks down regex R: ^(?!.*hello)\w+ string S: hello The algorithm returns a success because \w+ matches hello, but R will never match a string containing S.
|
|
IP Logged |
my favorite puzzles
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Regex and substring
« Reply #4 on: Apr 25th, 2016, 6:40am » |
Quote Modify
|
We could just check whether S matches R+".+" or ".+"+R. The only problem I see is with ^ and $ anchors. the ".+" should be inserted after an initial "^" and before a final "$". PS: oops, possibly correct answer to the wrong question.
|
« Last Edit: Apr 26th, 2016, 7:23am by Grimbal » |
IP Logged |
|
|
|
playful
Junior Member
meet me at the second fork in the road
Gender:
Posts: 100
|
|
Re: Regex and substring
« Reply #5 on: Apr 25th, 2016, 9:45am » |
Quote Modify
|
on Apr 25th, 2016, 6:40am, Grimbal wrote:We could just check whether S matches R+".+" or ".+"+R. |
| I think the problem statement is the other way around. S is a substring of S', not the opposite. In my example above, with S as "hello", manipulating R as you suggest will never match S.
|
|
IP Logged |
my favorite puzzles
|
|
|
|