|
||
Title: Regex and substring Post by Barukh on Oct 18th, 2015, 8:53am A friend of mine asked me the following intriguing question: Given a regular expression R and a string S. Devise an algorithm to determine whether there exists another string S' s.t. 1. S is a substring of S'. 2. S' matches R. |
||
Title: Re: Regex and substring Post by towr on Oct 18th, 2015, 1:11pm [hide]Build the finite state automaton corresponding to R Try each state as starting state to check if S can be a substring of any string matching R Work backwards/forwards from start/end of S to find a fitting S'.[/hide] |
||
Title: Re: Regex and substring Post by Barukh on Oct 19th, 2015, 9:12am Nice! But... So, it all [hide]boils down to efficiently building an FA for the regex[/hide]? Can it be efficient always? |
||
Title: Re: Regex and substring Post by playful on Apr 24th, 2016, 3:03am Here is how I would get started, taking an example. Bracing myself for a semantic discussion on the meaning of "regular expression". :) The title of the puzzle does say "regex". Example 1. regex pattern R: \d+@hello 2. string S: hello Idea for Algorithm For each position p in the pattern R ...Rp = subpattern from p to end of R ...Temp regex Rt = '' ...For each token t in Rp ......Rt += t ......If Rt matches S .........Return success Return failure Tracing this example: - Position p=0 -- Try to match "hello" with \d+ => Fail -- Try to match "hello" with \d+@ => Fail -- Try to match "hello" with \d+@h => Fail ... -- Try to match "hello" with \d+@hello => Fail - Position p=1 -- Try to match "hello" with @ => Fail -- Try to match "hello" with @h => Fail ... -- Try to match "hello" with @hello => Fail - Position p=2 -- Try to match "hello" with h => Fail -- Try to match "hello" with he => Fail ... -- Try to match "hello" with hello => Succeed Example string S': 123@hello Where this breaks down regex R: ^(?!.*hello)\w+ string S: hello The algorithm returns a success because \w+ matches hello, but R will never match a string containing S. |
||
Title: Re: Regex and substring Post by Grimbal on Apr 25th, 2016, 6:40am The only problem I see is with ^ and $ anchors. the ".+" should be inserted after an initial "^" and before a final "$". PS: oops, possibly correct answer to the wrong question. |
||
Title: Re: Regex and substring Post by playful on Apr 25th, 2016, 9:45am on 04/25/16 at 06:40:54, Grimbal wrote:
I think the problem statement is the other way around. S is a substring of S', not the opposite. In my example above, with S as "hello", manipulating R as you suggest will never match S. |
||
Powered by YaBB 1 Gold - SP 1.4! Forum software copyright © 2000-2004 Yet another Bulletin Board |