wu :: forums (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
riddles >> cs >> Unrecognizable Language
(Message started by: william wu on May 21st, 2003, 3:42pm)

Title: Unrecognizable Language
Post by william wu on May 21st, 2003, 3:42pm
Consider the following language L:

L = { <M> | for every input string w, M will halt within 1000|w|2 steps }


Show that this language is not recognizable. (Reduce from ~ATM.)



Notes:

1. <M> denotes an encoding of a Turing machine M. So L consists of all Turing machines that satisfy the property described above.

2. |w| denotes the length of the string w.

3. ~ATM = { <M,w> | M is a Turing Machine that does NOT accept the string w }. So it could reject or loop forever. (A stands for accept.)

4. A language L is recognizable if there exists a Turing Machine M such that for all strings w in L, running M on w will ACCEPT. When M is run on a string not in L, it will either REJECT or loop forever. Contrast this to a decidable language L', in which there exists a Turing Machine that will ACCEPT for strings in L', and REJECT for strings not in L'. In summary, deciders must halt, and recognizers only need to halt for strings in the language.



Source: UC Berkeley CS172 (Computablity and Complexity) Final Exam, Spring 2003



Title: Re: Unrecognizable Language
Post by Rezyk on Oct 19th, 2003, 11:45pm
What's generally known about ~ATM?  Recognizable/unrecognizable?  Decideable/undecideable?

Title: Re: Unrecognizable Language
Post by william wu on Oct 20th, 2003, 12:19am
~ATM is the canonical example of a Turing-unrecognizable language. This means there does not exist a Turing Machine which will accept the set of all machine-string pairs <M,w> such that M does NOT halt when run on w. The proof of this is very short:

Lemma: ATM is Turing-recognizable.

Proof by construction: Make a Turing Machine U which, when given the input <M,w>, simulates M on w, and simply mimics the accept/reject behavior of M. In other words, if M ever enters its accept state, U should accept; and if M enters the reject state, U should reject. Thus U recognizes ATM. Note that if M loops forever, U also loops forever, so U only recognizes ATM, and does not decide it.

Now we are ready for:

Proposition: ~ATM is Turing-unrecognizable.

Proof by contradiction: We know that ATM is Turing-recognizable. Suppose ~ATM was also Turing-recognizable. Then ATM would be decidable, because we could just make a machine that runs the recognizers for ATM and ~ATM in parallel, and accept if ATM's recognizer accepts, and rejects if ~ATM's accepts. However, we know that ATM (the famous halting problem) is undecidable, so we have a contradiction. Thus, ~ATM is Turing-unrecognizable.

Title: Re: Unrecognizable Language
Post by william wu on Oct 20th, 2003, 12:44am
Some preliminary information in case anyone is wondering:


0. Turing Machines

A Turing machine is a model of computation. From an introductory perspective, one might think of it as simply a computer program with access to infinite hard disk space.

As you may have heard before, the Turing Machine is just a read-write head moving along a tape, writing and rewriting symbols according to instructions dictated by a finite state machine. Turing's inspiration for such a model actually came from fairly philosophical observations on how the human mind works. An excerpt in his own words:

"Computing is normally done by writing certain symbols on paper. We may suppose this paper is divided into squares like a child's arithmetic book. In elementary arithmetic the two-dimensional nature of the paper is sometimes used. But such a use is always avoidable, and I think it will be agreed that the two-dimensional character of the paper is no essential of computation. I assume that the computation is carried out on one-dimensional paper, i.e., on a tape divided into squares. I shall also suppose that the number of symbols which may be printed is finite. If we were to allow an infinity of symbols, then there would be symbols differing to an arbitrarily small extent ..." - Alan Turing

Interesting stuff. You can read more about Turing's motivations here: http://www.ocf.berkeley.edu/~wwu/riddles/church-turing-thesis.ps


1. Language

A language is just some set of strings. For example, the set of all palindromes over the alphabet {0,1}.

In this world of computational complexity theory, we like to formulate our problems as a Decision Problem -- which is the problem of whether a string belongs in a certain set. For example, the SAT problem is to determine whether a boolean formula has a satisfying truth assignment. This can be formulated as the following decision problem:

"Determine if a string w belongs to the language { <[phi]> | <[phi]> is an encoding of a boolean formula which has a satisfying truth assignment }."

In the framework of decision problems, Turing Machines thus only need to output "yes" [it belongs in the language] or "no" [it doesn't belong in the language] when given an input string w. The terminology for "yes" and "no" is "accept" and "reject", respectively.

These issues of decidability and recognizability address different classes of problems that Turing Machines can solve to varying degrees of satisfaction.



2. Decidable language

A language L for which there exists a Turing Machine that will accept when given input w[in]L, and reject when given input w[nin]L.



3. Recognizable language

A language L for which there exists a Turing Machine that will accept when given input w[in]L.


Observe that the requirement for a recognizable language is less stringent than the decidability requirements. Thus a recognizer is allowed to loop forever on an input w[nin]L -- it needs not return any "yes" or "no" answer in a finite amount of time.

With this knowledge, hopefully it will be intuitive to see why ATM is a recognizable language, using the construction given in my previous post. The same intuition also helps see why ~ATM is unrecognizable.

Title: Re: Unrecognizable Language
Post by Rezyk on Oct 20th, 2003, 3:43am
Can we assume |w| is never 0?  Otherwise L [supseteq] { <M> | for an empty input, M will halt within 0 steps }, which seems trivial..

Here's my tentative answer (apologies for any improper notation):
[hide]

If L is recognizable, let R be a turing machine that recognizes it.

Now we can recognize <M, w> [in] ~ATM with a turing machine that does: {
 Construct turing machine V(<s, z, x>) that does: {
   If x<99, ACCEPT.
   Simulate M(w) for up to lg(x) steps and see if it ever gets to state s with tape-state z, if not then ACCEPT.
   Simulate M from state s with tape-state z for one step -- if it ACCEPTS, loop forever.
   Otherwise ACCEPT.
 }
 Return R(V)
}

However, ~ATM is unrecognizable, so the presumption that L is recognizeable must be false.

Quick explanation:
Iff M(w) never accepts, then for any input V never hits the loop forever clause (and thus always accepts in a timely manner), and R(V) must accept.
[/hide]

Title: Re: Unrecognizable Language
Post by william wu on Nov 8th, 2003, 1:06am
I think we can just ignore the 0 steps case. Running a machine on an input for 0 steps is like not running it at all, and I think we can argue that the question of whether something halts or not only makes sense after you've started running the machine.

Still thinking about your solution. Frankly after rereading it many times, it still doesn't make sense to me, although that doesn't necessarily mean it's wrong. To make matters worse, I can't even remember how this problem is solved   >:(

Title: Re: Unrecognizable Language
Post by william wu on Nov 8th, 2003, 1:26am
Well, I don't get it, so maybe you can walk me through it very slowly :)

1. Where did "99" come from?
2. Where did the log come from?
3. Where exactly is 1000|v|^2 in your solution?

Title: Re: Unrecognizable Language
Post by Rezyk on Nov 8th, 2003, 1:26pm
Sure, my pleasure.  You probably would have gotten it if I had known/used better notation.

Btw, I thought this was a pretty neat problem.


on 11/08/03 at 01:26:24, william wu wrote:
1. Where did "99" come from?

The exact value 99 was rather arbitrary.  I'm just skipping small-sized inputs to avoid cases where V's overhead processing might cause it to exceed 1000|<s,z,x>|2 steps.


Quote:
2. Where did the log come from?

A normal encoding for <s,z,x> would specify x in binary, so lg(x) [le] |<s,z,x>|.


Quote:
3. Where exactly is 1000|v|^2 in your solution?

Assuming a decent implementation, V was designed to always halt within 1000|<s,z,x>|2 steps (unless we explicitly choose to loop forever).


This may help to clarify..a proof of correctness for the case in which M(w) ACCEPTs:
[hide]
There must be some state sA and tape-state zA reached in nA steps from which M(w) will ACCEPT in the next step.  Consider the execution of V(<sA, zA, max(99, [lceil]2^nA[rceil])>).

The first line tests if max(99, [lceil]2^nA[rceil])<99, which is obviously false.  The second line simulates M for at least lg(max(99, [lceil]2^nA[rceil])) > nA steps, so it will find it reaching state sA with tape-state zA, and therefore V does not accept at this point.  The third line simulates M from sA zA for one step, which we know must be an ACCEPTing step.  Thus V enters the loop-forever clause here.

Since V loops forever for at least one input, it is definitely not in L.  Thus R(V) will not accept, and so our overall TM will not accept, which is the correct response.[/hide]



Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board