wu :: forums (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
riddles >> hard >> New Question:  infinite data compression (M?)
(Message started by: George Wright on Aug 9th, 2002, 5:23pm)

Title: New Question:  infinite data compression (M?)
Post by George Wright on Aug 9th, 2002, 5:23pm
We all know from basic cominatorics, that that if I
have n symbols, I can make only n^m different
strings of lenght m.

So if I allow myself 26 letters and a space, with
100 characters I can only represent 27^90 different
things.  Yet I claim that I can uniquely represent all positive
integers (an infinitite number) with just 90 characters!!!


The representation system I'll use is the english language.

So for example the string "nine" will represent the number 9.
The string "The number of stars in the galaxy", can be used to  represent that very large number.  Some strings such as
"lksdfkjeojk" won't represent any number, but that's fine.

Now to prove that this will represent all positive integers.

Proof by contradiction.

Let A be the set of positive integers that can not be
represented in 90 characters.

Since A is a discrete ordered set it must contain a least
element, but the string

"the least positive integer that can not be represented in less than ninety characters"

will then represent this integer, and is less than 90 characters
in lenght.  Hence there is a contradiction, and so
no such integers must exist.

Can anyone find a flaw?

Title: Re: New Question:  infinite data compression (M?)
Post by Eric Yeh on Aug 9th, 2002, 6:28pm
This is just another self-referential contradiction, a la "This statement is false" and the pop quiz "paradox".  The concept of this string/number is contradictory in itself because if its value is n, then it is not n, so therefore it cannot correctly represent any number.  Thus, the statement that the string "will then represent this integer" is false.

BTW, to apply WOP you need to first claim the set is nonempty.  ;)

Best,
Eric

Title: Re: New Question:  infinite data compression (M?)
Post by George Wright on Aug 16th, 2002, 7:38pm
Eric,

   Although admittedly the problem is a bit of a cheap shot,
I think your dismissing it a little too lightly.

Its clear that  the statement

"the least positive integer that can not be represented in less than ninety characters"

leads to a contradiction and so can't represent a number, I
also realize that in order to apply WOP I need to have a
non-empty set.  In fact, I need this to be as you have
stated it for my proof to work.

Perhaps if I re-wrote the proof it might make things clearer.


Proof:

Let f be the mapping from the strings of lenghth 90 or to the
positive integers, where mapping f, is that induced by the English Language.

Let S be the set of all positive integers that are not in the
Image of f.

2 cases.  Either
1)  S is the empty set
   or
2) S is non-empty

in case 1) then it must be the case that the set of all
positive integers is in the image of f, and so I have my
infinite data compression.

in case 2)  S is non-empty, so there exists a least element.
But this element is
   
  f(the least positive integer that can not be represented in less than ninety characters)

so it is in the image of f.  Therefor it is not an element
of S.  This is a contradiction,  so case 2 must be false.

Therefor case (1) is true.


Let me know what you think,

George

Title: Re: New Question:  infinite data compression (M?)
Post by Eric Yeh on Aug 16th, 2002, 7:46pm
George,

Don't worry, your proof was actually clear the first time -- my WOP comment was just a silly pot shot at a minor carelessness.  :)

But as far as the answer, I still feel it is sufficient to say that there is no such number f("the least positive integer that can not be represented in less than ninety characters"), due to the fact that it would be contradictory.  Do you have a different explanation?

Best,
Eric

Title: Re: New Question:  infinite data compression (M?)
Post by George Wright on Aug 17th, 2002, 10:22pm
Eric,
  Somehow we are having a failure to communicate.
That this statement is contradictory is crucial to my
proof.  If case 2 didn't lead to a contradiction,
I wouldn't be able to make the claim that case 1 was
true.

   Think of this like the following proof of the uncountability of
the subsets of the integers (which given the sophistication
of your previous posts you are probably already aware of)

    Assume that we have a 1 to 1 correspondence between
integers, and the subsets of integers.  Color those
integers whose corresponding sets do not contain themselves
red.  Then consider the set of all red numbers. If its corresponding integer is red, then it should be included in this
set, but then it would not be painted red...

  The above paragraph is a well accepted proof of the uncountability of the subsets of the integers.  Saying its
wrong just because the set contructed leads to a contradiction would be missing the point of the proof.

 Similarly saying my proof is wrong just because I constructed
an element of S that leads to a contradiction would also be missing the point.

.
.
.
WARNING Spoiler ahead
.
.    
.  

I think the problem with the infinite compression proof
is that the mapping f is not well defined.  I start out as
though f is a fixed coding system that you can with
check correspondences with complete rigor.

Then, in the second part of the proof I allow f to be
flexable, and change its description in a self-referential
way.

Cheers,

George

Title: Re: New Question:  infinite data compression (M?)
Post by tim on Aug 17th, 2002, 11:07pm
George: Yes, I had exactly the same argument with someone in the rec.arts.sf.science newsgroup. (Except it was talking about sentences of arbitrary finite length, and reals)

Let sequence S be "the least positive integer that can not be represented in less than ninety characters". Let f be any mapping from a subset A of sequences to integers such that f(x) matches the English-language meaning of x for all x in A. We say that x "is a representation of" f(x).

Suppose S is in A. Then by definition of f, f(S) is the least positive integer that can not be represented in less than ninety characters. However, S is a representation of f(S) that is only 83 characters long. This is a contradiction, hence S is not in A.

This is not surprising: we already know that not all character sequences are in the domain of f.

The flaw in the original proof is the unjustified assumption that the English language actually defines a unique mapping from sequences of characters to integers. It does not.

Title: Re: New Question:  infinite data compression (M?)
Post by Eric Yeh on Aug 18th, 2002, 8:49am
George and Tim (if your msg was not in support of me -- I couldn't quite tell at  the end if you were just bridging the difference or also could not see my explanation),

My statement is exactly equivalent to what the two of you are saying, except not spelling out the details in the same language.  To put it in your language, the function f is not well-defined at <Tim's> S precisely because there is a contradiction due to the self-referential implicit in the mapping.

The difference between this "pf" and the subset of integers pf is precisely that there each individual image of the mapping is defined independently of the rest of the mapping; here the mapping f relies on the entirety of the mapping itself (i.e. is self-referential).  That leads to a contradiction  in the mapping itself, before the application of looking at another set, as per the subsets pf.

George,

When you say that the contradiction of the statement is crucial to your proof, believe me I understand!  The problem is that the "hypothesis" that it negates is not the hypothesis that S is not empty, but that f is well-formed at all, or else that <Tim's string> S has an image at all, as we understand it in English.  (If you take f to be well-defined, the answer must be that S has no image f(S), and this still precludes your end contradiction, because that [extant] least element truly isn't represented by any such S, which does not actually have an image at all!)  That was the explanation I was trying to deliver in my original post.

In summary:  the two explanations are the same, and I happily accept your explanation; I hope I have sufficiently bridged the difference so that you accept mine.

Best,
Eric

Title: Re: New Question:  infinite data compression (M?)
Post by George Wright on Aug 18th, 2002, 1:59pm
Wow, an argument in a forum, in which the issue is resolved
and both parties end up agreeing with eachother.  Alert the
Guiness book of world records this may be a first.

Needless to say, I agree with your assessment Eric.
I was just miunderstanding what you meant when you
simply said the problem was that the statement was
contradictory.

Best Regards,

George

Title: Re: New Question:  infinite data compression (M?)
Post by Eric Yeh on Aug 18th, 2002, 2:03pm
Excellent -- sorry for being too terse before, then!  It is a nice problem overall.

Best,
Eric



Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board