wu :: forums (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
riddles >> general problem-solving / chatting / whatever >> reCAPTCHA + why is CS slow on human computation?
(Message started by: amichail on May 25th, 2007, 12:55pm)

Title: reCAPTCHA + why is CS slow on human computation?
Post by amichail on May 25th, 2007, 12:55pm
Very cool:

http://bmaurer.blogspot.com/2007/05/recaptcha-new-way-to-fight-spam.html

IMO, human computation is a field with enormous potential. But strangely, there are few publications on the topic. It's not taken seriously in computer science yet.

Or maybe it requires a different sort of thinking than what most computer scientists are accustomed to?

Or perhaps most computer scientists feel that human computation is cheating? For them, maybe there is no reward in this sort of research. They are more interested in something technically difficult, even if the results are much worse.

Title: Re: reCAPTCHA + why is CS slow on human computatio
Post by Grimbal on May 26th, 2007, 4:35am
I am sure the method is used already.

By whom?  Spammers of course.

They set up a web site A for some teenager games, ask you to register and type the funny letters.  In the background they request a registration on another web site B, they pass B's test image to the registration screen of A, get the answer, get it back to B and complete the registration.  This way they can automate the registering process.

Title: Re: reCAPTCHA + why is CS slow on human computatio
Post by SMQ on May 26th, 2007, 5:50am
Very close, Grimbal, but in the "real world" website A almost always features "adult content", and requires a captcha for every login, not just for registration, so that many registrations to site B can be processed for each user of site A...

--SMQ

Title: Re: reCAPTCHA + why is CS slow on human computatio
Post by amichail on May 26th, 2007, 5:54am
I somehow doubt that computer scientists are avoiding this line of research because it might help spammers.


Title: Re: reCAPTCHA + why is CS slow on human computatio
Post by Icarus on May 26th, 2007, 9:04am
This may no be why CS people don't study "human computation", but unfortunately, I think they are right about how spammer can and are abusing it for their own ends. If this idea spreads, I suspect it will be the death knell for CAPTCHA verification. After all, why use CAPTCHA when spammers have a way around it?


I would guess that the main reason "human computation" (as used here) isn't studied much is that it is difficult to accomplish effectively. It requires a large body of people who are willing to cooperate with it. This means that you need some kind of incentive. For the reCAPTCHA idea, I don't see one. Altruism will lead some to cooperate, but many more won't bother, particularly after they've done it a number of times. Instead of translating the 2nd word correctly, they will put in x's or the wrong word. On some applications, it would become a game for people to mistranslate. Error checking (having multiple people translate the same image) will help this, but with low cooperation, it requires so much load on the same translation to get a reliable output that it may not be worthwhile.

Title: Re: reCAPTCHA + why is CS slow on human computatio
Post by amichail on May 26th, 2007, 9:12am

on 05/26/07 at 09:04:15, Icarus wrote:
This may no be why CS people don't study "human computation", but unfortunately, I think they are right about how spammer can and are abusing it for their own ends. If this idea spreads, I suspect it will be the death knell for CAPTCHA verification. After all, why use CAPTCHA when spammers have a way around it?


I would guess that the main reason "human computation" (as used here) isn't studied much is that it is difficult to accomplish effectively. It requires a large body of people who are willing to cooperate with it. This means that you need some kind of incentive. For the reCAPTCHA idea, I don't see one. Altruism will lead some to cooperate, but many more won't bother, particularly after they've done it a number of times. Instead of translating the 2nd word correctly, they will put in x's or the wrong word. On some applications, it would become a game for people to mistranslate. Error checking (having multiple people translate the same image) will help this, but with low cooperation, it requires so much load on the same translation to get a reliable output that it may not be worthwhile.

Even though there are ways to get around CAPTCHAs using human computation games, they are still very effective because they significantly reduce the rate at which spammers can get fake accounts.

As for altruism with respect to reCAPTCHAs, how would you know which word is which?  Presumably, the idea would be to make it really hard to know that.  Moreover, most users are clueless and would not even know what's going on anyway.

Title: Re: reCAPTCHA + why is CS slow on human computatio
Post by Icarus on May 26th, 2007, 1:33pm
I see. Then I would consider a reCAPTCHA to be an abuse of your visitors - forcing them to do something that is of no benefit for them and is unrelated to their purpose in visiting. This sort of abusive behavior is not something I would like to see spread even farther than it already has. And for commercial operations, it is harmful in the long term. People learn when they've been taken advantage of, and it leaves a very negative impression.

Title: Re: reCAPTCHA + why is CS slow on human computatio
Post by amichail on May 26th, 2007, 1:44pm

on 05/26/07 at 13:33:54, Icarus wrote:
I see. Then I would consider a reCAPTCHA to be an abuse of your visitors - forcing them to do something that is of no benefit for them and is unrelated to their purpose in visiting. This sort of abusive behavior is not something I would like to see spread even farther than it already has. And for commercial operations, it is harmful in the long term. People learn when they've been taken advantage of, and it leaves a very negative impression.

You could say something like "please enter the two words above to prove that you are human and help digitize old texts".

But still, they would not know which word is which.

Title: Re: reCAPTCHA + why is CS slow on human computatio
Post by amichail on May 26th, 2007, 2:01pm
On a related note, try comparing the ESP Game with Google Image Labeler:

http://espgame.org/

http://images.google.com/imagelabeler/

Although Google Image Labeler is based on the ESP Game, its presentation is lacking.

The ESP Game looks like a game while Google Image Labeler looks like work.

I think Google tried to downplay the game aspect to avoid misleading its users.

Title: Re: reCAPTCHA + why is CS slow on human computatio
Post by ThudanBlunder on May 28th, 2007, 6:11am
http://www.techdirt.com/articles/20070524/174116.shtml
has an interesting link about the ESP Game (http://www.jimschrempp.com/features/computer/googleimagelabeler.htm).

Title: Re: reCAPTCHA + why is CS slow on human computatio
Post by Icarus on Jun 3rd, 2007, 1:19pm
Given what he says about not finding these false keywords when searching for actual images, I'd guess that Google is fully aware of the problem, and isn't putting labels on pictures unless they are agreed on by more than two people.

His Tit-for-Tat comparison is telling. While these bots are cooperating with each other to get artificially high scores, they are dissenting to the larger project. There are ways in large population interactions like this that real cooperators can overcome the effect of cooperating-dissenters, provided the dissenters are a small enough group.

Google could help by looking for those responding repetitiously and ban them.



Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board