wu :: forums (http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi)
riddles >> cs >> Ranking streaming items
(Message started by: TenaliRaman on Nov 1st, 2017, 9:52am)

Title: Ranking streaming items
Post by TenaliRaman on Nov 1st, 2017, 9:52am
N videos are streamed to you. Your goal is to rank the N videos. A simple way to do that is just give some score to each video and finally sort the videos by score to get the ranking.

Now there are constraints:
1. The videos are streamed to you one-by-one and it is just single pass
2. You have to provide positive integer scores
3. Once you have given score to an item, it cannot be changed

Note that, you know N upfront.

Given the constraints, how would you score the items so that you are able to score each video appropriately and the final ranking obtained is the one you wished for after going through the videos. Also, if r_i is the score given to ith ranked video. We would also like to minimize sum_i |r_i - r_(i+1)|.

---
I came up with this problem myself (based on some real world problem). By the way, I tried to make the problem as concise as possible above, I hope there is no confusion. I have some ideas on how to proceed to solve this, but thought it might be interesting to the group here.

Title: Re: Ranking streaming items
Post by towr on Nov 1st, 2017, 1:57pm
Is the goal a perfect ranking, or a "good enough" ranking?
Because to get a perfect ranking I think you need [hide]2N-1 possible scores, just so you can always insert a score between two existing ones.[/hide] But if "good enough" is good enough, you could use a probabilistic approach to reduce that significantly.

Title: Re: Ranking streaming items
Post by TenaliRaman on Nov 5th, 2017, 6:51am
That's pretty much what I had as well. A probabilistic approach would be interesting to explore too!

Title: Re: Ranking streaming items
Post by towr on Nov 5th, 2017, 7:48am
I think the biggest issue with a probabilistic approach is probably how you want to weigh the range of scores against the number of ranking mistakes (like duplicate scores, but not necessarily limited to those)

You could just pick a small fixed range of scores (say 100, or proportional to N), and try to insert videos as best you can (with possibly lots of duplicate scores). Most sites that rank things have a 5 star system, possibly with a decimal (so 41 scores if you start at 1 and end at 5 with increments of .1)

Whether you can estimate relative goodness of videos will also have a big effect. Say you've already seen a video with a score of 1 and one with a score of 100, then if you see a video that falls in between, can you tell (approximately) <i>where</i> in between? Is it more like a 30 or 70?
In this case an approach like library-sort might work: reserve some constant factor * N as space and insert videos as best you can, the extra space leaves you some wiggle room to insert videos that unexpectedly fall in between two previous videos. But there's still a risk of duplicate scores once there's no room left between two videos.

The quality of videos will probably follow a normal distribution, and even if we're limited to integers, you can use a logistic function to map one to the other (i.e. use more scores to represent the bulge). The more videos you see the better you'll be able to characterize the distribution and insert videos at the correct score.

Title: Re: Ranking streaming items
Post by Grimbal on Nov 6th, 2017, 3:45am
You could rank the videos relative to all the videos you have seen in the past, yielding a percentile.  That should give a score with a good distribution.  That is assuming the new videos follow the same distribution as what you used to see.  And that you have seen your share.

If you still get two equal scores, they are probably indistinguishable in quality.  

Title: Re: Ranking streaming items
Post by Grimbal on Feb 14th, 2018, 10:05am

on 11/01/17 at 09:52:40, TenaliRaman wrote:
Also, if r_i is the score given to ith ranked video. We would also like to minimize sum_i |r_i - r_(i+1)|.

That means minimize (r_1 - r_N).  Doesn't it?

Title: Re: Ranking streaming items
Post by towr on Feb 14th, 2018, 10:52am
Yeah, considering you sorted on r.

If you want to penalize large differences between adjacent items, we could change it to sum of squares, instead of sum of absolute difference.



Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board