Author |
Topic: Sum-of-pair score of an array faster than O(n^2) (Read 4575 times) |
|
Conga
Newbie
Posts: 2
|
|
Sum-of-pair score of an array faster than O(n^2)
« on: Feb 29th, 2012, 5:51am » |
Quote Modify
|
Assume you have an array (I1, I2, ...., In) of characters (C1, C2,...., Cr) and a scoring function between such characters s(Ix, Iy). There are n*(n-1)/2 possible comparisons and a simple all-against-all sum-of-pair algorithm could easily be done in O(n^2). Now....assuming you have the counted numbers of each character within the array (#C1, #C2,...., #Cr). Is it possible to get the sum-of-pair score faster than O(n^2), maybe even O(n)? Thanks a lot for the help!
|
|
IP Logged |
|
|
|
william wu
wu::riddles Administrator
Gender:
Posts: 1291
|
|
Re: Sum-of-pair score of an array faster than O(n^
« Reply #1 on: Mar 9th, 2012, 6:21pm » |
Quote Modify
|
I am assuming that the scoring function is symmetric, which is implied by the statement that there are C(N,2) possible comparisons. Here is a start, although I'm guessing it is not smart enough to meet what is desired. The runtime is O(r^2); more precisely, r*(r-1)/2 + r log r, where the latter summand is due to sorting the histogram. Code: (* provided data *) s[x_, y_] := x^2 + y^2; (* example symmetric scoring function *) SeedRandom[0]; data = RandomInteger[{0, 9}, {30}]; hist = SortBy[Tally@data, Last]; (* sorting the histogram *) (* accumulate sum of all pairs, moving from one bin to the next *) sum = 0; len = Length@hist; For[k = 1, k <= len - 1, k++, sum += Binomial[Last@hist[[k]], 2]*s[First@hist[[k]], First@hist[[k]]]; (* case: s[x,x] *) subsum = 0; (* case: s[x,y] where x != y *) For[j = k + 1, j <= len, j++, subsum += Last@hist[[j]] *s[First@hist[[k]], First@hist[[j]]] ]; sum += Last@hist[[k]]*subsum; ] sum // N |
| Sample output: data={7, 0, 8, 2, 1, 5, 8, 0, 6, 7, 2, 1, 0, 6, 1, 2, 8, 6, 5, 5, 8, 4, 5, 9, 0, 6, 0, 0, 9, 3} hist={{3, 1}, {4, 1}, {7, 2}, {9, 2}, {1, 3}, {2, 3}, {5, 4}, {6, 4}, {8, 4}, {0, 6}} sum//N = 23200 which matches the brute-force all pairs number: Code: (* compare with sum of all pairs *) Plus @@ (s[First@#, Last@#] & /@ Subsets[data, {2}]) // N = 23200. |
|
|
« Last Edit: Mar 9th, 2012, 6:47pm by william wu » |
IP Logged |
[ wu ] : http://wuriddles.com / http://forums.wuriddles.com
|
|
|
Conga
Newbie
Posts: 2
|
|
Re: Sum-of-pair score of an array faster than O(n^
« Reply #2 on: Mar 14th, 2012, 9:25am » |
Quote Modify
|
Hey William, thanks a lot for the reply! You are correctly assuming that the scoring function is symmetric. I was suspecting that it can be done in O(r^2) using binomials, but so far I was not able wrap my head around it. Not sure if it is possible in O(n), but your approach is already something that will help me a lot! Thanks a lot! Conga
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Sum-of-pair score of an array faster than O(n^
« Reply #3 on: Mar 14th, 2012, 9:48am » |
Quote Modify
|
You can't do better than O(r^2) in the worst case, because the score for every distinct pair of characters may be distinct, and so you need to use all r*(r-1)/2 of them in the worst case.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
|