Author |
Topic: Efficient Datastructure (Read 1258 times) |
|
techforn10
Newbie
Gender:
Posts: 2
|
|
Efficient Datastructure
« on: Jul 7th, 2007, 10:00am » |
Quote Modify
|
Company A serves millions of requests per day. Give a efficient method to return two character string country code given IP address as input.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Efficient Datastructure
« Reply #1 on: Jul 7th, 2007, 11:16am » |
Quote Modify
|
Well, presumedly the input and output is known in advance, so you can spend quite some power to find the most efficient mapping. I'd be hardpressed to say a priori what that mapping will be.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
aki_scorpion
Newbie
Gender:
Posts: 13
|
|
Re: Efficient Datastructure
« Reply #2 on: Jul 7th, 2007, 11:28am » |
Quote Modify
|
hmmm If the input and output are known is advance and if they are limited why can we use 2 hash functions for every ip so that query will take only o(1) time.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Efficient Datastructure
« Reply #3 on: Jul 7th, 2007, 11:50am » |
Quote Modify
|
The range of ip numbers is limited, that much is a given. But O(1) doesn't say much in itself; if each request takes a second it's too long, even if it's a constant amount of time. We have some 2^32 IPv4 addresses, and some 200 countries. Hmm, I wonder if there's a list somewhere, so we can actually do it, rather than only try to figure out an approach.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Efficient Datastructure
« Reply #4 on: Jul 7th, 2007, 4:00pm » |
Quote Modify
|
I think IP addresses are attributed to countries in ranges. Storing each IP address would be overkill. Maybe a Btree would be appropriate?
|
|
IP Logged |
|
|
|
techforn10
Newbie
Gender:
Posts: 2
|
|
Re: Efficient Datastructure
« Reply #5 on: Jul 8th, 2007, 9:16am » |
Quote Modify
|
How about using Trie?? Vague idea: I was thinking of building a Trie of network part of IP addresses separated by 2 letter country code. So on query just traverse through the Trie matching network part of octets then just return the data in leaf node which contains country code.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Efficient Datastructure
« Reply #6 on: Jul 8th, 2007, 10:22am » |
Quote Modify
|
The (free) geolite database from maxmind contains just under 100000 ip-range - country pairs. And boasts 98% accuracy. (Their commercial database has over 99% accuracy) So, that's about 500 ranges per country. A tree of some sort would seem an obvious solution. But I wonder if it can't be done with fewer costly hops through memory.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
mad
Junior Member
Posts: 118
|
|
Re: Efficient Datastructure
« Reply #7 on: Mar 9th, 2008, 3:44am » |
Quote Modify
|
I am not very sure of this, but if it follows a classful addressing, can we use a mask to find the class the given address belongs to. Then, if it is a class A address, then search the class A tree only , if class B, search in its tree and so on. Not very sure of this approach. But, using a trie to store it as digits from 0-9 rather than binary tree might be efficient. Or can we store a tree of intervals and check if a given IP address falls in a given range/interval and output the corresponding country.
|
« Last Edit: Mar 9th, 2008, 4:33am by mad » |
IP Logged |
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Efficient Datastructure
« Reply #8 on: Mar 12th, 2008, 10:47am » |
Quote Modify
|
on Mar 9th, 2008, 3:44am, mad wrote:But, using a trie to store it as digits from 0-9 rather than binary tree might be efficient. |
| IP addresses being 4 bytes, you should rather go 256-way at each level, or maybe 16-way.
|
|
IP Logged |
|
|
|
Hippo
Uberpuzzler
Gender:
Posts: 919
|
|
Re: Efficient Datastructure
« Reply #9 on: Mar 12th, 2008, 1:18pm » |
Quote Modify
|
on Jul 7th, 2007, 11:28am, aki_scorpion wrote:so that query will take only o(1) time. |
| I should comment this ... there is a big difference between O(1) and o(1) ... There is no algorithm working in o(1) time. (At least you cannot call such algorithm.)
|
|
IP Logged |
|
|
|
|