wu :: forums
« wu :: forums - Efficient Datastructure »

Welcome, Guest. Please Login or Register.
May 17th, 2024, 12:39am

RIDDLES SITE WRITE MATH! Home Home Help Help Search Search Members Members Login Login Register Register
   wu :: forums
   riddles
   cs
(Moderators: ThudnBlunder, Grimbal, Icarus, towr, SMQ, william wu, Eigenray)
   Efficient Datastructure
« Previous topic | Next topic »
Pages: 1  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print
   Author  Topic: Efficient Datastructure  (Read 1258 times)
techforn10
Newbie
*





   


Gender: male
Posts: 2
Efficient Datastructure  
« on: Jul 7th, 2007, 10:00am »
Quote Quote Modify Modify

Company A serves millions of requests per day. Give a efficient method to return two character string country code given IP address as input.
IP Logged
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: Efficient Datastructure  
« Reply #1 on: Jul 7th, 2007, 11:16am »
Quote Quote Modify Modify

Well, presumedly the input and output is known in advance, so you can spend quite some power to find the most efficient mapping.
I'd be hardpressed to say a priori what that mapping will be.
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
aki_scorpion
Newbie
*





   


Gender: male
Posts: 13
Re: Efficient Datastructure  
« Reply #2 on: Jul 7th, 2007, 11:28am »
Quote Quote Modify Modify

hmmm
 
If the input and output are known is advance and if they are limited why can we use 2 hash functions for every ip so that query will take only o(1) time.
IP Logged
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: Efficient Datastructure  
« Reply #3 on: Jul 7th, 2007, 11:50am »
Quote Quote Modify Modify

The range of ip numbers is limited, that much is a given. But O(1) doesn't say much in itself; if each request takes a second it's too long, even if it's a constant amount of time.
 
We have some 2^32 IPv4 addresses, and some 200 countries.  
Hmm, I wonder if there's a list somewhere, so we can actually do it, rather than only try to figure out an approach.
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Re: Efficient Datastructure  
« Reply #4 on: Jul 7th, 2007, 4:00pm »
Quote Quote Modify Modify

I think IP addresses are attributed to countries in ranges.  Storing each IP address would be overkill.  Maybe a Btree would be appropriate?
IP Logged
techforn10
Newbie
*





   


Gender: male
Posts: 2
Re: Efficient Datastructure  
« Reply #5 on: Jul 8th, 2007, 9:16am »
Quote Quote Modify Modify

How about using Trie??
 
Vague idea:  
I was thinking of building a Trie of network part of IP addresses separated by 2 letter country code. So on query just traverse through the Trie matching network part of octets then just return the data in leaf node which contains country code.
IP Logged
towr
wu::riddles Moderator
Uberpuzzler
*****



Some people are average, some are just mean.

   


Gender: male
Posts: 13730
Re: Efficient Datastructure  
« Reply #6 on: Jul 8th, 2007, 10:22am »
Quote Quote Modify Modify

The (free) geolite database from maxmind contains just under 100000  ip-range - country pairs. And boasts 98% accuracy. (Their commercial database has over 99% accuracy)
So, that's about 500 ranges per country.
 
A tree of some sort would seem an obvious solution. But I wonder if it can't be done with fewer costly hops through memory.
IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB
mad
Junior Member
**





   


Posts: 118
Re: Efficient Datastructure  
« Reply #7 on: Mar 9th, 2008, 3:44am »
Quote Quote Modify Modify

I am not very sure of this, but if it follows a classful addressing, can we use a mask to find the class the given address belongs to. Then, if it is a class A address, then search the class A tree only , if class B, search in its tree and so on. Not very sure of this approach.
 
But, using a trie to store it as digits from 0-9 rather than binary tree might be efficient.
 
Or can we store a tree of intervals and check if a given IP address falls in a given range/interval and output the corresponding country.
« Last Edit: Mar 9th, 2008, 4:33am by mad » IP Logged
Grimbal
wu::riddles Moderator
Uberpuzzler
*****






   


Gender: male
Posts: 7527
Re: Efficient Datastructure  
« Reply #8 on: Mar 12th, 2008, 10:47am »
Quote Quote Modify Modify

on Mar 9th, 2008, 3:44am, mad wrote:
But, using a trie to store it as digits from 0-9 rather than binary tree might be efficient.

IP addresses being 4 bytes, you should rather go 256-way at each level, or maybe 16-way.
IP Logged
Hippo
Uberpuzzler
*****





   


Gender: male
Posts: 919
Re: Efficient Datastructure  
« Reply #9 on: Mar 12th, 2008, 1:18pm »
Quote Quote Modify Modify

on Jul 7th, 2007, 11:28am, aki_scorpion wrote:
so that query will take only o(1) time.

 
I should comment this ... there is a big difference between O(1) and o(1) ... There is no algorithm working in o(1) time. (At least you cannot call such algorithm.)
IP Logged
Pages: 1  Reply Reply Notify of replies Notify of replies Send Topic Send Topic Print Print

« Previous topic | Next topic »

Powered by YaBB 1 Gold - SP 1.4!
Forum software copyright © 2000-2004 Yet another Bulletin Board