

Title: Worm Propagation Post by william wu on Feb 3^{rd}, 2003, 12:04am WORM PROPAGATION A hacker is attacking a computer network. Each computer on the network is connected to various other computers. The hacker releases a worm on a source computer. When a worm infects a computer, that computer can propagate a copy of the worm to a connected computer at a rate of once per second. Suppose the hacker knows the layout of the network; i.e., he knows which computers are connected to which. How should the worm proceed to infect the whole network as quickly as possible? Also, if possible, describe an algorithm for computing the minimum time till total network infection. (Admittedly a computer science background will help here, but it is not absolutely necessary.) Note: Writing credits to Yosen Lin. Inspiration was the Sapphire worm that hit the Net last week. 

Title: Re: Worm Propagation Post by towr on Feb 3^{rd}, 2003, 1:30am ... [hide]Order all connected computers as a tree, with the source computer as the root, and make sure the tree is as shallow as possible. Model it so an infected computer 'reinfects' itself, and another computer, making a binary tree, where each level is one second further in time. And of course al connection in the tree have to be there 'irl' between the computers.[/hide]... 

Title: Re: Worm Propagation Post by Chronos on Feb 3^{rd}, 2003, 5:58pm towr: Where you have [hide]and make sure the tree is as shallow as possible.[/hide], isn't that sort of begging the question? Does there exist a standard algorithm for doing that, which any computer scientist would know already? Or did you just intend it as a rewording of the problem? 

Title: Re: Worm Propagation Post by towr on Feb 4^{th}, 2003, 12:02am on 02/03/03 at 17:58:02, Chronos wrote:
There is however the way of making every possible tree, and choosing the one that befits the criterium.. Bruteforce is usually my first solution, smart solutions follow when it's apparant bruteforce would take too long. There has to be a better way, but I'm not really sure yet what it is.. 

Title: Re: Worm Propagation Post by william wu on Feb 4^{th}, 2003, 2:27am Breadthfirst search (BFS) can give you the shallowest possible tree. This basically organizes the vertices according to how many steps away they are from the source vertex. Yes, there's a cheaper way to do this than expand all possibilities ... Somewhere you need to describe how to choose the next vertex the worm should infect in order to minimize total infection time. This next vertex should satisfy some metric. Drawing some simple example networks should help. 

Title: Re: Worm Propagation Post by wolfgang on Feb 4^{th}, 2003, 8:21am Coming from a very amateur programmer: [hide]Create the tree by starting with the host computer and assigning an ID to every computer it's attached to. Move to those computers and assign an ID to each one they're attached to that doesn't already have an ID. Make it a 2 part number such as 2.3 or 2'3 to identify both parent and child. Continue till all computers have an ID. (towards the end I suppose it would speed things up to start with the computers that haven't been IDed and work backwards). Now to infect. Instruct each computer to search for the longest ID that starts with its own ID, to reach the most distant computers quickly. Continue until all computers it's been assigned have been infected. Then move to other computers it wasn't assigned, on the assumption that some computers will be offline or virus protected and they'll break the chain. If you make each infected computer start by sending a message to its parent to let it know the infection was successful, then the next step would be to either retry those that didn't respond or start infecting their children. Those that reach the end of the tree should start by trying to infect their cousins or uncles. If they ever find one that wasn't infected, then move up to its parent and grandparent... The total time to total infection will be one second for each step to the longest branch, or a couple seconds longer if there are computers offline or protected.[/hide] 

Title: Re: Worm Propagation Post by wolfgang on Feb 4^{th}, 2003, 8:32am Obviously, I made one very silly mistake. I really should register so I can undo those mistakes before anybody else sees them, but I keep thinking this will be my very last mistake. Anyway [hide] Please ignore the suggestion that a computer infect its grandchildren in the event the computer it tried to infect doesn't respond. Obviously it has no direct connection to its grandchildren or they wouldn't be grandchildren. Maybe the tree should have doubleredundancy or it should start sending out lists of the computers it missed so other computers can see whether they can make the bypass, [/hide] 

Title: Re: Worm Propagation Post by wolfgang on Feb 4^{th}, 2003, 8:53am Wrong again, but at least I'm registered, so this will be my last post. [hide]The total time will be the maximum sum of all the address parts for one of the computers at the top of the tree. If a tree is very broad, the worm will spread sideways much more slowly than it spreads up. So the fastest algorithm would be to weigh each branch. Start with the branches that will take the longest, not just the tallest branch. And those branches will have to be sorted in reverse order, sorting the top of the tree first and working your way down. [/hide] I should have known this would be more complicated than it looked. It's too hard for me so I'll shut up after one more comment. [hide]My treecreating algorithm is no where near optimal. As soon as A infects B, A and B should be equally regarded to decide which will infect C if it's connected to both. And that decision will be based both on equalizing A's and B's workloads (If A connects 100 computers and B connects 10, then you don't want A wasting time infecting computers that B could handle), as well as the importance or weight of C's vertex. But that means the whole tree's shape may change, so a topdown weight sorting algorithm won't work as the way we figure out C's importance. [/hide] 

Title: Re: Worm Propagation Post by william wu on Feb 25^{th}, 2003, 12:31am on 02/04/03 at 08:53:39, wolfgang wrote:
You're not alone; it turns out our solution was flawed. We're stuck too. 

Title: Re: Worm Propagation Post by william wu on Mar 9^{th}, 2003, 2:13am I eventually resorted to googling. We have rediscovered what is typically known as the gossip problem. However, my searches were unsuccessful toward finding a general algorithm for minimal gossip times on a graph. One paper I saw found minimal gossip times only for particular graph structures, such as cliques, trees, and lines; however, I only gave it a cursory survey. Anyways, this problem seems far harder than I expected, and is probably NPhard because of the tradeoff between depth and bushiness. 

Title: Re: Worm Propagation Post by william wu on May 12^{th}, 2003, 12:21am An interesting email I got from an Edward Rustin regarding this problem. Intuitively it seems right to head for the center of mass, but of course that's not a proof. Hi, This isn't exactly a conventional method for solving the network propagation problem but it does work. Unfortunatly I doubt that an algorithm or a strictly formal proof can be derived from this. All you need is a lot of string and a bit of time. Since our hacker knows the layout of the network already then he can do this. Bascily we create a network diagram using string. Every computer is represented by a knot in our string. Whenever two parts of the network come togeather (say at a router or a gateway machine) we tie two pieces of string togeather. Make sure that the knots that represent hubs or other 'transparent' pieces of hardware are easy to tell apart from the knots that represent computers. The only tricky part is to ensure that the spaceing between knots is consistent. Once we have our 'string map' of the network its easy to find the 'center' of the network. pick up the string at any of its loose ends (the loose ends being computers on the edge of the network) and let the string hang naturally. The lowest point is the computer 'furthest' away from the computer that you are holding in your fingers. If you then hold that lowest knot and pull the two so that the string becomes tight then we've just mapped out the shortest path between those two computers. Its a trivial exercise to count the number of computers between these two points (and, should we wish, map it onto our network diagram). From this we can work out which computer is centered between those two points. We now need to perform a sort of 'string recursion' algorithm. Still holding our bottom knot let go of the top one. This will now show us the point furthest away from our previous bottom. Take hold of this point and drop the top one. This, once again, gives us the furthest point away from that previous point. Repeat untill we oscillate between two points, each point being the furthest point away from each other. This is the longest path on our network. An easier way to visualise this is as follows: Pick a point on the network  P1 Let the network hang from that point, lowest point is P2 While (Pn != Pn+2) AND (Pn+1 != Pn+3) Hang network from Pn Lowpoint is Pn+1 Repeat with Pn+1 Pn, Pn+1 is now the longest path. This makes it trivial to determine where the network's 'center' is, being the computer that is in the middle of these two points. This is therefore the best computer to infect first. The propagation time is thus the ceiling of Distance(P1, P2)/2. This makes it trivial to determin where the network's 'center' is, being the computer that is in the middle of these two points. This is therefore the best computer to infect first. The propagation time is thus the ceiling of Distance(P1, P2)/2. The best propagation tactic for the worm is for each computer to propagate first to the computers on this line between the two points and then on subsequent ticks to propagate to any other uninffected computer that it is connected to. I think that's so at least. Even if that isn't true, my string computer should go someway to solving this problem. Edward Rustin 

Title: Re: Worm Propagation Post by towr on May 12^{th}, 2003, 1:06am I'm not sure it will work, because the 'distance' (time) between two computers depends on the previous steps taken by the worm. From one computer the worm can only infect one other computer per second, so all computers connected to that computer are at a different distance which you can't determine before you determine the order in which you infect them. 

Title: Re: Worm Propagation Post by TenaliRaman on Nov 2^{nd}, 2003, 8:57am ::[hide] Why not simply apply Djikstra Algorithm (from source node to every node) and then construct a tree whose root node is the source node. [/hide]:: 

Title: Re: Worm Propagation Post by towr on Nov 2^{nd}, 2003, 9:48am Could you elaborate on how that would work? 

Title: Re: Worm Propagation Post by TenaliRaman on Nov 2^{nd}, 2003, 11:50am Pretty simple (i think!) Construct the graph of the network. Assign each edge of the graph with weight 1 (we simply are interested in minimising the hopcounts here). Now apply Djikstra algorithm from source node to every other node and build a tree. Consider this example as an illustration, 

Title: Re: Worm Propagation Post by towr on Nov 2^{nd}, 2003, 12:19pm Ah.. So basicly it won't work here, imo.. The distance between nodes depends on the order childnodes are chosen.. (because the worm can only be send to one node every second) 

Title: Re: Worm Propagation Post by TenaliRaman on Nov 2^{nd}, 2003, 12:32pm In the simple graph i proposed above, the time taken for worm propagation is not much significant. However if my idea is to produce a nlevel tree with m childnodes for each node on an average then it can achieve significant time minimisation. 

Title: Re: Worm Propagation Post by mindcrime on Jul 28^{th}, 2004, 10:28am Well guys, I am new to this field but let me exercise my feeble mind over this topic. The hacker could create a minimal spanning tree from the network and start the attack from the root. The minimal spanning tree could be the traditional one as we all know, but it should also be "minimal" in another sense wherever possible. The other sense is that as far as possible , the weights of the edges should be nondecreasing. Clearly, by choosing the appropriate vertex which satisfies this property in any spanning tree as the root, this is possible. So the solution can be very crisply stated as follows... "Find the minimal spanning tree and start from the vertex from which every edge is ordered to the leaves in nondecreasing order of their weights". ::) 

Title: Re: Worm Propagation Post by John_Gaughan on Sep 2^{nd}, 2004, 10:49am Minimal spanning tree may be a good start but will not, by itself, yield a solution. Keep in mind that time is a factor, not just location on the network. If a node has ten children, it will infect them, one a second, until it is done. This introduces complexity to the algorithm that, as William said, probably makes it NPhard. That and the fact that we do not know the network layout ahead of time, so we have to generalize it enough to work in all cases. All of my CS education pertaining to trees makes assumptions that go against this program. I believe the answer lies not in traditional tree/graph theory but in some new angle, a new way of approaching the problem that flies in the face of conventional thought. One parameter not described in the original problem is whether a copy of the worm has knowledge about which nodes are already infected. Does a worm infect its children until done, then quit? Does it know about nodes not connected to itself? E.g. if the root infects a node, which infects another node, does that node know about its "cousin" nodes? Does the algorithm take that into account? Is infection random? How much a priori knowledge do we have? 

Title: Re: Worm Propagation Post by towr on Sep 2^{nd}, 2004, 11:40am on 09/02/04 at 10:49:55, John_Gaughan wrote:


Title: Re: Worm Propagation Post by rmsgrey on Oct 24^{th}, 2004, 10:03am Looking back over some of the old problems... I'm curious as to whether this problem can even be shown to be in NP  I'm a little rusty on the theory, but don't NP problems have to be able to evaluate prospective solutions in polynomial time? How would you go about recognising a genuine minimum time infection route? Certainly a naive brute force method requires comparison of all possible attacks to determine the optimum one. 

Title: Re: Worm Propagation Post by towr on Oct 24^{th}, 2004, 1:47pm I'm pretty sure it has to be NP. Because as it isn't polynomial, it's nonpolynomial.. There's only two classes. You might be thinking of NPcomplete (a subclass of NPproblem) 

Title: Re: Worm Propagation Post by rmsgrey on Oct 25^{th}, 2004, 6:21am I was taught that NP was Nondeterministic Polynomial  in other words solvable by a nondeterministic device in polynomial time, unlike P which are solvable by deterministic devices in polynomial time, L which are solvable using logarithmic space, PSPACE, NPSPACE which are solvable using polynomial space by deterministic and nondeterministic devices respectively. The simplest type of NP algorithm is one which nondeterministically generates a possible solution, and then tests it in polynomial time  the basic idea of a nondeterministic machine is that, whenever it hits a "choose" instruction, it makes all possible choices simultaneously, and then evaluates the outcome of each choice in parallel. The machine fails if all choices fail, and succeeds if any parallel version succeeds. I guess you could create a machine that, for a given infection time, checks the nondeterministically generated (in time O(n^{2})) infection scheme in time O(n) to see if it takes the appropriate amount of time to infect the network. Having that machine, you could then create another machine that called it repeatedly with times 1 to n until a success gets returned. Assuming I haven't made any egregious errors, the total running time would be O(n^{3}), putting the problem in NP. I guess I've managed to answer my own question then... 

Title: Re: Worm Propagation Post by towr on Oct 25^{th}, 2004, 8:23am on 10/25/04 at 06:21:20, rmsgrey wrote:
I must be more affected by my cold than I reallize.. At least I got the 'non' and 'polynomial' parts right ;) 

Title: Re: Worm Propagation Post by Aryabhatta on Oct 25^{th}, 2004, 6:22pm This problem seems to be the same as the Minimum Broadcast Time problem, which is claimed to be NPComplete (no proof, but a reference) in the first few pages of the following paper (http://www.crab.rutgers.edu/~guyk/pub/broad/nabs.ps). Now that we 'know' that it is NPComplete, it would be interesting to come up with a proof of that. 

Title: Re: Worm Propagation Post by Aryabhatta on Oct 26^{th}, 2004, 7:56pm Supposedly, 3 dimensional matching is reducible in a straight forward way to the minimum broadcast problem. 3 Dimensional Matching: Given a set S of 3n elements and a family F of subsets of S, such that each element of F is of cardinality 3, does there exist a subset H of F of cardinality n such that the union of elements of H is S. (basically given some m > n, triples from S, can you find n triples which cover S. Since S is of size 3n, those n triples must be mutually disjoint, it is also known as the Exact Three Cover.) 3 Dimensional Matching is known to be NPComplete. 

Title: Re: Worm Propagation Post by Sjoerd Job Postmus on Mar 10^{th}, 2005, 2:56pm I'm an infected computer, I will take these steps: 1. Enumerate the computers I can infect. 2. Choose uninfected computer that only I can infect, with the largest amount of links. 3. If that fails  all computers are somehow attached to other infects  choose one with as many links as possible, with the least amount of infected links, with the largest total of infect busyness. 4. If that fails, I can no longer infect anything, so I'm done :(! 

Title: Re: Worm Propagation Post by DewiMorgan on Jun 2^{nd}, 2006, 7:38pm Although it's NPhard, the fun comes from constructing an effective, nearoptimal algorithm for a blind virus to spread most economically. Longestpath and centerofmass and mostgrandchildren solutions all provide good general answers, but fail in some situations. Consider a situation where you are attacking "target", which lies between a hub, connected in parallel to X machines, none of which are interconnected, and to a chain of Y machines. X=HubtargetABCDE...Y Clearly, if X>Y, then it's better to infect the hub machine first, so it can start infecting its children as soon as possible. This is despite the fact that it's not the "center" of the network, and it's not the longest path. If X<Y, then it's best to infect the chain first, even though the first link only has a single outside connection, because that link leads to the majority of the network, which we want to infect. It rapidly becomes more complicated when you start linking the leaves together, of course. Any solution should be recursive (that is, each infected child should be able to run it too). And unless the infected machines broadcast which others they infect, and the machines can deliberate amongst eachother about who should infect whom, I's assume all machines are working blind. As much as possible, though, each machine needs to mark alreadyinfected machines in its internal map, remove the links between them, remove any subsequently isolated machines, and repeat the process each second (after it had connected to all its own children, it would consider itself an isolated node, and remove itself from the network, stopping its own action). But guessing which others (other than its parent and the ones it infected itself) were infected, could be tricky. This is because I think randomness may need to come into play. The chance of it selecting a target node could be divided by the number of available nodes, and multiplied by the suspected number of noninfected links from that node, for instance. Also want to include a factor to compensate for the probability of another known infected node selecting that item. Possibly subtract 1/(number of nodes available to infected node) from the probability of selecting that item, for each infected node that shares it, capping at just over zero, but not marking it as actually infected unless the probability was 1/1. Then scale all probabilities so the total probability of selecting SOME node is 1. This algorithm would mean that nodes right in the center of a mass of infected machines might not get infected until one of the infected machines had exhausted all other probabilities, but that's probably not a bad thing. I think the above is pretty close to optimal, though it would be suboptimal for the first node in the Y>X state above. It would also fail sometimes for: xxinfectedy ...since it would sometimes choose y first. Probably the choices should only be random where there are conflicts with other infected hosts, and should be deterministic for unshared nodes. which brings us back to how to best choose deterministically. THe weighting should include a measure of the netsork distance from other infected notes (other than passing through the current host), too. And probably a measure of the size of the tree behind it divided by the number of knowninfected hosts on that tree. 

Title: Re: Worm Propagation Post by Shiladie on Nov 26^{th}, 2006, 6:10pm Very simple, most probably flawed logic while looking at this after confusing ever myself with bad math here is an example for 7 seconds with infinit computers available infecting the maximum computers 0: x 1: x, 1x1 2: x, 2x1,1x2 3: x, 3x1,3x2,1x3 4: x, 4x1 6x2,4x3,1x4 5: x, 5x1,10x2,10x3,5x4,1x5 6: x, 6x1,15x2,20x3,15x4,6x5,1x6 7: x, 7x1,21x2,35x3,35x4,21x5,7x6,1x7 after 7 seconds you infect 127 computers and now its so obvious, the equation for what i am now sure is probably the completely wrong problem is: s = seconds assumeing you start at 0 x = number of computers infected 2^s1 = x this is assumeing perfect network layout and that the virus is smart enough to not infect the same computer more then once. which bwe have to assume to get any sort of real answer for this riddle without more information about how the network is layed out the rest is theorycraft about 'weighting the larger branches of the network more heavily' and such and now im ready for the flame about how i'm wrong and missed some big note, which is probably true because this just is too simply to have so many people thinking... 

Title: Re: Worm Propagation Post by balakrishnan on Jan 10^{th}, 2007, 8:04pm Is this solved yet? 

Title: Re: Worm Propagation Post by towr on Jan 11^{th}, 2007, 1:51am on 01/10/07 at 20:04:42, balakrishnan wrote:
However a near optimal solution may be better in practice (especially if you don't know the entire layout beforehand). And there's nearly always room for improvement at that end. 

Title: Re: Worm Propagation Post by CowsRUs on Feb 12^{th}, 2007, 9:19am I would infect other computers then infect more then infect more than start spamming emails and hope some idiot clicks on the spam with the virus 

Title: Re: Worm Propagation Post by bosssssss3 on Mar 21^{st}, 2007, 10:02am can sumone please add more smileys plz :/ 

Title: Re: Worm Propagation Post by Hippo on Apr 30^{th}, 2007, 4:26am Sorry I had loooked for on all pages missing the title one ... this belongs here... I will show that minimal gosip problem (worm propagation) is NP hard. I will show that deciding whether mingosip=k providing you know mingosip in k,k+1 is NP complete. I will show a reduction from the SAT CNF problem. You have m clauses using n different variables alltogether. Let k be chosen later. I will construct graph with mingosip in k,k+1 having mingosip = k iff given formula is satisfable: Construction will be started from starting vertex 0 for the gosip problem. Let vertex 0 and newly created vertices x_{i},neg x_{i} creates triangle. Let vertex i is connected to both x_{i} and neg x_{i}. And let us create path of length ki2 edges starting in vertex i. Last vertex of the path being blotted before time k+1 forces that vertex i is blotted at time i+1. In that case either x_{i} or neg x_{i} is blotted in time i. Furthermore for ith literal vertex (x_{i} or neg x_{i} vertex) we create hypercube sceletton connected to it such that under each literal there is at least m leaves in the last level. This is the only bound for k (and it is lower bound). And the last level is blottable in time k2 iff the literal vertex is selected and in time k iff it is not selected and the last vertex of path under vertex i is reachable in time k. For lth clause create and connect vertices c_{l},d_{l} and for each literal y connect c_{l} to lth leave of the y tree. Vertex c_{l} must be blotable in time k1 to be able to blot d_{l} before time k. ... WQD 

Powered by YaBB 1 Gold  SP 1.4! Forum software copyright © 20002004 Yet another Bulletin Board 