I've been implementing a network protocol, and I require packets to have unique identifiers. So far, I've just been generating random 32-bit integers, and assuming that it is astronomically unlikely that there will be a collision during the lifespan of a program/connection. Is this generally considered an acceptable practice in production code, or should one devise a more complex system to prevent collisions?
Software Engineering Stack Exchange is a question and answer site for professionals, academics, and students working within the systems development life cycle who care about creating, delivering, and maintaining software responsibly. Join them; it only takes a minute:
Sign up
Sign up
Here's how it works:
- Anybody can ask a question
- Anybody can answer
- The best answers are voted up and rise to the top
|
|
Beware the birthday paradox. Suppose you are generating a sequence of random values (uniformly, independently) from a set of size N (N = 2^32 in your case). Then, the rule of thumb for the birthday paradox states that once you have generated about sqrt(N) values, there is at least a 50% chance that a collision has occurred, that is, that there are at least two identical values in the generated sequence. For N = 2^32, sqrt(N) = 2^16 = 65536. So after you have generated about 65k identifiers, it is more likely that two of them collide than not! If you generate an identifier per second, this would happen in less than a day; needless to say, many network protocols operate way faster than that. |
|||
|
|