Software Engineering Stack Exchange is a question and answer site for professionals, academics, and students working within the systems development life cycle who care about creating, delivering, and maintaining software responsibly. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I've been implementing a network protocol, and I require packets to have unique identifiers. So far, I've just been generating random 32-bit integers, and assuming that it is astronomically unlikely that there will be a collision during the lifespan of a program/connection. Is this generally considered an acceptable practice in production code, or should one devise a more complex system to prevent collisions?

share|improve this question
2  
Why is using a sequential integer not going to cut it? – whatsisname 4 hours ago
    
Why don't you just use an incrementing int? GUIDs, which are designed to have the uniqueness properties you describe, are 128 bits in size, not 32. – Robert Harvey 4 hours ago
    
Because multiple computers will need to create IDs that are unique among the set of IDs created by any connected computer – Phoenix 4 hours ago
    
Can you use GUIDs? – Robert Harvey 4 hours ago
2  
Alternatively, assign a channel number to each connected computer, and use an incrementing sequence id. The two numbers combined (with the channel number taking up the high-order bits) becomes your new unique id. – Robert Harvey 4 hours ago

Beware the birthday paradox.

Suppose you are generating a sequence of random values (uniformly, independently) from a set of size N (N = 2^32 in your case).

Then, the rule of thumb for the birthday paradox states that once you have generated about sqrt(N) values, there is at least a 50% chance that a collision has occurred, that is, that there are at least two identical values in the generated sequence.

For N = 2^32, sqrt(N) = 2^16 = 65536. So after you have generated about 65k identifiers, it is more likely that two of them collide than not! If you generate an identifier per second, this would happen in less than a day; needless to say, many network protocols operate way faster than that.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.