In this 2003 article by Stephen Figgins on linuxdevcenter.com, Bram Cohen's BitTorrent is described as using the "Fix Everything" design pattern.

A less common approach that both makes BitTorrent harder to grasp, but worthy of study, is Cohen's use of idempotence. A process is idempotent when applying it more than once causes no further changes. Cohen says he uses a design pattern he calls "Fix Everything," a function that can react to a number of changes without really noting what all it might change. He explains, "you note the event that happened, then call the fix everything function which is written in this very idempotent manner, and just cleans up whatever might be going on and recalculates it all from scratch." While idempotence makes some difficult calculations easier, it makes things a little convoluted. It's not always clear what a call is going to change, if anything. You don't need to know in advance. You are free to call the function, just to be on the safe side.

This sounds quite nice on the face of it.

However, it seems to me that calling an idempotent "fix everything" function would improve robustness of the system at the cost of efficiency and potentially screwing up the containing system (that might prefer processes that carefully plan and execute.).

I can't say that I've used it before, though. I also cannot find the source for his application online (but I did find this one that claims to be based on it.). Nor can I find reference to it outside of this article (and I consider my google-fu to be pretty good) but I did find an entry for "Idempotent Capability" on SOApatterns.org.

Is this idea better known by another name?

What is the "Fix Everything" design pattern? What are its pros and cons?

share|improve this question
    
I'm curious why this "Stephen Figgins" is using the Indian English idiom "what all"; seems suspicious to me. Or is it common someplace else? – BoundaryImposition 10 hours ago
4  
Seems common to me in the USA. – Aaron Hall 10 hours ago
    
I see. Well, carry on then. – BoundaryImposition 10 hours ago
    
@BoundaryImposition "suspicious"? More like raving paranoid, and completely irrelevant. And just because you have heard an Indian English speaker say it doesn't make it "the Indian English idiom". – Jim Balter 3 hours ago

Let's say you have an HTML page that is fairly complicated-- if you pick something in one dropdown, another control might appear, or the values in a third control might change. There's two ways you could approach this:

  1. Write a separate handler, for each and every control, that respond to events on that control, and updates other controls as needed.

  2. Write a single handler that looks at the state of all the controls on the page and just fixes everything.

The second call is "idempotent" because you can call it over and over again and the controls will always be arranged properly. Whereas the first call(s) may have issues if a call is lost or repeated, e.g. if one of the handlers performs a toggle.

The logic for the second call would be a bit more obscure, but you only have to write one handler.

And you can always use both solutions, calling the "fix everything" function as needed "just to be on the safe side."

The second approach is especially nice when state can come from different sources, e.g. from user input versus rendered from the server. In ASP.NET, the technique plays very well with the concept of postback because you just run the fix everything function whenever you render the page.

Now that I've mentioned events being lost or repeated, and getting state from different sources, I'm thinking it is obvious how this approach maps well to a problem space like BitTorrent's.

Cons? Well the obvious con is that there is a performance hit because it is less efficient to go over everything all the time. But a solution like BitTorrent is optimized to scale out, not scale up, so it's good for that sort of thing. Depending on the problem you are trying to solve, it might not be suitable for you.

share|improve this answer

I think the article is a bit dated because as I read it, this isn't really a unorthodox or new idea at all. This idea is presented as a separate pattern when it really is just a simple Observer implementation. Thinking back to what I was doing at the time, I remember working on logic to sit behind a somewhat complex interface with a number of different panels with data that were interdependent. The user could change values and/or run a optimization routine and based on those actions, events were generated that the UI would listen to and update as needed. There were a number of issues during development where certain panels wouldn't update when they should. The fix (staying within the design) was to generate events from other events. Ultimately, by the time everything was working right, almost every change resulted in all the panels to refresh. All the complexity of trying to isolate when a given panel needed to refresh was for naught. And it didn't matter anyway. It was effectively a premature optimization.

There are innumerable systems designed this way. Think of all the CRUD interfaces that add/update a row and then requery the DB. This isn't an exotic approach, it's just the obvious non-clever solution.

A simple example is calculating means. The simple solution is to sum numbers and divide by the cardinality of the values. If you add or modify a number, you just do it again, from the beginning. You could keep track of the sum and the count of numbers and when someone adds a number, you increase the count and add it to the sum. Now you aren't re-adding all the numbers again. If you've ever worked with Excel with a formula that references a range and modified a single value in that range, you have an example of the 'fix everything' pattern i.e. any formula that has a reference to that range will recalculate regardless of whether that value was relevant (e.g. using something like sumif()).

This isn't to say this isn't a smart choice in a given context. In the mean example, lets say we now need to support updates. Now I need to know the old value somehow and only change the sum by the delta. None of this is really that challenging until you consider trying to do this in a distributed or concurrent environment. You now have to handle all kinds of thorny timing issues and you'll likely end up creating a major bottleneck which slows down things far more than recalculating.

share|improve this answer
1  
I'm pretty sure Excel intends to only recalculate cells that are dependent on changes, which is why there's a way to trigger all cells to recalculate: superuser.com/questions/448376/… (which I suppose would be a "fix everything") – Aaron Hall 16 hours ago
    
@AaronHall If it does, it's a really bad implementation. I regularly watch it consume 100% of 7 CPUs for 15-30 minutes to calculate e.g. 60,000 cells. The calculations aren't complicated. I've often written Python programs that can do everything in the sheet in a few seconds including starting Python. This was my best guess at how it could take so long. It could be something else I suppose. There are also a number of really old bugs in Excel which might be the reason for that feature. – JimmyJames 16 hours ago
    
@AaronHall it's also possible with that user that auto-calculate was disabled on the sheet. I often do this on large workbooks because I don't have 15 minutes to spare every time I hit enter. – JimmyJames 16 hours ago
    
@AaronHall I thought a little more and you have a point. My assumptions were likely overly broad. I updated the answer to me more focused on something I am more confident in. – JimmyJames 13 hours ago
    
Refine it as much as you can, I expect this Q&A to be a classic - I'm going to hold out accepting as long as possible. – Aaron Hall 12 hours ago

Not sure it's a "design pattern", but I would classify that type of behavior as end state configuration or desired state configuration, in the vein of Puppet, Chef or Powershell DSC.

Those solutions typically operate at the systems management level, not a business logic level as the question describes, but it's the effectively the same paradigm, and although such tools are usually declarative in nature, the same principles can be applied in procedural code or scripting.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.