Why You Should Care About Statistics in Warcraft
Last week I mentioned that the statistics of raid health in Warlords of Draenor are going to be important because of the changes to smart heals (now dumb heals) in the expansion. Today, and because I think a lot of people will be suspicious of this claim, I’m going to show you why I said that. Hopefully, you will walk away convinced that we should be caring about the way health is distributed throughout the raid in Warlords of Draenor raiding.
I’m going to run through a couple of interesting demonstrations of statistics applied to questions that we might reasonably want to ask. They are questions which have intuitive answers, but which so far haven’t been proven in any real sense as far as I can tell.
Furthermore, although I will be making some assumptions, the assumptions that I make in these examples are not restrictive in the way the results are applied. I think that this is the most important thing to note, because it means that these results reflect real situations which we will see
Sidenote: Throughout the post you’ll see me say things like “it can be proved that*” – the asterisk denotes that it’s not very simple to prove this, so I put the proof at the end. This makes it easier to read for nonexperts while allowing people interested in the mathematics itself see the methodology.
Mastery as a Statistical Quantity
Ever since Vixsin first published her results on Mastery vs. Crit all the way back in Cataclysm, we have always talked about the fact that “the value of Mastery depends on the health of the raid“. This is a reasonable result, and I don’t want to argue against it. However, that statement doesn’t really capture the whole picture.

Does the value of Mastery depend upon the average health of the raid, or is it more complicated than that?
Intuition might tell you that it is indeed the average raid health which matters, but you can’t just go off and do analysis on these things without first justifying your assumptions. So, let’s find out whether this is true.
Mastery’s effect is usually written out something like this;
Now, for our purposes we don’t care about a lot of parts of the equation. First, we’re going to tidy up the bracket which includes the Mastery Rating term. Then, we’re going to make things easier on ourself by simplifying the bracket with the %HP term. The second bit is more interesting, because we’re going to go from talking about the current health of the target to the health deficit, D, of the target. D is a number between 0 and 1 which represents how much health the target has missing. If D is 1, the player has zero health (and is dead). If D is 0, the player has full health.
So now the total heal, H, is a constant A multiplied by a constant added to D times another constant, C. C just controls how much Mastery you have, proportionally, and isn’t interesting right now. Likewise, A is just the normal heal (including things like Haste and Crit considerations) and isn’t interesting to us either. D, the health deficit, is the interesting quantity.
So what is the mean value of H? We can find this by assuming that the value D is a statistical quantity. By that, I mean that D has many possible values each with an associated probability.
It’s possible to prove* that (see Proof 1, later);
So quite quickly, I’ve shown you that you can indeed back up the statement that the value of Mastery relies entirely upon the average health that everybody is on.
While this isn’t an unexpected result on its own, it does go to show that simple analyses agree with our intuitions; this is reassuring. More importantly, it opens the way to more sophisticated analyses.
Why So Important?
What I want to reinforce here is that this analysis is independent of the “kind” of average which you take and even the specific distribution which you choose. It doesn’t matter whether you take that average over time (i.e. for one player over a series of heals) or over players (i.e. over lots of different players at the same time). It doesn’t even strictly matter here whether you have lots of players at low health and one at high health, or an even spread! The average health is what matters. Over the course of a fight, the average health of your targets is probably well modelled as being a random variable. Future work will have to determine how “random” you can assume that is.
To prove that this isn’t an unique result for just one kind of heal, let’s look at something a bit more difficult and somewhat less obvious.
Chained Heals
Now, let’s turn out attention to Chain Heal. Since in the new expansion, Chain Heal’s power will drop by 15% each jump, we can model the heals it makes like this;
Which is all well and grand. When we introduce Mastery coefficients as we did previously, we have to take into account the health deficit of each target;
So the question is then; how much can we simplify this? Well, after some poking around with the maths it’s possible to show* that (see Proof 2, later); postedit note: as Dayani correctly points out on this post on her blog, the coefficient below is in fact 3.687 due to the Draenor Perk.
In words; the average total heal that Chain Heal makes works exactly the same as any regular heal, with a simple constant multiplier. OK, not surprising either, but the interesting thing is that when we make this equation we assume that all the players have the same probability distributions for different health values. That’s not likely to be correct, but it opens up the intriguing possibility that Chain Heal’s average heal does not scale quite so straightforwardly as you would like.
To demonstrate, let me emphasise one point; the average health of the initial target of your heals is unlikely to be the same as the average health of the raid as a whole. Why? Because healers tend to heal the most damaged player first! This opens up an interesting theoretical challenge: how do I model the role of player choice in this?
Where Next?
I have some ideas, and some of them are very exciting; for one, I’ve started work on CHsim – a simulation of player positions and its interaction with Chain Heal. The goal is to spend time looking at how different raid situations affect the statistics of Chain Heal.
In addition, it should allow me to extract some really cool statistics relating to the spread of Chain Heal’s healing. Ultimately, I expect the spread to be dictated by the range of health values in the raid, but there is something to be said about the randomness of Chain Heal’s targeting and also the spatial spread of the raid.
Finally, I want to know some things about how Chain Heal jumps between clusters of players. Every experienced Shaman knows that getting Chain Heal to jump where you want is a right bitch challenge at the best of times, but I have a hunch that there’s a mathematical description of this. That requires knowledge of something called Percolation Theory, which I haven’t a clue about. If there’s anyone out there who does and feels like collaborating, hit me up.
spacer
because
wordpress
is
annoying
Proof 1
Starting with the equation;
We will have to write down the average heal <H> in terms of the probability distribution for H, by which we really mean;
So what we have to do is to integrate the joint probability like this;
Now substitute for H and p(H,D);
Having found this complicated expression, we can note that two things are true;
Also;
Now, combining the “by definition” bits with the long complicated equation above, we find the result;
QED. Huzzah!
Spacer again
Yay.
Proof 2
We’ll start by contracting the long equation for Chain Heal into a more compact series notation;
Now, it makes it nice and clear when we write down the integral for the mean heal <H> as we did before. The integral splits up into the sum of four different integrals; it essentially says that the total heal is the sum of the means of each jump;
Where we have already expanded p(H,D) into p(HD)p(D) as before. Now it’s fairly easy to spot that each element in the summation has a similar result to what we got for Proof 1, so;
Now, here’s the important part; to simplify this any further we have to make an assumption about whether D1 is the same as D2, and so on. As I discuss in the main body of the post, that’s probably not true, but for the moment let’s say that <D1>=<D2> and so on, so we simplify the summation to;
Which is the result I give in the main body of the post.
That’s all for now, thanks for reading! 🙂
Pingback: WoDsplosion!  UNconstant
Pingback: Hypergeometric Distribution: How and Why  UNconstant