The last few weeks I have experienced a streak of bad luck, the kind that just keeps getting worse when you think it can't. I startedkeeping records to try and understand if my perception was a cognitive distortion or if it really was an objective string of unlikely events. After convincing myself of the latter, I decided to do a little exploration of poker variance. You might be interested in this for poker reasons, or you might just read it to feel good that you are not as nerdy as I am. Either way I thought this might be useful here. I found it very helpful to visualize how this kind of randomness works. Basically, the question I am asking here is: how many hands do you have to look at before you can reliably find the "expected" outcome?
In other words, when we see poker odds -- for example the little numbers that come up on ESPN when someone is all in tell us that AA is an 87% favorite over 72 -- those are calculated over thousands and thousands of hands. Obviously the smaller the sample, the more likely you are to see a result that deviates from the expected winning percentage.. but how large does your sample need to get before you approach the expected outcome over a series of observations of the same size? I wrote some code that will take any two poker hands and simulate outcomes over many many iterations. The data I will show here were done over 10,000 iterations. Once we have 10,000 outcomes -- that's 10,000 wins or losses, we can then look at various little subsamples - epochs I will call them. The code will take an epoch size (say we look over 10 hands) and slide that window forward one hand at time to see what winning percentage we get with each 10 hand window. We then do the same with 20 hand epochs, or 100 hand epochs, etc. That way we can see winning percentage varies with the size of the epoch that we sample. The first matchup we will look at is the extreme AhAd versus 7c2s. Over 10,000 iterations AA won 87.4% of the time. The longestlosing streak was 5 in a row and the longest winning streak was 51 in a row. Imagine losing with your aces to 72 5 times in a row? The following image shows the results for the epochs of different sizes from 10 to 500.
You can see when you look at the small, 10 hand samples (blue) you get many cases where AA loses 60% or 50% of the time. By the time we reach the 200-hand samples (yellow) things have settled down quite a bit and rarely deviates more than 5% or so from the expected outcome. Here's another one, this time AhKd versus TcTs.
This one is really all over the place with 10 and 20 samples. Plenty of times with 10 samples it went 100% one way or the other (longest streak 15 for TT). Again, though, things do settle down with several hundred iterations. In other words, its quite typical
to get 10 or 20 hand segments that deviate far from expectation, but it's very rare to find a 100 hand segment that deviates. I know the principle here is obvious to anyone with an understanding of statistics, but the specifics of how the variance relates to scope are interesting to see.