WW Games: Magic Math

Showing posts with label Magic Math. Show all posts

Friday, 13 August 2021

Predict: What to name

Last night on Jarvis Yu’s stream (check it out at twitch.tv/jarvisyu , it’s a good watch) a situation arose where he was priced into playing a blind Predict. He named a four-of without much thought as to which four-of to pick, and then some discussion ensued as to what the best play was.

TLDR: in a situation like that, where you're not likely to see a very large percentage of your deck by the time the game ends, the difference is really tiny. But let's look at some concrete maths, because I like this kind of little math puzzle.

First of all, let's simplify the situation to assume that there aren't particular graveyard effects (e.g. graveyard eldrazi shuffle triggers, flashback, etc.); let's also assume that we can narrow down the options to one card that's "good" and one that's "bad" (presumably the best and worst 4-ofs left in the deck). The first thing I want to do is show that, very concretely, there are some situations where naming the good card is optimal, and some situations where naming the bad card is optimal.

Scenario A: Your opponent is empty-handed. You've just untapped with a billion mana and only the predict in hand. A Sulfuric Vortex trigger just went on the stack which will kill you when it resolves. The only cards left in your deck are 2 Lightning Bolts and 2 Shocks. Your opponent is at 3 life.

If you name

Lightning bolt:

You have a 50% chance of guessing right, in which case you win on the spot (any 2 spells are lethal)

You have a 50% chance of guessing wrong. In this case a shock gets milled, and you draw 1 card out of a deck of Bolt/Bolt/Shock. This gives you a 2/3 chance of winning.

Overall, you win 5/6 of the time.

Shock:

You have a 50% chance of guessing right, in which case you win on the spot.

You have a 50% chance of guessing wrong, in which case you mill a Bolt, and draw 1 out of Bolt/Shock/Shock. This gives you a 1/3 chance of winning.

Overall you win 4/6=2/3 of the time.

So we can see in this scenario, it's better to name the better card (Lightning Bolt), which increases your win rate in this case by 1/6.

Scenario B: Exactly the same as Scenario A, except now your opponent is at 6 life instead of 3.

If you name

Lightning bolt:

You have a 50% chance of guessing right, in which case you draw 2 from shock/shock/bolt, which can't win. You lose.

You have a 50% chance of guessing wrong, in which case you draw 1 card, can't burn them out, and lose for sure.

Overall, you win none of the time.

Shock:

You have a 50% chance of guessing right, in which case you draw 2 from Bolt/Bolt/Shock. You need the shock to be on bottom to win, which only happens 1/3 of the time, so you win 1/3 of these games.

You have a 50% chance of guessing wrong, in which case you draw 1 card, can't burn them out, and lose for sure.

Overall you win 1/6 of the time.

So in this version, it's better to name the worse card, which increases your win rate by... well the same 1/6 as it happens.

So we've established at least that scenarios can exist where it matters significantly, and that sometimes you want to name a better card, and sometimes you want to name a worse card. Let's quickly check a couple more cases similar to these; if instead the opponent had 4 life, you just need to draw 2 spells, any 2 spells, so it doesn't matter - you need to predict right to win, it's 50/50 either way. If they're on 5 life, you need to draw either bolt/bolt or bolt/shock; you want to name Shock here, which gives you a 50% chance to win (whereas naming Bolt would give you only a 1/3 chance).

But these scenarios are pretty artificial. What about other scenarios?

Scenario C: You're digging for one specific card which will win you the game, and you need to find it right now. Let's say you're down to 8 cards again, 4 Winners and 4 Losers.

If you name

Winner:

Guessing right gets you a 5/7 chance to win, and guessing wrong gets you a 4/7 chance to win, for an overall win rate of 9/14

Loser:

Guessing right gets you a 6/7 chance to win, and guessing wrong gets you a 3/7 chance to win, for an overall win rate of 9/14

So here it doesn't matter, which matches Jarvis's original intuition

Scenario D: Same as C, but we add 32 other 1-of blank pieces of cardboard to the deck. You still need to find one of the 4 copies of Winner to win the game. Now in this case, 32/40 of the time, a card besides one of your options is on top, and it didn't matter at all which of the two cards you named. So we only need to look at the 20% of the time where it actually matters.

If you name

Winner:

Guessing right gets you a 15.0% chance to win, and guessing wrong (with Loser on top) gets you a 10.3% chance to win, for a win rate of 12.6% in the 20% of cases it matters

Loser:

Guessing right gets you a 19.7% chance to win, and guessing wrong gets you a 7.7% chance to win, for a win rate of 13.7% in the 20% of cases it matters.

Given that these scenarios overall are only 20% of games, we can see that it the difference is just over 0.2% to the overall win rate. But it's better in this case to predict a bad card than the one you're looking for, since two looks at the better deck has a bigger impact than the flipside one look, even though the chances of guessing right are very small.

Other scenarios get more complicated, and I don't want to go in to that kind of depth. The long and short of it seems to be:

The better the position you're in, the more likely you are to want to name a good card, so that you're improving how bad the worst-case scenario is
The worse the position you're in, the more likely you are to want to name a bad card, to maximize your best case scenario
The bigger your deck tends to be, especially relative to the number of cards you'd be happy to draw, the more you want to probably name the bad card
Naming a good card is likely to be best mostly in cases where something like half your deck (or, especially, more than this) is a good draw.
Mostly the differences are going to be tiny unless your deck is tiny, which... shouldn't come up very much. The most likely case for this is probably if you're playing against some kind of mill deck(?). A farfetched situation that might actually arise is some kind of doomsday deck where you need to fetch a land out of the doomsday pile. I would love to see that happen, but I'm not holding my breath

Tuesday, 23 April 2019

WAR Limited Analysis: Part II

Unfortunately, I've been sick most of the past week, so this analysis may not be quite as detailed as I'd hoped. Nevertheless, we move forward!

Also, a special shout-out to Scryfall (https://scryfall.com/), which has been immensely helpful in putting this analysis together.

Final numbers for Amass, +1/+1 counters, and Proliferate

Per draft of WAR, you can expect 32.5 Amass cards to be opened; 11.7 of these are mono blue, 9.3 mono black, 8.0 mono red, and the rest multicolored. So if you have an opponent in two of these colors, you should expect them to likely have several Amass cards. But if they're in only one of those colors, they'll probably only have a few. It's also good to note that almost all of these cards are Amass 1 or Amass 2.

In terms of non-amass cards that produce +1/+1 counters, you can expect 12.8 of those to be opened per draft in White, 8.4 in Blue, 0.9 in Black, 6.0 in Red, and 12.1 in Green. So in the proliferate colors, there's lots of cards that are going to enable proliferation (note that blue also gets to count the Amass cards as noted above, so they're actually in first here). On the other hand, this is not so many cards that you can expect to have just tons of different permanents to proliferate onto at any point - one target will happen, two will be common, and you'll be quite happy to manage getting three.

Speaking of Proliferate, it ends up on 5.7 cards per draft in both White and Blue, and 8.4 in Green. So, don't expect to be able to build a deck around proliferating over and over again - unless you get one of the cards which singlehandedly pumps trigger after trigger out, you're more likely to get one, or maybe two over the course of a game, even in these colors.

Is spellslinger a real strategy?
As is so often the case in these sets, Blue/Red's theme seems to be "spellslinger", i.e. it wants you to play lots of instants and sorceries. Slightly confusingly, in this set in particular, some of the cards in this direction point you towards those particular types, but some care about noncreature spells more generally. And I expect most decks in this format to have a few noncreatures which aren't in these types (mainly planeswalkers, though there's some playable enchantment-based removal as well).

So what are the numbers? Well, on the non-creature side, there's 5.1 monored, 4.2 monoblue, and .9 hybrit Izzet cards per draft. And on the Instant/Sorcery side, we're at 4.4 blue, 1.5 red, and 1.3 Izzet gold cards. Overall, if you're completely alone in your lane, you might be able to scrape together a deck based around these.... but I wouldn't really bank on it. I think the biggest way to get into this deck is to open a good rare that's on theme and then pick up another couple early - but don't be trying to get payoffs later, it's just not likely enough to happen.

Of note, the red cards here also work with the red-white (even less supported) subtheme of pumping your own stuff - I don't think both of those decks can exist at the same table, though obviously you can build decks in these colors that don't exactly follow those themes.

Mana Fixing, or lack thereof
There are 10.4 pieces of mana fixing per draft which are colorless (i.e., lands or artifacts); you get access to an additional 6.6 if you are base green. This is actually a reasonable amount of fixing... but it's a LOT less than we've seen in the last couple of sets set on Ravnica (or actually, any of the Ravnica sets). So five color decks will be nigh impossible to make... three color decks are even going to be very ambitious. Especially if you aren't green, you'd need basically all the fixing at the table. Plus, since most of the fixing isn't in lands, you would end up with like half your spells just being dedicated to fixing, and I just don't see the payoff being worth it. (Sorry, Niv-Mizzet).

Having said that, splashing seems very plausible. It's definitely not to the point where you would say that splashing some spells from a third color is free, by any stretch - you still have to work a bit to get your mana to get there, like normal - but if you have a reason, you should be able to find something to get you there most of the time (provided it's not like, halfway through pack three already or something).

Creature Sizing
How big are the creatures in the format? Obviously it's a little bit tough to tell just by looking at a list of cards, since there are questions of playability, plus a lot of +1/+1 counters running around and affecting the sizing.

But if we look at everything, just on the base stats, then in terms of power, there's a massive hump at 2 power. There are nearly as many creatures with exactly 2 power (58.1 per draft) as there are with greater than 2 power. Moving from 3 power (28.2 creatures per draft) to 4 (21.1 creatures per draft), there's not nearly as big of a drop off. Per normal, not many creature get to the 5+ power range (12.4), so don't be super surprised if your opponent has one of those, but it won't be often they have multiples.

On the Toughness side of the equation, things are more spread out. 31.4 creature per draft have 1 toughness, so most of your opponents will have a target for your ping effect to hit (although in many of these cases, you would need to time it precisely, as a few of these creatures grow from an ETB counter, and some others are unplayable... so be ready to sideboard around this situation one way or another, which ia fairly common problem if we're honest). 50.2 creatures opened have 2 toughness, and 43.4 have 3. This is the point where the biggest drop-off is, with only 20.4 creatures per draft having 4 toughness, with an additional 16.5 at 5 toughness, and 7.9 at 6.

In the hopes of finding a "magic toughness" or sizing in general, it's also important to look at the toughness-based removal the set provides (damage or -N toughness). 6.4 such cards per draft punish 1 toughness, 8.2 on 2 toughness, 4.6 on 3 toughness, 5.1 take care of creatures with 4 toughness, and 2.3 (1 common) deal with creatures having 5 toughness or less.

Based on looking at this, I doubt that there really well be any "magic size" for creatures in the set, though I guess that most things with 5 toughness will be fairly hard to take out using a single card, especially if that card isn't one of the few premium removal spells in white or black that don't care about size at all (or the fight-like spells in green which are pretty close)

Final thoughts

Overall, The biggest thing about this set is that it looks much closer to a 'normal set' to me than we've had in a while. Well, except for having a couple of planeswalkers per deck, which is, I guess, a pretty significant difference. But the fixing numbers, creature sizing, and for the most part lack of cohesive on-rails plan for each color combination makes things mostly more block-and-tackle. Or more, uh... I feel like there should be a better metaphor which doesn't draw a parallel to a sport which is virtually exclusively played in a single country. Anyway, I digress.

Take good cards, probably don't splash, realize your opponent will have a couple planeswalkers, but also realize they won't be the be-all and end-all. Try to have board presence. And most of all, have fun! It's a new set, that's what they're for.

Hopefully I'll have time to get a moxiously early pick-order list generated before the end of the week, with some notes about specific cards, but we'll see...

Tuesday, 18 April 2017

Amonkhet Draft Quantitative Analysis

After some time, I'm back again to break down some of the numbers relating to a new Magic: the Gathering limited format. Per normal, I'll be dishing out the numbers of certain classes of cards (on a per-draft basis) to try to help everyone get a better picture of what archetypes are supported, against which ones are not. (Big reason this can be useful is that some of the archetypes are really constructed plants - and I don't mean Sylvan Caryatid - in terms of being loaded at high rarity).

This time, I'd like to make a special shout out to the fine folks at https://scryfall.com/ , which made putting this together FAR easier than it has been in the past.

In terms of the numbers themselves, it's a pretty normal "big" set. 101 commons, 80 uncommons, 53 rares, and 15 mythics. This leads us to .099, .0375, .0165, and .0083 of any particular card of that rarity, respectively, per pack. This gets multiplied by 24 packs to get a per-draft average. If you want to know a about a sealed, you'd divide that by 4. (I'll note that due to the way print runs happen, I think there's one common with a slightly different incidence rate, but there's little way at this point to know which that is; I'm also ignoring foils here, since I'm not sure how that replacement works, so that would slightly increase non-commons and decrease commons; these are all very small differences, but I wanted to mention them in the interest of full disclosure).

Fixing

One of the first things I always want to look at in any format is how much mana-fixing there is. This helps us figure out how many colors we can be playing, how much you'd have to work for extra colors, how easy it is to splash, how much contempt you should have for picking multicolor cards early, etc.

Amonkhet has 4 common mana fixers, 1 uncommon, and 9 rares (I'm not including Vizier of the Menagerie, which only fixes for creatures). This leads us to a total of an average of 14 pieces of fixing per draft. Typically you want something like 4-8 pieces of fixing to play a third color, which means you'd need roughly half (or maybe a little under) of the fixing in the draft - seems possible, but you'd have to work for it. But let's drill a bit deeper. Painted Bluffs is a common fixer that could go in any deck, but not one you'd want to. Cascading Cataracts and Pyramid of the Pantheon are similar, but at rare. The cycling lands are probably going to be quite hard to pick up if you don't open them, and in any case will only fix your mana if you just happen to be the right colors. This leaves us with Evolving Wilds as the only good, reliable fixer for any colors, which is a place we've been pretty often before. Additionally in this set, though, we're back to having noticeably more fixing in Green exactly - Oashra Cultivator and Gift of Paradise at common, Spring of Spring//Mind at Uncommon, and a couple different rares all add up to make Green the color of fixing again. It's worth noting that these are generally a bit overpriced from what we'd expect (3 mana Rampant Growth seems to be the norm here), but will get the job done in a pinch. And importantly, splashing multiple colors seems only marginally harder than splashing one, and actually easier than trying to be fully 3 colors.

Cycling

Sure, cycling is a theme of the set. But just how present is it? EVERYWHERE. There are fully 20 Commons, 10 Uncommons, and 8 Rares with the popular returning mechanic, leading to an average of 59.7 cards per draft! This means even the average player will end up with 7-8 of these cards in their pool. And some of those won't be in the right colors, and some will be unplayable (though the option to cycle means very few will be embarrassingly bad). But even if your normal half-the-cards you draft end up in your deck, you're still looking at about 4 per player. Which means if you crack open a Drake Haven, and you actually prioritize these cards a bit, you should really be able to have plenty of enablers to turn that card on. I'll also note here that most of these cards that care about cycling also trigger off of other forms of discard, of which there are 14 in the set - bringing you to an even healthier number of enablers. So you shouldn't really have problems in 'getting there' with those kinds of cards.

How many such rewards are there? Well, if you also include cards like Shadow of the Grave and Sacred Excavation, which don't trigger off cycling per se, but definitely care about the mechanic, you end up with 3 commons, 6 uncommons, and 5 rares, for a total of 14.5 per draft. So not all that many. When you factor in that a lot of these are at higher rarity, and several of the commons give mediocre bonuses, I don't think this is an archetype you should expect to see in every draft pod. But it is something you can go with if you get the right card(s) early. And worth noting that this is centred in blue and black particularly, also with some presence in red.

Lastly, because cycling is something that happens from the hand, at instant speed, and is on lots of cards, if your opponent has something like Hekma Sentinels or Pitiless Vizier, keep in mind that they basically have threat-of-activation on activated abilities - since most any card in hand could be a combat trick with card advantage. So value that accordingly in the draft, and play round or bluff it accordingly in gameplay.

Embalm

Embalm appears on 5 commons, 4 uncommons, 5 rares, and a mythic. It is centred mostly in white, with strong representation in blue as well, and the smallest sprinkles in Red and Green. In total, you can expect 17.7 Embalm creatures to show up on average in a draft. Because of the color imbalance, you can expect white and blue drafters to probably have a few each (WU drafters a bit more than that even), but not at all a strongly themed deck.

Zombies

This leads us right to Zombies, which seem to be the tribe du jour on Amonkhet. Apart from the Embalm cards (all of which make white zombie tokens when embalmed), there are 28.75 other zombies per draft in the set, (including cards which make multiple zombie tokens, like Liliana or her Mastery, once each for their rarity). Altogether, that makes a total of 46.4 - definitely less than cycling, but more than about anything else you're going to find. Especially important is that these other zombies are all white and/or black, so that when you combine the embalm in, you get the most Zombies in white, followed by black and blue, and very few in red or green.

But the bigger story here is the pay-offs for zombies. There appear to be quite a few in the set. But the problem is that, like with the cycling bonuses, they're focused at higher rarities. 2 commons, 4 uncommons, and 2 rares leaves you with only 9.1 zombie bonus cards per draft (I didn't count the Liliana ultimate here, full disclosure). So this is somewhat like the BW Lifegain theme from Oath of the Gatewatch - sometimes it will come up, but you can have decks even in those colors where it doesn't really.

"Heckbent"

Something that people have been noticing throughout the spoilers is that there seems to be a subtheme of cards, mostly in black and red, which care about having few cards in hand - specifically, many of them are improved when you get to having 0-1 cards in hand. People have dubbed this "Heckbent" as a lite version of the Hellbent (no cards in hand) keyword from Dissension. But this is really a constructed-slanted mechanic - 1 common, 2 rares, and a mythic have that text, plus an extra uncommon that's huge but shrunken for each card in hand. Don't count on this in limited.

-1/-1 Counters

Instead of the near-ubiquitous run of +1/+1 counter mechanics we've had over the last few years, this block returns us for the first time since Scars block to -1/-1 counters. These are fairly prevalent in the set, with 26.7 cards per draft that give them out. These are primarily in black and green, with a bit in red. And it's especially worth noting that many of these cards actually have you putting the counters on your own creatures, at least at first (many of those in turn have ways for you to take them off later).

How many cards care about these kinds of counters is, as often, the bigger question. The answer in this case is 12.4 per draft (this follows some logical progression on what counts as "caring about" - I'm not counting here Exemplar of Strength, but of course I am counting Nest of Scarabs). This is definitely the kind of thing which again, doesn't look terribly supported, but again, is something you probably will see from time to time.

Aftermath

These cards are known perhaps more descriptively as Split Flashback cards. And while for constructed, the thing to look out for is that they mostly look priced for limited, the thing to know from a drafter's perspective is that these are all at high rarity. They only exist in 3 cycles - enemy-colored split uncommons, allied-color split rares, and same-color split rares. This leads to only 8.5 per draft, and especially spread throughout the colors - don't expect to see an aftermath deck in any way shape or form across the lifetime of the format. In other words, just evaluate these cards at face value.

Exert

Exert is a mechanic that allows you to choose at the time one of your creatures attacks to have it not untap in your next untap step. In return, you get some sort of bonus right now. These cards obviously promote attacking, and in general, racing. There are 23.4 such cards per draft. The bonuses for exerting come off of a couple uncommon red cards which pay you out whenever you exert any creature, as well as a couple of cards which give you some bonus for having tapped creatures. Again though, these are really small potatoes - the cards should be evaluated really on their faces far more than for synergies.

Friday, 20 January 2017

Why Copycat Won't Ruin Standard

The bane of Standard players everywhere.... or is it? If you're paying any attention to the discussion of the format, you'll know that this two-card combo has dominated all the talk. And if you aren't, let me explain - Saheeli's -2 copies Felidar Guardian, which blinks Saheeli on ETB, making it a fresh permanent that hasn't used a loyalty ability yet this turn, which lets it make another hasty Cat Beast, ad infinitum.

I, however, don't think this combo will be the format-destroying scourge most others seem to. Let's dig into why.

1. Math

So, the first thing people said was, "This is a turn 4 format now!". First of all, I suggest that good aggro decks can kill on turn 4 in most formats without interaction, but I digress. This combo will yes, sometimes be able to kill you on turn 4, but it won't be consistent at all.

The chance of naturally drawing both combo pieces by turn 4, given that you have at least 4 lands which make all the right mana you need them to, and assuming you're running 4 of each combo piece, is only 12.1% on the play, 16% on the draw. Throw in that you actually need Saheeli on turn 3, and we're down to about 11.1% and 14.7%, respectively. Not all that hot. I should, of course, point out that Saheeli's plus give a scry, which helps find the cat, but this adds only about 3.6% (a tiny bit extra on the draw compared to on the play). Still looking at a pretty low percentage in any case.

But, people have also noted the whole thing can be played on turn 6, which also reduces the opportunities for the opponent to interact. The problem here, of course, is that for that to work, you need to have 6 lands. And so even if we're assuming that you have enough lands every time, those lands take up slots in your hand. So the chances of having everything by turn 6 aren't much better - 12.9% on the play, 17% on the draw.

But this has all been assuming you just have the lands you need, which is by no means a given. Even still ignoring the color requirements, or that the last 1-2 need to ETB untapped (both of which are going to be very dependent on the precise build you use - mana for the deck can be pretty good, but getting the last land ETB untapped might be a bit tough), just having enough is a serious concern. If we look at a typical Standard land count of 25 lands, then having enough by turn 4 is only a 67.5% proposition on the play, 76.6% on the draw. When you multiply those by the existing chances we had above, and we're taking a few percentage points further than before even.

For the turn 6 scenario, it's even worse (as you might expect): only a 36.8% chance of having 6 lands on turn 6 on the play, and 47.5% on the draw. Of course, you're less likely to be color-screwed by the time you're at 6 lands, but still more unlikely to have that 6th land ETB untapped.

There are, of course, ways to make things more consistent. There are a number of cantrips in these colors, though at 1 mana you'd need a creature (though Insolent Neonate can do some amount of work for you), so effectively we have... Anticipate, Cathartic Reunion, Tormenting Voice, Nagging Thoughts, and a number of no-selection draw-1s for 2 (best of which for the deck is probably Prophetic Prism). Best case here is Anticipate, and it really does help a good amount - it takes a card slot, but lets you see 3 deeper. Of course, you need it by turn 2, and it gives you less chance to hit your taplands, so it's not without cost. But if you can get it off, it adds... several percentage points to where you'd otherwise be.

The other thing which presumably helps you, and I'm not taking account of here, is mulligans. Particularly with the scry, your ability to toss back hands which are missing too much is going to help you out. I will note that it's still harder, even with the scry, to get it all on 6 than it is on 7, and that those bad 7s do sometimes get there. So the improvements won't be huge, but they're real.

All told, these improvements may get your chance of combo-ing out up to 20% or so, and while I haven't actually simulated finding an optimal goldfish list or percentage, I find it hard to imagine you can get that to much over 25 or 30%, especially on the play - certainly well below 50%. Goldfishing turn 6 is much more plausible to be able to optimize for, but on the other hand, that's not so impressive - even midrange decks can routinely goldfish at that rate. Heck, limited decks can. If anyone has done this kind of optimization, I'd love to hear about it. I suspect the optimal goldfish-turn-4 list is somewhere around this:

That leaves the question open, though - how long does it take, typically, to get the combo together? By 'typical', I'm going to say the point at which you can expect to have a greater than 50% chance to have the combo assembled (and not the average turn it's assembled on,which is almost surely worse/later). In an unoptimized list (i.e. 4 of each combo piece, a pile of lands, maybe some cards that don't help assemble at all), we're looking at somewhere around turn 10-11 (depending on play vs draw, exact manabase and composition of irrelevant cards, etc.). In an optimized version... well again, I don't know what optimal would be for minimizing time to goldfish, but my guess is that it's probably going to be turn 6 (though turn 5 wouldn't surprise me - getting an extra turn to deploy cantrips helps a LOT).

(Pre-Post Edit: Okay, I missed Contingency Plan, but come on, let's be serious - not THAT much better than Anticipate, and hard to see it actually, you know, making the deck).

2. Interaction

There are lots of ways to stop the combo in Standard - any way of killing a 1/4 at instant speed (Grasp of Darkness, Murder, Unlicensed Disintegration, Stasis Snare, revolted Fatal Push, Warping Wail, Harnessed Lightning + an energy), any way of killing a planeswalker at sorcery speed (Ruinous Path, 4+ damage from combat and burn), 1 damage to a planeswalker at instant speed (Implement of Combustion, Shock, Fiery Temper), Counterspells (Void Shatter, Disallow, Metallic Rebuke, etc. etc.) Misc (Thalia, Authority of the Consuls, Dampening Pulse, win faster).

Most of these cards are fairly commonly played already. Moreover, virtually every deck in the format plays at least some of these already, even in the maindeck - and most have more in the sideboard. And while it's certainly possible to have plans to deal with most or all of these... well, you need to have plans to do that. Which take up slots. And time. And make your deck less of a consistent quick combo. That's not to necessarily say they're bad, but it does bring us to

3. But what about Splinter Twin?/So where do we stand?

So the big comparison that gets made with the Copycat combo, of course, is Splinter Twin. Twin was so good, it even got banned in Modern. Everyone knows, of course, that this combo is worse, but Standard is also a weaker format than Modern, and is the combo really that much worse?

So, in terms of consistency at least, yes, it's quite a bit less consistent. This was especially the case when you could Preordain and Ponder - that many good cheap cantrips? You get quite a bit of consistency there. Even afterwards, though, Serum Visions is basically as good at digging as Anticipate, and it's a whole mana cheaper, which means you can use both turns 1 and 2. Furthermore, you got to play with more pieces than 4 of each - typically you played 6 Exarch/Pestermite, which is a 50% increase. That 50% doesn't translate to 'having it' 50% more of the time, of course, but it's not that far off. So this is really significant.

The bigger thing, though, is that Twin got to play a different kind of game. Turn 3 Exarch, tap down your land, untap kill you. It's only vulnerable to instant-speed interaction. It can kill out of nowhere. And realistically, represent doing other good things as well. Lots of flexibility. And lots of generally other good cards - the snapcaster/bolt/remand/cantrips Blue Moon kind of game. The deck was way more consistent, way more efficient, and played a really good game even when it wasn't comboing off.

At this point, I want to make a point about the other shell a lot of people have discussed for the combo, and that's using it as a finisher in a control deck. In some sense, I can see that - there's really not a reaosn it wouldn't work - but I am not terribly convinced by this, either, for one big reason: Torrential Gearhulk. Gearhulk already provides that deck with quite a quick clock to finish the game off, it takes fewer slots, it comes at instant speed, it helps support the control aspect of the deckmuch better. So it's a bit tough for me to think that such a deck is going to go for the combo over Gearhulk, or take up enough slots to go with both.

Having said allllllllll of that, I don't think the combo is just terrible. I could see it still being good. I wouldn't be entirely stunned if it was strong enough that a banning needs to happen (though I kind of doubt it). My main point is, you need a really good shell around it that plays to its strengths. You can't just throw it anywhere and have it be busted. It's not like pre-ban Eldrazi was in Modern where every flavor was effectively busted, and it was all about optimizing for the mirror. You need to build the right shell for it to be good, and my guess is that it will be good in that shell, but not busted.

What is that shell? I think a lot of people are reasonably close - you play Jeskai, you play a lot of good ETB creatures, with a mix of disruption and a bit of selection. Not too far from the Panharmonicon decks we started seeing last Standard.

Sunday, 2 October 2016

Math on Treasure Chests, New Entry/Prize Structure, and EV

WotC recently announced a huge amount of changes coming for MTGO. I'm not going to touch here on Redemption changes (which basically seem to be straight negatives for the consumer - only silver lining is that online cards won't be cheaper; older cards having value suggests MTGO economy won't completely collapse, but more than that I can't really say). However there have been many other changes announced, dealing with Entry Fees, Prize Payouts, and the new "Treasure Chests". Let's use some analytical mindset, some math, and try to break it down.

I want to note before we begin that, I've spent a decent chunk of time (several hours) putting this together. I've gone through my methodology in a decent amount of detail. Having said that, it's very possible I'm wrong about one or more things here. And I expect, given my conclusions, that many people will want to shout that I am wrong, because they seem dead-set that my end conclusion is wrong. That's fine - if you think I am wrong, I actually actively want to know. But I don't want just a "You're wrong!" or especially a bunch of vitriol without explanation. I want you to point out, at what point am I going wrong? Which of my estimations or assumptions or methodology is wrong? And why? In short, tell me I'm wrong, but give me a reason. Thanks in advance.

Pack Prices:
The most obvious thing here that most people are jumping on is that non-pack entries to drafts/sealed events are lower now than before (while entry fees including packs haven't changed). The supposition is then that pack prices are going to decrease, which de facto will reduce the price of packs and therefore any prize support which includes packs. And this isn't nothing - improving the quality of options competing with using packs should make the packs less valuable to some extent.

Things, however, are not so cut-and-dried. If you look at current booster prices, they're well below 3.33 tickets per pack. The cut-off point is simply not a huge factor as to why they are priced where they are. So what does cause this? Supply and Demand, of course. More specifically in this case, the supply is determined by how many packs people buy from the store plus how many packs are being paid out as prizes. The demand is how many packs people want to crack plus how many they use in limited events.

I assume that far more boosters are being used for limited than just cracked. This is simply because cracking packs is really bad EV. Prices of cards is a factor in this, but that bottom cut-off, like the top cut-offs of buying from the store or being worse than equivalent entry using non-pack entry options, is far enough away from actual price points that it doesn't come into consideration.

I also assume far more packs get put into the system from prizes than are being bought from the store. Again, this assumption stems merely from the fact that it's far cheaper to get packs from other places than buying them from the store.

Given these sources of supply and demand, I'm actually expecting pack prices to increase (slightly, if we don't account for the Redemption Change anyway). This is because supply should be going down - the number of packs awarded in constructed events have been slashed fairly significantly, in favor of other prizes. In the meantime, the demand for packs should not change terribly much. Sure, lower price on the Play Point/ticket entries should change that somewhat, but because that option was and still will be more expensive in practice, this difference shouldn't be a large one. A more significant factor will be actually how many people want to play in those limited events - which has a lot to do with the quality of the format. Of course, being a totally different format is going to make a 'real' comparison somewhat difficult. But the important point here is that if you think the decrease in prizes from constructed events is relatively larger than the decrease in demand because people will enter for play points/tix, then pack prices should go up rather than down. I certainly believe this to be likely, but more importantly, I find it very unlikely that it's so far out of whack that pack prices will significantly drop.

Treasure Chest value:
Each treasure chest has 3 items in it. One of the 3 is guaranteed to be a random Rare/Mythic from a modern set OR a 'curated' card OR some number of Play Points. I'm going to call this a 'value' slot hereafter, even though a lot of the time, this slot will have virtually 0 value. The other two slots will usually be a common or uncommon from Standard, but there's a 1 in 4.5 chance that you'll get one additional 'value' slot and a 1 in 239 chance that all three slots will be 'value' slots. This means you expect 1.23 'value' slots per Chest, on average.

I'm going to assume standard-legal commons and uncommons have 0 value. This isn't strictly true, of course, but typically very few of them have significant value, and there are many many many commons and uncommons in standard, so this approximation is very likely to not be wrong by more than 1 cent or so, which is small compared the estimation errors we have to make from approximations.

So then we need to figure out what the EV of a 'value' slot is. This is hard, because there are a good number of unknowns. Let's start with the knowns though. I will note that for the following, I'm taking everything Pre-Kaladesh, since that set isn't online yet, and the economy for that set hasn't stabilized. Kaladesh will change the numbers a bit, but likely not by much (and probably slightly upward at first, given that most of the cards about to rotate out of Standard have lost most of their value already)

Random Rares/Mythics from Modern:
There are 3063 different rare printings in Modern sets (including Timeshifted cards, and each printing separately). There are 434 different Mythic Rare printings (again, counting each printing separately). In the chests, each rare appears twice as often as each mythic, so there is a (3063*2)/(3063*2+434) = 93.4% chance of hitting a rare, leaving 6.6% to hit a mythic.

I went through https://www.mtggoldfish.com/ set lists and added the prices of each rare and mythic, rounding for convenience and generally ignoring cards less than about half a ticket. I realize that these prices are sell prices rather than buy prices, but I will try to account for that later on (if I knew of a source that had clean lists of buy prices, I would use that instead, as it would be more accurate. Please please let me know if you have seen someone do this analysis more precisely somewhere else). (I should also note that I think I used non-premium versions for everything, which is wrong for a few cases I believe). I came out with a total value of 2067 tickets (this looks more precise than it is; I expect I'm off by maybe 15 or 20 tickets one way or the other, wouldn't be surprised if it's a bit more). This gets to .675 tix per rare, on average, in modern (again, sell prices). I will note that most of the value here comes from the pre-mythic era, with another significant amount on Standard cards. Mythics come up to 1170 tickets (again, I expect to be off by 10-15 tickets in some direction or other), which gets us to 2.70 tickets per Mythic on average. When you combine all of this, you get to a weighted average of 0.81 tickets per Rare/Mythic from Modern. This is pretty closely in line with the number produced here, which I saw after doing my calculations.

Now, as I mentioned before, this is using sell prices, whereas to actually determine the value to most people opening the chest (i.e. people who wouldn't be buying the card), you need a buy price. Doing spot checks on the differences between these, they seem to, for the most part, be a little bit below 90% of the sell price. So I am going to be a little conservative here and estimate a 15% reduction in the sell price to get an estimated buy price of (after artificially rounding down) 0.68 tickets per Rare/Mythic in a Treasure Chest.

Curated Slots
The list of curated cards can be found here: http://magic.wizards.com/en/MTGO/articles/archive/magic-online/treasure-chest-curated-card-list-2016-09-29 I've seen the (unweighted) average of prices of the list as anywhere from 5 to 7 tickets. The best analysis I've seen is here: https://www.reddit.com/r/magicTCG/comments/554g55/value_of_mtgo_treasure_chests_curated_cards/?st=itpryd7g&sh=468d6663 and it suggests slightly over 5 tickets per card, not including any value from the Gearhulks (which will probably bump the average between .1 and .2). It's very hard to know the actual value of what these will be, because the frequency of the different cards are different. I haven't seen anyone who thinks the opened average will be anything besides less than the unweighted average (generally they are going to want to keep the rarer cards rarer); it's just a question of how much less. My guess is that it will probably between the 3-4 range, closer to 4. I've seen other people estimate on the order of 2.23. When I calculate different potential EVs for a chest, I'll present a number of different options.

Play Points
I'm going to use a conversion of 10 Play Points = 1 ticket. This holds for entry into events. Tickets are obviously more liquid assets, and thus more valuable. But the loss in value really only comes in in one of two places: First, if you're winning enough that you're always running a surplus of Play Points. In this case, you end up with excess play points, rather than other assets, and because you can't transform them into other assets, it's merely a waste. Second, if you want to sell all the assets out of your account, you can't get any value out of the play points. However, there are very few people in the first scenario (plus they are very profitable already, just less so than they otherwise would be). And my assumption is that people will typically play in more events at a far higher rate than they will sell out of their account. So all in all, yes, I would definitely rather have 1 ticket than 10 play points. But over the large scale, the differences are so small that I'm using the conversion. Take that for what you will.

I will add for a moment that these changes, because they increase the number of play points going out as prizes, necessarily will make more people fall into category one, where they are left with mounds of play points that just go up and up and up over time - particularly players who are pretty good at constructed but rarely if ever draft. That's a serious negative for that group.

In terms of how many Play Points you can expect from Treasure Chests, in the video where Lee Sharpe demonstrates the chests, we see him open 10 total chests. By counting the number of cards contained (26), and subtracting that from the total number of slots (30), we can see that 4 of the slots must have been devoted to Play Points. We also see that he opened a total of 50 Play Points, which gets us to a displayed average of 12.5 Play Points per Play Point payout. Obviously this average is based on a very small sample, so it could be off by a reasonable amount. My guess is that this was 3 sets of 10 Play Points and 1 set of 20 Play Points. So I'm guessing there's some distribution for Play Points which is unknown, but most likely 10 is the lowest and most common payout, sometimes you get 20, maybe sometimes you get more.

Adding it all up
If we look at the same video, we can see that Lee opened, from his 10 Chests, 12 'value' slots, which is close to the average we'd expect based on the reported numbers (in fact, it's the most likely number; DEFINITELY within normal variation). Of these 12 'value' slots, as I said 4 were Play Point bundles. Two were curated cards (Remand and Force of Will). This leaves the remaining 6 as being random rares/mythics from Modern (unless I've missed somewhere that one of those was curate - please let me know if that's the case).

This gets us to seen averages of 1/6 Curated Cards, 1/3 Play Points, and 1/2 random rare/mythic from Modern. Again, this is small sample size, and there's a very good chance it's off a little bit in some direction or the other, but it also seems quite plausible that this is the distribution.

If we take the seen averages as real averages, then if we take average curated card at 3.5 tickets, we end up with 'Value Slot' EV of:
(1/6 chance of curated)*3.5 Tickets + (1/3 chance of play points)*1.25 tickets + (1/2 chance of Rare/Mythic) *.68 Tickets = 1.34 Tickets. This leads to a Chest EV of 1.65 Tickets.
If we shift Curated average to 3 tickets, we fall to 'value slot' EV of 1.26 Tickets and Chest EV of 1.55 Tickets.
At Curated = 2.5, 'value slot' = 1.17, Chest = 1.44. And at Curated = 2.0, 'value slot' = 1.09, Chest = 1.34.

Other values, you can do the Math yourself (or if you ask nicely, I will probably get back to you).

In terms of the bottom line on restructuring of Prizes and Entries:
This post has the best breakdown I have seen so far. If we apply current pack prices and make our 10 play points to 1 ticket equivalency, then Competitive Leagues need Treasure Chest value to be .733 Tickets/Chest in order to be equal overall (weighted average of all records). As you can see, even the most conservative calculations above are showing Chest values to be will in excess of this figure, which means effectively that prize payouts in these leagues have increased (though not necessarily that they've increased by a large amount).

In Friendly leagues, if we also ignore QPs being gone, then Chests need to have an EV of 1.46 tickets in order to break even overall before compared to now. The pessimistic view would then show these leagues as now paying out less. However the more optimistic or average-case-best-guess estimates show an increase in payouts for friendly leagues, too.

Bottom Line: Payouts are increasing for Competitive Leagues almost certainly, and there's a good chance (but less sure) they're going up for Friendly Leagues as well.

Caveats and Downsides:
It's not all good.

Of course, we don't actually know the distribution of things. I think you'd have to be quite cynical to think that they've rigged what Lee showed in the video to be unrepresentative of average in a big way, but it's of course possible he got slightly luckier than average on categories. The big question, though, remains in the Curated Cards. If they weight it such that the expensive cards are WAY less likely than the Atogs of the world, EV will plummet. At the worst case of Curated Card EV = 0, Chest EV goes all the way to .93 Tickets per chest, which means Competitive Leagues would STILL go up, but if you factor in that Play Point and other distribution might be different, it could be a little worse.

Furthermore, WotC's lack of transparency on these points is troubling. I certainly don't expect them to tell us the EV of a chest, since that will be in Flux, and they want to avoid appearances of paying a set rate of $$ so as to not look like gambling. However, I see little reason why they can't give us more information about the distribution of the Curated cards (or their methodology here) and especially why we can't know how often you get Play Points vs Curated vs random Rare/Mythic. It's also somewhat bothersome that they have given us no information about the distribution of the different sized bundles Play Points come in - or even that there are bundles of different sizes.

Also, this prize situation creates an enormous Variance in prize payouts beyond what existed before. So even though on average things work out ok, in the short run, there's a big chance that you are worse off than before, because many curated cards and most random rares/mythics are worth very little (and also somewhat harder to trade than packs, because they all have different names). People want to have stable prizes. They also want to make sure that a 5-0 will get them a bigger prize than a 4-1, which will be bigger than a 3-2, which is no longer the case. Yes, we were getting more packs before, and those could be opened which also has big variance. But WotC knows, just as the community knows, that people weren't, for the most part, using these packs to open them; they are either entering limited events with them or trading them away - neither of which are things that Treasure Chests can do. Making the Chests themselves tradeable would go a long way to solving the variance problems, but of course it's not a panacea.

As I mentioned above with play points, the prizes now are less liquid than before. And more players will be stuck with playpoints they aren't using (at which point those play points become valueless). This has its definite negatives, and... no real positives for the consumer.

Card prices for a lot of the curated cards (as well as the random rares/mythics from Modern) will also likely fall to some extent following this, which will eat into the EV. It's really hard to know exactly the size of this effect, and it probably won't be much, but that is something which can definitely be a negative as well. Actually let me expand on this a little bit. For any given Modern rare, the chance you get one from any slot is 1/2 of 2 in 6560, so for any chest it's about 1 in 5330. The average league run hands out 1.03 chests. So 1 in 5170 leagues will add an Engineered Explosives to the system (for example). A given Mythic is half as common. 1000 leagues per day (very rough guess) means 1.3 to 1.4 of any given Rare, and somewhat less than 1 of any given mythic, are entering the system per week. The numbers for Curated cards are going to be likely on the same order of magnitude - there is less of a chance to get a curated card, but there are fewer of them as well, and we again don't know the weighting between the different cards. In any event, it seems very clear that the influx rate is significantly lower than we're getting from the Flashback drafts. This is a continual increase in supply of course though, and would need to be offset by a sustained increase in demand over time to keep prices from falling; however, the rate of a few per week entering the system means that not that much net increase in game play (of that specific card) has to occur for prices to remain fairly stable. Therefore, I expect prices to drop on cards which are expensive mostly because they've been stupendously rare (e.g. Rishadan Port), while remaining reasonably steady (not bigger changes than normal variations we've seen in the past) for cards which are expensive because they see gobs of play (e.g. Tarmogoyf, Liliana of the Veil).

Finally, and this one hasn't been talked about much, contents of the chests are set on opening, rather than on obtaining the chest. This is not surprising given that it would be a bit of a disaster to try to show "this is the chest you got on date X, this is the one you got on date Y", etc. But the important thing is, it encourages you in many ways to hold your chests until they change the curated card list/distribution to be more monetarily favorable. In fact, there's a decent chance that this is the reason they aren't telling us the curated frequencies. But if that's the case, it's a poor remedy, a medicine that is likely worse than the disease. What would be far preferable would be a way to solve the original problem, to remove the incentive for sitting on chests (by not allowing that to be possible/plausible).