A formal grammar for Magic: the Gathering

I wrote an ANTLR4 formal grammar for Magic: the Gathering cards.

It turns this

A Magic: the Gathering card

Or, more specifically, it turns this

Undergrowth — When you cast this spell, reveal the top X cards of your library, where X is the number of creature cards in your graveyard. You may put a green permanent card with converted mana cost X or less from among them onto the battlefield. Put the rest on the bottom of your library in a random order.

Into this:

You can try it out here: https://soothsilver.github.io/mtg-grammar/ and you can download the grammar and the source code from https://github.com/Soothsilver/mtg-grammar.

It handles all 273 cards of Guilds of Ravnica, the most recent Standard set as of now.

What’s it good for? For nothing, because the Wizards Fan Content Policy prohibits fans from using Wizards’ “gameplay” in their content, which, I suspect, includes Magic: the Gathering rules.

But suppose we weren’t limited by the policy. The Magic Arena team isn’t. They are using a parser, similar to the one I created, to generate code from rules text.

Traditionally in video games, game components such as cards, spells, and abilities are scripted. Here’s Affectionate Indrik:

Here’s how the code for Affectionate Indrik’s ability looks like in an unnamed Magic: the Gathering game:

Ability ability = new EntersBattlefieldTriggeredAbility(
new FightTargetSourceEffect()
.setText(“you may have it fight target creature you don’t control”),
ability.addTarget(new TargetCreaturePermanent(filter));

And here’s how it looks like after parsing from rules text:

The idea is that, from this syntax tree, you can generate the code above automatically. A syntax tree is no longer arbitrary English text incomprehensible to a rules engine, it’s an object that can be, in theory, unambiguously converted into code. The hope is that this way, it will be easier to add new cards (because nobody needs to script them) and that there will be less bugs (because nobody can make bugs in the scripts because the code is created automatically). There could be “bugs” in the rules text, but that’s unlikely. Wizards of the Coast pays attention to editing. I’ve only encountered a single editing error in the entire Guilds of Ravnica set (there’s a missing space between the end of the rules text and the reminder text for surveil in the card Mission Briefing).

But of course, the rub is that there could be bugs in the parser or the semantic analysis or the code generation. That has already happened. Look at Beamsplitter Mage:

It reads, in part, “Whenever [..], if you control one or more other creatures that spell could target, choose one of those creatures. […]”

So the parser sees the phrase “those creatures”. Well, asks the parser, I wonder what those creatures are, I should look at the preceding sentence to figure that out. It does and it reads “one or more other creatures that spell could target” and joyfully presents to the player as possible choices all creatures that spell could target.

What the parser failed to notice are the words “if you control”: “those creatures” actually refers to “one or more other creatures under your control that the spell could target”. Here’s a Benjamin Finkel, a Magic Arena developer, explaining it:

What about the mentioned unnamed program with scripted cards? Its code (slightly modified) looks like this:

public boolean match(Permanent permanent, UUID sourceId, UUID playerId, Game game) {
return permanent.getController().equals(controllerId) // you control
&& permanent.isCreature()                         // creatures
&& !permanent.getId().equals(notId)               // other
&& target.canTarget(permanent.getId(), game);     // that spell could target

That’s a small part of the script needed to make Beamsplitter Mage work. Obviously, it’s error-prone. But I’m still on the fence. Magic Arena developers managed to make all cards in Standard work under a reasonable approximation of the Magic rules system, so this idea of generating code from rules text has merit. But while I was writing the grammar, it always felt so brittle. When you script individual cards, new cards you create don’t often have an effect on old cards. But with a grammar, any change you make may produce a different syntax tree, with different meaning.

Maybe there are safeguards a developer can use to make those errors less likely. I would be interested in those. And I will be watching Magic Arena as it evolves, especially if developers continue to share behind-the-scenes details.

I have a couple of cave-ats before I end here. I said the grammar handles all cards in Guilds of Ravnica, which is true — to an extent. Here’s the cave-ats:


Flower reads, in part, “Search your library for a basic Forest or Plains card”. The parser interprets it as “a basic Forest” or “Plains card”, as opposed to the correct “a basic card that’s a Forest or that’s a Plains”. This happens because it considers the words “basic Forest” and “Plains” to be of the same kind, when in fact one’s already a compound of a supertype and a subtype and the other is a subtype only.

Thief of Sanity

Thief of Sanity received undocumented, zero-day errata to its text which I noticed too late, and it’s a pretty unique card, so the parser still works with the text as printed and not the Oracle text.

Aurelia, Exemplar of Justice

Aurelia parses, but incorrectly. It reads, in part, “that creature gets +2/+0, gains trample if it’s red, and gains vigilance if it’s white.” Which the parser interprets as “if it’s white, it gains vigilance” and “if it’s white and red, it gains trample and +2/+0” as opposed to the correct “+2/+0 always”, “vigilance if white”, and “trample if red”.

Pelt Collector

Pelt Collector reads, in part, “if that creature’s power is greater than Pelt Collector’s,” which makes sense in English but is difficult to parse because the adjective-like ‘Peltr Collector’s’ doesn’t qualify anything. It would parse correctly if it said “greater than Pelt Collector’s power”.

Ral, Vraska

I added an extra full stop (.) at the end of their ultimate abilities, because as printed, the full stop is inside the double quotes, which is, of course, proper style, just not easy to handle.

Chance for Glory

I thought that the template “[object] gains [abilities] until [something happens]” would work for all ability-gaining abilities, but Chance for Glory reads, “Creatures you control gain indestructible.” There’s no “until”, not after, not in front. They gain indestructible indefinitely. Oh well.

Gruesome Menagerie

I cheated a little on this one. The effect is very unique so I basically wrote a term that can only ever apply to Gruesome Menagerie.

Leave a Reply