e4. narrowing syntax: an introduction to Minimalist compositionality

merge, move, project, and agree

Jul 25, 2025

For the last 30 years, theoretical syntax has been dominated by some form of the Minimalist Program. I love Minimalism because I am a small-M minimalist at heart, but I also love it because it basically explains away syntax as purely a function of its interfaces with semantics and phonology, and this is pretty great for Eevee’s Laws.

To be clear, Minimalism has different definitions to different people, and it’s often more of an ‘approach’ than a solidified theory. And Minimalism didn’t originally try to explain away syntax—it just tried to make syntax consist of as few bells and whistles as possible.

The common mantra nowadays is that Minimalism is “just Merge.” Meaning, syntax as a whole is only composed of a singular mental operation, which takes two nodes and makes them into a bigger one, labelling it with one of the daughter node labels. Merge takes A and B and makes A: {A, B} or B: {A, B}. This means that all syntax trees are binary-branching—nodes can have at maximum two daughters.

For example, if you want to get “big cat,” you take “big” and “cat” and put them together.

…And that’s it. That’s syntax.

Big if true.

But how in the world can you get trees that look like this, when you only have a single operation which literally just puts two things together?:

Well, the idea is that things from semantics and phonology kind of seep into syntax through the edges, and the most important thing that seeps in from semantics, or from the lexicon, is features.

Subscribe now and receive a free wug!

features, interpretability, and agreement

If you remember our lexicon from last time:

We said that lemonade had the [liquid] feature, and strawberry didn’t, and that’s why drink could take lemonade but not strawberry as an object. But where does it say that—how do we formally represent that drink can only take objects with the [liquid] feature?

The simplest solution is to say that actually each word has two types of features: features that the word has, and features that the word wants in other words that it combines with.

Interpretable features: features a word has.
Uninterpretable features: features a word wants in other words it combines with.

Lemonade has an interpretable [liquid] feature, and drink has an uninterpretable [liquid] feature. We can distinguish these by putting an ‘i’ or ‘u’ in front of the feature name in our lexicon table:

The [liquid] feature is a semantic feature, but there are also other types of features:

categorical features (noun, verb, etc.)
ɸ-features (third-person, feminine, singular, etc.)
case features (nominative, accusatives, etc.)

For example, the verb drink has, among others, these features:

interpretable category: [V]
interpretable case: [Accusative]
uninterpretable categories: [N], [N]
uninterpretable semantics: [liquid]

Uninterpretable features are BAD. If there are any uninterpretable features left in a sentence when syntax is done with it, that sentence won’t work. We won’t be able to interpret it. In other words, if a word wants some characteristic in a word that it combines with, and it doesn’t get it, the sentence fails.

We can make uninterpretable features go away by combining them with interpretable features of the same type. This is also called the Agree operation.

For example, in order for the uninterpretable feature [liquid] of drink to go away, it needs to combine with something that has the interpretable feature [liquid]. In fact, merging drink and lemonade has multiple benefits: it gets rid of the uninterpretable feature [liquid] feature on drink, the uninterpretable [Case] feature on lemonade, and one of the uninterpretable [N] features on drink.

When an uninterpretable feature gets deleted, the interpretable feature basically takes its place. At the end of the derivation of drink lemonade, lemonade will have ACC (accusative case)—which in many languages is marked overtly.

Interpreting features is THE reason why things combine with other things. Merge is the mechanism, and Agree is the motivation.

projection and labelling

Before, we said that Merge can ‘label’ a node with the label of either of its daughters. A big question for our theory is how do we choose which label?

The ‘labels’ from before are actually just interpretable category features,1 and thus the problem of projection isn’t unique to categories: it’s a problem for all types of features. How do we choose which features a mother node gets from which of its daughter nodes? The ability of features to be transferred from daughter node to mother node is called projection.

For example, it seems intuitively clear to me that big cat should be a noun and an animal, not an adjective. That suggests that the features of cat should project, not those of big. So maybe the rightmost item in a merge always projects its features to its mother node?

We usually represent interpretable categories on trees as labels (the letters in red).

Let’s try it on Ella drinks lemonade:

Hmm. It doesn’t seem right to say that drink lemonade is a noun, or a liquid. It seems more intuitive to say that it’s a verb.

Furthermore, there are problems with this tree. Drink still has an uninterpretable [N] feature that hasn’t been deleted. And right now Ella has no reason to merge: it’s not getting rid of any of its uninterpretable features, or any of those of drink lemonade. A key part of Minimalism is that everything in syntax needs to happen for a reason.

If the features of drink projected, and drink lemonade was a verb, then it would have a [uN] feature, which would be interpreted by merging Ella into the tree. This seems way better:

So in the case of big cat, we want cat to project, and in the case of drink lemonade, we want drink to project. How do we represent this formally?

This is a hard question, and honestly, I tried really hard to answer it here, but it relies on things we haven’t gotten to yet, and that would turn this post into a 20+ minute read… so I’m going to leave it hanging for now. Sorry! But it’ll be first on our Running Questions list, and we WILL get back to it soon. Promise.

Running Questions
1. Why does cat project its features in big cat, but drink projects its features in drink lemonade?

So now let’s finish off Minimalism with the last operation: Move.

movement

Move (also known as Internal Merge or Copy and Delete) is our last operation. Some people think it’s a distinct operation, but for others it’s just a combination of Merge and Agree.

You might have noticed that while we did get rid of the second uninterpretable [N] feature on drink by adding Ella to the tree, we actually introduced another uninterpretable feature, and never got rid of it: Ella still has [uCase].

You may have also noticed that this sentence looks a little bit funny. The surface forms in blue actually say Ella drink lemonade, not Ella drinks lemonade. This is because the verb drink is currently just a root—it doesn’t have grammatical tense. We can represent this formally by giving drink an uninterpretable [Tense] feature, meaning “I want tense!”, and we can get rid of this feature by merging the present tense marker -s into our tree.

But -s drink Ella lemonade is not the right sequence of words! That’s not what we think or say! We have to move things around. And Ella still needs case, and drink still needs tense. Despite the fact that -s has been merged into the tree, it still isn’t close enough to drink or Ella to interpret their features.

You might think that since we’ve already merged Ella and drink elsewhere in the tree, all is lost. But actually, we can just merge them again, closer to the tense marker!

The lines through the bottom copies of Ella and drink indicate that we don’t pronounce them when we turn the sequence of words into phonetic form—there’s a more general rule in phonology that we only pronounce the first copy of each word (it’s called ‘Chain Reduction’).

Why do we raise drink first and Ella second? Obviously the external motivation is because that’s what produces the right word order, but we still need a theory-internal reason for why English speakers don’t say drink Ella-s lemonade instead.

Actually—why do we raise them at all? How close do interpretable and uninterpretable features need to be to combine with each other?

Let’s start off our linguistic journey with what I'd argue is the null hypothesis: when words want things from other words, they need to be right next to each other:

The Sisterhood Principle: Uninterpretable features can only be interpreted by the interpretable features of their sister.

In other words, to interpret an uninterpretable feature, it needs to be directly merged with an interpretable feature of the same type.

Unfortunately, this doesn’t explain why we need to move drink up closer to the tense marker—the verb phrase Ella drink lemonade has a [uTense] feature, and its sister has a [iTense] feature. Why can’t they combine right when the tense marker first merges in?

These are tough questions to answer now, but I will add them to our Running Questions list and we’ll get to them soon:

Running Questions
1. Why does cat project its features in big cat, but drink projects its features in drink lemonade?
2. Why does the verb raise to be close to the tense marker before the subject does?
3. Why does the verb need to raise at all?

conclusion

This episode might have felt like a bit of a whirlwind if you hadn’t been exposed to theoretical syntax before, but unfortunately that was kind of inevitable. This blog isn’t going to be an introduction to syntax, and so I kind of just had to get the necessary background out of the way.

But like… you just learned almost all of Minimalism, the leading framework in theoretical syntax, in a single blog post! There are a few more important things (phrases and phase theory, for example), but this is pretty much all of syntax!

I also hope this post was not overly boring if you already knew what Minimalism was. I think I’ve made choices here that aren’t shared by all Minimalists, so hopefully my specific take on things is interesting enough. And it just felt really good to type this stuff out and see if I really understand it on a deeper level.

explaining syntactic variation

One last thing I want to mention is how this setup could explain syntactic variation between languages. For example, let’s say you have a language with SOV word order—where the verb comes last. What would be different about that language from English? Remember that the theory I’m aiming for is trying to ascribe all syntactic variation to the lexicon.

One way to solve this (although by no means the only or the correct way) is to say that in the SOV language, verbs don’t have interpretable case features, and so therefore their objects as well as their subjects have to raise in order to get case.

And that’s how variation can be explained by features—by differences in the lexicon!

Lmk your thoughts about Minimalism!

important takeaways

The basic operation of syntax is Merge, where two syntactic objects become one.
Lexical entries have both uninterpretable and interpretable features.
Uninterpretable features need to be directly merged with interpretable features of the same type in order to be deleted. This is called Agree.
If a tree has any uninterpretable features after syntax is done with it, it will fail.
Features Project up the tree through the labeling algorithm.
Sometimes syntactic objects Move up the tree in order to get their uninterpretable features deleted.

note that this is *a* theory. not the most well researched theory or the most accurate theory. just the one david came up with in his backyard for these reasons. david might sound confident but he is not. check out the table of contents here.

This will be revised.

backyard biolinguistics

Discussion about this post

Ready for more?