Notes to Generalized Quantifiers
1. The use of “quantifier” (German “quantor”, French “quantificateur”, etc. to denote \(\forall\) and \(\exists\) became established in logic toward the end of the 1920s.
2. I use the standard convention of letting mathematical expressions denote themselves whenever convenient.
3. As a rule of thumb I use type-writer font for quantifier expressions and italics for the signified quantifiers. In logical languages, on the other hand, it is convenient to abuse notation somewhat by using the same symbol for both the expression and the quantifier, when no confusion results. I sometimes do the same for predicate symbols, so that the letters A, B …, R,… stand for both the symbol and the set or relation it denotes.
4. See Parsons (1997 [2017]) for an illuminating account of the Aristotelian square of opposition and some modern misunderstandings about it, and Westerståhl (2012) for a comparison with the “modern square”, which differs from the Aristotelian one in that all doesn’t have existential import but not all does (for Aristotle and most of his medieval followers, the reverse held).
5. See Peters and Westerståhl (2006: Ch. 2.5), for a proof that there does not exist a semantics assigning individuals or sets of individuals even to phrases of the two forms all A and some B in a systematic (compositional) way.
6. Some such properties were considered even in the early days of predicate logic, for example, the quantifier \(\exists !\) meaning “there exists exactly one”.
7. Rather than seeing generalized quantifiers as mappings from universes to second-order relations, Lindström took them to be classes of models of the corresponding type. This is a negligible difference, since we have \[Q_M(R_1,\ldots,R_k) \iff (M,R_1,\ldots,R_k) \in Q\] (see section 6 for the notation on the right-hand side).
8. It is often held that the idea of a totality of everything has been shown to be incoherent by Russell’s paradox. Indeed, the paradox proved Frege’s original system to be inconsistent, and shows that there cannot be a set containing all sets. Williamson (2003) claimed that an absolute notion of “everything” can be made formally coherent, although a semantics in which interpretations are not objects is needed. This sparked off a debate about absolutely general quantification; see, e.g., Cartwright (1994), Glanzberg (2004), Linnebo (2006), Rayo (2012), and the collection Rayo and Uzquiano (2006). Filin Karlsson (2018) gives an overview and suggests an account in terms of stratified set theory (NFU).
9. These are the modern variants. Aristotle considered all with existential import, i.e.,
\[(\textit{all}_{\,\text{ei}})_M(A,B) \iff \emp\neq A \subseteq B\]and similarly for not all, which he took to be the negation of \(\textit{all}_{\,\text{ei}}\).
10. Below, we are interpreting most as “more than half of the” (on finite universes). Often, most rather seems to mean something like “a large majority of the”. There is probably some vagueness involved, as well as an ambiguity: the “threshold” may be set at different values in different contexts. We are not dealing with vagueness here, and we assume that a fixed context assumption chooses a suitable meaning. By default, we thus set the “threshold” to 1/2.
11. A main source for the mathematics of model-theoretic logics is Barwise and Feferman (1985).
12. Here is a fact which is non-trivial but still relatively easy to prove:
\[\FO(\textit{MO}) \equiv \FO(Q_0,\textit{most})\]See Peters and Westerståhl (2006) for proofs of this and similar facts.
13. Nor does it have the completeness property or the Tarski property, though it does have the Löwenheim property.
14. See Ebbinghaus and Flum (1995) for the mathematics, and Westerståhl (1989) or Peters and Westerståhl (2006) for surveys focused on linguistic applications.
15. For the coding of quantifiers of arbitrary monadic types, or polyadic quantifiers, it may be practical to use a few more symbols in the coding language.
16. See, for example, Hopcroft and Ullman (1979) for an introduction to automata theory.
17. In the sense that if a binary word is accepted, so are all permutations of that word (when coded as suggested above).
18. Here 0 can be defined as the unique y in N such that \(y + y = y\). For more results in this area, see M. Mostowski (1998) and Steinert-Threlkeld and Icard (2013).
19. This is one way of providing for recursive definitions in the logic. A simpler operator that one can add is the transitive closure operator. If R is a binary relation, the transitive closure of R, \(\TC(R)\), is the smallest transitive relation containing R. It can also be defined as follows (cf. the definition of RECIP in the list (12)):
\[a\TC(R)b \iff \exists n \geq 1\, \exists x_0,\ldots,x_{n} [x_0 = a \wedge x_{n} = b \wedge x_iRx_{i+1} \text{ for } i<n]\]Note the quantification over n: \(\TC(R)\) is not in general definable in FO from R. It can also be defined recursively:
\[a\TC(R)b \iff aRb \vee \exists x [aRx \wedge x\TC(R)b]\]To be able to do this inside our logic we can add formulas of the form \(\TC(x,y,\f)(u,v)\) whenever \(\f\) is a formula, and the semantic rule that when \(a,b \in M\), \(\p(x,y,\zbar)\) has the free variables shown, and \(\cbar\) corresponds to \(\zbar\),
\[\M\models \TC(x,y,\p(x,y,\cbar))(a,b) \iff (a,b) \in \TC(\p(x,y,\cbar)^{\M,x,y})\]This gives us the logic \(\FO(\TC)\); the LFP operator generalizes this to other forms of recursion. See Ebbinghaus and Flum (1995) for details about the definitions, and for results about these logics, including those mentioned in this and the next two paragraphs.
20. Thereby contradicting common claims that natural languages are too unsystematic and chaotic to allow for a mathematically precise treatment; cf., for example, Russell’s “Misleading Form Thesis”. See Montague (1974), in particular the papers “Universal grammar” and “The proper treatment of quantification in ordinary English”.
21. Barwise & Cooper (1981), Keenan & Stavi (1986), Higginbotham & May (1981). The papers in Benthem (1986) provided further logical and linguistic development of these ideas. Surveys of the whole area are Westerståhl (1989), Keenan & Westerståhl (2011), and Peters & Westerståhl (2006).
22. I use the traditional NP for “noun phrase” and N for “noun”; linguists today prefer DP and NP, respectively.
23. (a) is practically immediate, and for (b) it is easy to verify that \(Q{^{\text{rel}}}\) always satisfies Conserv and Ext when Q is of type \({\langle}1{\rangle}\). In the other direction, any Conserv and Ext \(Q'\) has a type \({\langle}1{\rangle}\) “counterpart” Q defined by \(Q_{M}(B) \Leftrightarrow Q'_{M}(M,B)\); then
\[ \begin{align} Q{^{\text{rel}}}_{M}(A,B) & \iff Q_{A}(A\cap B) & \text{(by definition of }Q{^{\text{rel}}})\\ & \iff Q'_{A}(A,A\cap B) & \text{(by definition of } Q') \\ & \iff Q'_{M}(A,A\cap B) & \text{(by E}\textsc{XT})\\ &\iff Q'_{M}(A,B) & \text{(by C}\textsc{ONSERV}), \end{align} \]so \(Q' = Q{^{\text{rel}}}\).
24. But there again it is often natural to consider the Ext quantifier \(W{^{\text{rel}}}\), where \(W{^{\text{rel}}}_M(A,R)\) says that R is a well-ordering of A.
25. Indeed, the intersective quantifiers mentioned so far have the stronger property of being cardinal; i.e., only the cardinality of \(A\cap B\) matters. An example of an intersective but non-cardinal quantifier is no _ except Mary, defined in the next section.
26. See Keenan and Westerståhl (2011: sect. 19.2) for discussion.
27. For detailed discussion and further references, see Peters and Westerståhl (2006): Ch. 6.3 for existential there sentences, and Ch. 5 for much more on monotonicity, including the connection with polarity items.
28. For much more on this, and on the treatment of possessive and exceptive determiners in general, see Peters and Westerståhl (2006, 2013).
29. This is the universal reading of John’s. There is also an existential reading, e.g., in
When John’s dogs escape, his neighbors usually catch them.
The antecedent doesn’t say that all of his dogs escape, only that some of them do.
30. See Bonnay (2008) and references therein for an overview of the discussion of conditions like Isom in the context of logicality.
31. For example, from their relativized versions. This suggestion about sameness or constancy is further discussed in Westerståhl (2017).
32. van Benthem (1989) suggested that (37) could be interpreted as: “R includes a 1-1 function from A to B”, which is not FO-definable.
33. This result is proved in Luosto (2000); the proof is quite difficult. More general results on the undefinability of resumption can be found in Hella, Väänänen, & Westerståhl (1997). Some discussion of the linguistic aspects appears in Peters & Westerståhl (2002).
34. See Dalrymple et al. (1998) for an extended discussion. RECIP is not definable in FO.
35. Branching or partially ordered quantifiers is another way of generalizing (prefixes of) \(\forall\) and \(\exists\); it appeared in logic with Henkin (1961). Hintikka (1973) argued that partially ordered prefixes with \(\forall\) and \(\exists\) that are not first-order definable occur essentially in English too. The debate that followed Hintikka’s proposal was re-analyzed in Barwise (1981), who also suggested that (42) is a clearer example of branching in English than Hintikka’s original examples. Semantically, branching quantifiers are already subsumed under our notion of a generalized quantifier, since they can all be seen as polyadic quantifiers, like \(Br(Q_1,Q_2)\), although the special syntax is then lost.
Note that the construction in (42) only makes good sense when \(Q_1\) and \(Q_2\) are right monotone increasing, i.e., \(Q_i(A,B)\) and \(B\subseteq B'\) entails that \(Q_(A,B')\), \(i = 1,2\). Then
\[Q_i(A,B) \iff \exists X\, [Q_i(A,X) \amp X \subseteq B]\]and one sees that (42) is a generalization of this. There has been some discussion about if and how (42) can be reformulated for other quantifiers; apart from Barwise (1979), see Westerståhl (1987) and Sher (1997).
36. Peters and Westerståhl (2006) argues that the difference between D- and A-quantification is mainly syntactic, and that all languages appear to be able to express Conserv and Ext type \({\langle}1,1{\rangle}\) quantifiers (whether by determiners or other means), even though some languages are claimed to lack phrases denoting type \({\langle}1{\rangle}\) quantifiers.
37. These 34 chapters are written by linguists or linguistics students, most of whom are native speakers of the respective language.
38. Szabolcsi agrees that some alleged shortcomings of GQ theory are orthogonal to its aims, but thinks the compositionality and scope problems are serious. As to compositional analysis of the meaning of complex determiners, she sees no principled problems with adding such analyses to current GQ theory. Her intricate discussion of scope is too complex to be sketched here.
39. The ubiquity of this property, which is also called smoothness, is discussed at length in Peters and Westerståhl (2006: Ch. 5).
40. For example, is the premise true if most Americans who know three foreign languages speak at least one of them at home? Or must they speak all three, or in general most of the foreign languages they know, at home?
41. In fact, not just monotonicity but also exclusion: being able to reason from the fact that two predicates are disjoint, which also comes naturally to speakers. Technically, the insight—which essentially goes back to Keenan and Faltz (1984)—is that while mere monotonicity only requires a pre-order (reflexive and transitive), \(x \leq y\) entails \(f(x) \leq f(y)\), the domains of the relevant functions here are usually bounded distributive lattices, which enables one to express other properties besides monotonicity, in particular exclusion. A recent formulation of the monotonicity calculus in given in Icard, Moss, and Tune (2017).
42. Let C be the predicate “likes every clarinetist”. So Pat is a C (second pre miss). So if you like every C you like Pat. But then, using the first pre miss, if you like every C, everyone likes you. And that’s what the conclusion says. Each step in this argument can be construed as an application of the monotonicity profile \(-\textit{every}+\).
43. As should be clear from the above, the monotonicity calculus and the axiomatized syllogistic fragments can be seen as different ways to approach the same phenomena, a point of view explored in Icard (2014).
44. More exactly, it is NP-complete. Further, Mostowski and Wojtyniak (2004) proved that the branching construction in Hintikka’s famous villagers sentence is also NP-complete:
Some relative of each villager and some relative of each townsman hate each other.
45. This is supervised learning (back-propagation through a recurrent neural network): the network is asked if \(Q_M(A,B)\) holds in simple models \((M,A,B)\), reacts to feedback depending on the answer, and tries again. The lack of an effect for Conserv is explained by the fact that in the set-up used there is no difference between A and B in the models: the sets \(A-B\), \(A\cap B\), \(B-A\), and \(M-(A\cup B)\) are all on a par.