Unifying Labeling under Minimal Search in "Single-" and "Multiple-Specifier" Configurations*

1. Strong Minimalist Thesis According to the strong minimalist thesis (SMT), language is a "perfect system," meeting the interface conditions in a way satisfying third factor principles (Chomsky 1995). Under SMT, the combinatorial operation of the generative procedure is a simple set formation device called Merge that takes α and β, and forms {α, β}. The "perfect system" must have Merge, and ideally only Merge, and we expect this very simple, Merge-based system to interact with third factor principles such as Minimal Search (MS). That’s the ideal picture. Of particular importance is the extent to which we get significant results through the interaction of Merge and MS. With this in mind, let's examine the framework outlined by Chomsky (2013). First, the single, structure-building operation Merge puts two objects α and β into a relation, and the output of this operation is a two-membered set {α, β}. There are two important points concerning the application of Merge: (i) Merge applies freely as long as it conforms to third factor principles, and (ii) Merge does not encode a label; there is no labeled categorial node above α and β, the categorial status of the set {α, β} is representationally unidentified. For a syntactic object SO to be interpreted, however, it is necessary to know what kind of object it is (e.g., nominal, verbal, etc.). So, there must be some process of finding the relevant, category-identifying information of {α, β}, generated by Merge. Chomsky (2013) takes what is referred to as 'labeling' to be the relevant process, and labeling is "just minimal search, presumably appropriating a third factor principle, as in Agree and other operations." Labeling is MS, i.e. MS finds the relevant head(s) allowing for categorial identification at the interface. Suppose SO = {H, XP}, H a head and XP not a head. Here,


Short-Title
Interpretation is violated at CI; the ambiguity is assumed to be intolerable. Chomsky (2013) argues that there are two ways to remedy this situation: (i) modify SO so that there is only one visible head, or (ii) X and Y are identical in a relevant respect, providing the label of the SO. Take a concrete case, v*P with external argument NP (SO1 = {{N, α}, {v*, β}}), where both N and v* are located by MS. One way to label SO1 is to raise {N, α} to a higher position. This movement leaves behind an invisible copy of {N, α}, whose invisible status follows if γ is taken to be in domain D iff every occurrence of γ is a term of D (Chomsky 2013). 2 Under this algorithm, in SO1 = {{N, α}, {v*, β}}, MS finds the only "visible" head v* as the label of SO1 (given that the lower copy of {N, α} is invisible). As for SO2 = {{N, α}, {T, SO1}}, formed by the movement of {N, α}, Chomsky (2013) proposes that MS identifies the label for SO2 with the prominent shared features of N and of T, namely the v(alued)Phi on N and u(nvalued)Phi on T, which participate in agreement. The idea is that MS finds the single feature Phi, and Phi will then count as the category identifier (i.e. the label) for SO2 (see also Chomsky 2015).

"Multiple-Specifier" Constructions
We demonstrated above how the labeling theory works in "no specifier" and "singlespecifier" configurations: {H, XP} and {XP, YP}. Let us now ask how labeling theory deals with "multiple-specifier" configurations: SO={XP, {YP, {ZP, WP}}}. Consider (1): Take a concrete case. Japanese exhibits multiple nominative subjects (Kuno 1973), as in (2), schematically represented in (3) (for expository purposes, indices are assigned and the terminal elements are placed in the head-initial order). Each NP is assigned (or valued) nominative case (NOM) by finite T (Saito 2016): (2) Bunmeikoku-ga dansei-ga heikin-zyumyoo-ga mizikai civilized.country-NOM male-NOM verage-life.span-NOM short-Pres. 'It is in civilized countries that male's average life span is short.' What counts as the first head(s) in (3)? Should N1? That incorrectly labels (3) as nominal, and fails to capture the fact that NOM appears on all three NPs. Descriptive adequacy requires that, in (3), MS finds the three heads N1, N2, N3, and the finite T (and only those four heads), so that the valuation of NOM by finite T on each NP will take place, and the uCase-vTense pair will count as the label of SO. 4 The desired result is thus that MS finds all and only those four heads for computation, specifically, valuation and labeling, but at the same time, we have to explain why "multiplespecifier" configurations, such as Japanese multiple subjects (2), are not available for languages such as English. In effect, we are confronted with the problems of undergeneration and overgeneration, and in what follows, we demonstrate how these problems are overcome through the interaction of Merge and MS.

Minimal Search Defined and Unified
Let us ask again how MS finds its target(s) and makes it (them) accessible for

Lastname ________________________________________________________________________
Coyote Papers, Volume 22, 2019 4 computation (specifically, valuation and labeling). We would like to propose that MS finds a target in the optimal way via the shortest possible path. What counts as "shortest"? Take an H-XP configuration (with no "specifier"), where indices are assigned to each SO for expository purposes. Consider (4): Here we adopt the idea of Chomsky (1995): a shorter path is selected over a longer one (see Ke 2019 for a detailed formal analysis of the nature of MS). As for what counts as "shorter," we propose that the path of α is the set of all SOs of which α is a term. Note that the non-reflexive definition of term is adopted here. 5 Then, the path of α is shorter than the path of β iff the path of α is a proper subset of that of β (cf. Pesetsky 1982, May 1985. Given this shortest set-theoretic search, in (4), the path of H is {SO1}, and the path of X is {SO1, SO2}. Since the former is a proper subset of the latter, the former is shorter, and consequently H is selected over X. That is, assuming "shorter" means "properly contained in," it follows that MS selects H over X because the path of H (={SO1}) is a proper subset of the path of X (={SO1, SO2}); hence, only H counts as an accessible head for labeling. 6 How about an XP-YP configuration (with a "single-specifier"). Consider (5): It follows from the path-theoretic account that MS selects both X and Y because neither Interestingly, MS, defined as the shortest set-theoretic search, yields the right 5 We assume: X is a term of Y iff X∈Y or X∈ Z, Z a term of Y. See Chomsky 2019 UCLA lectures for further comment. 6 We assume that every term in the workspace is visible; hence, Search sees them, but among them, MS determines which terms count as accessible ones.
Consider (6): Recall that descriptive adequacy requires that MS finds the three nominal heads N1, N2, N3, and the single finite T (and only these four heads), so that the valuation of NOM by finite T on each NP will take place (Saito 2016  The proposed analysis also explains why lower copies, left by movement, are not available for computation. Recall the "subject-raising" case, where the movement of {N, α} from Spec-v* to Spec-T renders the lower copy of {N, α} invisible; hence, v* is unambiguously identifiable, repeated in (7): In (7) As demonstrated above, MS gives unified and predictively correct results. That is, anything that is found in the shortest possible way counts as an accessible term for computation, where the notion "shorter" is understood in terms of "proper subset" (not a uniquely linguistic concept, hence arguably third factor). There is no need to assume any special device (e.g. Chomsky's (2013) algorithm concerning (in)visibility) to get these results: what counts as an accessible term naturally follows from MS defined as the shortest set-theoretic search.

Revisiting "Single-" vs. "Multiple-Specifier" Configurations
The remaining task is to explain why "multiple-specifier" configurations exist in some languages, but not in others; for example, multiple subjects appear in Japanese, but not in English. What determines the presence (or absence) of "multiple-specifier" configurations?
What separates Japanese from English? We would like to propose that "multiple-specifier" configurations appear iff MS finds one and only one valuing head per agreement-relation; that is, for each unvalued feature uF-valuee, there is one and only one valued feature vFvaluer. Why? Because every SO, including "multiple-specifier" configurations, must yield a unique identification for labeling, multiple valuers would violate this uniqueness principle. With this requirement, let us examine "single-" and "multiple-specifier" configurations again. First consider the "single-specifier" configuration (8), where N bears vPhi and uCase, while T bears uPhi and vTense: Note further that in (7), elements contained within the lower copy of SO2 (i.e. N and α) are also invisible. The path of N or α within the higher copy of SO2 is {SO2, SO1) while the path of N or α in the lower copy is {SO2, SO4, SO3, SO1} where the former path is included in the latter and hence is shorter. In contrast, note that any analysis relying on c-command will not yield this result; the entire higher copy of object SO2 does c-command the lower copy, but elements within this higher copy do not. If a lower copy is invisible only if it is c-commanded by a higher copy, then the lower copies of N and α would be visible.

Short-Title
Lastname ________________________________________________________________________ Coyote Papers, Volume 22, 2019 7 In (8), MS finds both N and T, and vPhi on N values uPhi on T, and as a reflex, vTense on T values uCase on N (Chomsky 2013(Chomsky , 2015. 8 Notice, for each uF-valuee (uPhi, uCase), there is one and only one vF-valuer (vPhi, vTense); hence, the two pairs, uPhi-vPhi and uCase-vTense, constitute a unique label. Now turn to the "multiple-specifier" configuration (9), where each N bears vPhi and uCase, while T bears vTense but not uPhi: MS finds the three nominal heads N1, N2, N3, each bearing uCase, and the finite T bearing vTense. Saito (2016) argues that vTense values uCases of N1 N2, and N3. Notice, for each uF-valuee (uCase), there is one and only one vF-valuer (vTense); hence, the uCase-vTense pair constitutes a unique label. But suppose finite T had uPhi in the "multiple-specifier" configuration. What would happen? Consider (10): (10), the three nominal heads N1, N2, N3, bearing vPhi, each participate in valuing uPhi on finite T; hence, there would be no single phi-valuer because the three nominal heads N1, N2, N3 bear distinct phi-sets (even if they accidentally bear same values), and each participate in phi-valuation, thereby failing to yield a unique phi-label for SO1.

Discussion
So, what separates Japanese from English with respect to the licensing of "multiplespecifier" configurations? We would like to end this paper with the following three possibilities: The first possibility is that unlike English, Japanese has no uPhi (see (9) and (10)). That is, the presence of uPhi on English T blocks "multiple-specifier" configurations.
If there were two or more distinct vF-valuers for one uF-valuee, then a labeling failure would result (for earlier proposals, see Fukui 1986, Kuroda 1988; for recent discussion, see Saito 2016, Sorida 2014. The second possibility is that Japanese bears uPhi (as in (10)), but unlike English uPhi, Japanese uPhi has no morpho-phonological realization. That is, if there were two or more distinct vF-valuers for one uF-valuee, then there would be no way to realize such multiple phi-sets on the single head. But unlike English, Japanese has no such mopho-phonological realization of uPhi; hence this externalization problem can be circumvented (Kitada personal communication). 9 And the third possibility is that UG has uF, and in English, uF is realized as uCase on N, and as uPhi on T, whereas in Japanese, uF is realized as uCase on N but remains as uF on T. 10 Consider (11): 3 N3 uCase/vPhi γ TvTense/uF δ Suppose UG has unvalued feature uF. In English, uF is realized as uPhi that matches Phi and gets Phi-valued. By contrast, in Japanese, uF is realised as uF with no property that matches any F but gets no value. Suppose valuation takes place only if a valuer has some unvalued feature (i.e. the activity condition, Chomsky 2000). Then, English finite T values uCase on N as long as uPhi on T remains unvalued, meaning: only once, whereas Japanese finite T values uCase on N as long as uF on T remains unvalued, meaning: continuously. 9 Though interesting, it is unclear on this approach how a unique label is determined. 10 The notion of uF is somewhat similar to that of "edge feature" (Chomsky 2000(Chomsky , 2001 in that in Japanese uF on T matches with any feature, but it has no specific properties, so it cannot be valued; hence it remains unvalued throughout a derivation. In short, uF alone requires just matching, while uF plus a property (e.g. uCase or uPhi) requires matching and valuation for convergence.
That is, English uPhi realizes morpho-phonologically (sometimes vacuously), while Japanese uF never realizes morpho-phonologically (because there is no value).
Also notice, the third possibility is arguably consistent with the uniformity hypothesis: "In the absence of compelling evidence to the contrary, assume languages to be uniform, with variety restricted to easily detectable properties of utterances" (Chomsky 2001).
Under this hypothesis, UG expects that finite T uniformly values Case iff it bears unvalued features, while linguistic variation comes down to the problem of externalization. So, in English, uF is realized as uPhi, and uPhi gets Phi-valued, so Case-valuation happens only once. In Japanese, however, uF matches but remains unvalued, so Case-valuation may happen continuously, thereby allowing "single-" as well as "multiple-specifier" configurations to appear. Now, if English changes from uPhi (back) to uF, then it becomes like Japanese; if Japanese changes from uF (on)to uPhi, then it becomes like English. One might argue that such language change is hard to imagine under the assumption that English has uPhi (or uF) but Japanese has no counterpart of it.
Finally, let's ask what would happen if there were no uF. Chomsky (2013Chomsky ( , 2015 suggests that uF marks phases, so without uF, there would be no way to mark them. In addition, we would like to suggest that, without uF, there would be neither "single-" nor "multiple-specifier" configurations, and there would be no phrasal movement. It is uF that makes available exocentric structures or XP-YP configurations including "multiplespecifier" configurations. 11