среда, 7 марта 2012 г.

On the conceptual framework for voice phenomena *.

Abstract

This article attempts to lay the conceptual foundations of voice phenomena, ranging from the familiar active/passive contrast to the ergative/antipassive opposition, as well as voice functions of split case-marking in both transitive and intransitive constructions. We advance the claim that major voice phenomena have conceptual bases rooted in the human cognition of actions, which have evolutionary properties pertaining to their origin, development, and termination. The notion of transitivity is integral to the study of voice as evident from the fact that the so-called transitivity parameters identified by Hopper and Thompson (1980) and others are in the main concerned with these evolutionary properties of an action, and also from the fact that the phenomena dealt with in these studies are mostly voice phenomena. A number of claims made in past studies of voice and in some widely-received definitions of voice are shown to be false. In particular, voice oppositions are typically based on conceptual--as opposed to pragmatic--meanings, may not alter argument alignment patterns, may not change verbal valency, and may not even trigger verbal marking. There are also voice oppositions more basic and wide-spread than the active/passive system, upon which popular definitions of voice are typically based

1. Introduction

Current studies on voice phenomena suffer from a number of inadequacies at several levels of description and explanation. At the most fundamental level, there is no coherent conceptual framework that adequately addresses the matter, such that we are often left to wonder whether or not a given phenomenon falls in the domain of voice. For one thing, people differ in the treatment of causative and reflexive constructions; some consider them to represent voice categories, while others do not. Still others avoid raising the issue at all. Various definitions currently offered are of little use, as they are typically based on an Indo-European active/passive opposition, and arbitrarily include or exclude a particular phenomenon from the domain of voice. (1)

Properly identifying construction types representing a voice sub-domain is also a serious problem. In Crystal's (2003) definition (cf. Note 1) reflexives are not recognized as proper voice constructions and their relationship to the middle voice is not entirely clear. A similar problem is seen in Kemmer's (1993) extensive study of middle voice constructions.

There are also severe limitations at the level of explanation. Closer to the main theme of this volume is the problem of understanding the increases and decreases in valency and accompanying changes in argument structure observed in voice phenomena. Why do certain phenomena (e.g. the causative and applicative) show an increase in valency, while others (e.g. the passive and antipassive) typically have a valency-reducing effect? What motivates these valency changes in opposite directions?

Functional explanations regarding the distribution of certain voice constructions go a long way toward an explanatory functional study of grammatical phenomena (cf. Haiman 1985). Being largely based on formal properties such as "linguistic distance" and "full" vs. "reduced form," these explanations are not functional enough to be able to make more general predictions. (2)

The problems outlined above largely stem from two related methodological issues. One is the lack of a coherent conceptual framework for characterizing and analyzing voice phenomena; the other is an over-reliance on formal properties in both analysis and explanation. Clearly the latter problem is caused by the former and by the lack of commitment to the cognition-to-form approach in linguistic analysis. (3) The purpose of this article is thus to lay out a conceptual framework that coherently delineates the domain of voice, which embraces both those phenomena that are traditionally recognized as falling in the voice domain and those that have been kept in limbo. The framework required must deal with the fact that many voice phenomena straddle the semantics-pragmatics boundary, although the active/middle opposition is basically conceptual or semantic, and the active/passive opposition is largely pragmatic. We endeavor to unify these manifestations of voice function by assuming that the pragmatic relevance of clausal units is semantically determined in the first place.

The conceptual foundations of voice can only be arrived at by inspecting contrasting phenomena across languages. Our initial task is therefore to learn how a given language, using its own resources, achieves the goal of expressing a relevant conceptual opposition found in another language. While the ultimate goal of functional typology is to discover the correlative patterns between form and function, this article is concerned primarily with the initial task of postulating conceptual bases of voice phenomena and identifying constructions across languages that express the relevant oppositions.

One final introductory remark is due regarding the controversy over the question of whether the formal relationships between opposing voice categories should be treated as inflectional or derivational. We consider this question to be academic in the absence of rigorous definitions for these processes. In the realm of voice phenomena, some systems, for example, the Ancient Greek active/middle system, incorporate voice morphology in their inflectional paradigm. Others like the English active/ passive opposition do not show a simple morphological relationship--inflectional or derivational--since constructions as a whole enter into the formal opposition. The regularity or productivity of the pattern is often taken to be an important criterion distinguishing inflections from derivations; the former are thought to be regular and obligatory, while the latter allow exceptions. But regularity in natural language is always relative, and so are the patterns of voice oppositions. Even among the known ones, nothing is one hundred percent regular. An alternation that is well-integrated within the inflectional paradigm may show irregularity. In Ancient Greek, for example, we find both active forms that do not have middle counterparts (activa tantum) and middle forms lacking the corresponding active (media tantum). The active/passive opposition also shows a high degree of regularity, without ever being one hundred percent (as in the case of English), others place much severer limitations on the range of permissible passive constructions.

2. The evolution of an action: voice, transitivity, and aspect

The basic claim of this article is that major voice phenomena have their conceptual bases rooted in the human cognition of actions. Because such actions have various effects upon us, we have special interest in the way that they arise, how they develop, and the manner in which they terminate--what is referred to as the evolutionary properties or phases of an action in this article. Through a system of grammatical oppositions, a language provides a means for expressing conceptual contrasts pertaining to the evolutionary properties of an action that the speaker finds relevant for communicative purposes. Among the evolutionary properties, voice is primarily concerned with the way event participants are involved in actions, and with the communicative value, or discourse relevance pertaining to the event participants from the nature of this involvement.

Mention of the evolution of an action immediately brings to mind two other grammatical concepts, namely, transitivity and aspect. It is thus appropriate to clarify the relationships and differences between these notions. Traditionally, voice has been defined in reference to transitivity, or more narrowly in terms of the transitivity of a verb or clause; the active/ passive opposition most typically obtains with transitive verbs. A more important connection between transitivity and voice, however, lies in the notion of semantic transitivity, rather than strictly verbal or clausal transitivity. Indeed, it is easy to see this connection, as in the work of Hopper and Thompson (1980), where many of the phenomena discussed in terms of transitivity are nothing but voice phenomena. This important article concludes the section on grammatical transitivity as follows: "It is tempting to find a superordinate semantic notion which will include all the Transitivity components. If there is one, it has so far not been discovered ..." (Hopper and Thompson 1980: 279). Our claim is that what they are looking for is a theory of voice. In fact, the work of Hopper and Thompson lays important ground work for the study of voice. In this regard, Kemmer (1993: 247) is absolutely correct in noting that "the scale of transitivity ... forms the conceptual underpinning for voice systems in general, and for reflexive and middle marking systems in particular." (4) While none of these works makes it quite clear, voice is a system of correspondences between action or event types and syntactic structures. For example, what is known as the active voice is the pattern of correspondence between the high transitive event type or the prototypical transitive action and the nominative-accusative coding pattern of the event participants, as in the English active sentence She killed him (see Section 5 below).

The parameters of transitivity identified by Hopper and Thompson (1980) pertain to "different facet[s]" of "carrying-over or transferring an action from one participant to another" (Hopper and Thompson 1980: 253), and they in effect represent the evolutionary properties of an action, that is, they pertain to the way an action is brought about, to the way it is transferred to the second participant, and to the way it affects this participant. In order to bring grammar closer to cognition, we propose to examine specific evolutionary properties of an action pertaining to voice oppositions that are distilled as transitivity parameters in Hopper and Thompson (1980) and others dealing with the issues of transitivity.

If transitivity is integral to a theory of voice, how then do aspect and voice differ under the assumption that both are concerned with the way an action evolves? These two grammatical categories invite different kinds of questions. Aspect asks where the vantage point is with regard to the temporal structure of an action. When the action is viewed holistically encompassing all of its temporal phases, we obtain the perfective viewpoint of the described action. On the other hand, if specific sections of internal temporal structure are focused, we obtain various types of imperfective aspectual construal of an event. The contrast between the perfective and the imperfective aspects and the representative subcategories of the latter seen across languages are represented in Figure 1.

[FIGURE 1 OMITTED]

Voice, on the other hand, asks how an action evolves--that is, it asks about the nature of its origin, the manner in which it develops, and the way that it terminates. These evolutionary phases of an action and the various voice categories pertaining to them are depicted schematically in Figure 2.

[FIGURE 2 OMITTED]

3. Major voice oppositions and their conceptual bases

Under the present conception, the three principal evolutionary phases of an action--origin, development, and termination--form the basis for the major voice parameters. These parameters are generally expressible in the form of questions concerning the evolutionary properties of an action, as below:

Major voice parameters:

I. The origin of an action

(a) How is the action brought about?

(b) Where does the action originate?

(c) What is the nature of the agent?

II. The development of an action

How does the action develop?

(a) Does the action extend beyond the agent's personal sphere or is it confined to it?

(b) Does the action achieve the intended effect in a distinct patient, or does it fail to do so?

III. The termination of an action

Does the action develop further than its normal course, extend beyond the immediate participants of the event, and terminate in an additional entity?

Figure 2 summarizes the voice constructions pertaining to these parameters. Throughout the following discussion, we touch upon the theoretical consequences of this diagrammatic representation of the voice domain.

3.1. Parameters pertaining to the origin of an action

The first opposition to be examined has to do with the nature of the origin of an action--namely, whether the action in question is brought about volitionally or nonvolitionally by a human agent.

Volitional/spontaneous opposition:

Is the action brought about volitionally?

Yes [right arrow] volitional

No [right arrow] spontaneous

While not widely recognized as a voice opposition, this distinction has been recognized as such in the Japanese grammatical tradition, perhaps because the suffix for the spontaneous voice is identical with that used in the passive construction. In fact, it is generally believed that the Japanese passive arose from the spontaneous construction. Languages (or grammarians' interpretations of the facts?) may differ with regard to the precise meaning contrast seen in the volitional/spontaneous opposition. In Japanese, the spontaneous construction expresses a situation where the agent does not intend to bring about an action, but where there is a circumstantial factor external to the agent that induces an action (such as eating "dancing-mushrooms" as in [1b] below). In other languages, a spontaneous form conveys the meaning of an action accidentally brought about. Other manifestations of the opposition may be alternatively expressed in terms of such notions as intentional/unintentional or controlled/uncontrolled, but we shall take the position that these contrasts are included in the basic function of the volitional/spontaneous opposition. That is, by "volitional voice" we mean a connection between a particular syntactic form and a type of action that is brought about by the willful involvement of an agent who "intends the action," and sees to it that the intended effect is achieved. Departure from this action type in any significant way may be construed as constituting a spontaneous action, expressed by a construction formally contrasting with the volitional construction.

In Modern Japanese, the domain of the volitional/spontaneous opposition has shrunk to such an extent that mental activities are the only ones where the contrast is readily observed, with the spontaneous morphology (-re/-rare) having generally given way to a passive interpretation in the domain of physical actions. In Classical Japanese (ninth-twelfth centuries), the volitional/spontaneous opposition was more widely observed, as in the following examples:

Classical Japanese

(1) a. Kikori-domo mo mai-keri. (volitional)

wood cutter-PL also dance-PAST

'Wood cutters also danced.'

b. Kikori-domo mo mawa-re-keri. (spontaneous)

wood cuter-PL also dance-SPON-PAST

'Wood cutters also danced willy-nilly.'

Spontaneous expressions in Japanese typically do not contain an agent in subject position. Because information regarding the volitional status of an agent is most readily accessible to the speaker, the volitional/spontaneous distinction is typically made with reference to a first person agent; accordingly, the missing agent is understood to be the speaker unless otherwise specified. This non-coding of an agent in subject position paved the way for a spontaneous expression where a patient nominal is coded in subject position, as in the following spontaneous construction (2b). Undoubtedly, this was an important step in the development of the passive from the spontaneous construction.

Modern Japanese

(2) a. Boku-wayoku mukasi-no-koto-o

I-TOP often old days-GEN-things-ACC

omo-u. (volitional)

think-PRES

'I often think about the things of the old days.'

b. Saikin mukasi-no-koto-ga yoku

recently old days-GEN-things-NOM often

omowa-re-ru. (spontaneous)

think-SPON-PRES

'Recently the things of the old days often come to mind.'

Since the volitional/spontaneous opposition is not widely recognized as a voice phenomenon, it is perhaps worth spending some time showing how widespread in the world's languages it actually is. As in other voice sub-domains, languages make use of different resources in expressing the volitional/spontaneous opposition. Indonesian and Malay use the multifunctional prefix ter- to express unintended or accidental actions:

Indonesian

(3) a. Ali memukul anak-nya. (volitional)

Ali AF.hit child-3SG.POSS

'Ali hit his child.'

b. Ali ter-pukul oleh anak-nya. (spontaneous)

Ali SPON-hit PREP child-3SG.POSS

'Ali accidentally hit his child.'

(I Wayan Arka pers. comm.)

According to Winstedt (1927: 86-87), the function of ter- in Malay is characterized as denoting an action due "not to conscious activity on the part of the subject, but to external compulsion or accident." It is noteworthy that spontaneous constructions in both Japanese and Indonesian/ Malay have an affinity with the passive in that they share the same affix in these languages. Compare the spontaneous constructions above with the passives in Japanese and Indonesian below:

(4) a. Taroo-wa Ziroo-ni nagura-re-ta. (Japanese passive)

Taro-TOP Jiro-by hit-PASS-PAST

'Taro was hit by Jiro.'

b. rumah itu tidak ter-beli oleh

house that NEG PASS-buy PRES

saya. (Indonesian passive)

1.SG

'The house cannot be bought by me.'

The diagrammatic representation of voice constructions in Figure 2 can be thought of as a semantic map, where different constructions are distributed over relevant territory within the voice domain. This is a useful way of representing conceptual affinities among various voice constructions, but its utility is predicated only on a comprehensive view of voice as advocated in this article. Spontaneous and passive are both concerned with the origin of an action. What they share is the idea that this lies NOT in the pragmatically most relevant participant; in the case of the passive, it is the agent of low discourse relevance and in the spontaneous case, it is the external circumstance.

The map in Figure 2 also shows the "neighboring" relationship between the spontaneous, the middle, and the antipassive. In Russian and a number of Australian languages, middle forms are recruited for the volitional/spontaneous contrast, as in the following examples:

Russian

(5) a. Kostja poreza-I xleb.

Kostja cut.PERF-PAST.SG.MASC bread

'Kostja cut the bread.'

b. Kostja porezaq-sja.

Kostja cut.PERF-PAST.SG.MASC-SPON

'Kostja has [accidentally] cut himself.'

(Vera Podlesskaya pers. comm.)

Diyari

(6) a. natu yinana danka-na wawa-yi.

1SG.ERG 2SG.O find-PARTC AUX-PRES

'I found you (after deliberately searching).'

b. nani danka-tadi-na wara-yi yinka ngu.

1SG.ABS find-SPON-PARTC AUX-PREP 2SG.LOC

'I found you (accidentally).'

(Austin 1981: 154)

Another favorite source for the spontaneous construction--especially prominent among Indo-Aryan and Dravidian languages of India--is the so-called dative-subject construction, which typically expresses uncontrollable states:

Sinhala

(7) a. mame ee wacene kiwwa.

I.NOM that word say.PAST

'I said that word.'

b. mate ee wacene kiyewuna.

I.DAT that word say.P.PAST

'I blurted that word out.'

(Gair 1990: 17)

The adaptation of the dative-subject construction for a spontaneous action is also seen when the "dative-subject" is marked by cases different from the dative as in the following Bengali examples, where the nominal form corresponding to the dative subject is marked with genitive. Here the volitional/spontaneous contrast takes on interesting nuances:

Bengali

(8) a. Ami toma-ke khub pc chondo kor-i.

1SG.NOM 2ORDSG-OBJ very liking do-PRES.1

'I like you very much.' (According to my own criteria.)

b. Ama-r toma-ke khub pc chondo

1SG-GEN 2ORDSG-OBJ very liking

hc y.

become-PRES.3ORD

'I like you very much.' (According to some [socially] set criteria.)

(Onishi 2001: 120)

When the basic meaning of the verb denotes a spontaneous (involuntary) action, the volitional voice form can be obtained by using a self-benefactive construction, as in Marathi and other Indo-Aryan languages:

Marathi

(9) a. sitaa raD-l-i.

Sita.NOM cry-PERF-F

'Sita cried.'

b. sitaa-ne raD-un ghet-l-a.

Sita-ERG cry-CONJ take-PERF-N

'Sita cried (so as to relieve herself).'

(Prashant Pardeshi pers. comm.)

Lhasa Tibetan has a set of auxiliaries expressing different categories of perspective. "Perspective-choice" interacts with both person and evidential categories in a complex way, but the relevant auxiliaries can be divided into a "self-centered" and an "other-centered" group (Denwood 1999). Verbs denoting such intentional actions as reading and dancing normally occur with self-centered auxiliaries when used with first person subjects. They can be made nonintentional or spontaneous with the use of other-centered auxiliaries, as in the following examples:

Tibetan

(10) a. ngas. yi.ge, klog.ba yin.

I-SMP letter read-LINK-AUX (self-centered)

'I read the letter (on purpose).'

b. ngas. yi.ge, klog.song.

1-SMP letter read-AUX (other-centered)

'I read the letter (without meaning to).'

(Denwood 1999: 137)

Conversely, although unintentional verbs expressing involuntary actions such as coughing and seeing normally occur with other-centered auxiliaties, they can be rendered volitional by the use of self-centered auxiliaries:

Tibetan

(11) a. glo. rgyab.byung.

cough-AUX (other-centered)

'I coughed (involuntarily).'

b. glo. rgyab.pa.yin.

cough-LINK-AUX (self-centered)

'I coughed (deliberately).'

(Denwood 1999: 139)

A similar pattern is observed in Newar (Tibeto-Burman), where the relevant contrast is expressed in terms of a distinction between conjunct and disjunct verbal endings--apparently an evidentiality-related phenomenon. Note that only clauses with first person subjects allow this contrast to be expressed.

Newar

(12) a. ji-n kayo tachya-na

1SG-ERG cup break-PC

'I broke the cup (deliberately).'

b. ji-n kayo tachya-ta

1SG-ERG cup break-PD

'I broke the cup (accidentally).'

(Kansakar 1999: 428)

Finally, the phenomenon now widely recognized in the name of "split intransitivity" is rooted in the volitional/spontaneous opposition. Observe first some well-known examples from Eastern Pomo below:

Eastern Pomo

(13) a. ha: c'e:xelka.(volitional)

1SG.A slip

'I am sliding.'

b. wi c'e:xelka. (spontaneous)

1SG.P slip

'I am slipping.'

(McLendon 1978: 1-3)

Although the verb forms are the same, when the pronominal form is inflected for the patient (13b), the sentence conveys a spontaneous action or a "lack of protagonist control" (McLendon 1978: 4). A similar contrast is seen in the Caucasian language Tsova-Tush (Batsbi), where "[the] referent of [an ergative] subject is a voluntary, conscious, controlling participant in the situation named by the verb" (Holisky 1987:113).

Tsova-Tush (Batsbi)

(14) a. (as) vuiz-n-as.

1SG.ERG fall-AOR- 1SG.ESRG

'I fell down, on purpose.'

b. (so) voz-en-sO.

1SG.NOM fell-AOR-1SG.NOM

'I fell down, by accident.' (Holisky 1987: 104)

In addition to these cases of "fluid-S" marking (Dixon 1994), split intransitivity may be realized as a lexically-conditioned phenomenon, where intransitive verbs are classified into an "agentive" class and a "patientive class." Agentive and patientive nominals respectively trigger marking similar to the corresponding arguments of a transitive clause. The Philippine language Cebuano shows this pattern through a focus system which is characteristic of Formosan and Western Austronesian languages:

Cebuano

(15) Transitive actor-focus construction

Ni-basa ako ug libro.

AF-read I.TOP 1NDEF book

'I read a book.'

(16) Transitive patient-focus construction

Gi-basa nako ang libro.

PF-read I TOP book

'I read the book.'

(17) a. Agentive intransitive

Ni-dagan ako. (actor-focus form)

AF-run I.TOP

'I ran.'

b. Patientive intransitive

Gi-kapoy ako. (patient-focus form)

PF-tired I.TOP

'I got tired/I am tired.'

Generalizing processes have the effect of obliterating the basic semantic motivation for distinguishing two classes of intransitive verbs; either the larger agentive or larger patientive class of intransitive verbs tends to have semantically heterogeneous verbs. Nevertheless, the split of intransitive verbs into two classes is rooted in the distinction between volitional and involuntary actions involving an animate protagonist. This is seen in a minority class of verbs, such that a minority agentive class contains verbs denoting controlled actions, and a minority patientive class includes verbs denoting involuntary states of affairs (see Merlan 1985). In Cebuano (and perhaps other Philippine languages as well) the larger agentive class includes verbs denoting uncontrolled events such as raining or slipping off, while the minority patientive class contains verbs that express strictly involuntary states of affairs such as being hungry, becoming tired, or contracting diseases.

The patterns of split intransitivity discussed here underscore an important point that we wish to advance in this article: voice can be also expressed by nominal forms. Traditionally, voice has been regarded as a verbal category. Indeed, many linguists take verbal marking or verbal inflection as the defining feature of voice. (5) We reject this restrictive view. As we define it, voice is concerned with the evolutionary properties of an action. It is typically marked on the verb because a verb expresses an action. Verbal voice marking is therefore simply a case of iconicity. An action, however, also involves participants such as agent and patient. Because an action occurs in relation to these protagonist participants, any form representing them could also bear voice marking. The volitional/ spontaneous opposition manifested in nominal forms also reflects the underlying relationship between the origin of an action and the volitional status of the agent. (6) Nominal marking for certain voice contrasts is thus also motivated by the iconicity principle.

Let us now turn to the causative/noncausative opposition. As noted in the introduction, the causative has been problematic with respect to its status as a voice category. Widely-received definitions of voice, such as Crystal's in Note 1, maintain that voice oppositions do not entail a semantic contrast, which have prevented many grammarians from readily accepting causative/noncausative as one. As the above discussion on the volitional/spontaneous opposition shows, however, there is no reason to believe that voice is a semantically neutral phenomenon. As it happens, one of the oldest systems of voice contrast in Indo-European--the active/middle opposition--also involves a meaning contrast (see below). (7) The question concerning the causative/noncausative opposition (and other semantic oppositions) is whether the relevant contrasts can be naturally integrated into a coherent conceptual framework of voice. Our answer will be yes.

The causative/noncausative opposition pertains to the origin of an action; that is, whether the action originates with the agent of the main action or with another agent heading the action chain. The causative action chain is represented in Figure 3. (8)

[FIGURE 3 OMITTED]

In a noncausative situation, the initial agent ([Agent.sub.2]) is also the agent of the main action. In a causative situation, the ultimate origin of the main action lies in the agent ([Agent.sub.1]) heading the action chain, which is different from the agent ([Agent.sub.2]) of the main action. The relevant parameter for the causative/noncausative distinction can be formulated as below:

Causative/noncausative opposition:

Does the action originate with an agent heading the action chain that is distinct from the agent or patient of the main action?

Yes [right arrow] causative

No [right arrow] noncausative

The contrast between a noncausative situation represented by an expression such as Bill walked and its causative counterpart expressed by a periphrastic causative form like John made Bill walk can thus be naturally captured in terms of the nature of the origin of an action. Situations expressed by lexical causatives such as John killed Bill have an (initial) agent distinct from the patient of the main action.

One of the important points of past studies of causative constructions has to do with the fact that a voice category can be expressed by a construction as a whole, rather than by local morphological entities such as verb inflection or nominal case marking. Lexical and periphrastic causative constructions such as John killed Bill and John made Bill walk are a case in point. They differ in form from morphological causatives such as Quechua wanu-ci (die-CAUSE) 'kill' and Japanese aruka-se (walk-CAUSE) 'make walk', where the causative meaning is expressed morphologically. Traditionally, grammarians have tended to consider only morphological causatives as proper cases. However, such a position leads to the uncomfortable decision of treating the Quechua and Japanese forms cited above as causative, while treating the semantically parallel English expressions kill and make walk as noncausative. The form-based treatment of causatives is tantamount to simply circumscribing morphological causatives, and does not lead to a comprehensive study of causative phenomena. Causation is a semantic, not a morphological notion, and as such the whole range of expression types must be taken into account in a satisfactory analysis. Indeed, a (functional) typological study is predicated on the view that a variety of expression types will obtain in any given conceptual domain. The formal tripartite pattern of lexical, morphological, and periphrastic causative constructions has now been widely accepted, and some revealing correlations between form and function have been identified in the causative domain (see Shibatani and Pardeshi 2002 on recent developments). We see below that a similar pattern holds in other voice domains as well.

Having discussed two voice phenomena pertaining to the origin of an action, we now turn to the next major voice parameter concerning its development. We will consider the other voices associated with the nature of the origin of an action--the passive and the inverse--after dealing with other conceptually-based voice phenomena.

3.2. Parameters pertaining to the development of an action

In this section we recognize at least two sets of contrastive patterns in the developmental phase of an action. One is concerned with whether the action develops beyond the personal sphere of the agent or is instead confined within it. The latter mode of development forms the conceptual basis of what is known as the middle voice. The other contrastive pattern of action development is concerned with whether or not the action has been successfully transferred to the patient and has achieved its intended effect. This contrast forms the conceptual basis for the ergative/antipassive opposition.

The active/middle voice opposition is best known from studies of classical Indo-European languages such as Ancient Greek and Sanskrit, and calls for a broad understanding of the notion of action confinement in the agent's personal sphere. The clearest case in which the development of an action is confined to the agent's sphere is when simple intransitive activities, such as sitting and walking, are lexicalized as intransitive verbs. Here the development of the action is clearly confined within domain of the agent, as shown in the schematic representation Figure 5a. These situations contrast with active (causative) situations (e.g. John sat his son in the chair and John made his son walk) where the relevant actions involve an agent that instigates an action which develops outside the (initial) agent's domain (see Figure 4). In the words of Benveniste (1971 [1950]: 148): "In the active, the verbs denote a process that is accomplished outside the subject. In the middle, which is the diathesis to be defined by the opposition, the verb indicates a process centering in the subject, the subject being inside the process."

[FIGURES 4-5 OMITTED]

Reflexive situations also constitute one of the middle action types, since here the action is also confined within the agent's personal sphere. The active expression John hit Bill contrasts with the reflexive expression John hit himself, where the confinement of the hitting action within one's personal sphere (e.g. hitting one's head or body) is marked by a coreferential reflexive pronoun (see Figure 5b). (9)

Other middle situations of body-care action--bathing, combing one's hair, washing one's hands, and dressing oneself- are straightforward, where the agent's action deals with its own body or body part. Because an action confined to the agent's sphere typically affects the agent itself, this aspect of the middle--an effect accruing to the agent itself--plays an important role in framing certain actions of the middle. Greek middle expressions such as paraschesthai ti 'to give something from one's own means' and paratithesthai siton 'to have food served up' are a case in point. Here the actions actually extend beyond the agent's sphere, but their effects accrue on the agent in the manner of a typical middle depicted in Figures 5b and 5c. In other cases, the notion of the agent's personal sphere is more strictly adhered to, as in the following examples:

Sanskrit

(18) a. devadatto yajnadattasya bharyam

Devadatta.NOM Yajnadatta.GEN wife upayacchati. (active)

have. relations. 3SG.ACT

'Devadatta has relations with Yajnadatta's wife.'

b. devadatto bharyam upayacchate. (middle)

Devadatta.NOM wife have.relations.3 SG.MID

'Devadatta has relations with his (own) wife.'

(Klaiman 1988: 34)

Sanxiang Dulong/Rawang

(19) a. [an.sup.53) [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55]. (active)

3SG mosquito hit

'S/he is hitting the mosquito.'

b. [an.sup.53] [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55] -[cm.sup.31]. (middle)

3SG mosquito hit-MID

'S/he is hitting the mosquito (on her/his body).'

(LaPolla 1996: 1945)

The active/middle opposition is diagrammatically shown as above, where the dotted circles and arrows represent the agent's personal sphere and actions respectively.

The conceptual basis of the active/middle opposition can then be formulated in terms of the manner of the development of an action, as follows:

Active/middle opposition:

Active: The action extends beyond the agent's personal sphere and achieves its effect on a distinct patient.

Middle: The development of an action is confined within the agent's personal sphere so that the action's effect accrues on the agent itself.

Defining the middle voice domain in terms of confinement of an action within the sphere of the agent affords a unified treatment of various types of middle construction. Just as in the case of causatives, middle constructions come in three types--lexical, morphological, and periphrastic--both within individual languages and across different ones. Balinese, for example, exhibits all three types of middle construction, allowing some situation types to be expressed either morphologically or periphrastically, as shown in Table 1.

Our approach to middle voice phenomena is more consistent than Kemmer's (1993), which distinguishes reflexive situations from other middle event types, although these two categories are assumed to form a continuum, as shown in the diagrammatic representation in Figure 6. In our approach, Kemmer's reflexive, middle, and single participant situation types all fall in the middle voice domain, as defined above. Kemmer's distinctions among these types appear to be partly based on the typical forms expressing them. Reflexive situations tend to be expressed periphrastically, as in the case of Balinese nyagur awak 'hit oneself'. Kemmer's middle situation types are typically expressed morphologically, as in ma-cukur 'shave' in Balinese, and single-participant events are typically expressed by forms without any middle markers, as in the Balinese lexical middle negak 'sit'.

[FIGURE 6 OMITTED]

Kemmer arrives at her classification of event types as a result of her decision to "[deal] with ... middle-marking languages, or languages with overt morphological indications of the middle category" (Kemmer 1993: 10; bold face original, underline added). As pointed out in the discussion of causatives above, a strict form-based approach to the middle voice tends to focus on morphological middles, which is similar to the narrow treatment of morphological causatives, ignoring other possible form types. Such an approach would consider the Tarascan (Mexico) form ata-kurhi 'hit oneself' and the Quechua form maqa-ku 'hit onself' as middies, while treating the English and Balinese equivalents hit oneself and nyagur awak as distinct reflexives. Perhaps Kemmer would consider oneself and awak here as "overt morphological indications of the middle category." But then, why is she distinguishing reflexive situations from the middle situations in her diagram reproduced in Figure 6? Also, what of the German form aufstehen 'stand up', which shows no middle marking? Is it not a middle because it lacks any morphological marking? It is semantically equivalent to the Balinese middle form ma-jujuk 'stand up'.

A more systematic typological investigation of the form-function correlation can be achieved if variation in form is taken as a function of the "naturalness" of the middle action. Natural middle actions--for example, sitting and walking--tend to be lexicalized as intransitive verbs, while actions typically directed to others--for example, hitting and kicking--tend to be expressed by periphrastic constructions involving a reflexive form when they are confined within the agent's personal sphere. What Kemmer (1993) has identified as middles--morphological middles--center on those actions that people typically apply to themselves, but that are applied to others often enough. (10) One must, however, realize that there are both intra- and crosslinguistic variations--such that in Balinese ma-jujuk 'stand up' has a morphological middle prefix, but negak 'sit (down)' is simply lexical. The same marking pattern is reversed in German, where sich hinsetzen 'sit down' has a middle marker, but aufstehen 'stand up' does not. These irregularities require individual accounts, based on historical, cognitive, and even cultural data.

The middle voice system has several important implications for our general understanding of the nature of voice phenomena. Recall that most of the widely received definitions of voice (such as the one quoted from Crystal [2003] in Note 1) hold that voice opposition does not entail a meaning contrast. This is not the case for the active/middle opposition, as shown by the examples above as well as by the contrast between the English active form John hit Bill and the middle form John hit himself.

Secondly, these examples show that voice alternations do not necessarily alter argument alignment patterns. There is no change in grammatical relation in the contrastive pairs in (18) and (19). If the situations depicted there give the impression of unusual utterances, consider the mundane situations described by the following Greek examples, where a meaning contrast is expressed without a realignment of arguments:

Ancient Greek

(20) a. louo khitona. (active)

wash.1SG.ACT shirt.ACC

'I wash a shirt.'

b. louomai khitona. (middle)

wash.1SG.MID shirt.ACC

'I wash my shirt/I wash a shirt for myself.'

While morphological middle constructions in some languages are strictly intransitive (as in the case of the Balinese ma-), and middles derived via the decausative function (as in the Greek forms porefisai 'to cause to go, to convey': poreusasthai 'to go' and kaiein 'to light, kindle': kaiesthai 'to be lighted, to bum') are intransitive, intransitivity is not a defining property of middle constructions. A large number of languages allow middle constructions that are syntactically transitive, as shown in the examples above and (21b) below, where the direct object is clearly marked by the accusative case suffix -n.

Amharic

(21) a. lemma te-lac' ce.

Lemma MID-shave.PERF.3M

'Lemma shaved himself.'

b. lamma ras-u-n te-lac' ce.

Lemma head-POSS.3M-ACC MID-shave.PERF.3M

'Lemma shaved his head.'

(Amberber 2000: 325, 326)

The general tendency for morphological middles to be intransitive is best viewed as the result of historical processes responding to the pressure on the form to conform to the semantic intransitivity, which characterizes middle events. This is exactly what has happened to many of the middle forms expressing reflexive middle situations in European languages, where the relevant affixes evolved from reflexive pronouns in the parent languages. The course of this development can be illustrated by using synchronic data below, where the Swedish example shows an intermediate clitic stage, the Russian form sebja exemplifies the earliest transitive pattern, and -s' (or -sja) the advanced fused pattern.

(22) a. Ivan ubi-1 sebj-a. (Russian)

Ivan kill.PERF-PAST.SG.MASC self-ACC

'Ivan killed himself.'

b. Honkamma-de sag. (Swedish)

she comb-PAST MID

'She combed.'

c. Ona prichesa-l-a-s'. (Russian)

she comb-PAST-FEM-MID

'She combed.'

Finally, in recognizing intransitive and transitive verbs as lexicalized middle and active voice forms, we elevate the active/middle contrast to the status of a central voice opposition observed in all human languages (cf. Dixon's [1979: 68-69] observation that "all languages appear to distinguish activities that necessarily involve two participants from those that necessarily involve one ... Then all languages have classes of transitive and intransitive verbs, to describe these two classes of activity"). (11)

Let us now turn to the antipassive voice. As the name suggests, the syntactic properties of antipassive constructions mirror somewhat those of passives, but the semantic aspect is different in these two voices. In the case of the passive, there is no implication that an agent is not somehow fully involved in the action. Indeed, full involvement of an agent is a crucial feature distinguishing the passive (e.g. John was killed while he was asleep) from the spontaneous middle (e.g. John died while he was asleep). Antipassive situations contrast in meaning with those expressed in the active and the ergative voice regarding the attainment of the intended effect upon a patient, however.

The intended effect of an action on a patient differs depending on the verb type. With contact verbs, the antipassive presents a situation as failing to make contact, as in the following examples:

Chukchee

(23) a. elteg=e keyn=en penre-nen.

father=GER bear=ABS attack=3SG:3SG/AOR

'The father attacked the bear.'

b. elteg=en penre=tko=g[??]e

father=GER attack=APASS=3SG.AOR

keyn=ete. (antipassive)

bear=DAT

'The father rushed at the bear.'

(Kozinsky et al. 1988: 652)

Warlpiri

(24) a. nyuntulu-rlu [??]-npa-ju pantu-rnu ngaju.

you-ERG [??]-2SG.A-1SG.P spear-PAST I.ABS

'You speared me.'

b. nyuntulu-rlu [??]-npa-ju-rla pantu-rnu

you-ERG [??]-2SG.A-1SG-DAT spear-PAST

ngaju-ku. (antipassive)

I-DAT

'You speared at me; you tried to spear me.'

(Dixon 1980: 449)

According to Dixon (1980: 449), (24b) above "indicates that the action denoted by the verb is not fully carried out, in the sense that it does not have the intended effect on the entity denoted by the object [read "patient", MS]." Similarly, visual contact is not made when situations involving visual perception are presented in the antipassive voice:

Warrungu

(25) a. nyula nyaka+n wurripa+[??].

3SG.NOM see+P/P bee+ABS

'He saw bees.'

b. ngaya nyaka+kali+[??] wurripa+wu katyarra+wu.

1SG.NOM see-APASS+P/P bee+DAT possum+DAT

'I was looking for bees and possums.'

(Tsunoda 1988: 606)

Moreover, for action types affecting a patient, the antipassive voice presents a situation as NOT affecting the patient in totality, as in the following examples:

Samoan

(26) a. S[bar.a] 'ai e le teine le i'a.

PAST eat ERG ART girl ART fish

'The girl ate the fish.'

b. S[bar.a] 'ai le teine i le i'a.

PAST eat ART girl LOC ART fish

'The girl ate some (of the) fish.'

(Mosel and Hovdhaugen 1992: 108)

The voice parameter focusing on the ergative/antipassive contrast can be formulated as below:

Ergative/antipassive opposition:

Does the action develop to its full extent and achieve its intended effect on a patient?

Yes [right arrow] ergative(/active)

No [right arrow] antipassive

Notice that in (24b) an antipassive event is conveyed solely by the case marking on the patient, underscoring our earlier point that voice may be manifested in a nominal element denoting the relevant participant. In the case of the antipassive, the status of the patient is at issue, and antipassivization iconically affects the form of the patient nominal--either case marking it differently from the active/ergative (a case of the so-called differential object marking [Moravcsik 1978]), or avoiding coding it (examples below).

As conceived here, both the middle and the antipassive relate to the nature of the development of an action. Specifically, both have the ontological feature of an action not (totally) affecting a distinct patient. The conceptual affinity between the two explains the middle/antipassive polysemy seen in a fair number of languages. Observe:

Yidiny

(27) a. wagu:da bambi-dinu.

man.ABS cover-MID

'The man covered himself.'

b. wagu:da wawa-:dinu gudaganda.

man.ABS saw-APASS dog.DAT

'The man saw the dog.'

(Dixon 1977: 277, 280)

Balinese

(28) a. Ia sedek ma-sugi.

3SG ASP MID-wash.face

'She is washing her face.'

b. Tiang ma-daar.

1SG APASS-eat

'I ate.'

Shibatani and Artawa 2002)

Russian

(29) a. Ivan mojetsja mylom.

Ivan wash.MID soap.INSTR

'Ivan washed himself with soap.'

b. Babuska rugajetsja.

granny.NOM scold.APASS

'Granny is scolding.'

(Geniusiene 1987: 9)

In addition, languages may show the well-known connection between the middle and the passive (12) through the use of the same form as the antipassive, thus illustrating a three-way middle-passive-antipassive polysemy:

Russian (cf. the examples immediately above)

(30) Dom stroitsja turezk-oj firm-oj

house.NOM is.being.built.PASS Turkish-INST firm-INST

INKA.

INKA

'The house is being built by the Turkish company INKA.'

Kuku Yalanji

(31) a. karrkay julurri-ji-y. (middle)

child.ABS wash-MID-NONPAST

'The child is washing itself.'

b. warru (yaburr-ndu) bayka-ji-ny. (passive)

young man.ABS shark:LOC:pt bite-PASS-PAST

'The young man was bitten (by a shark).'

c. nyulu dingkar minya-nga nuka-ji-ny. (antipassive)

3SG.NOM man.ABS meat-LOC eat-APASS-PAST

'The man had a good feast of meat (he wasted nothing).'

(Patz 1982: 244, 248, 255)

3.3. The termination of action parameter

In a regular transitive event, an action terminates in a patient. However, the action may extend beyond the patient and affect an additional entity, which then functions as a new terminal point. Benefactives/malefactives and applicatives express this kind of situation. The relevant parameter can be formulated in the following form:

Benefactive/malefactive/applicative parameter:

Does the action develop further than its normal course, such that an entity other than the direct event-participants becomes a new terminal point registering an effect of the action?

No [right arrow] active/middle

Yes [right arrow] benefactive/malefactive/applicative

While the notion of benefit-giving is a broad one, there is one particular type with a perceptible change in the beneficiary. This is the case involving transfer of an object, where the object itself is directly affected by the act of giving. In a typical giving situation, the object is physically moved from one owner to a new one. The recipient beneficiary is secondarily affected because it comes into possession of the transferred object. Languages often have a special benefactive construction that portrays this type of situation, where the effect on the beneficiary is indicated by its argument status in syntactic coding. As shown in Shibatani (1996), benefactive constructions are typically based on the syntactic schema of the give-construction even involving the verb form for giving in some languages, as in the case of Japanese seen below:

(32) a. Taroo-wa Hanako-ni hon-o yat-ta.

Taro-TOP Hanako-DAT book-ACC give-PAST

'Taro gave Hanako a book.'

b. Taroo-wa Hanako-ni hon-o kat-te

Taro-TOP Hanako-DAT book-ACC buy-CONJ

yat-ta.

BEN-PAST

'Taro bought Hanako a book.'

In (32b) the buying action is extended beyond the patient (the book), and affects the beneficiary nominal (Hanako) coded in the dative form. Compare this construction to the one below, expressing a more general benefit-giving in which the beneficiary takes on a nonargument form.

(33) Taroo-wa Hanak-no tame-ni hon-o

Taro-TOP Hanako-of sake-for book-ACC

kat-te yat-ta.

buy-CONJ GIVE-PAST

'Taro bought a book for (the sake of) Hanako.'

While (33) may express any type of benefit-giving--including one of buying a book to help Hanako's book-selling business--(32b) specifically conveys the meaning that the transfer of the book was intended. Note also the English translations accompanying these examples, which show the same contrast.

Benefactive/malefactive events are also realized by so-called external possession constructions in Indo-European and some other languages (cf. Payne and Barshi 1999), although the context may determine whether or not a clear benefactive/malecfactive reading obtains from them. When a body part is involved as the primary patient (cf. below), the benefactive/malecfactive reading is not strongly pronounced beyond that which is conveyed by the verb; cf. (34) and (35a):

German

(34) Ich wasche mir die Hande.

I wash I.DAT the hands

'I wash my hands.' (lit. 'I wash me the hands.')

(35) a. Man hat ihm den Arm gebrochen.

lit. 'They broke him the ann.'

b. Man hat seinen Arm gebrochen.

'They broke his arm.'

Where inalienable possession is implicated as above, the dative nominal indicates that the action has affected it as a new terminal point of the action. In German, the external possession construction is generally obligatory when the affected body part is inalienably possessed; the extension of the action to its owner is inevitable under such circumstances. Indeed, an internal possession construction like (35b) suggests that the arm in question was detached, and no effect on its owner is asserted by such a sentence. Internal possession constructions involving inalienably possessed body parts, as in the English form I broke his arm, suggest that the arm's owner was affected, but the implication is obtained through a commonsensical world view. The dative construction (35a), on the other hand, asserts that the body part owner is affected by the action.

The benefactive/malefactive reading can be seen more readily in the following examples, where the dative nominal represents a mentally affected party:

French

(36) a. Jean lui a casse sa vaisalle.

lit. 'Jean broke her her dishes.'

b. Jean a casse sa vaisalle.

'Jean broke her dishes.'

Modern Hebrew

(37) a. ha tinok lixlex li et ha xulca.

the baby dirtied I.DAT ACC the shirt

'The baby dirtied the shirt on me.'

b. ha tinok lixlex et ha-xulca shel-i.

the baby dirtied ACC the-shirt of-me

'The baby dirtied my shirt.'

(Berman 1982; T. Gibon pers. comm.)

Where inalienable possession is evident, as in these examples, a malefactive meaning obtains more readily. The trade-off between inalienability and affective reading shows that a principle of relevance is at work in these constructions: the relevance of the dative arguments to the event must be somehow "guaranteed." Involvement of an inalienably possessed object guarantees the relevance of the possessor to the event, since whatever happens to the body part will affect its possessor automatically. When an inalienable possession relation does not obtain--as in (36a) and (37a)--a benefactive/malefactive effect upon the dative argument is pronounced as a way of establishing its relevance to the event. The attendant interpretation that a possessive relation exists contributes to the establishment of the affective relationship; the owner of an object is more easily affected by what happens to its possession.

Contrary to what the label suggests then, so-called external possession constructions DO NOT assert a possessive relation between the dative argument and the directly affected patient. Indeed, the relevant constructions arise independently from externalization of the possessor, as in the German example below (also in [36a] above), or when the notion of possession is irrelevant, as in the following examples (40)-(41) from River Warihio (Uto-Aztecan): (13)

German

(38) Peter repariert mir mein Fahrrad.

'Peter fixes me my bicycle.'

River Warihio

(39) a. hustina pasu-re muni kukuci icio.

Agustina cook-PERF beans children BEN

'Agustina cooked beans for the children.'

b. hustina pasu-ke-re muni kukuci.

Agustina cook-BEN-PERF beans children

'Agustina cooked beans for the children.'

(40) maniwiri no'o wikahta-ke-ru yoma aari.

Manuel 1SG.NS sing-BEN-PERF all afternoon

'Manuel sang all afternoon for me.'

(41) tapana no'o yuku-ke-ru.

yesterday 1SG.NS rain-BEN-PERF

'Yesterday it rained on/for me.'

(Felix 2005: 253, 257, 258)

That the condition of physical proximity should be more important than the possessive relation in inducing a benefactive/malefactive construction is shown by the following River Warihio examples (see Shibatani 1994 for other cases):

(42) a. maniwiri ihcorewapate-re wani pantaoni-ra.

Manuel get.dirty-PERF John jeans-POSS 'Manuel dirtied John's jeans.' (John's jeans were over the chair.)

b. maniwiri ihcorewapate-ke-re pantaoni wani.

Manuel get.dirty-BEN-PERF jeans John

'Manuel dirtied John's jeans.' (John was wearing his jeans.)

In general, applicative constructions have been considered as syntactic valency-increasing operations that are pragmatically motivated (see Peterson 1999). Our claim is that their conceptual basis is rooted in the ontological feature of an action, as stated in the voice parameter above. Peterson's (1999) survey shows that certain applicatives are more basic and prevalent than others. In the words of Peterson (who lumps benefactives and applicatives together), "the locative and circumstantial applicatives depend on the presence of other applicative constructions, while benefactive and instrumental/comitative applicatives do not. That is, there are two core applicative constructions, benefactive and instrumental/ comitative, and these serve as anchors as it were for the development of additional applicative constructions marked either by the same or distinct morphology" (Peterson 1999: 135). This observation is consistent with our view of the benefactive/applicative voice. Benefactive and instrumental/comitative participants are much more directly involved in the event than a causal factor, or setting entity such a location, hence much more likely to be affected by the action. That the benefactive applicative is obligatory in some languages also underscores the point regarding the affected nature of the recipient beneficiary (cf. above).

In the past, grammarians may have not paid sufficiently close attention to the subtle meaning differences that exist between applicative constructions and their nonapplicative counterparts. However, recent descriptions of applicative constructions have begun to notice some revealing semantic effects. For example, Donohue (1999) shows that the Tukang Besi comitative applicative conveys a meaning whereby the applied comitative nominal is actively engaged in the event: (14)

Tukang Besi

(43) a. No-moturu kene wowine ane ke hotu mopera.

3R-sleep and woman exist and hair short

'He slept with the woman with the short hair.'

(i.e. they were sleeping near each other.) (# they had sex together.)

b. No-moturu-ngkene te wowine ane ke hotu

3R-sleep-COM CORE woman exist and hair

mopera.

short

'He slept with the woman with the short hair.' (i.e. they had sex together.)

(Donohue 1999: 231)

The following instrumental applicative from Pulaar also demonstrates how an applied instrumental nominal can implicate a participant more thoroughly affected by the agent's action:

Pulaar

(44) a. mi loot-ii min am a

1SG wash-PERF.ACT y.s. 1SG.POSS PREP

saabunnde hee.

Soap DET

'I washed my younger sibling with (some of) the soap.'

b. mi loot-r-ii min am

1.SG wash-INST-PERF.ACT y.s. 1SG.POSS

saabunnde hee.

soap DET

'I washed my younger sibling with (all of) the soap.'

(Sebastian Ross-Hagebaum pers. comm.)

The various effects of locative applicatives have also been recognized in the literature. The Balinese locative expression in (45b) below, for example, describes a situation where the action of planting banana trees extends in such a way as to affect the garden. Here the entire garden ends up being planted with banana trees, while no such implication is made in the nonapplicative counterpart (45a).

Balinese

(45) a. Tiang mulan biyu di tegalan tiang-e.

1SG plant banana in garden 1SG-POSS

'I planted bananas in my garden.'

b. Tiang mulan-in tegalan tiang-e biyu.

1SG plant-APPL garden 1SG-POSS banana 'I planted my garden with bananas.'

(I. Wayan Arka pers. comm.)

On the conceptual framework for voice phenomena *.

Abstract

This article attempts to lay the conceptual foundations of voice phenomena, ranging from the familiar active/passive contrast to the ergative/antipassive opposition, as well as voice functions of split case-marking in both transitive and intransitive constructions. We advance the claim that major voice phenomena have conceptual bases rooted in the human cognition of actions, which have evolutionary properties pertaining to their origin, development, and termination. The notion of transitivity is integral to the study of voice as evident from the fact that the so-called transitivity parameters identified by Hopper and Thompson (1980) and others are in the main concerned with these evolutionary properties of an action, and also from the fact that the phenomena dealt with in these studies are mostly voice phenomena. A number of claims made in past studies of voice and in some widely-received definitions of voice are shown to be false. In particular, voice oppositions are typically based on conceptual--as opposed to pragmatic--meanings, may not alter argument alignment patterns, may not change verbal valency, and may not even trigger verbal marking. There are also voice oppositions more basic and wide-spread than the active/passive system, upon which popular definitions of voice are typically based

1. Introduction

Current studies on voice phenomena suffer from a number of inadequacies at several levels of description and explanation. At the most fundamental level, there is no coherent conceptual framework that adequately addresses the matter, such that we are often left to wonder whether or not a given phenomenon falls in the domain of voice. For one thing, people differ in the treatment of causative and reflexive constructions; some consider them to represent voice categories, while others do not. Still others avoid raising the issue at all. Various definitions currently offered are of little use, as they are typically based on an Indo-European active/passive opposition, and arbitrarily include or exclude a particular phenomenon from the domain of voice. (1)

Properly identifying construction types representing a voice sub-domain is also a serious problem. In Crystal's (2003) definition (cf. Note 1) reflexives are not recognized as proper voice constructions and their relationship to the middle voice is not entirely clear. A similar problem is seen in Kemmer's (1993) extensive study of middle voice constructions.

There are also severe limitations at the level of explanation. Closer to the main theme of this volume is the problem of understanding the increases and decreases in valency and accompanying changes in argument structure observed in voice phenomena. Why do certain phenomena (e.g. the causative and applicative) show an increase in valency, while others (e.g. the passive and antipassive) typically have a valency-reducing effect? What motivates these valency changes in opposite directions?

Functional explanations regarding the distribution of certain voice constructions go a long way toward an explanatory functional study of grammatical phenomena (cf. Haiman 1985). Being largely based on formal properties such as "linguistic distance" and "full" vs. "reduced form," these explanations are not functional enough to be able to make more general predictions. (2)

The problems outlined above largely stem from two related methodological issues. One is the lack of a coherent conceptual framework for characterizing and analyzing voice phenomena; the other is an over-reliance on formal properties in both analysis and explanation. Clearly the latter problem is caused by the former and by the lack of commitment to the cognition-to-form approach in linguistic analysis. (3) The purpose of this article is thus to lay out a conceptual framework that coherently delineates the domain of voice, which embraces both those phenomena that are traditionally recognized as falling in the voice domain and those that have been kept in limbo. The framework required must deal with the fact that many voice phenomena straddle the semantics-pragmatics boundary, although the active/middle opposition is basically conceptual or semantic, and the active/passive opposition is largely pragmatic. We endeavor to unify these manifestations of voice function by assuming that the pragmatic relevance of clausal units is semantically determined in the first place.

The conceptual foundations of voice can only be arrived at by inspecting contrasting phenomena across languages. Our initial task is therefore to learn how a given language, using its own resources, achieves the goal of expressing a relevant conceptual opposition found in another language. While the ultimate goal of functional typology is to discover the correlative patterns between form and function, this article is concerned primarily with the initial task of postulating conceptual bases of voice phenomena and identifying constructions across languages that express the relevant oppositions.

One final introductory remark is due regarding the controversy over the question of whether the formal relationships between opposing voice categories should be treated as inflectional or derivational. We consider this question to be academic in the absence of rigorous definitions for these processes. In the realm of voice phenomena, some systems, for example, the Ancient Greek active/middle system, incorporate voice morphology in their inflectional paradigm. Others like the English active/ passive opposition do not show a simple morphological relationship--inflectional or derivational--since constructions as a whole enter into the formal opposition. The regularity or productivity of the pattern is often taken to be an important criterion distinguishing inflections from derivations; the former are thought to be regular and obligatory, while the latter allow exceptions. But regularity in natural language is always relative, and so are the patterns of voice oppositions. Even among the known ones, nothing is one hundred percent regular. An alternation that is well-integrated within the inflectional paradigm may show irregularity. In Ancient Greek, for example, we find both active forms that do not have middle counterparts (activa tantum) and middle forms lacking the corresponding active (media tantum). The active/passive opposition also shows a high degree of regularity, without ever being one hundred percent (as in the case of English), others place much severer limitations on the range of permissible passive constructions.

2. The evolution of an action: voice, transitivity, and aspect

The basic claim of this article is that major voice phenomena have their conceptual bases rooted in the human cognition of actions. Because such actions have various effects upon us, we have special interest in the way that they arise, how they develop, and the manner in which they terminate--what is referred to as the evolutionary properties or phases of an action in this article. Through a system of grammatical oppositions, a language provides a means for expressing conceptual contrasts pertaining to the evolutionary properties of an action that the speaker finds relevant for communicative purposes. Among the evolutionary properties, voice is primarily concerned with the way event participants are involved in actions, and with the communicative value, or discourse relevance pertaining to the event participants from the nature of this involvement.

Mention of the evolution of an action immediately brings to mind two other grammatical concepts, namely, transitivity and aspect. It is thus appropriate to clarify the relationships and differences between these notions. Traditionally, voice has been defined in reference to transitivity, or more narrowly in terms of the transitivity of a verb or clause; the active/ passive opposition most typically obtains with transitive verbs. A more important connection between transitivity and voice, however, lies in the notion of semantic transitivity, rather than strictly verbal or clausal transitivity. Indeed, it is easy to see this connection, as in the work of Hopper and Thompson (1980), where many of the phenomena discussed in terms of transitivity are nothing but voice phenomena. This important article concludes the section on grammatical transitivity as follows: "It is tempting to find a superordinate semantic notion which will include all the Transitivity components. If there is one, it has so far not been discovered ..." (Hopper and Thompson 1980: 279). Our claim is that what they are looking for is a theory of voice. In fact, the work of Hopper and Thompson lays important ground work for the study of voice. In this regard, Kemmer (1993: 247) is absolutely correct in noting that "the scale of transitivity ... forms the conceptual underpinning for voice systems in general, and for reflexive and middle marking systems in particular." (4) While none of these works makes it quite clear, voice is a system of correspondences between action or event types and syntactic structures. For example, what is known as the active voice is the pattern of correspondence between the high transitive event type or the prototypical transitive action and the nominative-accusative coding pattern of the event participants, as in the English active sentence She killed him (see Section 5 below).

The parameters of transitivity identified by Hopper and Thompson (1980) pertain to "different facet[s]" of "carrying-over or transferring an action from one participant to another" (Hopper and Thompson 1980: 253), and they in effect represent the evolutionary properties of an action, that is, they pertain to the way an action is brought about, to the way it is transferred to the second participant, and to the way it affects this participant. In order to bring grammar closer to cognition, we propose to examine specific evolutionary properties of an action pertaining to voice oppositions that are distilled as transitivity parameters in Hopper and Thompson (1980) and others dealing with the issues of transitivity.

If transitivity is integral to a theory of voice, how then do aspect and voice differ under the assumption that both are concerned with the way an action evolves? These two grammatical categories invite different kinds of questions. Aspect asks where the vantage point is with regard to the temporal structure of an action. When the action is viewed holistically encompassing all of its temporal phases, we obtain the perfective viewpoint of the described action. On the other hand, if specific sections of internal temporal structure are focused, we obtain various types of imperfective aspectual construal of an event. The contrast between the perfective and the imperfective aspects and the representative subcategories of the latter seen across languages are represented in Figure 1.

[FIGURE 1 OMITTED]

Voice, on the other hand, asks how an action evolves--that is, it asks about the nature of its origin, the manner in which it develops, and the way that it terminates. These evolutionary phases of an action and the various voice categories pertaining to them are depicted schematically in Figure 2.

[FIGURE 2 OMITTED]

3. Major voice oppositions and their conceptual bases

Under the present conception, the three principal evolutionary phases of an action--origin, development, and termination--form the basis for the major voice parameters. These parameters are generally expressible in the form of questions concerning the evolutionary properties of an action, as below:

Major voice parameters:

I. The origin of an action

(a) How is the action brought about?

(b) Where does the action originate?

(c) What is the nature of the agent?

II. The development of an action

How does the action develop?

(a) Does the action extend beyond the agent's personal sphere or is it confined to it?

(b) Does the action achieve the intended effect in a distinct patient, or does it fail to do so?

III. The termination of an action

Does the action develop further than its normal course, extend beyond the immediate participants of the event, and terminate in an additional entity?

Figure 2 summarizes the voice constructions pertaining to these parameters. Throughout the following discussion, we touch upon the theoretical consequences of this diagrammatic representation of the voice domain.

3.1. Parameters pertaining to the origin of an action

The first opposition to be examined has to do with the nature of the origin of an action--namely, whether the action in question is brought about volitionally or nonvolitionally by a human agent.

Volitional/spontaneous opposition:

Is the action brought about volitionally?

Yes [right arrow] volitional

No [right arrow] spontaneous

While not widely recognized as a voice opposition, this distinction has been recognized as such in the Japanese grammatical tradition, perhaps because the suffix for the spontaneous voice is identical with that used in the passive construction. In fact, it is generally believed that the Japanese passive arose from the spontaneous construction. Languages (or grammarians' interpretations of the facts?) may differ with regard to the precise meaning contrast seen in the volitional/spontaneous opposition. In Japanese, the spontaneous construction expresses a situation where the agent does not intend to bring about an action, but where there is a circumstantial factor external to the agent that induces an action (such as eating "dancing-mushrooms" as in [1b] below). In other languages, a spontaneous form conveys the meaning of an action accidentally brought about. Other manifestations of the opposition may be alternatively expressed in terms of such notions as intentional/unintentional or controlled/uncontrolled, but we shall take the position that these contrasts are included in the basic function of the volitional/spontaneous opposition. That is, by "volitional voice" we mean a connection between a particular syntactic form and a type of action that is brought about by the willful involvement of an agent who "intends the action," and sees to it that the intended effect is achieved. Departure from this action type in any significant way may be construed as constituting a spontaneous action, expressed by a construction formally contrasting with the volitional construction.

In Modern Japanese, the domain of the volitional/spontaneous opposition has shrunk to such an extent that mental activities are the only ones where the contrast is readily observed, with the spontaneous morphology (-re/-rare) having generally given way to a passive interpretation in the domain of physical actions. In Classical Japanese (ninth-twelfth centuries), the volitional/spontaneous opposition was more widely observed, as in the following examples:

Classical Japanese

(1) a. Kikori-domo mo mai-keri. (volitional)

wood cutter-PL also dance-PAST

'Wood cutters also danced.'

b. Kikori-domo mo mawa-re-keri. (spontaneous)

wood cuter-PL also dance-SPON-PAST

'Wood cutters also danced willy-nilly.'

Spontaneous expressions in Japanese typically do not contain an agent in subject position. Because information regarding the volitional status of an agent is most readily accessible to the speaker, the volitional/spontaneous distinction is typically made with reference to a first person agent; accordingly, the missing agent is understood to be the speaker unless otherwise specified. This non-coding of an agent in subject position paved the way for a spontaneous expression where a patient nominal is coded in subject position, as in the following spontaneous construction (2b). Undoubtedly, this was an important step in the development of the passive from the spontaneous construction.

Modern Japanese

(2) a. Boku-wayoku mukasi-no-koto-o

I-TOP often old days-GEN-things-ACC

omo-u. (volitional)

think-PRES

'I often think about the things of the old days.'

b. Saikin mukasi-no-koto-ga yoku

recently old days-GEN-things-NOM often

omowa-re-ru. (spontaneous)

think-SPON-PRES

'Recently the things of the old days often come to mind.'

Since the volitional/spontaneous opposition is not widely recognized as a voice phenomenon, it is perhaps worth spending some time showing how widespread in the world's languages it actually is. As in other voice sub-domains, languages make use of different resources in expressing the volitional/spontaneous opposition. Indonesian and Malay use the multifunctional prefix ter- to express unintended or accidental actions:

Indonesian

(3) a. Ali memukul anak-nya. (volitional)

Ali AF.hit child-3SG.POSS

'Ali hit his child.'

b. Ali ter-pukul oleh anak-nya. (spontaneous)

Ali SPON-hit PREP child-3SG.POSS

'Ali accidentally hit his child.'

(I Wayan Arka pers. comm.)

According to Winstedt (1927: 86-87), the function of ter- in Malay is characterized as denoting an action due "not to conscious activity on the part of the subject, but to external compulsion or accident." It is noteworthy that spontaneous constructions in both Japanese and Indonesian/ Malay have an affinity with the passive in that they share the same affix in these languages. Compare the spontaneous constructions above with the passives in Japanese and Indonesian below:

(4) a. Taroo-wa Ziroo-ni nagura-re-ta. (Japanese passive)

Taro-TOP Jiro-by hit-PASS-PAST

'Taro was hit by Jiro.'

b. rumah itu tidak ter-beli oleh

house that NEG PASS-buy PRES

saya. (Indonesian passive)

1.SG

'The house cannot be bought by me.'

The diagrammatic representation of voice constructions in Figure 2 can be thought of as a semantic map, where different constructions are distributed over relevant territory within the voice domain. This is a useful way of representing conceptual affinities among various voice constructions, but its utility is predicated only on a comprehensive view of voice as advocated in this article. Spontaneous and passive are both concerned with the origin of an action. What they share is the idea that this lies NOT in the pragmatically most relevant participant; in the case of the passive, it is the agent of low discourse relevance and in the spontaneous case, it is the external circumstance.

The map in Figure 2 also shows the "neighboring" relationship between the spontaneous, the middle, and the antipassive. In Russian and a number of Australian languages, middle forms are recruited for the volitional/spontaneous contrast, as in the following examples:

Russian

(5) a. Kostja poreza-I xleb.

Kostja cut.PERF-PAST.SG.MASC bread

'Kostja cut the bread.'

b. Kostja porezaq-sja.

Kostja cut.PERF-PAST.SG.MASC-SPON

'Kostja has [accidentally] cut himself.'

(Vera Podlesskaya pers. comm.)

Diyari

(6) a. natu yinana danka-na wawa-yi.

1SG.ERG 2SG.O find-PARTC AUX-PRES

'I found you (after deliberately searching).'

b. nani danka-tadi-na wara-yi yinka ngu.

1SG.ABS find-SPON-PARTC AUX-PREP 2SG.LOC

'I found you (accidentally).'

(Austin 1981: 154)

Another favorite source for the spontaneous construction--especially prominent among Indo-Aryan and Dravidian languages of India--is the so-called dative-subject construction, which typically expresses uncontrollable states:

Sinhala

(7) a. mame ee wacene kiwwa.

I.NOM that word say.PAST

'I said that word.'

b. mate ee wacene kiyewuna.

I.DAT that word say.P.PAST

'I blurted that word out.'

(Gair 1990: 17)

The adaptation of the dative-subject construction for a spontaneous action is also seen when the "dative-subject" is marked by cases different from the dative as in the following Bengali examples, where the nominal form corresponding to the dative subject is marked with genitive. Here the volitional/spontaneous contrast takes on interesting nuances:

Bengali

(8) a. Ami toma-ke khub pc chondo kor-i.

1SG.NOM 2ORDSG-OBJ very liking do-PRES.1

'I like you very much.' (According to my own criteria.)

b. Ama-r toma-ke khub pc chondo

1SG-GEN 2ORDSG-OBJ very liking

hc y.

become-PRES.3ORD

'I like you very much.' (According to some [socially] set criteria.)

(Onishi 2001: 120)

When the basic meaning of the verb denotes a spontaneous (involuntary) action, the volitional voice form can be obtained by using a self-benefactive construction, as in Marathi and other Indo-Aryan languages:

Marathi

(9) a. sitaa raD-l-i.

Sita.NOM cry-PERF-F

'Sita cried.'

b. sitaa-ne raD-un ghet-l-a.

Sita-ERG cry-CONJ take-PERF-N

'Sita cried (so as to relieve herself).'

(Prashant Pardeshi pers. comm.)

Lhasa Tibetan has a set of auxiliaries expressing different categories of perspective. "Perspective-choice" interacts with both person and evidential categories in a complex way, but the relevant auxiliaries can be divided into a "self-centered" and an "other-centered" group (Denwood 1999). Verbs denoting such intentional actions as reading and dancing normally occur with self-centered auxiliaries when used with first person subjects. They can be made nonintentional or spontaneous with the use of other-centered auxiliaries, as in the following examples:

Tibetan

(10) a. ngas. yi.ge, klog.ba yin.

I-SMP letter read-LINK-AUX (self-centered)

'I read the letter (on purpose).'

b. ngas. yi.ge, klog.song.

1-SMP letter read-AUX (other-centered)

'I read the letter (without meaning to).'

(Denwood 1999: 137)

Conversely, although unintentional verbs expressing involuntary actions such as coughing and seeing normally occur with other-centered auxiliaties, they can be rendered volitional by the use of self-centered auxiliaries:

Tibetan

(11) a. glo. rgyab.byung.

cough-AUX (other-centered)

'I coughed (involuntarily).'

b. glo. rgyab.pa.yin.

cough-LINK-AUX (self-centered)

'I coughed (deliberately).'

(Denwood 1999: 139)

A similar pattern is observed in Newar (Tibeto-Burman), where the relevant contrast is expressed in terms of a distinction between conjunct and disjunct verbal endings--apparently an evidentiality-related phenomenon. Note that only clauses with first person subjects allow this contrast to be expressed.

Newar

(12) a. ji-n kayo tachya-na

1SG-ERG cup break-PC

'I broke the cup (deliberately).'

b. ji-n kayo tachya-ta

1SG-ERG cup break-PD

'I broke the cup (accidentally).'

(Kansakar 1999: 428)

Finally, the phenomenon now widely recognized in the name of "split intransitivity" is rooted in the volitional/spontaneous opposition. Observe first some well-known examples from Eastern Pomo below:

Eastern Pomo

(13) a. ha: c'e:xelka.(volitional)

1SG.A slip

'I am sliding.'

b. wi c'e:xelka. (spontaneous)

1SG.P slip

'I am slipping.'

(McLendon 1978: 1-3)

Although the verb forms are the same, when the pronominal form is inflected for the patient (13b), the sentence conveys a spontaneous action or a "lack of protagonist control" (McLendon 1978: 4). A similar contrast is seen in the Caucasian language Tsova-Tush (Batsbi), where "[the] referent of [an ergative] subject is a voluntary, conscious, controlling participant in the situation named by the verb" (Holisky 1987:113).

Tsova-Tush (Batsbi)

(14) a. (as) vuiz-n-as.

1SG.ERG fall-AOR- 1SG.ESRG

'I fell down, on purpose.'

b. (so) voz-en-sO.

1SG.NOM fell-AOR-1SG.NOM

'I fell down, by accident.' (Holisky 1987: 104)

In addition to these cases of "fluid-S" marking (Dixon 1994), split intransitivity may be realized as a lexically-conditioned phenomenon, where intransitive verbs are classified into an "agentive" class and a "patientive class." Agentive and patientive nominals respectively trigger marking similar to the corresponding arguments of a transitive clause. The Philippine language Cebuano shows this pattern through a focus system which is characteristic of Formosan and Western Austronesian languages:

Cebuano

(15) Transitive actor-focus construction

Ni-basa ako ug libro.

AF-read I.TOP 1NDEF book

'I read a book.'

(16) Transitive patient-focus construction

Gi-basa nako ang libro.

PF-read I TOP book

'I read the book.'

(17) a. Agentive intransitive

Ni-dagan ako. (actor-focus form)

AF-run I.TOP

'I ran.'

b. Patientive intransitive

Gi-kapoy ako. (patient-focus form)

PF-tired I.TOP

'I got tired/I am tired.'

Generalizing processes have the effect of obliterating the basic semantic motivation for distinguishing two classes of intransitive verbs; either the larger agentive or larger patientive class of intransitive verbs tends to have semantically heterogeneous verbs. Nevertheless, the split of intransitive verbs into two classes is rooted in the distinction between volitional and involuntary actions involving an animate protagonist. This is seen in a minority class of verbs, such that a minority agentive class contains verbs denoting controlled actions, and a minority patientive class includes verbs denoting involuntary states of affairs (see Merlan 1985). In Cebuano (and perhaps other Philippine languages as well) the larger agentive class includes verbs denoting uncontrolled events such as raining or slipping off, while the minority patientive class contains verbs that express strictly involuntary states of affairs such as being hungry, becoming tired, or contracting diseases.

The patterns of split intransitivity discussed here underscore an important point that we wish to advance in this article: voice can be also expressed by nominal forms. Traditionally, voice has been regarded as a verbal category. Indeed, many linguists take verbal marking or verbal inflection as the defining feature of voice. (5) We reject this restrictive view. As we define it, voice is concerned with the evolutionary properties of an action. It is typically marked on the verb because a verb expresses an action. Verbal voice marking is therefore simply a case of iconicity. An action, however, also involves participants such as agent and patient. Because an action occurs in relation to these protagonist participants, any form representing them could also bear voice marking. The volitional/ spontaneous opposition manifested in nominal forms also reflects the underlying relationship between the origin of an action and the volitional status of the agent. (6) Nominal marking for certain voice contrasts is thus also motivated by the iconicity principle.

Let us now turn to the causative/noncausative opposition. As noted in the introduction, the causative has been problematic with respect to its status as a voice category. Widely-received definitions of voice, such as Crystal's in Note 1, maintain that voice oppositions do not entail a semantic contrast, which have prevented many grammarians from readily accepting causative/noncausative as one. As the above discussion on the volitional/spontaneous opposition shows, however, there is no reason to believe that voice is a semantically neutral phenomenon. As it happens, one of the oldest systems of voice contrast in Indo-European--the active/middle opposition--also involves a meaning contrast (see below). (7) The question concerning the causative/noncausative opposition (and other semantic oppositions) is whether the relevant contrasts can be naturally integrated into a coherent conceptual framework of voice. Our answer will be yes.

The causative/noncausative opposition pertains to the origin of an action; that is, whether the action originates with the agent of the main action or with another agent heading the action chain. The causative action chain is represented in Figure 3. (8)

[FIGURE 3 OMITTED]

In a noncausative situation, the initial agent ([Agent.sub.2]) is also the agent of the main action. In a causative situation, the ultimate origin of the main action lies in the agent ([Agent.sub.1]) heading the action chain, which is different from the agent ([Agent.sub.2]) of the main action. The relevant parameter for the causative/noncausative distinction can be formulated as below:

Causative/noncausative opposition:

Does the action originate with an agent heading the action chain that is distinct from the agent or patient of the main action?

Yes [right arrow] causative

No [right arrow] noncausative

The contrast between a noncausative situation represented by an expression such as Bill walked and its causative counterpart expressed by a periphrastic causative form like John made Bill walk can thus be naturally captured in terms of the nature of the origin of an action. Situations expressed by lexical causatives such as John killed Bill have an (initial) agent distinct from the patient of the main action.

One of the important points of past studies of causative constructions has to do with the fact that a voice category can be expressed by a construction as a whole, rather than by local morphological entities such as verb inflection or nominal case marking. Lexical and periphrastic causative constructions such as John killed Bill and John made Bill walk are a case in point. They differ in form from morphological causatives such as Quechua wanu-ci (die-CAUSE) 'kill' and Japanese aruka-se (walk-CAUSE) 'make walk', where the causative meaning is expressed morphologically. Traditionally, grammarians have tended to consider only morphological causatives as proper cases. However, such a position leads to the uncomfortable decision of treating the Quechua and Japanese forms cited above as causative, while treating the semantically parallel English expressions kill and make walk as noncausative. The form-based treatment of causatives is tantamount to simply circumscribing morphological causatives, and does not lead to a comprehensive study of causative phenomena. Causation is a semantic, not a morphological notion, and as such the whole range of expression types must be taken into account in a satisfactory analysis. Indeed, a (functional) typological study is predicated on the view that a variety of expression types will obtain in any given conceptual domain. The formal tripartite pattern of lexical, morphological, and periphrastic causative constructions has now been widely accepted, and some revealing correlations between form and function have been identified in the causative domain (see Shibatani and Pardeshi 2002 on recent developments). We see below that a similar pattern holds in other voice domains as well.

Having discussed two voice phenomena pertaining to the origin of an action, we now turn to the next major voice parameter concerning its development. We will consider the other voices associated with the nature of the origin of an action--the passive and the inverse--after dealing with other conceptually-based voice phenomena.

3.2. Parameters pertaining to the development of an action

In this section we recognize at least two sets of contrastive patterns in the developmental phase of an action. One is concerned with whether the action develops beyond the personal sphere of the agent or is instead confined within it. The latter mode of development forms the conceptual basis of what is known as the middle voice. The other contrastive pattern of action development is concerned with whether or not the action has been successfully transferred to the patient and has achieved its intended effect. This contrast forms the conceptual basis for the ergative/antipassive opposition.

The active/middle voice opposition is best known from studies of classical Indo-European languages such as Ancient Greek and Sanskrit, and calls for a broad understanding of the notion of action confinement in the agent's personal sphere. The clearest case in which the development of an action is confined to the agent's sphere is when simple intransitive activities, such as sitting and walking, are lexicalized as intransitive verbs. Here the development of the action is clearly confined within domain of the agent, as shown in the schematic representation Figure 5a. These situations contrast with active (causative) situations (e.g. John sat his son in the chair and John made his son walk) where the relevant actions involve an agent that instigates an action which develops outside the (initial) agent's domain (see Figure 4). In the words of Benveniste (1971 [1950]: 148): "In the active, the verbs denote a process that is accomplished outside the subject. In the middle, which is the diathesis to be defined by the opposition, the verb indicates a process centering in the subject, the subject being inside the process."

[FIGURES 4-5 OMITTED]

Reflexive situations also constitute one of the middle action types, since here the action is also confined within the agent's personal sphere. The active expression John hit Bill contrasts with the reflexive expression John hit himself, where the confinement of the hitting action within one's personal sphere (e.g. hitting one's head or body) is marked by a coreferential reflexive pronoun (see Figure 5b). (9)

Other middle situations of body-care action--bathing, combing one's hair, washing one's hands, and dressing oneself- are straightforward, where the agent's action deals with its own body or body part. Because an action confined to the agent's sphere typically affects the agent itself, this aspect of the middle--an effect accruing to the agent itself--plays an important role in framing certain actions of the middle. Greek middle expressions such as paraschesthai ti 'to give something from one's own means' and paratithesthai siton 'to have food served up' are a case in point. Here the actions actually extend beyond the agent's sphere, but their effects accrue on the agent in the manner of a typical middle depicted in Figures 5b and 5c. In other cases, the notion of the agent's personal sphere is more strictly adhered to, as in the following examples:

Sanskrit

(18) a. devadatto yajnadattasya bharyam

Devadatta.NOM Yajnadatta.GEN wife upayacchati. (active)

have. relations. 3SG.ACT

'Devadatta has relations with Yajnadatta's wife.'

b. devadatto bharyam upayacchate. (middle)

Devadatta.NOM wife have.relations.3 SG.MID

'Devadatta has relations with his (own) wife.'

(Klaiman 1988: 34)

Sanxiang Dulong/Rawang

(19) a. [an.sup.53) [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55]. (active)

3SG mosquito hit

'S/he is hitting the mosquito.'

b. [an.sup.53] [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55] -[cm.sup.31]. (middle)

3SG mosquito hit-MID

'S/he is hitting the mosquito (on her/his body).'

(LaPolla 1996: 1945)

The active/middle opposition is diagrammatically shown as above, where the dotted circles and arrows represent the agent's personal sphere and actions respectively.

The conceptual basis of the active/middle opposition can then be formulated in terms of the manner of the development of an action, as follows:

Active/middle opposition:

Active: The action extends beyond the agent's personal sphere and achieves its effect on a distinct patient.

Middle: The development of an action is confined within the agent's personal sphere so that the action's effect accrues on the agent itself.

Defining the middle voice domain in terms of confinement of an action within the sphere of the agent affords a unified treatment of various types of middle construction. Just as in the case of causatives, middle constructions come in three types--lexical, morphological, and periphrastic--both within individual languages and across different ones. Balinese, for example, exhibits all three types of middle construction, allowing some situation types to be expressed either morphologically or periphrastically, as shown in Table 1.

Our approach to middle voice phenomena is more consistent than Kemmer's (1993), which distinguishes reflexive situations from other middle event types, although these two categories are assumed to form a continuum, as shown in the diagrammatic representation in Figure 6. In our approach, Kemmer's reflexive, middle, and single participant situation types all fall in the middle voice domain, as defined above. Kemmer's distinctions among these types appear to be partly based on the typical forms expressing them. Reflexive situations tend to be expressed periphrastically, as in the case of Balinese nyagur awak 'hit oneself'. Kemmer's middle situation types are typically expressed morphologically, as in ma-cukur 'shave' in Balinese, and single-participant events are typically expressed by forms without any middle markers, as in the Balinese lexical middle negak 'sit'.

[FIGURE 6 OMITTED]

Kemmer arrives at her classification of event types as a result of her decision to "[deal] with ... middle-marking languages, or languages with overt morphological indications of the middle category" (Kemmer 1993: 10; bold face original, underline added). As pointed out in the discussion of causatives above, a strict form-based approach to the middle voice tends to focus on morphological middles, which is similar to the narrow treatment of morphological causatives, ignoring other possible form types. Such an approach would consider the Tarascan (Mexico) form ata-kurhi 'hit oneself' and the Quechua form maqa-ku 'hit onself' as middies, while treating the English and Balinese equivalents hit oneself and nyagur awak as distinct reflexives. Perhaps Kemmer would consider oneself and awak here as "overt morphological indications of the middle category." But then, why is she distinguishing reflexive situations from the middle situations in her diagram reproduced in Figure 6? Also, what of the German form aufstehen 'stand up', which shows no middle marking? Is it not a middle because it lacks any morphological marking? It is semantically equivalent to the Balinese middle form ma-jujuk 'stand up'.

A more systematic typological investigation of the form-function correlation can be achieved if variation in form is taken as a function of the "naturalness" of the middle action. Natural middle actions--for example, sitting and walking--tend to be lexicalized as intransitive verbs, while actions typically directed to others--for example, hitting and kicking--tend to be expressed by periphrastic constructions involving a reflexive form when they are confined within the agent's personal sphere. What Kemmer (1993) has identified as middles--morphological middles--center on those actions that people typically apply to themselves, but that are applied to others often enough. (10) One must, however, realize that there are both intra- and crosslinguistic variations--such that in Balinese ma-jujuk 'stand up' has a morphological middle prefix, but negak 'sit (down)' is simply lexical. The same marking pattern is reversed in German, where sich hinsetzen 'sit down' has a middle marker, but aufstehen 'stand up' does not. These irregularities require individual accounts, based on historical, cognitive, and even cultural data.

The middle voice system has several important implications for our general understanding of the nature of voice phenomena. Recall that most of the widely received definitions of voice (such as the one quoted from Crystal [2003] in Note 1) hold that voice opposition does not entail a meaning contrast. This is not the case for the active/middle opposition, as shown by the examples above as well as by the contrast between the English active form John hit Bill and the middle form John hit himself.

Secondly, these examples show that voice alternations do not necessarily alter argument alignment patterns. There is no change in grammatical relation in the contrastive pairs in (18) and (19). If the situations depicted there give the impression of unusual utterances, consider the mundane situations described by the following Greek examples, where a meaning contrast is expressed without a realignment of arguments:

Ancient Greek

(20) a. louo khitona. (active)

wash.1SG.ACT shirt.ACC

'I wash a shirt.'

b. louomai khitona. (middle)

wash.1SG.MID shirt.ACC

'I wash my shirt/I wash a shirt for myself.'

While morphological middle constructions in some languages are strictly intransitive (as in the case of the Balinese ma-), and middles derived via the decausative function (as in the Greek forms porefisai 'to cause to go, to convey': poreusasthai 'to go' and kaiein 'to light, kindle': kaiesthai 'to be lighted, to bum') are intransitive, intransitivity is not a defining property of middle constructions. A large number of languages allow middle constructions that are syntactically transitive, as shown in the examples above and (21b) below, where the direct object is clearly marked by the accusative case suffix -n.

Amharic

(21) a. lemma te-lac' ce.

Lemma MID-shave.PERF.3M

'Lemma shaved himself.'

b. lamma ras-u-n te-lac' ce.

Lemma head-POSS.3M-ACC MID-shave.PERF.3M

'Lemma shaved his head.'

(Amberber 2000: 325, 326)

The general tendency for morphological middles to be intransitive is best viewed as the result of historical processes responding to the pressure on the form to conform to the semantic intransitivity, which characterizes middle events. This is exactly what has happened to many of the middle forms expressing reflexive middle situations in European languages, where the relevant affixes evolved from reflexive pronouns in the parent languages. The course of this development can be illustrated by using synchronic data below, where the Swedish example shows an intermediate clitic stage, the Russian form sebja exemplifies the earliest transitive pattern, and -s' (or -sja) the advanced fused pattern.

(22) a. Ivan ubi-1 sebj-a. (Russian)

Ivan kill.PERF-PAST.SG.MASC self-ACC

'Ivan killed himself.'

b. Honkamma-de sag. (Swedish)

she comb-PAST MID

'She combed.'

c. Ona prichesa-l-a-s'. (Russian)

she comb-PAST-FEM-MID

'She combed.'

Finally, in recognizing intransitive and transitive verbs as lexicalized middle and active voice forms, we elevate the active/middle contrast to the status of a central voice opposition observed in all human languages (cf. Dixon's [1979: 68-69] observation that "all languages appear to distinguish activities that necessarily involve two participants from those that necessarily involve one ... Then all languages have classes of transitive and intransitive verbs, to describe these two classes of activity"). (11)

Let us now turn to the antipassive voice. As the name suggests, the syntactic properties of antipassive constructions mirror somewhat those of passives, but the semantic aspect is different in these two voices. In the case of the passive, there is no implication that an agent is not somehow fully involved in the action. Indeed, full involvement of an agent is a crucial feature distinguishing the passive (e.g. John was killed while he was asleep) from the spontaneous middle (e.g. John died while he was asleep). Antipassive situations contrast in meaning with those expressed in the active and the ergative voice regarding the attainment of the intended effect upon a patient, however.

The intended effect of an action on a patient differs depending on the verb type. With contact verbs, the antipassive presents a situation as failing to make contact, as in the following examples:

Chukchee

(23) a. elteg=e keyn=en penre-nen.

father=GER bear=ABS attack=3SG:3SG/AOR

'The father attacked the bear.'

b. elteg=en penre=tko=g[??]e

father=GER attack=APASS=3SG.AOR

keyn=ete. (antipassive)

bear=DAT

'The father rushed at the bear.'

(Kozinsky et al. 1988: 652)

Warlpiri

(24) a. nyuntulu-rlu [??]-npa-ju pantu-rnu ngaju.

you-ERG [??]-2SG.A-1SG.P spear-PAST I.ABS

'You speared me.'

b. nyuntulu-rlu [??]-npa-ju-rla pantu-rnu

you-ERG [??]-2SG.A-1SG-DAT spear-PAST

ngaju-ku. (antipassive)

I-DAT

'You speared at me; you tried to spear me.'

(Dixon 1980: 449)

According to Dixon (1980: 449), (24b) above "indicates that the action denoted by the verb is not fully carried out, in the sense that it does not have the intended effect on the entity denoted by the object [read "patient", MS]." Similarly, visual contact is not made when situations involving visual perception are presented in the antipassive voice:

Warrungu

(25) a. nyula nyaka+n wurripa+[??].

3SG.NOM see+P/P bee+ABS

'He saw bees.'

b. ngaya nyaka+kali+[??] wurripa+wu katyarra+wu.

1SG.NOM see-APASS+P/P bee+DAT possum+DAT

'I was looking for bees and possums.'

(Tsunoda 1988: 606)

Moreover, for action types affecting a patient, the antipassive voice presents a situation as NOT affecting the patient in totality, as in the following examples:

Samoan

(26) a. S[bar.a] 'ai e le teine le i'a.

PAST eat ERG ART girl ART fish

'The girl ate the fish.'

b. S[bar.a] 'ai le teine i le i'a.

PAST eat ART girl LOC ART fish

'The girl ate some (of the) fish.'

(Mosel and Hovdhaugen 1992: 108)

The voice parameter focusing on the ergative/antipassive contrast can be formulated as below:

Ergative/antipassive opposition:

Does the action develop to its full extent and achieve its intended effect on a patient?

Yes [right arrow] ergative(/active)

No [right arrow] antipassive

Notice that in (24b) an antipassive event is conveyed solely by the case marking on the patient, underscoring our earlier point that voice may be manifested in a nominal element denoting the relevant participant. In the case of the antipassive, the status of the patient is at issue, and antipassivization iconically affects the form of the patient nominal--either case marking it differently from the active/ergative (a case of the so-called differential object marking [Moravcsik 1978]), or avoiding coding it (examples below).

As conceived here, both the middle and the antipassive relate to the nature of the development of an action. Specifically, both have the ontological feature of an action not (totally) affecting a distinct patient. The conceptual affinity between the two explains the middle/antipassive polysemy seen in a fair number of languages. Observe:

Yidiny

(27) a. wagu:da bambi-dinu.

man.ABS cover-MID

'The man covered himself.'

b. wagu:da wawa-:dinu gudaganda.

man.ABS saw-APASS dog.DAT

'The man saw the dog.'

(Dixon 1977: 277, 280)

Balinese

(28) a. Ia sedek ma-sugi.

3SG ASP MID-wash.face

'She is washing her face.'

b. Tiang ma-daar.

1SG APASS-eat

'I ate.'

Shibatani and Artawa 2002)

Russian

(29) a. Ivan mojetsja mylom.

Ivan wash.MID soap.INSTR

'Ivan washed himself with soap.'

b. Babuska rugajetsja.

granny.NOM scold.APASS

'Granny is scolding.'

(Geniusiene 1987: 9)

In addition, languages may show the well-known connection between the middle and the passive (12) through the use of the same form as the antipassive, thus illustrating a three-way middle-passive-antipassive polysemy:

Russian (cf. the examples immediately above)

(30) Dom stroitsja turezk-oj firm-oj

house.NOM is.being.built.PASS Turkish-INST firm-INST

INKA.

INKA

'The house is being built by the Turkish company INKA.'

Kuku Yalanji

(31) a. karrkay julurri-ji-y. (middle)

child.ABS wash-MID-NONPAST

'The child is washing itself.'

b. warru (yaburr-ndu) bayka-ji-ny. (passive)

young man.ABS shark:LOC:pt bite-PASS-PAST

'The young man was bitten (by a shark).'

c. nyulu dingkar minya-nga nuka-ji-ny. (antipassive)

3SG.NOM man.ABS meat-LOC eat-APASS-PAST

'The man had a good feast of meat (he wasted nothing).'

(Patz 1982: 244, 248, 255)

3.3. The termination of action parameter

In a regular transitive event, an action terminates in a patient. However, the action may extend beyond the patient and affect an additional entity, which then functions as a new terminal point. Benefactives/malefactives and applicatives express this kind of situation. The relevant parameter can be formulated in the following form:

Benefactive/malefactive/applicative parameter:

Does the action develop further than its normal course, such that an entity other than the direct event-participants becomes a new terminal point registering an effect of the action?

No [right arrow] active/middle

Yes [right arrow] benefactive/malefactive/applicative

While the notion of benefit-giving is a broad one, there is one particular type with a perceptible change in the beneficiary. This is the case involving transfer of an object, where the object itself is directly affected by the act of giving. In a typical giving situation, the object is physically moved from one owner to a new one. The recipient beneficiary is secondarily affected because it comes into possession of the transferred object. Languages often have a special benefactive construction that portrays this type of situation, where the effect on the beneficiary is indicated by its argument status in syntactic coding. As shown in Shibatani (1996), benefactive constructions are typically based on the syntactic schema of the give-construction even involving the verb form for giving in some languages, as in the case of Japanese seen below:

(32) a. Taroo-wa Hanako-ni hon-o yat-ta.

Taro-TOP Hanako-DAT book-ACC give-PAST

'Taro gave Hanako a book.'

b. Taroo-wa Hanako-ni hon-o kat-te

Taro-TOP Hanako-DAT book-ACC buy-CONJ

yat-ta.

BEN-PAST

'Taro bought Hanako a book.'

In (32b) the buying action is extended beyond the patient (the book), and affects the beneficiary nominal (Hanako) coded in the dative form. Compare this construction to the one below, expressing a more general benefit-giving in which the beneficiary takes on a nonargument form.

(33) Taroo-wa Hanak-no tame-ni hon-o

Taro-TOP Hanako-of sake-for book-ACC

kat-te yat-ta.

buy-CONJ GIVE-PAST

'Taro bought a book for (the sake of) Hanako.'

While (33) may express any type of benefit-giving--including one of buying a book to help Hanako's book-selling business--(32b) specifically conveys the meaning that the transfer of the book was intended. Note also the English translations accompanying these examples, which show the same contrast.

Benefactive/malefactive events are also realized by so-called external possession constructions in Indo-European and some other languages (cf. Payne and Barshi 1999), although the context may determine whether or not a clear benefactive/malecfactive reading obtains from them. When a body part is involved as the primary patient (cf. below), the benefactive/malecfactive reading is not strongly pronounced beyond that which is conveyed by the verb; cf. (34) and (35a):

German

(34) Ich wasche mir die Hande.

I wash I.DAT the hands

'I wash my hands.' (lit. 'I wash me the hands.')

(35) a. Man hat ihm den Arm gebrochen.

lit. 'They broke him the ann.'

b. Man hat seinen Arm gebrochen.

'They broke his arm.'

Where inalienable possession is implicated as above, the dative nominal indicates that the action has affected it as a new terminal point of the action. In German, the external possession construction is generally obligatory when the affected body part is inalienably possessed; the extension of the action to its owner is inevitable under such circumstances. Indeed, an internal possession construction like (35b) suggests that the arm in question was detached, and no effect on its owner is asserted by such a sentence. Internal possession constructions involving inalienably possessed body parts, as in the English form I broke his arm, suggest that the arm's owner was affected, but the implication is obtained through a commonsensical world view. The dative construction (35a), on the other hand, asserts that the body part owner is affected by the action.

The benefactive/malefactive reading can be seen more readily in the following examples, where the dative nominal represents a mentally affected party:

French

(36) a. Jean lui a casse sa vaisalle.

lit. 'Jean broke her her dishes.'

b. Jean a casse sa vaisalle.

'Jean broke her dishes.'

Modern Hebrew

(37) a. ha tinok lixlex li et ha xulca.

the baby dirtied I.DAT ACC the shirt

'The baby dirtied the shirt on me.'

b. ha tinok lixlex et ha-xulca shel-i.

the baby dirtied ACC the-shirt of-me

'The baby dirtied my shirt.'

(Berman 1982; T. Gibon pers. comm.)

Where inalienable possession is evident, as in these examples, a malefactive meaning obtains more readily. The trade-off between inalienability and affective reading shows that a principle of relevance is at work in these constructions: the relevance of the dative arguments to the event must be somehow "guaranteed." Involvement of an inalienably possessed object guarantees the relevance of the possessor to the event, since whatever happens to the body part will affect its possessor automatically. When an inalienable possession relation does not obtain--as in (36a) and (37a)--a benefactive/malefactive effect upon the dative argument is pronounced as a way of establishing its relevance to the event. The attendant interpretation that a possessive relation exists contributes to the establishment of the affective relationship; the owner of an object is more easily affected by what happens to its possession.

Contrary to what the label suggests then, so-called external possession constructions DO NOT assert a possessive relation between the dative argument and the directly affected patient. Indeed, the relevant constructions arise independently from externalization of the possessor, as in the German example below (also in [36a] above), or when the notion of possession is irrelevant, as in the following examples (40)-(41) from River Warihio (Uto-Aztecan): (13)

German

(38) Peter repariert mir mein Fahrrad.

'Peter fixes me my bicycle.'

River Warihio

(39) a. hustina pasu-re muni kukuci icio.

Agustina cook-PERF beans children BEN

'Agustina cooked beans for the children.'

b. hustina pasu-ke-re muni kukuci.

Agustina cook-BEN-PERF beans children

'Agustina cooked beans for the children.'

(40) maniwiri no'o wikahta-ke-ru yoma aari.

Manuel 1SG.NS sing-BEN-PERF all afternoon

'Manuel sang all afternoon for me.'

(41) tapana no'o yuku-ke-ru.

yesterday 1SG.NS rain-BEN-PERF

'Yesterday it rained on/for me.'

(Felix 2005: 253, 257, 258)

That the condition of physical proximity should be more important than the possessive relation in inducing a benefactive/malefactive construction is shown by the following River Warihio examples (see Shibatani 1994 for other cases):

(42) a. maniwiri ihcorewapate-re wani pantaoni-ra.

Manuel get.dirty-PERF John jeans-POSS 'Manuel dirtied John's jeans.' (John's jeans were over the chair.)

b. maniwiri ihcorewapate-ke-re pantaoni wani.

Manuel get.dirty-BEN-PERF jeans John

'Manuel dirtied John's jeans.' (John was wearing his jeans.)

In general, applicative constructions have been considered as syntactic valency-increasing operations that are pragmatically motivated (see Peterson 1999). Our claim is that their conceptual basis is rooted in the ontological feature of an action, as stated in the voice parameter above. Peterson's (1999) survey shows that certain applicatives are more basic and prevalent than others. In the words of Peterson (who lumps benefactives and applicatives together), "the locative and circumstantial applicatives depend on the presence of other applicative constructions, while benefactive and instrumental/comitative applicatives do not. That is, there are two core applicative constructions, benefactive and instrumental/ comitative, and these serve as anchors as it were for the development of additional applicative constructions marked either by the same or distinct morphology" (Peterson 1999: 135). This observation is consistent with our view of the benefactive/applicative voice. Benefactive and instrumental/comitative participants are much more directly involved in the event than a causal factor, or setting entity such a location, hence much more likely to be affected by the action. That the benefactive applicative is obligatory in some languages also underscores the point regarding the affected nature of the recipient beneficiary (cf. above).

In the past, grammarians may have not paid sufficiently close attention to the subtle meaning differences that exist between applicative constructions and their nonapplicative counterparts. However, recent descriptions of applicative constructions have begun to notice some revealing semantic effects. For example, Donohue (1999) shows that the Tukang Besi comitative applicative conveys a meaning whereby the applied comitative nominal is actively engaged in the event: (14)

Tukang Besi

(43) a. No-moturu kene wowine ane ke hotu mopera.

3R-sleep and woman exist and hair short

'He slept with the woman with the short hair.'

(i.e. they were sleeping near each other.) (# they had sex together.)

b. No-moturu-ngkene te wowine ane ke hotu

3R-sleep-COM CORE woman exist and hair

mopera.

short

'He slept with the woman with the short hair.' (i.e. they had sex together.)

(Donohue 1999: 231)

The following instrumental applicative from Pulaar also demonstrates how an applied instrumental nominal can implicate a participant more thoroughly affected by the agent's action:

Pulaar

(44) a. mi loot-ii min am a

1SG wash-PERF.ACT y.s. 1SG.POSS PREP

saabunnde hee.

Soap DET

'I washed my younger sibling with (some of) the soap.'

b. mi loot-r-ii min am

1.SG wash-INST-PERF.ACT y.s. 1SG.POSS

saabunnde hee.

soap DET

'I washed my younger sibling with (all of) the soap.'

(Sebastian Ross-Hagebaum pers. comm.)

The various effects of locative applicatives have also been recognized in the literature. The Balinese locative expression in (45b) below, for example, describes a situation where the action of planting banana trees extends in such a way as to affect the garden. Here the entire garden ends up being planted with banana trees, while no such implication is made in the nonapplicative counterpart (45a).

Balinese

(45) a. Tiang mulan biyu di tegalan tiang-e.

1SG plant banana in garden 1SG-POSS

'I planted bananas in my garden.'

b. Tiang mulan-in tegalan tiang-e biyu.

1SG plant-APPL garden 1SG-POSS banana 'I planted my garden with bananas.'

(I. Wayan Arka pers. comm.)

On the conceptual framework for voice phenomena *.

Abstract

This article attempts to lay the conceptual foundations of voice phenomena, ranging from the familiar active/passive contrast to the ergative/antipassive opposition, as well as voice functions of split case-marking in both transitive and intransitive constructions. We advance the claim that major voice phenomena have conceptual bases rooted in the human cognition of actions, which have evolutionary properties pertaining to their origin, development, and termination. The notion of transitivity is integral to the study of voice as evident from the fact that the so-called transitivity parameters identified by Hopper and Thompson (1980) and others are in the main concerned with these evolutionary properties of an action, and also from the fact that the phenomena dealt with in these studies are mostly voice phenomena. A number of claims made in past studies of voice and in some widely-received definitions of voice are shown to be false. In particular, voice oppositions are typically based on conceptual--as opposed to pragmatic--meanings, may not alter argument alignment patterns, may not change verbal valency, and may not even trigger verbal marking. There are also voice oppositions more basic and wide-spread than the active/passive system, upon which popular definitions of voice are typically based

1. Introduction

Current studies on voice phenomena suffer from a number of inadequacies at several levels of description and explanation. At the most fundamental level, there is no coherent conceptual framework that adequately addresses the matter, such that we are often left to wonder whether or not a given phenomenon falls in the domain of voice. For one thing, people differ in the treatment of causative and reflexive constructions; some consider them to represent voice categories, while others do not. Still others avoid raising the issue at all. Various definitions currently offered are of little use, as they are typically based on an Indo-European active/passive opposition, and arbitrarily include or exclude a particular phenomenon from the domain of voice. (1)

Properly identifying construction types representing a voice sub-domain is also a serious problem. In Crystal's (2003) definition (cf. Note 1) reflexives are not recognized as proper voice constructions and their relationship to the middle voice is not entirely clear. A similar problem is seen in Kemmer's (1993) extensive study of middle voice constructions.

There are also severe limitations at the level of explanation. Closer to the main theme of this volume is the problem of understanding the increases and decreases in valency and accompanying changes in argument structure observed in voice phenomena. Why do certain phenomena (e.g. the causative and applicative) show an increase in valency, while others (e.g. the passive and antipassive) typically have a valency-reducing effect? What motivates these valency changes in opposite directions?

Functional explanations regarding the distribution of certain voice constructions go a long way toward an explanatory functional study of grammatical phenomena (cf. Haiman 1985). Being largely based on formal properties such as "linguistic distance" and "full" vs. "reduced form," these explanations are not functional enough to be able to make more general predictions. (2)

The problems outlined above largely stem from two related methodological issues. One is the lack of a coherent conceptual framework for characterizing and analyzing voice phenomena; the other is an over-reliance on formal properties in both analysis and explanation. Clearly the latter problem is caused by the former and by the lack of commitment to the cognition-to-form approach in linguistic analysis. (3) The purpose of this article is thus to lay out a conceptual framework that coherently delineates the domain of voice, which embraces both those phenomena that are traditionally recognized as falling in the voice domain and those that have been kept in limbo. The framework required must deal with the fact that many voice phenomena straddle the semantics-pragmatics boundary, although the active/middle opposition is basically conceptual or semantic, and the active/passive opposition is largely pragmatic. We endeavor to unify these manifestations of voice function by assuming that the pragmatic relevance of clausal units is semantically determined in the first place.

The conceptual foundations of voice can only be arrived at by inspecting contrasting phenomena across languages. Our initial task is therefore to learn how a given language, using its own resources, achieves the goal of expressing a relevant conceptual opposition found in another language. While the ultimate goal of functional typology is to discover the correlative patterns between form and function, this article is concerned primarily with the initial task of postulating conceptual bases of voice phenomena and identifying constructions across languages that express the relevant oppositions.

One final introductory remark is due regarding the controversy over the question of whether the formal relationships between opposing voice categories should be treated as inflectional or derivational. We consider this question to be academic in the absence of rigorous definitions for these processes. In the realm of voice phenomena, some systems, for example, the Ancient Greek active/middle system, incorporate voice morphology in their inflectional paradigm. Others like the English active/ passive opposition do not show a simple morphological relationship--inflectional or derivational--since constructions as a whole enter into the formal opposition. The regularity or productivity of the pattern is often taken to be an important criterion distinguishing inflections from derivations; the former are thought to be regular and obligatory, while the latter allow exceptions. But regularity in natural language is always relative, and so are the patterns of voice oppositions. Even among the known ones, nothing is one hundred percent regular. An alternation that is well-integrated within the inflectional paradigm may show irregularity. In Ancient Greek, for example, we find both active forms that do not have middle counterparts (activa tantum) and middle forms lacking the corresponding active (media tantum). The active/passive opposition also shows a high degree of regularity, without ever being one hundred percent (as in the case of English), others place much severer limitations on the range of permissible passive constructions.

2. The evolution of an action: voice, transitivity, and aspect

The basic claim of this article is that major voice phenomena have their conceptual bases rooted in the human cognition of actions. Because such actions have various effects upon us, we have special interest in the way that they arise, how they develop, and the manner in which they terminate--what is referred to as the evolutionary properties or phases of an action in this article. Through a system of grammatical oppositions, a language provides a means for expressing conceptual contrasts pertaining to the evolutionary properties of an action that the speaker finds relevant for communicative purposes. Among the evolutionary properties, voice is primarily concerned with the way event participants are involved in actions, and with the communicative value, or discourse relevance pertaining to the event participants from the nature of this involvement.

Mention of the evolution of an action immediately brings to mind two other grammatical concepts, namely, transitivity and aspect. It is thus appropriate to clarify the relationships and differences between these notions. Traditionally, voice has been defined in reference to transitivity, or more narrowly in terms of the transitivity of a verb or clause; the active/ passive opposition most typically obtains with transitive verbs. A more important connection between transitivity and voice, however, lies in the notion of semantic transitivity, rather than strictly verbal or clausal transitivity. Indeed, it is easy to see this connection, as in the work of Hopper and Thompson (1980), where many of the phenomena discussed in terms of transitivity are nothing but voice phenomena. This important article concludes the section on grammatical transitivity as follows: "It is tempting to find a superordinate semantic notion which will include all the Transitivity components. If there is one, it has so far not been discovered ..." (Hopper and Thompson 1980: 279). Our claim is that what they are looking for is a theory of voice. In fact, the work of Hopper and Thompson lays important ground work for the study of voice. In this regard, Kemmer (1993: 247) is absolutely correct in noting that "the scale of transitivity ... forms the conceptual underpinning for voice systems in general, and for reflexive and middle marking systems in particular." (4) While none of these works makes it quite clear, voice is a system of correspondences between action or event types and syntactic structures. For example, what is known as the active voice is the pattern of correspondence between the high transitive event type or the prototypical transitive action and the nominative-accusative coding pattern of the event participants, as in the English active sentence She killed him (see Section 5 below).

The parameters of transitivity identified by Hopper and Thompson (1980) pertain to "different facet[s]" of "carrying-over or transferring an action from one participant to another" (Hopper and Thompson 1980: 253), and they in effect represent the evolutionary properties of an action, that is, they pertain to the way an action is brought about, to the way it is transferred to the second participant, and to the way it affects this participant. In order to bring grammar closer to cognition, we propose to examine specific evolutionary properties of an action pertaining to voice oppositions that are distilled as transitivity parameters in Hopper and Thompson (1980) and others dealing with the issues of transitivity.

If transitivity is integral to a theory of voice, how then do aspect and voice differ under the assumption that both are concerned with the way an action evolves? These two grammatical categories invite different kinds of questions. Aspect asks where the vantage point is with regard to the temporal structure of an action. When the action is viewed holistically encompassing all of its temporal phases, we obtain the perfective viewpoint of the described action. On the other hand, if specific sections of internal temporal structure are focused, we obtain various types of imperfective aspectual construal of an event. The contrast between the perfective and the imperfective aspects and the representative subcategories of the latter seen across languages are represented in Figure 1.

[FIGURE 1 OMITTED]

Voice, on the other hand, asks how an action evolves--that is, it asks about the nature of its origin, the manner in which it develops, and the way that it terminates. These evolutionary phases of an action and the various voice categories pertaining to them are depicted schematically in Figure 2.

[FIGURE 2 OMITTED]

3. Major voice oppositions and their conceptual bases

Under the present conception, the three principal evolutionary phases of an action--origin, development, and termination--form the basis for the major voice parameters. These parameters are generally expressible in the form of questions concerning the evolutionary properties of an action, as below:

Major voice parameters:

I. The origin of an action

(a) How is the action brought about?

(b) Where does the action originate?

(c) What is the nature of the agent?

II. The development of an action

How does the action develop?

(a) Does the action extend beyond the agent's personal sphere or is it confined to it?

(b) Does the action achieve the intended effect in a distinct patient, or does it fail to do so?

III. The termination of an action

Does the action develop further than its normal course, extend beyond the immediate participants of the event, and terminate in an additional entity?

Figure 2 summarizes the voice constructions pertaining to these parameters. Throughout the following discussion, we touch upon the theoretical consequences of this diagrammatic representation of the voice domain.

3.1. Parameters pertaining to the origin of an action

The first opposition to be examined has to do with the nature of the origin of an action--namely, whether the action in question is brought about volitionally or nonvolitionally by a human agent.

Volitional/spontaneous opposition:

Is the action brought about volitionally?

Yes [right arrow] volitional

No [right arrow] spontaneous

While not widely recognized as a voice opposition, this distinction has been recognized as such in the Japanese grammatical tradition, perhaps because the suffix for the spontaneous voice is identical with that used in the passive construction. In fact, it is generally believed that the Japanese passive arose from the spontaneous construction. Languages (or grammarians' interpretations of the facts?) may differ with regard to the precise meaning contrast seen in the volitional/spontaneous opposition. In Japanese, the spontaneous construction expresses a situation where the agent does not intend to bring about an action, but where there is a circumstantial factor external to the agent that induces an action (such as eating "dancing-mushrooms" as in [1b] below). In other languages, a spontaneous form conveys the meaning of an action accidentally brought about. Other manifestations of the opposition may be alternatively expressed in terms of such notions as intentional/unintentional or controlled/uncontrolled, but we shall take the position that these contrasts are included in the basic function of the volitional/spontaneous opposition. That is, by "volitional voice" we mean a connection between a particular syntactic form and a type of action that is brought about by the willful involvement of an agent who "intends the action," and sees to it that the intended effect is achieved. Departure from this action type in any significant way may be construed as constituting a spontaneous action, expressed by a construction formally contrasting with the volitional construction.

In Modern Japanese, the domain of the volitional/spontaneous opposition has shrunk to such an extent that mental activities are the only ones where the contrast is readily observed, with the spontaneous morphology (-re/-rare) having generally given way to a passive interpretation in the domain of physical actions. In Classical Japanese (ninth-twelfth centuries), the volitional/spontaneous opposition was more widely observed, as in the following examples:

Classical Japanese

(1) a. Kikori-domo mo mai-keri. (volitional)

wood cutter-PL also dance-PAST

'Wood cutters also danced.'

b. Kikori-domo mo mawa-re-keri. (spontaneous)

wood cuter-PL also dance-SPON-PAST

'Wood cutters also danced willy-nilly.'

Spontaneous expressions in Japanese typically do not contain an agent in subject position. Because information regarding the volitional status of an agent is most readily accessible to the speaker, the volitional/spontaneous distinction is typically made with reference to a first person agent; accordingly, the missing agent is understood to be the speaker unless otherwise specified. This non-coding of an agent in subject position paved the way for a spontaneous expression where a patient nominal is coded in subject position, as in the following spontaneous construction (2b). Undoubtedly, this was an important step in the development of the passive from the spontaneous construction.

Modern Japanese

(2) a. Boku-wayoku mukasi-no-koto-o

I-TOP often old days-GEN-things-ACC

omo-u. (volitional)

think-PRES

'I often think about the things of the old days.'

b. Saikin mukasi-no-koto-ga yoku

recently old days-GEN-things-NOM often

omowa-re-ru. (spontaneous)

think-SPON-PRES

'Recently the things of the old days often come to mind.'

Since the volitional/spontaneous opposition is not widely recognized as a voice phenomenon, it is perhaps worth spending some time showing how widespread in the world's languages it actually is. As in other voice sub-domains, languages make use of different resources in expressing the volitional/spontaneous opposition. Indonesian and Malay use the multifunctional prefix ter- to express unintended or accidental actions:

Indonesian

(3) a. Ali memukul anak-nya. (volitional)

Ali AF.hit child-3SG.POSS

'Ali hit his child.'

b. Ali ter-pukul oleh anak-nya. (spontaneous)

Ali SPON-hit PREP child-3SG.POSS

'Ali accidentally hit his child.'

(I Wayan Arka pers. comm.)

According to Winstedt (1927: 86-87), the function of ter- in Malay is characterized as denoting an action due "not to conscious activity on the part of the subject, but to external compulsion or accident." It is noteworthy that spontaneous constructions in both Japanese and Indonesian/ Malay have an affinity with the passive in that they share the same affix in these languages. Compare the spontaneous constructions above with the passives in Japanese and Indonesian below:

(4) a. Taroo-wa Ziroo-ni nagura-re-ta. (Japanese passive)

Taro-TOP Jiro-by hit-PASS-PAST

'Taro was hit by Jiro.'

b. rumah itu tidak ter-beli oleh

house that NEG PASS-buy PRES

saya. (Indonesian passive)

1.SG

'The house cannot be bought by me.'

The diagrammatic representation of voice constructions in Figure 2 can be thought of as a semantic map, where different constructions are distributed over relevant territory within the voice domain. This is a useful way of representing conceptual affinities among various voice constructions, but its utility is predicated only on a comprehensive view of voice as advocated in this article. Spontaneous and passive are both concerned with the origin of an action. What they share is the idea that this lies NOT in the pragmatically most relevant participant; in the case of the passive, it is the agent of low discourse relevance and in the spontaneous case, it is the external circumstance.

The map in Figure 2 also shows the "neighboring" relationship between the spontaneous, the middle, and the antipassive. In Russian and a number of Australian languages, middle forms are recruited for the volitional/spontaneous contrast, as in the following examples:

Russian

(5) a. Kostja poreza-I xleb.

Kostja cut.PERF-PAST.SG.MASC bread

'Kostja cut the bread.'

b. Kostja porezaq-sja.

Kostja cut.PERF-PAST.SG.MASC-SPON

'Kostja has [accidentally] cut himself.'

(Vera Podlesskaya pers. comm.)

Diyari

(6) a. natu yinana danka-na wawa-yi.

1SG.ERG 2SG.O find-PARTC AUX-PRES

'I found you (after deliberately searching).'

b. nani danka-tadi-na wara-yi yinka ngu.

1SG.ABS find-SPON-PARTC AUX-PREP 2SG.LOC

'I found you (accidentally).'

(Austin 1981: 154)

Another favorite source for the spontaneous construction--especially prominent among Indo-Aryan and Dravidian languages of India--is the so-called dative-subject construction, which typically expresses uncontrollable states:

Sinhala

(7) a. mame ee wacene kiwwa.

I.NOM that word say.PAST

'I said that word.'

b. mate ee wacene kiyewuna.

I.DAT that word say.P.PAST

'I blurted that word out.'

(Gair 1990: 17)

The adaptation of the dative-subject construction for a spontaneous action is also seen when the "dative-subject" is marked by cases different from the dative as in the following Bengali examples, where the nominal form corresponding to the dative subject is marked with genitive. Here the volitional/spontaneous contrast takes on interesting nuances:

Bengali

(8) a. Ami toma-ke khub pc chondo kor-i.

1SG.NOM 2ORDSG-OBJ very liking do-PRES.1

'I like you very much.' (According to my own criteria.)

b. Ama-r toma-ke khub pc chondo

1SG-GEN 2ORDSG-OBJ very liking

hc y.

become-PRES.3ORD

'I like you very much.' (According to some [socially] set criteria.)

(Onishi 2001: 120)

When the basic meaning of the verb denotes a spontaneous (involuntary) action, the volitional voice form can be obtained by using a self-benefactive construction, as in Marathi and other Indo-Aryan languages:

Marathi

(9) a. sitaa raD-l-i.

Sita.NOM cry-PERF-F

'Sita cried.'

b. sitaa-ne raD-un ghet-l-a.

Sita-ERG cry-CONJ take-PERF-N

'Sita cried (so as to relieve herself).'

(Prashant Pardeshi pers. comm.)

Lhasa Tibetan has a set of auxiliaries expressing different categories of perspective. "Perspective-choice" interacts with both person and evidential categories in a complex way, but the relevant auxiliaries can be divided into a "self-centered" and an "other-centered" group (Denwood 1999). Verbs denoting such intentional actions as reading and dancing normally occur with self-centered auxiliaries when used with first person subjects. They can be made nonintentional or spontaneous with the use of other-centered auxiliaries, as in the following examples:

Tibetan

(10) a. ngas. yi.ge, klog.ba yin.

I-SMP letter read-LINK-AUX (self-centered)

'I read the letter (on purpose).'

b. ngas. yi.ge, klog.song.

1-SMP letter read-AUX (other-centered)

'I read the letter (without meaning to).'

(Denwood 1999: 137)

Conversely, although unintentional verbs expressing involuntary actions such as coughing and seeing normally occur with other-centered auxiliaties, they can be rendered volitional by the use of self-centered auxiliaries:

Tibetan

(11) a. glo. rgyab.byung.

cough-AUX (other-centered)

'I coughed (involuntarily).'

b. glo. rgyab.pa.yin.

cough-LINK-AUX (self-centered)

'I coughed (deliberately).'

(Denwood 1999: 139)

A similar pattern is observed in Newar (Tibeto-Burman), where the relevant contrast is expressed in terms of a distinction between conjunct and disjunct verbal endings--apparently an evidentiality-related phenomenon. Note that only clauses with first person subjects allow this contrast to be expressed.

Newar

(12) a. ji-n kayo tachya-na

1SG-ERG cup break-PC

'I broke the cup (deliberately).'

b. ji-n kayo tachya-ta

1SG-ERG cup break-PD

'I broke the cup (accidentally).'

(Kansakar 1999: 428)

Finally, the phenomenon now widely recognized in the name of "split intransitivity" is rooted in the volitional/spontaneous opposition. Observe first some well-known examples from Eastern Pomo below:

Eastern Pomo

(13) a. ha: c'e:xelka.(volitional)

1SG.A slip

'I am sliding.'

b. wi c'e:xelka. (spontaneous)

1SG.P slip

'I am slipping.'

(McLendon 1978: 1-3)

Although the verb forms are the same, when the pronominal form is inflected for the patient (13b), the sentence conveys a spontaneous action or a "lack of protagonist control" (McLendon 1978: 4). A similar contrast is seen in the Caucasian language Tsova-Tush (Batsbi), where "[the] referent of [an ergative] subject is a voluntary, conscious, controlling participant in the situation named by the verb" (Holisky 1987:113).

Tsova-Tush (Batsbi)

(14) a. (as) vuiz-n-as.

1SG.ERG fall-AOR- 1SG.ESRG

'I fell down, on purpose.'

b. (so) voz-en-sO.

1SG.NOM fell-AOR-1SG.NOM

'I fell down, by accident.' (Holisky 1987: 104)

In addition to these cases of "fluid-S" marking (Dixon 1994), split intransitivity may be realized as a lexically-conditioned phenomenon, where intransitive verbs are classified into an "agentive" class and a "patientive class." Agentive and patientive nominals respectively trigger marking similar to the corresponding arguments of a transitive clause. The Philippine language Cebuano shows this pattern through a focus system which is characteristic of Formosan and Western Austronesian languages:

Cebuano

(15) Transitive actor-focus construction

Ni-basa ako ug libro.

AF-read I.TOP 1NDEF book

'I read a book.'

(16) Transitive patient-focus construction

Gi-basa nako ang libro.

PF-read I TOP book

'I read the book.'

(17) a. Agentive intransitive

Ni-dagan ako. (actor-focus form)

AF-run I.TOP

'I ran.'

b. Patientive intransitive

Gi-kapoy ako. (patient-focus form)

PF-tired I.TOP

'I got tired/I am tired.'

Generalizing processes have the effect of obliterating the basic semantic motivation for distinguishing two classes of intransitive verbs; either the larger agentive or larger patientive class of intransitive verbs tends to have semantically heterogeneous verbs. Nevertheless, the split of intransitive verbs into two classes is rooted in the distinction between volitional and involuntary actions involving an animate protagonist. This is seen in a minority class of verbs, such that a minority agentive class contains verbs denoting controlled actions, and a minority patientive class includes verbs denoting involuntary states of affairs (see Merlan 1985). In Cebuano (and perhaps other Philippine languages as well) the larger agentive class includes verbs denoting uncontrolled events such as raining or slipping off, while the minority patientive class contains verbs that express strictly involuntary states of affairs such as being hungry, becoming tired, or contracting diseases.

The patterns of split intransitivity discussed here underscore an important point that we wish to advance in this article: voice can be also expressed by nominal forms. Traditionally, voice has been regarded as a verbal category. Indeed, many linguists take verbal marking or verbal inflection as the defining feature of voice. (5) We reject this restrictive view. As we define it, voice is concerned with the evolutionary properties of an action. It is typically marked on the verb because a verb expresses an action. Verbal voice marking is therefore simply a case of iconicity. An action, however, also involves participants such as agent and patient. Because an action occurs in relation to these protagonist participants, any form representing them could also bear voice marking. The volitional/ spontaneous opposition manifested in nominal forms also reflects the underlying relationship between the origin of an action and the volitional status of the agent. (6) Nominal marking for certain voice contrasts is thus also motivated by the iconicity principle.

Let us now turn to the causative/noncausative opposition. As noted in the introduction, the causative has been problematic with respect to its status as a voice category. Widely-received definitions of voice, such as Crystal's in Note 1, maintain that voice oppositions do not entail a semantic contrast, which have prevented many grammarians from readily accepting causative/noncausative as one. As the above discussion on the volitional/spontaneous opposition shows, however, there is no reason to believe that voice is a semantically neutral phenomenon. As it happens, one of the oldest systems of voice contrast in Indo-European--the active/middle opposition--also involves a meaning contrast (see below). (7) The question concerning the causative/noncausative opposition (and other semantic oppositions) is whether the relevant contrasts can be naturally integrated into a coherent conceptual framework of voice. Our answer will be yes.

The causative/noncausative opposition pertains to the origin of an action; that is, whether the action originates with the agent of the main action or with another agent heading the action chain. The causative action chain is represented in Figure 3. (8)

[FIGURE 3 OMITTED]

In a noncausative situation, the initial agent ([Agent.sub.2]) is also the agent of the main action. In a causative situation, the ultimate origin of the main action lies in the agent ([Agent.sub.1]) heading the action chain, which is different from the agent ([Agent.sub.2]) of the main action. The relevant parameter for the causative/noncausative distinction can be formulated as below:

Causative/noncausative opposition:

Does the action originate with an agent heading the action chain that is distinct from the agent or patient of the main action?

Yes [right arrow] causative

No [right arrow] noncausative

The contrast between a noncausative situation represented by an expression such as Bill walked and its causative counterpart expressed by a periphrastic causative form like John made Bill walk can thus be naturally captured in terms of the nature of the origin of an action. Situations expressed by lexical causatives such as John killed Bill have an (initial) agent distinct from the patient of the main action.

One of the important points of past studies of causative constructions has to do with the fact that a voice category can be expressed by a construction as a whole, rather than by local morphological entities such as verb inflection or nominal case marking. Lexical and periphrastic causative constructions such as John killed Bill and John made Bill walk are a case in point. They differ in form from morphological causatives such as Quechua wanu-ci (die-CAUSE) 'kill' and Japanese aruka-se (walk-CAUSE) 'make walk', where the causative meaning is expressed morphologically. Traditionally, grammarians have tended to consider only morphological causatives as proper cases. However, such a position leads to the uncomfortable decision of treating the Quechua and Japanese forms cited above as causative, while treating the semantically parallel English expressions kill and make walk as noncausative. The form-based treatment of causatives is tantamount to simply circumscribing morphological causatives, and does not lead to a comprehensive study of causative phenomena. Causation is a semantic, not a morphological notion, and as such the whole range of expression types must be taken into account in a satisfactory analysis. Indeed, a (functional) typological study is predicated on the view that a variety of expression types will obtain in any given conceptual domain. The formal tripartite pattern of lexical, morphological, and periphrastic causative constructions has now been widely accepted, and some revealing correlations between form and function have been identified in the causative domain (see Shibatani and Pardeshi 2002 on recent developments). We see below that a similar pattern holds in other voice domains as well.

Having discussed two voice phenomena pertaining to the origin of an action, we now turn to the next major voice parameter concerning its development. We will consider the other voices associated with the nature of the origin of an action--the passive and the inverse--after dealing with other conceptually-based voice phenomena.

3.2. Parameters pertaining to the development of an action

In this section we recognize at least two sets of contrastive patterns in the developmental phase of an action. One is concerned with whether the action develops beyond the personal sphere of the agent or is instead confined within it. The latter mode of development forms the conceptual basis of what is known as the middle voice. The other contrastive pattern of action development is concerned with whether or not the action has been successfully transferred to the patient and has achieved its intended effect. This contrast forms the conceptual basis for the ergative/antipassive opposition.

The active/middle voice opposition is best known from studies of classical Indo-European languages such as Ancient Greek and Sanskrit, and calls for a broad understanding of the notion of action confinement in the agent's personal sphere. The clearest case in which the development of an action is confined to the agent's sphere is when simple intransitive activities, such as sitting and walking, are lexicalized as intransitive verbs. Here the development of the action is clearly confined within domain of the agent, as shown in the schematic representation Figure 5a. These situations contrast with active (causative) situations (e.g. John sat his son in the chair and John made his son walk) where the relevant actions involve an agent that instigates an action which develops outside the (initial) agent's domain (see Figure 4). In the words of Benveniste (1971 [1950]: 148): "In the active, the verbs denote a process that is accomplished outside the subject. In the middle, which is the diathesis to be defined by the opposition, the verb indicates a process centering in the subject, the subject being inside the process."

[FIGURES 4-5 OMITTED]

Reflexive situations also constitute one of the middle action types, since here the action is also confined within the agent's personal sphere. The active expression John hit Bill contrasts with the reflexive expression John hit himself, where the confinement of the hitting action within one's personal sphere (e.g. hitting one's head or body) is marked by a coreferential reflexive pronoun (see Figure 5b). (9)

Other middle situations of body-care action--bathing, combing one's hair, washing one's hands, and dressing oneself- are straightforward, where the agent's action deals with its own body or body part. Because an action confined to the agent's sphere typically affects the agent itself, this aspect of the middle--an effect accruing to the agent itself--plays an important role in framing certain actions of the middle. Greek middle expressions such as paraschesthai ti 'to give something from one's own means' and paratithesthai siton 'to have food served up' are a case in point. Here the actions actually extend beyond the agent's sphere, but their effects accrue on the agent in the manner of a typical middle depicted in Figures 5b and 5c. In other cases, the notion of the agent's personal sphere is more strictly adhered to, as in the following examples:

Sanskrit

(18) a. devadatto yajnadattasya bharyam

Devadatta.NOM Yajnadatta.GEN wife upayacchati. (active)

have. relations. 3SG.ACT

'Devadatta has relations with Yajnadatta's wife.'

b. devadatto bharyam upayacchate. (middle)

Devadatta.NOM wife have.relations.3 SG.MID

'Devadatta has relations with his (own) wife.'

(Klaiman 1988: 34)

Sanxiang Dulong/Rawang

(19) a. [an.sup.53) [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55]. (active)

3SG mosquito hit

'S/he is hitting the mosquito.'

b. [an.sup.53] [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55] -[cm.sup.31]. (middle)

3SG mosquito hit-MID

'S/he is hitting the mosquito (on her/his body).'

(LaPolla 1996: 1945)

The active/middle opposition is diagrammatically shown as above, where the dotted circles and arrows represent the agent's personal sphere and actions respectively.

The conceptual basis of the active/middle opposition can then be formulated in terms of the manner of the development of an action, as follows:

Active/middle opposition:

Active: The action extends beyond the agent's personal sphere and achieves its effect on a distinct patient.

Middle: The development of an action is confined within the agent's personal sphere so that the action's effect accrues on the agent itself.

Defining the middle voice domain in terms of confinement of an action within the sphere of the agent affords a unified treatment of various types of middle construction. Just as in the case of causatives, middle constructions come in three types--lexical, morphological, and periphrastic--both within individual languages and across different ones. Balinese, for example, exhibits all three types of middle construction, allowing some situation types to be expressed either morphologically or periphrastically, as shown in Table 1.

Our approach to middle voice phenomena is more consistent than Kemmer's (1993), which distinguishes reflexive situations from other middle event types, although these two categories are assumed to form a continuum, as shown in the diagrammatic representation in Figure 6. In our approach, Kemmer's reflexive, middle, and single participant situation types all fall in the middle voice domain, as defined above. Kemmer's distinctions among these types appear to be partly based on the typical forms expressing them. Reflexive situations tend to be expressed periphrastically, as in the case of Balinese nyagur awak 'hit oneself'. Kemmer's middle situation types are typically expressed morphologically, as in ma-cukur 'shave' in Balinese, and single-participant events are typically expressed by forms without any middle markers, as in the Balinese lexical middle negak 'sit'.

[FIGURE 6 OMITTED]

Kemmer arrives at her classification of event types as a result of her decision to "[deal] with ... middle-marking languages, or languages with overt morphological indications of the middle category" (Kemmer 1993: 10; bold face original, underline added). As pointed out in the discussion of causatives above, a strict form-based approach to the middle voice tends to focus on morphological middles, which is similar to the narrow treatment of morphological causatives, ignoring other possible form types. Such an approach would consider the Tarascan (Mexico) form ata-kurhi 'hit oneself' and the Quechua form maqa-ku 'hit onself' as middies, while treating the English and Balinese equivalents hit oneself and nyagur awak as distinct reflexives. Perhaps Kemmer would consider oneself and awak here as "overt morphological indications of the middle category." But then, why is she distinguishing reflexive situations from the middle situations in her diagram reproduced in Figure 6? Also, what of the German form aufstehen 'stand up', which shows no middle marking? Is it not a middle because it lacks any morphological marking? It is semantically equivalent to the Balinese middle form ma-jujuk 'stand up'.

A more systematic typological investigation of the form-function correlation can be achieved if variation in form is taken as a function of the "naturalness" of the middle action. Natural middle actions--for example, sitting and walking--tend to be lexicalized as intransitive verbs, while actions typically directed to others--for example, hitting and kicking--tend to be expressed by periphrastic constructions involving a reflexive form when they are confined within the agent's personal sphere. What Kemmer (1993) has identified as middles--morphological middles--center on those actions that people typically apply to themselves, but that are applied to others often enough. (10) One must, however, realize that there are both intra- and crosslinguistic variations--such that in Balinese ma-jujuk 'stand up' has a morphological middle prefix, but negak 'sit (down)' is simply lexical. The same marking pattern is reversed in German, where sich hinsetzen 'sit down' has a middle marker, but aufstehen 'stand up' does not. These irregularities require individual accounts, based on historical, cognitive, and even cultural data.

The middle voice system has several important implications for our general understanding of the nature of voice phenomena. Recall that most of the widely received definitions of voice (such as the one quoted from Crystal [2003] in Note 1) hold that voice opposition does not entail a meaning contrast. This is not the case for the active/middle opposition, as shown by the examples above as well as by the contrast between the English active form John hit Bill and the middle form John hit himself.

Secondly, these examples show that voice alternations do not necessarily alter argument alignment patterns. There is no change in grammatical relation in the contrastive pairs in (18) and (19). If the situations depicted there give the impression of unusual utterances, consider the mundane situations described by the following Greek examples, where a meaning contrast is expressed without a realignment of arguments:

Ancient Greek

(20) a. louo khitona. (active)

wash.1SG.ACT shirt.ACC

'I wash a shirt.'

b. louomai khitona. (middle)

wash.1SG.MID shirt.ACC

'I wash my shirt/I wash a shirt for myself.'

While morphological middle constructions in some languages are strictly intransitive (as in the case of the Balinese ma-), and middles derived via the decausative function (as in the Greek forms porefisai 'to cause to go, to convey': poreusasthai 'to go' and kaiein 'to light, kindle': kaiesthai 'to be lighted, to bum') are intransitive, intransitivity is not a defining property of middle constructions. A large number of languages allow middle constructions that are syntactically transitive, as shown in the examples above and (21b) below, where the direct object is clearly marked by the accusative case suffix -n.

Amharic

(21) a. lemma te-lac' ce.

Lemma MID-shave.PERF.3M

'Lemma shaved himself.'

b. lamma ras-u-n te-lac' ce.

Lemma head-POSS.3M-ACC MID-shave.PERF.3M

'Lemma shaved his head.'

(Amberber 2000: 325, 326)

The general tendency for morphological middles to be intransitive is best viewed as the result of historical processes responding to the pressure on the form to conform to the semantic intransitivity, which characterizes middle events. This is exactly what has happened to many of the middle forms expressing reflexive middle situations in European languages, where the relevant affixes evolved from reflexive pronouns in the parent languages. The course of this development can be illustrated by using synchronic data below, where the Swedish example shows an intermediate clitic stage, the Russian form sebja exemplifies the earliest transitive pattern, and -s' (or -sja) the advanced fused pattern.

(22) a. Ivan ubi-1 sebj-a. (Russian)

Ivan kill.PERF-PAST.SG.MASC self-ACC

'Ivan killed himself.'

b. Honkamma-de sag. (Swedish)

she comb-PAST MID

'She combed.'

c. Ona prichesa-l-a-s'. (Russian)

she comb-PAST-FEM-MID

'She combed.'

Finally, in recognizing intransitive and transitive verbs as lexicalized middle and active voice forms, we elevate the active/middle contrast to the status of a central voice opposition observed in all human languages (cf. Dixon's [1979: 68-69] observation that "all languages appear to distinguish activities that necessarily involve two participants from those that necessarily involve one ... Then all languages have classes of transitive and intransitive verbs, to describe these two classes of activity"). (11)

Let us now turn to the antipassive voice. As the name suggests, the syntactic properties of antipassive constructions mirror somewhat those of passives, but the semantic aspect is different in these two voices. In the case of the passive, there is no implication that an agent is not somehow fully involved in the action. Indeed, full involvement of an agent is a crucial feature distinguishing the passive (e.g. John was killed while he was asleep) from the spontaneous middle (e.g. John died while he was asleep). Antipassive situations contrast in meaning with those expressed in the active and the ergative voice regarding the attainment of the intended effect upon a patient, however.

The intended effect of an action on a patient differs depending on the verb type. With contact verbs, the antipassive presents a situation as failing to make contact, as in the following examples:

Chukchee

(23) a. elteg=e keyn=en penre-nen.

father=GER bear=ABS attack=3SG:3SG/AOR

'The father attacked the bear.'

b. elteg=en penre=tko=g[??]e

father=GER attack=APASS=3SG.AOR

keyn=ete. (antipassive)

bear=DAT

'The father rushed at the bear.'

(Kozinsky et al. 1988: 652)

Warlpiri

(24) a. nyuntulu-rlu [??]-npa-ju pantu-rnu ngaju.

you-ERG [??]-2SG.A-1SG.P spear-PAST I.ABS

'You speared me.'

b. nyuntulu-rlu [??]-npa-ju-rla pantu-rnu

you-ERG [??]-2SG.A-1SG-DAT spear-PAST

ngaju-ku. (antipassive)

I-DAT

'You speared at me; you tried to spear me.'

(Dixon 1980: 449)

According to Dixon (1980: 449), (24b) above "indicates that the action denoted by the verb is not fully carried out, in the sense that it does not have the intended effect on the entity denoted by the object [read "patient", MS]." Similarly, visual contact is not made when situations involving visual perception are presented in the antipassive voice:

Warrungu

(25) a. nyula nyaka+n wurripa+[??].

3SG.NOM see+P/P bee+ABS

'He saw bees.'

b. ngaya nyaka+kali+[??] wurripa+wu katyarra+wu.

1SG.NOM see-APASS+P/P bee+DAT possum+DAT

'I was looking for bees and possums.'

(Tsunoda 1988: 606)

Moreover, for action types affecting a patient, the antipassive voice presents a situation as NOT affecting the patient in totality, as in the following examples:

Samoan

(26) a. S[bar.a] 'ai e le teine le i'a.

PAST eat ERG ART girl ART fish

'The girl ate the fish.'

b. S[bar.a] 'ai le teine i le i'a.

PAST eat ART girl LOC ART fish

'The girl ate some (of the) fish.'

(Mosel and Hovdhaugen 1992: 108)

The voice parameter focusing on the ergative/antipassive contrast can be formulated as below:

Ergative/antipassive opposition:

Does the action develop to its full extent and achieve its intended effect on a patient?

Yes [right arrow] ergative(/active)

No [right arrow] antipassive

Notice that in (24b) an antipassive event is conveyed solely by the case marking on the patient, underscoring our earlier point that voice may be manifested in a nominal element denoting the relevant participant. In the case of the antipassive, the status of the patient is at issue, and antipassivization iconically affects the form of the patient nominal--either case marking it differently from the active/ergative (a case of the so-called differential object marking [Moravcsik 1978]), or avoiding coding it (examples below).

As conceived here, both the middle and the antipassive relate to the nature of the development of an action. Specifically, both have the ontological feature of an action not (totally) affecting a distinct patient. The conceptual affinity between the two explains the middle/antipassive polysemy seen in a fair number of languages. Observe:

Yidiny

(27) a. wagu:da bambi-dinu.

man.ABS cover-MID

'The man covered himself.'

b. wagu:da wawa-:dinu gudaganda.

man.ABS saw-APASS dog.DAT

'The man saw the dog.'

(Dixon 1977: 277, 280)

Balinese

(28) a. Ia sedek ma-sugi.

3SG ASP MID-wash.face

'She is washing her face.'

b. Tiang ma-daar.

1SG APASS-eat

'I ate.'

Shibatani and Artawa 2002)

Russian

(29) a. Ivan mojetsja mylom.

Ivan wash.MID soap.INSTR

'Ivan washed himself with soap.'

b. Babuska rugajetsja.

granny.NOM scold.APASS

'Granny is scolding.'

(Geniusiene 1987: 9)

In addition, languages may show the well-known connection between the middle and the passive (12) through the use of the same form as the antipassive, thus illustrating a three-way middle-passive-antipassive polysemy:

Russian (cf. the examples immediately above)

(30) Dom stroitsja turezk-oj firm-oj

house.NOM is.being.built.PASS Turkish-INST firm-INST

INKA.

INKA

'The house is being built by the Turkish company INKA.'

Kuku Yalanji

(31) a. karrkay julurri-ji-y. (middle)

child.ABS wash-MID-NONPAST

'The child is washing itself.'

b. warru (yaburr-ndu) bayka-ji-ny. (passive)

young man.ABS shark:LOC:pt bite-PASS-PAST

'The young man was bitten (by a shark).'

c. nyulu dingkar minya-nga nuka-ji-ny. (antipassive)

3SG.NOM man.ABS meat-LOC eat-APASS-PAST

'The man had a good feast of meat (he wasted nothing).'

(Patz 1982: 244, 248, 255)

3.3. The termination of action parameter

In a regular transitive event, an action terminates in a patient. However, the action may extend beyond the patient and affect an additional entity, which then functions as a new terminal point. Benefactives/malefactives and applicatives express this kind of situation. The relevant parameter can be formulated in the following form:

Benefactive/malefactive/applicative parameter:

Does the action develop further than its normal course, such that an entity other than the direct event-participants becomes a new terminal point registering an effect of the action?

No [right arrow] active/middle

Yes [right arrow] benefactive/malefactive/applicative

While the notion of benefit-giving is a broad one, there is one particular type with a perceptible change in the beneficiary. This is the case involving transfer of an object, where the object itself is directly affected by the act of giving. In a typical giving situation, the object is physically moved from one owner to a new one. The recipient beneficiary is secondarily affected because it comes into possession of the transferred object. Languages often have a special benefactive construction that portrays this type of situation, where the effect on the beneficiary is indicated by its argument status in syntactic coding. As shown in Shibatani (1996), benefactive constructions are typically based on the syntactic schema of the give-construction even involving the verb form for giving in some languages, as in the case of Japanese seen below:

(32) a. Taroo-wa Hanako-ni hon-o yat-ta.

Taro-TOP Hanako-DAT book-ACC give-PAST

'Taro gave Hanako a book.'

b. Taroo-wa Hanako-ni hon-o kat-te

Taro-TOP Hanako-DAT book-ACC buy-CONJ

yat-ta.

BEN-PAST

'Taro bought Hanako a book.'

In (32b) the buying action is extended beyond the patient (the book), and affects the beneficiary nominal (Hanako) coded in the dative form. Compare this construction to the one below, expressing a more general benefit-giving in which the beneficiary takes on a nonargument form.

(33) Taroo-wa Hanak-no tame-ni hon-o

Taro-TOP Hanako-of sake-for book-ACC

kat-te yat-ta.

buy-CONJ GIVE-PAST

'Taro bought a book for (the sake of) Hanako.'

While (33) may express any type of benefit-giving--including one of buying a book to help Hanako's book-selling business--(32b) specifically conveys the meaning that the transfer of the book was intended. Note also the English translations accompanying these examples, which show the same contrast.

Benefactive/malefactive events are also realized by so-called external possession constructions in Indo-European and some other languages (cf. Payne and Barshi 1999), although the context may determine whether or not a clear benefactive/malecfactive reading obtains from them. When a body part is involved as the primary patient (cf. below), the benefactive/malecfactive reading is not strongly pronounced beyond that which is conveyed by the verb; cf. (34) and (35a):

German

(34) Ich wasche mir die Hande.

I wash I.DAT the hands

'I wash my hands.' (lit. 'I wash me the hands.')

(35) a. Man hat ihm den Arm gebrochen.

lit. 'They broke him the ann.'

b. Man hat seinen Arm gebrochen.

'They broke his arm.'

Where inalienable possession is implicated as above, the dative nominal indicates that the action has affected it as a new terminal point of the action. In German, the external possession construction is generally obligatory when the affected body part is inalienably possessed; the extension of the action to its owner is inevitable under such circumstances. Indeed, an internal possession construction like (35b) suggests that the arm in question was detached, and no effect on its owner is asserted by such a sentence. Internal possession constructions involving inalienably possessed body parts, as in the English form I broke his arm, suggest that the arm's owner was affected, but the implication is obtained through a commonsensical world view. The dative construction (35a), on the other hand, asserts that the body part owner is affected by the action.

The benefactive/malefactive reading can be seen more readily in the following examples, where the dative nominal represents a mentally affected party:

French

(36) a. Jean lui a casse sa vaisalle.

lit. 'Jean broke her her dishes.'

b. Jean a casse sa vaisalle.

'Jean broke her dishes.'

Modern Hebrew

(37) a. ha tinok lixlex li et ha xulca.

the baby dirtied I.DAT ACC the shirt

'The baby dirtied the shirt on me.'

b. ha tinok lixlex et ha-xulca shel-i.

the baby dirtied ACC the-shirt of-me

'The baby dirtied my shirt.'

(Berman 1982; T. Gibon pers. comm.)

Where inalienable possession is evident, as in these examples, a malefactive meaning obtains more readily. The trade-off between inalienability and affective reading shows that a principle of relevance is at work in these constructions: the relevance of the dative arguments to the event must be somehow "guaranteed." Involvement of an inalienably possessed object guarantees the relevance of the possessor to the event, since whatever happens to the body part will affect its possessor automatically. When an inalienable possession relation does not obtain--as in (36a) and (37a)--a benefactive/malefactive effect upon the dative argument is pronounced as a way of establishing its relevance to the event. The attendant interpretation that a possessive relation exists contributes to the establishment of the affective relationship; the owner of an object is more easily affected by what happens to its possession.

Contrary to what the label suggests then, so-called external possession constructions DO NOT assert a possessive relation between the dative argument and the directly affected patient. Indeed, the relevant constructions arise independently from externalization of the possessor, as in the German example below (also in [36a] above), or when the notion of possession is irrelevant, as in the following examples (40)-(41) from River Warihio (Uto-Aztecan): (13)

German

(38) Peter repariert mir mein Fahrrad.

'Peter fixes me my bicycle.'

River Warihio

(39) a. hustina pasu-re muni kukuci icio.

Agustina cook-PERF beans children BEN

'Agustina cooked beans for the children.'

b. hustina pasu-ke-re muni kukuci.

Agustina cook-BEN-PERF beans children

'Agustina cooked beans for the children.'

(40) maniwiri no'o wikahta-ke-ru yoma aari.

Manuel 1SG.NS sing-BEN-PERF all afternoon

'Manuel sang all afternoon for me.'

(41) tapana no'o yuku-ke-ru.

yesterday 1SG.NS rain-BEN-PERF

'Yesterday it rained on/for me.'

(Felix 2005: 253, 257, 258)

That the condition of physical proximity should be more important than the possessive relation in inducing a benefactive/malefactive construction is shown by the following River Warihio examples (see Shibatani 1994 for other cases):

(42) a. maniwiri ihcorewapate-re wani pantaoni-ra.

Manuel get.dirty-PERF John jeans-POSS 'Manuel dirtied John's jeans.' (John's jeans were over the chair.)

b. maniwiri ihcorewapate-ke-re pantaoni wani.

Manuel get.dirty-BEN-PERF jeans John

'Manuel dirtied John's jeans.' (John was wearing his jeans.)

In general, applicative constructions have been considered as syntactic valency-increasing operations that are pragmatically motivated (see Peterson 1999). Our claim is that their conceptual basis is rooted in the ontological feature of an action, as stated in the voice parameter above. Peterson's (1999) survey shows that certain applicatives are more basic and prevalent than others. In the words of Peterson (who lumps benefactives and applicatives together), "the locative and circumstantial applicatives depend on the presence of other applicative constructions, while benefactive and instrumental/comitative applicatives do not. That is, there are two core applicative constructions, benefactive and instrumental/ comitative, and these serve as anchors as it were for the development of additional applicative constructions marked either by the same or distinct morphology" (Peterson 1999: 135). This observation is consistent with our view of the benefactive/applicative voice. Benefactive and instrumental/comitative participants are much more directly involved in the event than a causal factor, or setting entity such a location, hence much more likely to be affected by the action. That the benefactive applicative is obligatory in some languages also underscores the point regarding the affected nature of the recipient beneficiary (cf. above).

In the past, grammarians may have not paid sufficiently close attention to the subtle meaning differences that exist between applicative constructions and their nonapplicative counterparts. However, recent descriptions of applicative constructions have begun to notice some revealing semantic effects. For example, Donohue (1999) shows that the Tukang Besi comitative applicative conveys a meaning whereby the applied comitative nominal is actively engaged in the event: (14)

Tukang Besi

(43) a. No-moturu kene wowine ane ke hotu mopera.

3R-sleep and woman exist and hair short

'He slept with the woman with the short hair.'

(i.e. they were sleeping near each other.) (# they had sex together.)

b. No-moturu-ngkene te wowine ane ke hotu

3R-sleep-COM CORE woman exist and hair

mopera.

short

'He slept with the woman with the short hair.' (i.e. they had sex together.)

(Donohue 1999: 231)

The following instrumental applicative from Pulaar also demonstrates how an applied instrumental nominal can implicate a participant more thoroughly affected by the agent's action:

Pulaar

(44) a. mi loot-ii min am a

1SG wash-PERF.ACT y.s. 1SG.POSS PREP

saabunnde hee.

Soap DET

'I washed my younger sibling with (some of) the soap.'

b. mi loot-r-ii min am

1.SG wash-INST-PERF.ACT y.s. 1SG.POSS

saabunnde hee.

soap DET

'I washed my younger sibling with (all of) the soap.'

(Sebastian Ross-Hagebaum pers. comm.)

The various effects of locative applicatives have also been recognized in the literature. The Balinese locative expression in (45b) below, for example, describes a situation where the action of planting banana trees extends in such a way as to affect the garden. Here the entire garden ends up being planted with banana trees, while no such implication is made in the nonapplicative counterpart (45a).

Balinese

(45) a. Tiang mulan biyu di tegalan tiang-e.

1SG plant banana in garden 1SG-POSS

'I planted bananas in my garden.'

b. Tiang mulan-in tegalan tiang-e biyu.

1SG plant-APPL garden 1SG-POSS banana 'I planted my garden with bananas.'

(I. Wayan Arka pers. comm.)

On the conceptual framework for voice phenomena *.

Abstract

This article attempts to lay the conceptual foundations of voice phenomena, ranging from the familiar active/passive contrast to the ergative/antipassive opposition, as well as voice functions of split case-marking in both transitive and intransitive constructions. We advance the claim that major voice phenomena have conceptual bases rooted in the human cognition of actions, which have evolutionary properties pertaining to their origin, development, and termination. The notion of transitivity is integral to the study of voice as evident from the fact that the so-called transitivity parameters identified by Hopper and Thompson (1980) and others are in the main concerned with these evolutionary properties of an action, and also from the fact that the phenomena dealt with in these studies are mostly voice phenomena. A number of claims made in past studies of voice and in some widely-received definitions of voice are shown to be false. In particular, voice oppositions are typically based on conceptual--as opposed to pragmatic--meanings, may not alter argument alignment patterns, may not change verbal valency, and may not even trigger verbal marking. There are also voice oppositions more basic and wide-spread than the active/passive system, upon which popular definitions of voice are typically based

1. Introduction

Current studies on voice phenomena suffer from a number of inadequacies at several levels of description and explanation. At the most fundamental level, there is no coherent conceptual framework that adequately addresses the matter, such that we are often left to wonder whether or not a given phenomenon falls in the domain of voice. For one thing, people differ in the treatment of causative and reflexive constructions; some consider them to represent voice categories, while others do not. Still others avoid raising the issue at all. Various definitions currently offered are of little use, as they are typically based on an Indo-European active/passive opposition, and arbitrarily include or exclude a particular phenomenon from the domain of voice. (1)

Properly identifying construction types representing a voice sub-domain is also a serious problem. In Crystal's (2003) definition (cf. Note 1) reflexives are not recognized as proper voice constructions and their relationship to the middle voice is not entirely clear. A similar problem is seen in Kemmer's (1993) extensive study of middle voice constructions.

There are also severe limitations at the level of explanation. Closer to the main theme of this volume is the problem of understanding the increases and decreases in valency and accompanying changes in argument structure observed in voice phenomena. Why do certain phenomena (e.g. the causative and applicative) show an increase in valency, while others (e.g. the passive and antipassive) typically have a valency-reducing effect? What motivates these valency changes in opposite directions?

Functional explanations regarding the distribution of certain voice constructions go a long way toward an explanatory functional study of grammatical phenomena (cf. Haiman 1985). Being largely based on formal properties such as "linguistic distance" and "full" vs. "reduced form," these explanations are not functional enough to be able to make more general predictions. (2)

The problems outlined above largely stem from two related methodological issues. One is the lack of a coherent conceptual framework for characterizing and analyzing voice phenomena; the other is an over-reliance on formal properties in both analysis and explanation. Clearly the latter problem is caused by the former and by the lack of commitment to the cognition-to-form approach in linguistic analysis. (3) The purpose of this article is thus to lay out a conceptual framework that coherently delineates the domain of voice, which embraces both those phenomena that are traditionally recognized as falling in the voice domain and those that have been kept in limbo. The framework required must deal with the fact that many voice phenomena straddle the semantics-pragmatics boundary, although the active/middle opposition is basically conceptual or semantic, and the active/passive opposition is largely pragmatic. We endeavor to unify these manifestations of voice function by assuming that the pragmatic relevance of clausal units is semantically determined in the first place.

The conceptual foundations of voice can only be arrived at by inspecting contrasting phenomena across languages. Our initial task is therefore to learn how a given language, using its own resources, achieves the goal of expressing a relevant conceptual opposition found in another language. While the ultimate goal of functional typology is to discover the correlative patterns between form and function, this article is concerned primarily with the initial task of postulating conceptual bases of voice phenomena and identifying constructions across languages that express the relevant oppositions.

One final introductory remark is due regarding the controversy over the question of whether the formal relationships between opposing voice categories should be treated as inflectional or derivational. We consider this question to be academic in the absence of rigorous definitions for these processes. In the realm of voice phenomena, some systems, for example, the Ancient Greek active/middle system, incorporate voice morphology in their inflectional paradigm. Others like the English active/ passive opposition do not show a simple morphological relationship--inflectional or derivational--since constructions as a whole enter into the formal opposition. The regularity or productivity of the pattern is often taken to be an important criterion distinguishing inflections from derivations; the former are thought to be regular and obligatory, while the latter allow exceptions. But regularity in natural language is always relative, and so are the patterns of voice oppositions. Even among the known ones, nothing is one hundred percent regular. An alternation that is well-integrated within the inflectional paradigm may show irregularity. In Ancient Greek, for example, we find both active forms that do not have middle counterparts (activa tantum) and middle forms lacking the corresponding active (media tantum). The active/passive opposition also shows a high degree of regularity, without ever being one hundred percent (as in the case of English), others place much severer limitations on the range of permissible passive constructions.

2. The evolution of an action: voice, transitivity, and aspect

The basic claim of this article is that major voice phenomena have their conceptual bases rooted in the human cognition of actions. Because such actions have various effects upon us, we have special interest in the way that they arise, how they develop, and the manner in which they terminate--what is referred to as the evolutionary properties or phases of an action in this article. Through a system of grammatical oppositions, a language provides a means for expressing conceptual contrasts pertaining to the evolutionary properties of an action that the speaker finds relevant for communicative purposes. Among the evolutionary properties, voice is primarily concerned with the way event participants are involved in actions, and with the communicative value, or discourse relevance pertaining to the event participants from the nature of this involvement.

Mention of the evolution of an action immediately brings to mind two other grammatical concepts, namely, transitivity and aspect. It is thus appropriate to clarify the relationships and differences between these notions. Traditionally, voice has been defined in reference to transitivity, or more narrowly in terms of the transitivity of a verb or clause; the active/ passive opposition most typically obtains with transitive verbs. A more important connection between transitivity and voice, however, lies in the notion of semantic transitivity, rather than strictly verbal or clausal transitivity. Indeed, it is easy to see this connection, as in the work of Hopper and Thompson (1980), where many of the phenomena discussed in terms of transitivity are nothing but voice phenomena. This important article concludes the section on grammatical transitivity as follows: "It is tempting to find a superordinate semantic notion which will include all the Transitivity components. If there is one, it has so far not been discovered ..." (Hopper and Thompson 1980: 279). Our claim is that what they are looking for is a theory of voice. In fact, the work of Hopper and Thompson lays important ground work for the study of voice. In this regard, Kemmer (1993: 247) is absolutely correct in noting that "the scale of transitivity ... forms the conceptual underpinning for voice systems in general, and for reflexive and middle marking systems in particular." (4) While none of these works makes it quite clear, voice is a system of correspondences between action or event types and syntactic structures. For example, what is known as the active voice is the pattern of correspondence between the high transitive event type or the prototypical transitive action and the nominative-accusative coding pattern of the event participants, as in the English active sentence She killed him (see Section 5 below).

The parameters of transitivity identified by Hopper and Thompson (1980) pertain to "different facet[s]" of "carrying-over or transferring an action from one participant to another" (Hopper and Thompson 1980: 253), and they in effect represent the evolutionary properties of an action, that is, they pertain to the way an action is brought about, to the way it is transferred to the second participant, and to the way it affects this participant. In order to bring grammar closer to cognition, we propose to examine specific evolutionary properties of an action pertaining to voice oppositions that are distilled as transitivity parameters in Hopper and Thompson (1980) and others dealing with the issues of transitivity.

If transitivity is integral to a theory of voice, how then do aspect and voice differ under the assumption that both are concerned with the way an action evolves? These two grammatical categories invite different kinds of questions. Aspect asks where the vantage point is with regard to the temporal structure of an action. When the action is viewed holistically encompassing all of its temporal phases, we obtain the perfective viewpoint of the described action. On the other hand, if specific sections of internal temporal structure are focused, we obtain various types of imperfective aspectual construal of an event. The contrast between the perfective and the imperfective aspects and the representative subcategories of the latter seen across languages are represented in Figure 1.

[FIGURE 1 OMITTED]

Voice, on the other hand, asks how an action evolves--that is, it asks about the nature of its origin, the manner in which it develops, and the way that it terminates. These evolutionary phases of an action and the various voice categories pertaining to them are depicted schematically in Figure 2.

[FIGURE 2 OMITTED]

3. Major voice oppositions and their conceptual bases

Under the present conception, the three principal evolutionary phases of an action--origin, development, and termination--form the basis for the major voice parameters. These parameters are generally expressible in the form of questions concerning the evolutionary properties of an action, as below:

Major voice parameters:

I. The origin of an action

(a) How is the action brought about?

(b) Where does the action originate?

(c) What is the nature of the agent?

II. The development of an action

How does the action develop?

(a) Does the action extend beyond the agent's personal sphere or is it confined to it?

(b) Does the action achieve the intended effect in a distinct patient, or does it fail to do so?

III. The termination of an action

Does the action develop further than its normal course, extend beyond the immediate participants of the event, and terminate in an additional entity?

Figure 2 summarizes the voice constructions pertaining to these parameters. Throughout the following discussion, we touch upon the theoretical consequences of this diagrammatic representation of the voice domain.

3.1. Parameters pertaining to the origin of an action

The first opposition to be examined has to do with the nature of the origin of an action--namely, whether the action in question is brought about volitionally or nonvolitionally by a human agent.

Volitional/spontaneous opposition:

Is the action brought about volitionally?

Yes [right arrow] volitional

No [right arrow] spontaneous

While not widely recognized as a voice opposition, this distinction has been recognized as such in the Japanese grammatical tradition, perhaps because the suffix for the spontaneous voice is identical with that used in the passive construction. In fact, it is generally believed that the Japanese passive arose from the spontaneous construction. Languages (or grammarians' interpretations of the facts?) may differ with regard to the precise meaning contrast seen in the volitional/spontaneous opposition. In Japanese, the spontaneous construction expresses a situation where the agent does not intend to bring about an action, but where there is a circumstantial factor external to the agent that induces an action (such as eating "dancing-mushrooms" as in [1b] below). In other languages, a spontaneous form conveys the meaning of an action accidentally brought about. Other manifestations of the opposition may be alternatively expressed in terms of such notions as intentional/unintentional or controlled/uncontrolled, but we shall take the position that these contrasts are included in the basic function of the volitional/spontaneous opposition. That is, by "volitional voice" we mean a connection between a particular syntactic form and a type of action that is brought about by the willful involvement of an agent who "intends the action," and sees to it that the intended effect is achieved. Departure from this action type in any significant way may be construed as constituting a spontaneous action, expressed by a construction formally contrasting with the volitional construction.

In Modern Japanese, the domain of the volitional/spontaneous opposition has shrunk to such an extent that mental activities are the only ones where the contrast is readily observed, with the spontaneous morphology (-re/-rare) having generally given way to a passive interpretation in the domain of physical actions. In Classical Japanese (ninth-twelfth centuries), the volitional/spontaneous opposition was more widely observed, as in the following examples:

Classical Japanese

(1) a. Kikori-domo mo mai-keri. (volitional)

wood cutter-PL also dance-PAST

'Wood cutters also danced.'

b. Kikori-domo mo mawa-re-keri. (spontaneous)

wood cuter-PL also dance-SPON-PAST

'Wood cutters also danced willy-nilly.'

Spontaneous expressions in Japanese typically do not contain an agent in subject position. Because information regarding the volitional status of an agent is most readily accessible to the speaker, the volitional/spontaneous distinction is typically made with reference to a first person agent; accordingly, the missing agent is understood to be the speaker unless otherwise specified. This non-coding of an agent in subject position paved the way for a spontaneous expression where a patient nominal is coded in subject position, as in the following spontaneous construction (2b). Undoubtedly, this was an important step in the development of the passive from the spontaneous construction.

Modern Japanese

(2) a. Boku-wayoku mukasi-no-koto-o

I-TOP often old days-GEN-things-ACC

omo-u. (volitional)

think-PRES

'I often think about the things of the old days.'

b. Saikin mukasi-no-koto-ga yoku

recently old days-GEN-things-NOM often

omowa-re-ru. (spontaneous)

think-SPON-PRES

'Recently the things of the old days often come to mind.'

Since the volitional/spontaneous opposition is not widely recognized as a voice phenomenon, it is perhaps worth spending some time showing how widespread in the world's languages it actually is. As in other voice sub-domains, languages make use of different resources in expressing the volitional/spontaneous opposition. Indonesian and Malay use the multifunctional prefix ter- to express unintended or accidental actions:

Indonesian

(3) a. Ali memukul anak-nya. (volitional)

Ali AF.hit child-3SG.POSS

'Ali hit his child.'

b. Ali ter-pukul oleh anak-nya. (spontaneous)

Ali SPON-hit PREP child-3SG.POSS

'Ali accidentally hit his child.'

(I Wayan Arka pers. comm.)

According to Winstedt (1927: 86-87), the function of ter- in Malay is characterized as denoting an action due "not to conscious activity on the part of the subject, but to external compulsion or accident." It is noteworthy that spontaneous constructions in both Japanese and Indonesian/ Malay have an affinity with the passive in that they share the same affix in these languages. Compare the spontaneous constructions above with the passives in Japanese and Indonesian below:

(4) a. Taroo-wa Ziroo-ni nagura-re-ta. (Japanese passive)

Taro-TOP Jiro-by hit-PASS-PAST

'Taro was hit by Jiro.'

b. rumah itu tidak ter-beli oleh

house that NEG PASS-buy PRES

saya. (Indonesian passive)

1.SG

'The house cannot be bought by me.'

The diagrammatic representation of voice constructions in Figure 2 can be thought of as a semantic map, where different constructions are distributed over relevant territory within the voice domain. This is a useful way of representing conceptual affinities among various voice constructions, but its utility is predicated only on a comprehensive view of voice as advocated in this article. Spontaneous and passive are both concerned with the origin of an action. What they share is the idea that this lies NOT in the pragmatically most relevant participant; in the case of the passive, it is the agent of low discourse relevance and in the spontaneous case, it is the external circumstance.

The map in Figure 2 also shows the "neighboring" relationship between the spontaneous, the middle, and the antipassive. In Russian and a number of Australian languages, middle forms are recruited for the volitional/spontaneous contrast, as in the following examples:

Russian

(5) a. Kostja poreza-I xleb.

Kostja cut.PERF-PAST.SG.MASC bread

'Kostja cut the bread.'

b. Kostja porezaq-sja.

Kostja cut.PERF-PAST.SG.MASC-SPON

'Kostja has [accidentally] cut himself.'

(Vera Podlesskaya pers. comm.)

Diyari

(6) a. natu yinana danka-na wawa-yi.

1SG.ERG 2SG.O find-PARTC AUX-PRES

'I found you (after deliberately searching).'

b. nani danka-tadi-na wara-yi yinka ngu.

1SG.ABS find-SPON-PARTC AUX-PREP 2SG.LOC

'I found you (accidentally).'

(Austin 1981: 154)

Another favorite source for the spontaneous construction--especially prominent among Indo-Aryan and Dravidian languages of India--is the so-called dative-subject construction, which typically expresses uncontrollable states:

Sinhala

(7) a. mame ee wacene kiwwa.

I.NOM that word say.PAST

'I said that word.'

b. mate ee wacene kiyewuna.

I.DAT that word say.P.PAST

'I blurted that word out.'

(Gair 1990: 17)

The adaptation of the dative-subject construction for a spontaneous action is also seen when the "dative-subject" is marked by cases different from the dative as in the following Bengali examples, where the nominal form corresponding to the dative subject is marked with genitive. Here the volitional/spontaneous contrast takes on interesting nuances:

Bengali

(8) a. Ami toma-ke khub pc chondo kor-i.

1SG.NOM 2ORDSG-OBJ very liking do-PRES.1

'I like you very much.' (According to my own criteria.)

b. Ama-r toma-ke khub pc chondo

1SG-GEN 2ORDSG-OBJ very liking

hc y.

become-PRES.3ORD

'I like you very much.' (According to some [socially] set criteria.)

(Onishi 2001: 120)

When the basic meaning of the verb denotes a spontaneous (involuntary) action, the volitional voice form can be obtained by using a self-benefactive construction, as in Marathi and other Indo-Aryan languages:

Marathi

(9) a. sitaa raD-l-i.

Sita.NOM cry-PERF-F

'Sita cried.'

b. sitaa-ne raD-un ghet-l-a.

Sita-ERG cry-CONJ take-PERF-N

'Sita cried (so as to relieve herself).'

(Prashant Pardeshi pers. comm.)

Lhasa Tibetan has a set of auxiliaries expressing different categories of perspective. "Perspective-choice" interacts with both person and evidential categories in a complex way, but the relevant auxiliaries can be divided into a "self-centered" and an "other-centered" group (Denwood 1999). Verbs denoting such intentional actions as reading and dancing normally occur with self-centered auxiliaries when used with first person subjects. They can be made nonintentional or spontaneous with the use of other-centered auxiliaries, as in the following examples:

Tibetan

(10) a. ngas. yi.ge, klog.ba yin.

I-SMP letter read-LINK-AUX (self-centered)

'I read the letter (on purpose).'

b. ngas. yi.ge, klog.song.

1-SMP letter read-AUX (other-centered)

'I read the letter (without meaning to).'

(Denwood 1999: 137)

Conversely, although unintentional verbs expressing involuntary actions such as coughing and seeing normally occur with other-centered auxiliaties, they can be rendered volitional by the use of self-centered auxiliaries:

Tibetan

(11) a. glo. rgyab.byung.

cough-AUX (other-centered)

'I coughed (involuntarily).'

b. glo. rgyab.pa.yin.

cough-LINK-AUX (self-centered)

'I coughed (deliberately).'

(Denwood 1999: 139)

A similar pattern is observed in Newar (Tibeto-Burman), where the relevant contrast is expressed in terms of a distinction between conjunct and disjunct verbal endings--apparently an evidentiality-related phenomenon. Note that only clauses with first person subjects allow this contrast to be expressed.

Newar

(12) a. ji-n kayo tachya-na

1SG-ERG cup break-PC

'I broke the cup (deliberately).'

b. ji-n kayo tachya-ta

1SG-ERG cup break-PD

'I broke the cup (accidentally).'

(Kansakar 1999: 428)

Finally, the phenomenon now widely recognized in the name of "split intransitivity" is rooted in the volitional/spontaneous opposition. Observe first some well-known examples from Eastern Pomo below:

Eastern Pomo

(13) a. ha: c'e:xelka.(volitional)

1SG.A slip

'I am sliding.'

b. wi c'e:xelka. (spontaneous)

1SG.P slip

'I am slipping.'

(McLendon 1978: 1-3)

Although the verb forms are the same, when the pronominal form is inflected for the patient (13b), the sentence conveys a spontaneous action or a "lack of protagonist control" (McLendon 1978: 4). A similar contrast is seen in the Caucasian language Tsova-Tush (Batsbi), where "[the] referent of [an ergative] subject is a voluntary, conscious, controlling participant in the situation named by the verb" (Holisky 1987:113).

Tsova-Tush (Batsbi)

(14) a. (as) vuiz-n-as.

1SG.ERG fall-AOR- 1SG.ESRG

'I fell down, on purpose.'

b. (so) voz-en-sO.

1SG.NOM fell-AOR-1SG.NOM

'I fell down, by accident.' (Holisky 1987: 104)

In addition to these cases of "fluid-S" marking (Dixon 1994), split intransitivity may be realized as a lexically-conditioned phenomenon, where intransitive verbs are classified into an "agentive" class and a "patientive class." Agentive and patientive nominals respectively trigger marking similar to the corresponding arguments of a transitive clause. The Philippine language Cebuano shows this pattern through a focus system which is characteristic of Formosan and Western Austronesian languages:

Cebuano

(15) Transitive actor-focus construction

Ni-basa ako ug libro.

AF-read I.TOP 1NDEF book

'I read a book.'

(16) Transitive patient-focus construction

Gi-basa nako ang libro.

PF-read I TOP book

'I read the book.'

(17) a. Agentive intransitive

Ni-dagan ako. (actor-focus form)

AF-run I.TOP

'I ran.'

b. Patientive intransitive

Gi-kapoy ako. (patient-focus form)

PF-tired I.TOP

'I got tired/I am tired.'

Generalizing processes have the effect of obliterating the basic semantic motivation for distinguishing two classes of intransitive verbs; either the larger agentive or larger patientive class of intransitive verbs tends to have semantically heterogeneous verbs. Nevertheless, the split of intransitive verbs into two classes is rooted in the distinction between volitional and involuntary actions involving an animate protagonist. This is seen in a minority class of verbs, such that a minority agentive class contains verbs denoting controlled actions, and a minority patientive class includes verbs denoting involuntary states of affairs (see Merlan 1985). In Cebuano (and perhaps other Philippine languages as well) the larger agentive class includes verbs denoting uncontrolled events such as raining or slipping off, while the minority patientive class contains verbs that express strictly involuntary states of affairs such as being hungry, becoming tired, or contracting diseases.

The patterns of split intransitivity discussed here underscore an important point that we wish to advance in this article: voice can be also expressed by nominal forms. Traditionally, voice has been regarded as a verbal category. Indeed, many linguists take verbal marking or verbal inflection as the defining feature of voice. (5) We reject this restrictive view. As we define it, voice is concerned with the evolutionary properties of an action. It is typically marked on the verb because a verb expresses an action. Verbal voice marking is therefore simply a case of iconicity. An action, however, also involves participants such as agent and patient. Because an action occurs in relation to these protagonist participants, any form representing them could also bear voice marking. The volitional/ spontaneous opposition manifested in nominal forms also reflects the underlying relationship between the origin of an action and the volitional status of the agent. (6) Nominal marking for certain voice contrasts is thus also motivated by the iconicity principle.

Let us now turn to the causative/noncausative opposition. As noted in the introduction, the causative has been problematic with respect to its status as a voice category. Widely-received definitions of voice, such as Crystal's in Note 1, maintain that voice oppositions do not entail a semantic contrast, which have prevented many grammarians from readily accepting causative/noncausative as one. As the above discussion on the volitional/spontaneous opposition shows, however, there is no reason to believe that voice is a semantically neutral phenomenon. As it happens, one of the oldest systems of voice contrast in Indo-European--the active/middle opposition--also involves a meaning contrast (see below). (7) The question concerning the causative/noncausative opposition (and other semantic oppositions) is whether the relevant contrasts can be naturally integrated into a coherent conceptual framework of voice. Our answer will be yes.

The causative/noncausative opposition pertains to the origin of an action; that is, whether the action originates with the agent of the main action or with another agent heading the action chain. The causative action chain is represented in Figure 3. (8)

[FIGURE 3 OMITTED]

In a noncausative situation, the initial agent ([Agent.sub.2]) is also the agent of the main action. In a causative situation, the ultimate origin of the main action lies in the agent ([Agent.sub.1]) heading the action chain, which is different from the agent ([Agent.sub.2]) of the main action. The relevant parameter for the causative/noncausative distinction can be formulated as below:

Causative/noncausative opposition:

Does the action originate with an agent heading the action chain that is distinct from the agent or patient of the main action?

Yes [right arrow] causative

No [right arrow] noncausative

The contrast between a noncausative situation represented by an expression such as Bill walked and its causative counterpart expressed by a periphrastic causative form like John made Bill walk can thus be naturally captured in terms of the nature of the origin of an action. Situations expressed by lexical causatives such as John killed Bill have an (initial) agent distinct from the patient of the main action.

One of the important points of past studies of causative constructions has to do with the fact that a voice category can be expressed by a construction as a whole, rather than by local morphological entities such as verb inflection or nominal case marking. Lexical and periphrastic causative constructions such as John killed Bill and John made Bill walk are a case in point. They differ in form from morphological causatives such as Quechua wanu-ci (die-CAUSE) 'kill' and Japanese aruka-se (walk-CAUSE) 'make walk', where the causative meaning is expressed morphologically. Traditionally, grammarians have tended to consider only morphological causatives as proper cases. However, such a position leads to the uncomfortable decision of treating the Quechua and Japanese forms cited above as causative, while treating the semantically parallel English expressions kill and make walk as noncausative. The form-based treatment of causatives is tantamount to simply circumscribing morphological causatives, and does not lead to a comprehensive study of causative phenomena. Causation is a semantic, not a morphological notion, and as such the whole range of expression types must be taken into account in a satisfactory analysis. Indeed, a (functional) typological study is predicated on the view that a variety of expression types will obtain in any given conceptual domain. The formal tripartite pattern of lexical, morphological, and periphrastic causative constructions has now been widely accepted, and some revealing correlations between form and function have been identified in the causative domain (see Shibatani and Pardeshi 2002 on recent developments). We see below that a similar pattern holds in other voice domains as well.

Having discussed two voice phenomena pertaining to the origin of an action, we now turn to the next major voice parameter concerning its development. We will consider the other voices associated with the nature of the origin of an action--the passive and the inverse--after dealing with other conceptually-based voice phenomena.

3.2. Parameters pertaining to the development of an action

In this section we recognize at least two sets of contrastive patterns in the developmental phase of an action. One is concerned with whether the action develops beyond the personal sphere of the agent or is instead confined within it. The latter mode of development forms the conceptual basis of what is known as the middle voice. The other contrastive pattern of action development is concerned with whether or not the action has been successfully transferred to the patient and has achieved its intended effect. This contrast forms the conceptual basis for the ergative/antipassive opposition.

The active/middle voice opposition is best known from studies of classical Indo-European languages such as Ancient Greek and Sanskrit, and calls for a broad understanding of the notion of action confinement in the agent's personal sphere. The clearest case in which the development of an action is confined to the agent's sphere is when simple intransitive activities, such as sitting and walking, are lexicalized as intransitive verbs. Here the development of the action is clearly confined within domain of the agent, as shown in the schematic representation Figure 5a. These situations contrast with active (causative) situations (e.g. John sat his son in the chair and John made his son walk) where the relevant actions involve an agent that instigates an action which develops outside the (initial) agent's domain (see Figure 4). In the words of Benveniste (1971 [1950]: 148): "In the active, the verbs denote a process that is accomplished outside the subject. In the middle, which is the diathesis to be defined by the opposition, the verb indicates a process centering in the subject, the subject being inside the process."

[FIGURES 4-5 OMITTED]

Reflexive situations also constitute one of the middle action types, since here the action is also confined within the agent's personal sphere. The active expression John hit Bill contrasts with the reflexive expression John hit himself, where the confinement of the hitting action within one's personal sphere (e.g. hitting one's head or body) is marked by a coreferential reflexive pronoun (see Figure 5b). (9)

Other middle situations of body-care action--bathing, combing one's hair, washing one's hands, and dressing oneself- are straightforward, where the agent's action deals with its own body or body part. Because an action confined to the agent's sphere typically affects the agent itself, this aspect of the middle--an effect accruing to the agent itself--plays an important role in framing certain actions of the middle. Greek middle expressions such as paraschesthai ti 'to give something from one's own means' and paratithesthai siton 'to have food served up' are a case in point. Here the actions actually extend beyond the agent's sphere, but their effects accrue on the agent in the manner of a typical middle depicted in Figures 5b and 5c. In other cases, the notion of the agent's personal sphere is more strictly adhered to, as in the following examples:

Sanskrit

(18) a. devadatto yajnadattasya bharyam

Devadatta.NOM Yajnadatta.GEN wife upayacchati. (active)

have. relations. 3SG.ACT

'Devadatta has relations with Yajnadatta's wife.'

b. devadatto bharyam upayacchate. (middle)

Devadatta.NOM wife have.relations.3 SG.MID

'Devadatta has relations with his (own) wife.'

(Klaiman 1988: 34)

Sanxiang Dulong/Rawang

(19) a. [an.sup.53) [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55]. (active)

3SG mosquito hit

'S/he is hitting the mosquito.'

b. [an.sup.53] [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55] -[cm.sup.31]. (middle)

3SG mosquito hit-MID

'S/he is hitting the mosquito (on her/his body).'

(LaPolla 1996: 1945)

The active/middle opposition is diagrammatically shown as above, where the dotted circles and arrows represent the agent's personal sphere and actions respectively.

The conceptual basis of the active/middle opposition can then be formulated in terms of the manner of the development of an action, as follows:

Active/middle opposition:

Active: The action extends beyond the agent's personal sphere and achieves its effect on a distinct patient.

Middle: The development of an action is confined within the agent's personal sphere so that the action's effect accrues on the agent itself.

Defining the middle voice domain in terms of confinement of an action within the sphere of the agent affords a unified treatment of various types of middle construction. Just as in the case of causatives, middle constructions come in three types--lexical, morphological, and periphrastic--both within individual languages and across different ones. Balinese, for example, exhibits all three types of middle construction, allowing some situation types to be expressed either morphologically or periphrastically, as shown in Table 1.

Our approach to middle voice phenomena is more consistent than Kemmer's (1993), which distinguishes reflexive situations from other middle event types, although these two categories are assumed to form a continuum, as shown in the diagrammatic representation in Figure 6. In our approach, Kemmer's reflexive, middle, and single participant situation types all fall in the middle voice domain, as defined above. Kemmer's distinctions among these types appear to be partly based on the typical forms expressing them. Reflexive situations tend to be expressed periphrastically, as in the case of Balinese nyagur awak 'hit oneself'. Kemmer's middle situation types are typically expressed morphologically, as in ma-cukur 'shave' in Balinese, and single-participant events are typically expressed by forms without any middle markers, as in the Balinese lexical middle negak 'sit'.

[FIGURE 6 OMITTED]

Kemmer arrives at her classification of event types as a result of her decision to "[deal] with ... middle-marking languages, or languages with overt morphological indications of the middle category" (Kemmer 1993: 10; bold face original, underline added). As pointed out in the discussion of causatives above, a strict form-based approach to the middle voice tends to focus on morphological middles, which is similar to the narrow treatment of morphological causatives, ignoring other possible form types. Such an approach would consider the Tarascan (Mexico) form ata-kurhi 'hit oneself' and the Quechua form maqa-ku 'hit onself' as middies, while treating the English and Balinese equivalents hit oneself and nyagur awak as distinct reflexives. Perhaps Kemmer would consider oneself and awak here as "overt morphological indications of the middle category." But then, why is she distinguishing reflexive situations from the middle situations in her diagram reproduced in Figure 6? Also, what of the German form aufstehen 'stand up', which shows no middle marking? Is it not a middle because it lacks any morphological marking? It is semantically equivalent to the Balinese middle form ma-jujuk 'stand up'.

A more systematic typological investigation of the form-function correlation can be achieved if variation in form is taken as a function of the "naturalness" of the middle action. Natural middle actions--for example, sitting and walking--tend to be lexicalized as intransitive verbs, while actions typically directed to others--for example, hitting and kicking--tend to be expressed by periphrastic constructions involving a reflexive form when they are confined within the agent's personal sphere. What Kemmer (1993) has identified as middles--morphological middles--center on those actions that people typically apply to themselves, but that are applied to others often enough. (10) One must, however, realize that there are both intra- and crosslinguistic variations--such that in Balinese ma-jujuk 'stand up' has a morphological middle prefix, but negak 'sit (down)' is simply lexical. The same marking pattern is reversed in German, where sich hinsetzen 'sit down' has a middle marker, but aufstehen 'stand up' does not. These irregularities require individual accounts, based on historical, cognitive, and even cultural data.

The middle voice system has several important implications for our general understanding of the nature of voice phenomena. Recall that most of the widely received definitions of voice (such as the one quoted from Crystal [2003] in Note 1) hold that voice opposition does not entail a meaning contrast. This is not the case for the active/middle opposition, as shown by the examples above as well as by the contrast between the English active form John hit Bill and the middle form John hit himself.

Secondly, these examples show that voice alternations do not necessarily alter argument alignment patterns. There is no change in grammatical relation in the contrastive pairs in (18) and (19). If the situations depicted there give the impression of unusual utterances, consider the mundane situations described by the following Greek examples, where a meaning contrast is expressed without a realignment of arguments:

Ancient Greek

(20) a. louo khitona. (active)

wash.1SG.ACT shirt.ACC

'I wash a shirt.'

b. louomai khitona. (middle)

wash.1SG.MID shirt.ACC

'I wash my shirt/I wash a shirt for myself.'

While morphological middle constructions in some languages are strictly intransitive (as in the case of the Balinese ma-), and middles derived via the decausative function (as in the Greek forms porefisai 'to cause to go, to convey': poreusasthai 'to go' and kaiein 'to light, kindle': kaiesthai 'to be lighted, to bum') are intransitive, intransitivity is not a defining property of middle constructions. A large number of languages allow middle constructions that are syntactically transitive, as shown in the examples above and (21b) below, where the direct object is clearly marked by the accusative case suffix -n.

Amharic

(21) a. lemma te-lac' ce.

Lemma MID-shave.PERF.3M

'Lemma shaved himself.'

b. lamma ras-u-n te-lac' ce.

Lemma head-POSS.3M-ACC MID-shave.PERF.3M

'Lemma shaved his head.'

(Amberber 2000: 325, 326)

The general tendency for morphological middles to be intransitive is best viewed as the result of historical processes responding to the pressure on the form to conform to the semantic intransitivity, which characterizes middle events. This is exactly what has happened to many of the middle forms expressing reflexive middle situations in European languages, where the relevant affixes evolved from reflexive pronouns in the parent languages. The course of this development can be illustrated by using synchronic data below, where the Swedish example shows an intermediate clitic stage, the Russian form sebja exemplifies the earliest transitive pattern, and -s' (or -sja) the advanced fused pattern.

(22) a. Ivan ubi-1 sebj-a. (Russian)

Ivan kill.PERF-PAST.SG.MASC self-ACC

'Ivan killed himself.'

b. Honkamma-de sag. (Swedish)

she comb-PAST MID

'She combed.'

c. Ona prichesa-l-a-s'. (Russian)

she comb-PAST-FEM-MID

'She combed.'

Finally, in recognizing intransitive and transitive verbs as lexicalized middle and active voice forms, we elevate the active/middle contrast to the status of a central voice opposition observed in all human languages (cf. Dixon's [1979: 68-69] observation that "all languages appear to distinguish activities that necessarily involve two participants from those that necessarily involve one ... Then all languages have classes of transitive and intransitive verbs, to describe these two classes of activity"). (11)

Let us now turn to the antipassive voice. As the name suggests, the syntactic properties of antipassive constructions mirror somewhat those of passives, but the semantic aspect is different in these two voices. In the case of the passive, there is no implication that an agent is not somehow fully involved in the action. Indeed, full involvement of an agent is a crucial feature distinguishing the passive (e.g. John was killed while he was asleep) from the spontaneous middle (e.g. John died while he was asleep). Antipassive situations contrast in meaning with those expressed in the active and the ergative voice regarding the attainment of the intended effect upon a patient, however.

The intended effect of an action on a patient differs depending on the verb type. With contact verbs, the antipassive presents a situation as failing to make contact, as in the following examples:

Chukchee

(23) a. elteg=e keyn=en penre-nen.

father=GER bear=ABS attack=3SG:3SG/AOR

'The father attacked the bear.'

b. elteg=en penre=tko=g[??]e

father=GER attack=APASS=3SG.AOR

keyn=ete. (antipassive)

bear=DAT

'The father rushed at the bear.'

(Kozinsky et al. 1988: 652)

Warlpiri

(24) a. nyuntulu-rlu [??]-npa-ju pantu-rnu ngaju.

you-ERG [??]-2SG.A-1SG.P spear-PAST I.ABS

'You speared me.'

b. nyuntulu-rlu [??]-npa-ju-rla pantu-rnu

you-ERG [??]-2SG.A-1SG-DAT spear-PAST

ngaju-ku. (antipassive)

I-DAT

'You speared at me; you tried to spear me.'

(Dixon 1980: 449)

According to Dixon (1980: 449), (24b) above "indicates that the action denoted by the verb is not fully carried out, in the sense that it does not have the intended effect on the entity denoted by the object [read "patient", MS]." Similarly, visual contact is not made when situations involving visual perception are presented in the antipassive voice:

Warrungu

(25) a. nyula nyaka+n wurripa+[??].

3SG.NOM see+P/P bee+ABS

'He saw bees.'

b. ngaya nyaka+kali+[??] wurripa+wu katyarra+wu.

1SG.NOM see-APASS+P/P bee+DAT possum+DAT

'I was looking for bees and possums.'

(Tsunoda 1988: 606)

Moreover, for action types affecting a patient, the antipassive voice presents a situation as NOT affecting the patient in totality, as in the following examples:

Samoan

(26) a. S[bar.a] 'ai e le teine le i'a.

PAST eat ERG ART girl ART fish

'The girl ate the fish.'

b. S[bar.a] 'ai le teine i le i'a.

PAST eat ART girl LOC ART fish

'The girl ate some (of the) fish.'

(Mosel and Hovdhaugen 1992: 108)

The voice parameter focusing on the ergative/antipassive contrast can be formulated as below:

Ergative/antipassive opposition:

Does the action develop to its full extent and achieve its intended effect on a patient?

Yes [right arrow] ergative(/active)

No [right arrow] antipassive

Notice that in (24b) an antipassive event is conveyed solely by the case marking on the patient, underscoring our earlier point that voice may be manifested in a nominal element denoting the relevant participant. In the case of the antipassive, the status of the patient is at issue, and antipassivization iconically affects the form of the patient nominal--either case marking it differently from the active/ergative (a case of the so-called differential object marking [Moravcsik 1978]), or avoiding coding it (examples below).

As conceived here, both the middle and the antipassive relate to the nature of the development of an action. Specifically, both have the ontological feature of an action not (totally) affecting a distinct patient. The conceptual affinity between the two explains the middle/antipassive polysemy seen in a fair number of languages. Observe:

Yidiny

(27) a. wagu:da bambi-dinu.

man.ABS cover-MID

'The man covered himself.'

b. wagu:da wawa-:dinu gudaganda.

man.ABS saw-APASS dog.DAT

'The man saw the dog.'

(Dixon 1977: 277, 280)

Balinese

(28) a. Ia sedek ma-sugi.

3SG ASP MID-wash.face

'She is washing her face.'

b. Tiang ma-daar.

1SG APASS-eat

'I ate.'

Shibatani and Artawa 2002)

Russian

(29) a. Ivan mojetsja mylom.

Ivan wash.MID soap.INSTR

'Ivan washed himself with soap.'

b. Babuska rugajetsja.

granny.NOM scold.APASS

'Granny is scolding.'

(Geniusiene 1987: 9)

In addition, languages may show the well-known connection between the middle and the passive (12) through the use of the same form as the antipassive, thus illustrating a three-way middle-passive-antipassive polysemy:

Russian (cf. the examples immediately above)

(30) Dom stroitsja turezk-oj firm-oj

house.NOM is.being.built.PASS Turkish-INST firm-INST

INKA.

INKA

'The house is being built by the Turkish company INKA.'

Kuku Yalanji

(31) a. karrkay julurri-ji-y. (middle)

child.ABS wash-MID-NONPAST

'The child is washing itself.'

b. warru (yaburr-ndu) bayka-ji-ny. (passive)

young man.ABS shark:LOC:pt bite-PASS-PAST

'The young man was bitten (by a shark).'

c. nyulu dingkar minya-nga nuka-ji-ny. (antipassive)

3SG.NOM man.ABS meat-LOC eat-APASS-PAST

'The man had a good feast of meat (he wasted nothing).'

(Patz 1982: 244, 248, 255)

3.3. The termination of action parameter

In a regular transitive event, an action terminates in a patient. However, the action may extend beyond the patient and affect an additional entity, which then functions as a new terminal point. Benefactives/malefactives and applicatives express this kind of situation. The relevant parameter can be formulated in the following form:

Benefactive/malefactive/applicative parameter:

Does the action develop further than its normal course, such that an entity other than the direct event-participants becomes a new terminal point registering an effect of the action?

No [right arrow] active/middle

Yes [right arrow] benefactive/malefactive/applicative

While the notion of benefit-giving is a broad one, there is one particular type with a perceptible change in the beneficiary. This is the case involving transfer of an object, where the object itself is directly affected by the act of giving. In a typical giving situation, the object is physically moved from one owner to a new one. The recipient beneficiary is secondarily affected because it comes into possession of the transferred object. Languages often have a special benefactive construction that portrays this type of situation, where the effect on the beneficiary is indicated by its argument status in syntactic coding. As shown in Shibatani (1996), benefactive constructions are typically based on the syntactic schema of the give-construction even involving the verb form for giving in some languages, as in the case of Japanese seen below:

(32) a. Taroo-wa Hanako-ni hon-o yat-ta.

Taro-TOP Hanako-DAT book-ACC give-PAST

'Taro gave Hanako a book.'

b. Taroo-wa Hanako-ni hon-o kat-te

Taro-TOP Hanako-DAT book-ACC buy-CONJ

yat-ta.

BEN-PAST

'Taro bought Hanako a book.'

In (32b) the buying action is extended beyond the patient (the book), and affects the beneficiary nominal (Hanako) coded in the dative form. Compare this construction to the one below, expressing a more general benefit-giving in which the beneficiary takes on a nonargument form.

(33) Taroo-wa Hanak-no tame-ni hon-o

Taro-TOP Hanako-of sake-for book-ACC

kat-te yat-ta.

buy-CONJ GIVE-PAST

'Taro bought a book for (the sake of) Hanako.'

While (33) may express any type of benefit-giving--including one of buying a book to help Hanako's book-selling business--(32b) specifically conveys the meaning that the transfer of the book was intended. Note also the English translations accompanying these examples, which show the same contrast.

Benefactive/malefactive events are also realized by so-called external possession constructions in Indo-European and some other languages (cf. Payne and Barshi 1999), although the context may determine whether or not a clear benefactive/malecfactive reading obtains from them. When a body part is involved as the primary patient (cf. below), the benefactive/malecfactive reading is not strongly pronounced beyond that which is conveyed by the verb; cf. (34) and (35a):

German

(34) Ich wasche mir die Hande.

I wash I.DAT the hands

'I wash my hands.' (lit. 'I wash me the hands.')

(35) a. Man hat ihm den Arm gebrochen.

lit. 'They broke him the ann.'

b. Man hat seinen Arm gebrochen.

'They broke his arm.'

Where inalienable possession is implicated as above, the dative nominal indicates that the action has affected it as a new terminal point of the action. In German, the external possession construction is generally obligatory when the affected body part is inalienably possessed; the extension of the action to its owner is inevitable under such circumstances. Indeed, an internal possession construction like (35b) suggests that the arm in question was detached, and no effect on its owner is asserted by such a sentence. Internal possession constructions involving inalienably possessed body parts, as in the English form I broke his arm, suggest that the arm's owner was affected, but the implication is obtained through a commonsensical world view. The dative construction (35a), on the other hand, asserts that the body part owner is affected by the action.

The benefactive/malefactive reading can be seen more readily in the following examples, where the dative nominal represents a mentally affected party:

French

(36) a. Jean lui a casse sa vaisalle.

lit. 'Jean broke her her dishes.'

b. Jean a casse sa vaisalle.

'Jean broke her dishes.'

Modern Hebrew

(37) a. ha tinok lixlex li et ha xulca.

the baby dirtied I.DAT ACC the shirt

'The baby dirtied the shirt on me.'

b. ha tinok lixlex et ha-xulca shel-i.

the baby dirtied ACC the-shirt of-me

'The baby dirtied my shirt.'

(Berman 1982; T. Gibon pers. comm.)

Where inalienable possession is evident, as in these examples, a malefactive meaning obtains more readily. The trade-off between inalienability and affective reading shows that a principle of relevance is at work in these constructions: the relevance of the dative arguments to the event must be somehow "guaranteed." Involvement of an inalienably possessed object guarantees the relevance of the possessor to the event, since whatever happens to the body part will affect its possessor automatically. When an inalienable possession relation does not obtain--as in (36a) and (37a)--a benefactive/malefactive effect upon the dative argument is pronounced as a way of establishing its relevance to the event. The attendant interpretation that a possessive relation exists contributes to the establishment of the affective relationship; the owner of an object is more easily affected by what happens to its possession.

Contrary to what the label suggests then, so-called external possession constructions DO NOT assert a possessive relation between the dative argument and the directly affected patient. Indeed, the relevant constructions arise independently from externalization of the possessor, as in the German example below (also in [36a] above), or when the notion of possession is irrelevant, as in the following examples (40)-(41) from River Warihio (Uto-Aztecan): (13)

German

(38) Peter repariert mir mein Fahrrad.

'Peter fixes me my bicycle.'

River Warihio

(39) a. hustina pasu-re muni kukuci icio.

Agustina cook-PERF beans children BEN

'Agustina cooked beans for the children.'

b. hustina pasu-ke-re muni kukuci.

Agustina cook-BEN-PERF beans children

'Agustina cooked beans for the children.'

(40) maniwiri no'o wikahta-ke-ru yoma aari.

Manuel 1SG.NS sing-BEN-PERF all afternoon

'Manuel sang all afternoon for me.'

(41) tapana no'o yuku-ke-ru.

yesterday 1SG.NS rain-BEN-PERF

'Yesterday it rained on/for me.'

(Felix 2005: 253, 257, 258)

That the condition of physical proximity should be more important than the possessive relation in inducing a benefactive/malefactive construction is shown by the following River Warihio examples (see Shibatani 1994 for other cases):

(42) a. maniwiri ihcorewapate-re wani pantaoni-ra.

Manuel get.dirty-PERF John jeans-POSS 'Manuel dirtied John's jeans.' (John's jeans were over the chair.)

b. maniwiri ihcorewapate-ke-re pantaoni wani.

Manuel get.dirty-BEN-PERF jeans John

'Manuel dirtied John's jeans.' (John was wearing his jeans.)

In general, applicative constructions have been considered as syntactic valency-increasing operations that are pragmatically motivated (see Peterson 1999). Our claim is that their conceptual basis is rooted in the ontological feature of an action, as stated in the voice parameter above. Peterson's (1999) survey shows that certain applicatives are more basic and prevalent than others. In the words of Peterson (who lumps benefactives and applicatives together), "the locative and circumstantial applicatives depend on the presence of other applicative constructions, while benefactive and instrumental/comitative applicatives do not. That is, there are two core applicative constructions, benefactive and instrumental/ comitative, and these serve as anchors as it were for the development of additional applicative constructions marked either by the same or distinct morphology" (Peterson 1999: 135). This observation is consistent with our view of the benefactive/applicative voice. Benefactive and instrumental/comitative participants are much more directly involved in the event than a causal factor, or setting entity such a location, hence much more likely to be affected by the action. That the benefactive applicative is obligatory in some languages also underscores the point regarding the affected nature of the recipient beneficiary (cf. above).

In the past, grammarians may have not paid sufficiently close attention to the subtle meaning differences that exist between applicative constructions and their nonapplicative counterparts. However, recent descriptions of applicative constructions have begun to notice some revealing semantic effects. For example, Donohue (1999) shows that the Tukang Besi comitative applicative conveys a meaning whereby the applied comitative nominal is actively engaged in the event: (14)

Tukang Besi

(43) a. No-moturu kene wowine ane ke hotu mopera.

3R-sleep and woman exist and hair short

'He slept with the woman with the short hair.'

(i.e. they were sleeping near each other.) (# they had sex together.)

b. No-moturu-ngkene te wowine ane ke hotu

3R-sleep-COM CORE woman exist and hair

mopera.

short

'He slept with the woman with the short hair.' (i.e. they had sex together.)

(Donohue 1999: 231)

The following instrumental applicative from Pulaar also demonstrates how an applied instrumental nominal can implicate a participant more thoroughly affected by the agent's action:

Pulaar

(44) a. mi loot-ii min am a

1SG wash-PERF.ACT y.s. 1SG.POSS PREP

saabunnde hee.

Soap DET

'I washed my younger sibling with (some of) the soap.'

b. mi loot-r-ii min am

1.SG wash-INST-PERF.ACT y.s. 1SG.POSS

saabunnde hee.

soap DET

'I washed my younger sibling with (all of) the soap.'

(Sebastian Ross-Hagebaum pers. comm.)

The various effects of locative applicatives have also been recognized in the literature. The Balinese locative expression in (45b) below, for example, describes a situation where the action of planting banana trees extends in such a way as to affect the garden. Here the entire garden ends up being planted with banana trees, while no such implication is made in the nonapplicative counterpart (45a).

Balinese

(45) a. Tiang mulan biyu di tegalan tiang-e.

1SG plant banana in garden 1SG-POSS

'I planted bananas in my garden.'

b. Tiang mulan-in tegalan tiang-e biyu.

1SG plant-APPL garden 1SG-POSS banana 'I planted my garden with bananas.'

(I. Wayan Arka pers. comm.)

On the conceptual framework for voice phenomena *.

Abstract

This article attempts to lay the conceptual foundations of voice phenomena, ranging from the familiar active/passive contrast to the ergative/antipassive opposition, as well as voice functions of split case-marking in both transitive and intransitive constructions. We advance the claim that major voice phenomena have conceptual bases rooted in the human cognition of actions, which have evolutionary properties pertaining to their origin, development, and termination. The notion of transitivity is integral to the study of voice as evident from the fact that the so-called transitivity parameters identified by Hopper and Thompson (1980) and others are in the main concerned with these evolutionary properties of an action, and also from the fact that the phenomena dealt with in these studies are mostly voice phenomena. A number of claims made in past studies of voice and in some widely-received definitions of voice are shown to be false. In particular, voice oppositions are typically based on conceptual--as opposed to pragmatic--meanings, may not alter argument alignment patterns, may not change verbal valency, and may not even trigger verbal marking. There are also voice oppositions more basic and wide-spread than the active/passive system, upon which popular definitions of voice are typically based

1. Introduction

Current studies on voice phenomena suffer from a number of inadequacies at several levels of description and explanation. At the most fundamental level, there is no coherent conceptual framework that adequately addresses the matter, such that we are often left to wonder whether or not a given phenomenon falls in the domain of voice. For one thing, people differ in the treatment of causative and reflexive constructions; some consider them to represent voice categories, while others do not. Still others avoid raising the issue at all. Various definitions currently offered are of little use, as they are typically based on an Indo-European active/passive opposition, and arbitrarily include or exclude a particular phenomenon from the domain of voice. (1)

Properly identifying construction types representing a voice sub-domain is also a serious problem. In Crystal's (2003) definition (cf. Note 1) reflexives are not recognized as proper voice constructions and their relationship to the middle voice is not entirely clear. A similar problem is seen in Kemmer's (1993) extensive study of middle voice constructions.

There are also severe limitations at the level of explanation. Closer to the main theme of this volume is the problem of understanding the increases and decreases in valency and accompanying changes in argument structure observed in voice phenomena. Why do certain phenomena (e.g. the causative and applicative) show an increase in valency, while others (e.g. the passive and antipassive) typically have a valency-reducing effect? What motivates these valency changes in opposite directions?

Functional explanations regarding the distribution of certain voice constructions go a long way toward an explanatory functional study of grammatical phenomena (cf. Haiman 1985). Being largely based on formal properties such as "linguistic distance" and "full" vs. "reduced form," these explanations are not functional enough to be able to make more general predictions. (2)

The problems outlined above largely stem from two related methodological issues. One is the lack of a coherent conceptual framework for characterizing and analyzing voice phenomena; the other is an over-reliance on formal properties in both analysis and explanation. Clearly the latter problem is caused by the former and by the lack of commitment to the cognition-to-form approach in linguistic analysis. (3) The purpose of this article is thus to lay out a conceptual framework that coherently delineates the domain of voice, which embraces both those phenomena that are traditionally recognized as falling in the voice domain and those that have been kept in limbo. The framework required must deal with the fact that many voice phenomena straddle the semantics-pragmatics boundary, although the active/middle opposition is basically conceptual or semantic, and the active/passive opposition is largely pragmatic. We endeavor to unify these manifestations of voice function by assuming that the pragmatic relevance of clausal units is semantically determined in the first place.

The conceptual foundations of voice can only be arrived at by inspecting contrasting phenomena across languages. Our initial task is therefore to learn how a given language, using its own resources, achieves the goal of expressing a relevant conceptual opposition found in another language. While the ultimate goal of functional typology is to discover the correlative patterns between form and function, this article is concerned primarily with the initial task of postulating conceptual bases of voice phenomena and identifying constructions across languages that express the relevant oppositions.

One final introductory remark is due regarding the controversy over the question of whether the formal relationships between opposing voice categories should be treated as inflectional or derivational. We consider this question to be academic in the absence of rigorous definitions for these processes. In the realm of voice phenomena, some systems, for example, the Ancient Greek active/middle system, incorporate voice morphology in their inflectional paradigm. Others like the English active/ passive opposition do not show a simple morphological relationship--inflectional or derivational--since constructions as a whole enter into the formal opposition. The regularity or productivity of the pattern is often taken to be an important criterion distinguishing inflections from derivations; the former are thought to be regular and obligatory, while the latter allow exceptions. But regularity in natural language is always relative, and so are the patterns of voice oppositions. Even among the known ones, nothing is one hundred percent regular. An alternation that is well-integrated within the inflectional paradigm may show irregularity. In Ancient Greek, for example, we find both active forms that do not have middle counterparts (activa tantum) and middle forms lacking the corresponding active (media tantum). The active/passive opposition also shows a high degree of regularity, without ever being one hundred percent (as in the case of English), others place much severer limitations on the range of permissible passive constructions.

2. The evolution of an action: voice, transitivity, and aspect

The basic claim of this article is that major voice phenomena have their conceptual bases rooted in the human cognition of actions. Because such actions have various effects upon us, we have special interest in the way that they arise, how they develop, and the manner in which they terminate--what is referred to as the evolutionary properties or phases of an action in this article. Through a system of grammatical oppositions, a language provides a means for expressing conceptual contrasts pertaining to the evolutionary properties of an action that the speaker finds relevant for communicative purposes. Among the evolutionary properties, voice is primarily concerned with the way event participants are involved in actions, and with the communicative value, or discourse relevance pertaining to the event participants from the nature of this involvement.

Mention of the evolution of an action immediately brings to mind two other grammatical concepts, namely, transitivity and aspect. It is thus appropriate to clarify the relationships and differences between these notions. Traditionally, voice has been defined in reference to transitivity, or more narrowly in terms of the transitivity of a verb or clause; the active/ passive opposition most typically obtains with transitive verbs. A more important connection between transitivity and voice, however, lies in the notion of semantic transitivity, rather than strictly verbal or clausal transitivity. Indeed, it is easy to see this connection, as in the work of Hopper and Thompson (1980), where many of the phenomena discussed in terms of transitivity are nothing but voice phenomena. This important article concludes the section on grammatical transitivity as follows: "It is tempting to find a superordinate semantic notion which will include all the Transitivity components. If there is one, it has so far not been discovered ..." (Hopper and Thompson 1980: 279). Our claim is that what they are looking for is a theory of voice. In fact, the work of Hopper and Thompson lays important ground work for the study of voice. In this regard, Kemmer (1993: 247) is absolutely correct in noting that "the scale of transitivity ... forms the conceptual underpinning for voice systems in general, and for reflexive and middle marking systems in particular." (4) While none of these works makes it quite clear, voice is a system of correspondences between action or event types and syntactic structures. For example, what is known as the active voice is the pattern of correspondence between the high transitive event type or the prototypical transitive action and the nominative-accusative coding pattern of the event participants, as in the English active sentence She killed him (see Section 5 below).

The parameters of transitivity identified by Hopper and Thompson (1980) pertain to "different facet[s]" of "carrying-over or transferring an action from one participant to another" (Hopper and Thompson 1980: 253), and they in effect represent the evolutionary properties of an action, that is, they pertain to the way an action is brought about, to the way it is transferred to the second participant, and to the way it affects this participant. In order to bring grammar closer to cognition, we propose to examine specific evolutionary properties of an action pertaining to voice oppositions that are distilled as transitivity parameters in Hopper and Thompson (1980) and others dealing with the issues of transitivity.

If transitivity is integral to a theory of voice, how then do aspect and voice differ under the assumption that both are concerned with the way an action evolves? These two grammatical categories invite different kinds of questions. Aspect asks where the vantage point is with regard to the temporal structure of an action. When the action is viewed holistically encompassing all of its temporal phases, we obtain the perfective viewpoint of the described action. On the other hand, if specific sections of internal temporal structure are focused, we obtain various types of imperfective aspectual construal of an event. The contrast between the perfective and the imperfective aspects and the representative subcategories of the latter seen across languages are represented in Figure 1.

[FIGURE 1 OMITTED]

Voice, on the other hand, asks how an action evolves--that is, it asks about the nature of its origin, the manner in which it develops, and the way that it terminates. These evolutionary phases of an action and the various voice categories pertaining to them are depicted schematically in Figure 2.

[FIGURE 2 OMITTED]

3. Major voice oppositions and their conceptual bases

Under the present conception, the three principal evolutionary phases of an action--origin, development, and termination--form the basis for the major voice parameters. These parameters are generally expressible in the form of questions concerning the evolutionary properties of an action, as below:

Major voice parameters:

I. The origin of an action

(a) How is the action brought about?

(b) Where does the action originate?

(c) What is the nature of the agent?

II. The development of an action

How does the action develop?

(a) Does the action extend beyond the agent's personal sphere or is it confined to it?

(b) Does the action achieve the intended effect in a distinct patient, or does it fail to do so?

III. The termination of an action

Does the action develop further than its normal course, extend beyond the immediate participants of the event, and terminate in an additional entity?

Figure 2 summarizes the voice constructions pertaining to these parameters. Throughout the following discussion, we touch upon the theoretical consequences of this diagrammatic representation of the voice domain.

3.1. Parameters pertaining to the origin of an action

The first opposition to be examined has to do with the nature of the origin of an action--namely, whether the action in question is brought about volitionally or nonvolitionally by a human agent.

Volitional/spontaneous opposition:

Is the action brought about volitionally?

Yes [right arrow] volitional

No [right arrow] spontaneous

While not widely recognized as a voice opposition, this distinction has been recognized as such in the Japanese grammatical tradition, perhaps because the suffix for the spontaneous voice is identical with that used in the passive construction. In fact, it is generally believed that the Japanese passive arose from the spontaneous construction. Languages (or grammarians' interpretations of the facts?) may differ with regard to the precise meaning contrast seen in the volitional/spontaneous opposition. In Japanese, the spontaneous construction expresses a situation where the agent does not intend to bring about an action, but where there is a circumstantial factor external to the agent that induces an action (such as eating "dancing-mushrooms" as in [1b] below). In other languages, a spontaneous form conveys the meaning of an action accidentally brought about. Other manifestations of the opposition may be alternatively expressed in terms of such notions as intentional/unintentional or controlled/uncontrolled, but we shall take the position that these contrasts are included in the basic function of the volitional/spontaneous opposition. That is, by "volitional voice" we mean a connection between a particular syntactic form and a type of action that is brought about by the willful involvement of an agent who "intends the action," and sees to it that the intended effect is achieved. Departure from this action type in any significant way may be construed as constituting a spontaneous action, expressed by a construction formally contrasting with the volitional construction.

In Modern Japanese, the domain of the volitional/spontaneous opposition has shrunk to such an extent that mental activities are the only ones where the contrast is readily observed, with the spontaneous morphology (-re/-rare) having generally given way to a passive interpretation in the domain of physical actions. In Classical Japanese (ninth-twelfth centuries), the volitional/spontaneous opposition was more widely observed, as in the following examples:

Classical Japanese

(1) a. Kikori-domo mo mai-keri. (volitional)

wood cutter-PL also dance-PAST

'Wood cutters also danced.'

b. Kikori-domo mo mawa-re-keri. (spontaneous)

wood cuter-PL also dance-SPON-PAST

'Wood cutters also danced willy-nilly.'

Spontaneous expressions in Japanese typically do not contain an agent in subject position. Because information regarding the volitional status of an agent is most readily accessible to the speaker, the volitional/spontaneous distinction is typically made with reference to a first person agent; accordingly, the missing agent is understood to be the speaker unless otherwise specified. This non-coding of an agent in subject position paved the way for a spontaneous expression where a patient nominal is coded in subject position, as in the following spontaneous construction (2b). Undoubtedly, this was an important step in the development of the passive from the spontaneous construction.

Modern Japanese

(2) a. Boku-wayoku mukasi-no-koto-o

I-TOP often old days-GEN-things-ACC

omo-u. (volitional)

think-PRES

'I often think about the things of the old days.'

b. Saikin mukasi-no-koto-ga yoku

recently old days-GEN-things-NOM often

omowa-re-ru. (spontaneous)

think-SPON-PRES

'Recently the things of the old days often come to mind.'

Since the volitional/spontaneous opposition is not widely recognized as a voice phenomenon, it is perhaps worth spending some time showing how widespread in the world's languages it actually is. As in other voice sub-domains, languages make use of different resources in expressing the volitional/spontaneous opposition. Indonesian and Malay use the multifunctional prefix ter- to express unintended or accidental actions:

Indonesian

(3) a. Ali memukul anak-nya. (volitional)

Ali AF.hit child-3SG.POSS

'Ali hit his child.'

b. Ali ter-pukul oleh anak-nya. (spontaneous)

Ali SPON-hit PREP child-3SG.POSS

'Ali accidentally hit his child.'

(I Wayan Arka pers. comm.)

According to Winstedt (1927: 86-87), the function of ter- in Malay is characterized as denoting an action due "not to conscious activity on the part of the subject, but to external compulsion or accident." It is noteworthy that spontaneous constructions in both Japanese and Indonesian/ Malay have an affinity with the passive in that they share the same affix in these languages. Compare the spontaneous constructions above with the passives in Japanese and Indonesian below:

(4) a. Taroo-wa Ziroo-ni nagura-re-ta. (Japanese passive)

Taro-TOP Jiro-by hit-PASS-PAST

'Taro was hit by Jiro.'

b. rumah itu tidak ter-beli oleh

house that NEG PASS-buy PRES

saya. (Indonesian passive)

1.SG

'The house cannot be bought by me.'

The diagrammatic representation of voice constructions in Figure 2 can be thought of as a semantic map, where different constructions are distributed over relevant territory within the voice domain. This is a useful way of representing conceptual affinities among various voice constructions, but its utility is predicated only on a comprehensive view of voice as advocated in this article. Spontaneous and passive are both concerned with the origin of an action. What they share is the idea that this lies NOT in the pragmatically most relevant participant; in the case of the passive, it is the agent of low discourse relevance and in the spontaneous case, it is the external circumstance.

The map in Figure 2 also shows the "neighboring" relationship between the spontaneous, the middle, and the antipassive. In Russian and a number of Australian languages, middle forms are recruited for the volitional/spontaneous contrast, as in the following examples:

Russian

(5) a. Kostja poreza-I xleb.

Kostja cut.PERF-PAST.SG.MASC bread

'Kostja cut the bread.'

b. Kostja porezaq-sja.

Kostja cut.PERF-PAST.SG.MASC-SPON

'Kostja has [accidentally] cut himself.'

(Vera Podlesskaya pers. comm.)

Diyari

(6) a. natu yinana danka-na wawa-yi.

1SG.ERG 2SG.O find-PARTC AUX-PRES

'I found you (after deliberately searching).'

b. nani danka-tadi-na wara-yi yinka ngu.

1SG.ABS find-SPON-PARTC AUX-PREP 2SG.LOC

'I found you (accidentally).'

(Austin 1981: 154)

Another favorite source for the spontaneous construction--especially prominent among Indo-Aryan and Dravidian languages of India--is the so-called dative-subject construction, which typically expresses uncontrollable states:

Sinhala

(7) a. mame ee wacene kiwwa.

I.NOM that word say.PAST

'I said that word.'

b. mate ee wacene kiyewuna.

I.DAT that word say.P.PAST

'I blurted that word out.'

(Gair 1990: 17)

The adaptation of the dative-subject construction for a spontaneous action is also seen when the "dative-subject" is marked by cases different from the dative as in the following Bengali examples, where the nominal form corresponding to the dative subject is marked with genitive. Here the volitional/spontaneous contrast takes on interesting nuances:

Bengali

(8) a. Ami toma-ke khub pc chondo kor-i.

1SG.NOM 2ORDSG-OBJ very liking do-PRES.1

'I like you very much.' (According to my own criteria.)

b. Ama-r toma-ke khub pc chondo

1SG-GEN 2ORDSG-OBJ very liking

hc y.

become-PRES.3ORD

'I like you very much.' (According to some [socially] set criteria.)

(Onishi 2001: 120)

When the basic meaning of the verb denotes a spontaneous (involuntary) action, the volitional voice form can be obtained by using a self-benefactive construction, as in Marathi and other Indo-Aryan languages:

Marathi

(9) a. sitaa raD-l-i.

Sita.NOM cry-PERF-F

'Sita cried.'

b. sitaa-ne raD-un ghet-l-a.

Sita-ERG cry-CONJ take-PERF-N

'Sita cried (so as to relieve herself).'

(Prashant Pardeshi pers. comm.)

Lhasa Tibetan has a set of auxiliaries expressing different categories of perspective. "Perspective-choice" interacts with both person and evidential categories in a complex way, but the relevant auxiliaries can be divided into a "self-centered" and an "other-centered" group (Denwood 1999). Verbs denoting such intentional actions as reading and dancing normally occur with self-centered auxiliaries when used with first person subjects. They can be made nonintentional or spontaneous with the use of other-centered auxiliaries, as in the following examples:

Tibetan

(10) a. ngas. yi.ge, klog.ba yin.

I-SMP letter read-LINK-AUX (self-centered)

'I read the letter (on purpose).'

b. ngas. yi.ge, klog.song.

1-SMP letter read-AUX (other-centered)

'I read the letter (without meaning to).'

(Denwood 1999: 137)

Conversely, although unintentional verbs expressing involuntary actions such as coughing and seeing normally occur with other-centered auxiliaties, they can be rendered volitional by the use of self-centered auxiliaries:

Tibetan

(11) a. glo. rgyab.byung.

cough-AUX (other-centered)

'I coughed (involuntarily).'

b. glo. rgyab.pa.yin.

cough-LINK-AUX (self-centered)

'I coughed (deliberately).'

(Denwood 1999: 139)

A similar pattern is observed in Newar (Tibeto-Burman), where the relevant contrast is expressed in terms of a distinction between conjunct and disjunct verbal endings--apparently an evidentiality-related phenomenon. Note that only clauses with first person subjects allow this contrast to be expressed.

Newar

(12) a. ji-n kayo tachya-na

1SG-ERG cup break-PC

'I broke the cup (deliberately).'

b. ji-n kayo tachya-ta

1SG-ERG cup break-PD

'I broke the cup (accidentally).'

(Kansakar 1999: 428)

Finally, the phenomenon now widely recognized in the name of "split intransitivity" is rooted in the volitional/spontaneous opposition. Observe first some well-known examples from Eastern Pomo below:

Eastern Pomo

(13) a. ha: c'e:xelka.(volitional)

1SG.A slip

'I am sliding.'

b. wi c'e:xelka. (spontaneous)

1SG.P slip

'I am slipping.'

(McLendon 1978: 1-3)

Although the verb forms are the same, when the pronominal form is inflected for the patient (13b), the sentence conveys a spontaneous action or a "lack of protagonist control" (McLendon 1978: 4). A similar contrast is seen in the Caucasian language Tsova-Tush (Batsbi), where "[the] referent of [an ergative] subject is a voluntary, conscious, controlling participant in the situation named by the verb" (Holisky 1987:113).

Tsova-Tush (Batsbi)

(14) a. (as) vuiz-n-as.

1SG.ERG fall-AOR- 1SG.ESRG

'I fell down, on purpose.'

b. (so) voz-en-sO.

1SG.NOM fell-AOR-1SG.NOM

'I fell down, by accident.' (Holisky 1987: 104)

In addition to these cases of "fluid-S" marking (Dixon 1994), split intransitivity may be realized as a lexically-conditioned phenomenon, where intransitive verbs are classified into an "agentive" class and a "patientive class." Agentive and patientive nominals respectively trigger marking similar to the corresponding arguments of a transitive clause. The Philippine language Cebuano shows this pattern through a focus system which is characteristic of Formosan and Western Austronesian languages:

Cebuano

(15) Transitive actor-focus construction

Ni-basa ako ug libro.

AF-read I.TOP 1NDEF book

'I read a book.'

(16) Transitive patient-focus construction

Gi-basa nako ang libro.

PF-read I TOP book

'I read the book.'

(17) a. Agentive intransitive

Ni-dagan ako. (actor-focus form)

AF-run I.TOP

'I ran.'

b. Patientive intransitive

Gi-kapoy ako. (patient-focus form)

PF-tired I.TOP

'I got tired/I am tired.'

Generalizing processes have the effect of obliterating the basic semantic motivation for distinguishing two classes of intransitive verbs; either the larger agentive or larger patientive class of intransitive verbs tends to have semantically heterogeneous verbs. Nevertheless, the split of intransitive verbs into two classes is rooted in the distinction between volitional and involuntary actions involving an animate protagonist. This is seen in a minority class of verbs, such that a minority agentive class contains verbs denoting controlled actions, and a minority patientive class includes verbs denoting involuntary states of affairs (see Merlan 1985). In Cebuano (and perhaps other Philippine languages as well) the larger agentive class includes verbs denoting uncontrolled events such as raining or slipping off, while the minority patientive class contains verbs that express strictly involuntary states of affairs such as being hungry, becoming tired, or contracting diseases.

The patterns of split intransitivity discussed here underscore an important point that we wish to advance in this article: voice can be also expressed by nominal forms. Traditionally, voice has been regarded as a verbal category. Indeed, many linguists take verbal marking or verbal inflection as the defining feature of voice. (5) We reject this restrictive view. As we define it, voice is concerned with the evolutionary properties of an action. It is typically marked on the verb because a verb expresses an action. Verbal voice marking is therefore simply a case of iconicity. An action, however, also involves participants such as agent and patient. Because an action occurs in relation to these protagonist participants, any form representing them could also bear voice marking. The volitional/ spontaneous opposition manifested in nominal forms also reflects the underlying relationship between the origin of an action and the volitional status of the agent. (6) Nominal marking for certain voice contrasts is thus also motivated by the iconicity principle.

Let us now turn to the causative/noncausative opposition. As noted in the introduction, the causative has been problematic with respect to its status as a voice category. Widely-received definitions of voice, such as Crystal's in Note 1, maintain that voice oppositions do not entail a semantic contrast, which have prevented many grammarians from readily accepting causative/noncausative as one. As the above discussion on the volitional/spontaneous opposition shows, however, there is no reason to believe that voice is a semantically neutral phenomenon. As it happens, one of the oldest systems of voice contrast in Indo-European--the active/middle opposition--also involves a meaning contrast (see below). (7) The question concerning the causative/noncausative opposition (and other semantic oppositions) is whether the relevant contrasts can be naturally integrated into a coherent conceptual framework of voice. Our answer will be yes.

The causative/noncausative opposition pertains to the origin of an action; that is, whether the action originates with the agent of the main action or with another agent heading the action chain. The causative action chain is represented in Figure 3. (8)

[FIGURE 3 OMITTED]

In a noncausative situation, the initial agent ([Agent.sub.2]) is also the agent of the main action. In a causative situation, the ultimate origin of the main action lies in the agent ([Agent.sub.1]) heading the action chain, which is different from the agent ([Agent.sub.2]) of the main action. The relevant parameter for the causative/noncausative distinction can be formulated as below:

Causative/noncausative opposition:

Does the action originate with an agent heading the action chain that is distinct from the agent or patient of the main action?

Yes [right arrow] causative

No [right arrow] noncausative

The contrast between a noncausative situation represented by an expression such as Bill walked and its causative counterpart expressed by a periphrastic causative form like John made Bill walk can thus be naturally captured in terms of the nature of the origin of an action. Situations expressed by lexical causatives such as John killed Bill have an (initial) agent distinct from the patient of the main action.

One of the important points of past studies of causative constructions has to do with the fact that a voice category can be expressed by a construction as a whole, rather than by local morphological entities such as verb inflection or nominal case marking. Lexical and periphrastic causative constructions such as John killed Bill and John made Bill walk are a case in point. They differ in form from morphological causatives such as Quechua wanu-ci (die-CAUSE) 'kill' and Japanese aruka-se (walk-CAUSE) 'make walk', where the causative meaning is expressed morphologically. Traditionally, grammarians have tended to consider only morphological causatives as proper cases. However, such a position leads to the uncomfortable decision of treating the Quechua and Japanese forms cited above as causative, while treating the semantically parallel English expressions kill and make walk as noncausative. The form-based treatment of causatives is tantamount to simply circumscribing morphological causatives, and does not lead to a comprehensive study of causative phenomena. Causation is a semantic, not a morphological notion, and as such the whole range of expression types must be taken into account in a satisfactory analysis. Indeed, a (functional) typological study is predicated on the view that a variety of expression types will obtain in any given conceptual domain. The formal tripartite pattern of lexical, morphological, and periphrastic causative constructions has now been widely accepted, and some revealing correlations between form and function have been identified in the causative domain (see Shibatani and Pardeshi 2002 on recent developments). We see below that a similar pattern holds in other voice domains as well.

Having discussed two voice phenomena pertaining to the origin of an action, we now turn to the next major voice parameter concerning its development. We will consider the other voices associated with the nature of the origin of an action--the passive and the inverse--after dealing with other conceptually-based voice phenomena.

3.2. Parameters pertaining to the development of an action

In this section we recognize at least two sets of contrastive patterns in the developmental phase of an action. One is concerned with whether the action develops beyond the personal sphere of the agent or is instead confined within it. The latter mode of development forms the conceptual basis of what is known as the middle voice. The other contrastive pattern of action development is concerned with whether or not the action has been successfully transferred to the patient and has achieved its intended effect. This contrast forms the conceptual basis for the ergative/antipassive opposition.

The active/middle voice opposition is best known from studies of classical Indo-European languages such as Ancient Greek and Sanskrit, and calls for a broad understanding of the notion of action confinement in the agent's personal sphere. The clearest case in which the development of an action is confined to the agent's sphere is when simple intransitive activities, such as sitting and walking, are lexicalized as intransitive verbs. Here the development of the action is clearly confined within domain of the agent, as shown in the schematic representation Figure 5a. These situations contrast with active (causative) situations (e.g. John sat his son in the chair and John made his son walk) where the relevant actions involve an agent that instigates an action which develops outside the (initial) agent's domain (see Figure 4). In the words of Benveniste (1971 [1950]: 148): "In the active, the verbs denote a process that is accomplished outside the subject. In the middle, which is the diathesis to be defined by the opposition, the verb indicates a process centering in the subject, the subject being inside the process."

[FIGURES 4-5 OMITTED]

Reflexive situations also constitute one of the middle action types, since here the action is also confined within the agent's personal sphere. The active expression John hit Bill contrasts with the reflexive expression John hit himself, where the confinement of the hitting action within one's personal sphere (e.g. hitting one's head or body) is marked by a coreferential reflexive pronoun (see Figure 5b). (9)

Other middle situations of body-care action--bathing, combing one's hair, washing one's hands, and dressing oneself- are straightforward, where the agent's action deals with its own body or body part. Because an action confined to the agent's sphere typically affects the agent itself, this aspect of the middle--an effect accruing to the agent itself--plays an important role in framing certain actions of the middle. Greek middle expressions such as paraschesthai ti 'to give something from one's own means' and paratithesthai siton 'to have food served up' are a case in point. Here the actions actually extend beyond the agent's sphere, but their effects accrue on the agent in the manner of a typical middle depicted in Figures 5b and 5c. In other cases, the notion of the agent's personal sphere is more strictly adhered to, as in the following examples:

Sanskrit

(18) a. devadatto yajnadattasya bharyam

Devadatta.NOM Yajnadatta.GEN wife upayacchati. (active)

have. relations. 3SG.ACT

'Devadatta has relations with Yajnadatta's wife.'

b. devadatto bharyam upayacchate. (middle)

Devadatta.NOM wife have.relations.3 SG.MID

'Devadatta has relations with his (own) wife.'

(Klaiman 1988: 34)

Sanxiang Dulong/Rawang

(19) a. [an.sup.53) [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55]. (active)

3SG mosquito hit

'S/he is hitting the mosquito.'

b. [an.sup.53] [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55] -[cm.sup.31]. (middle)

3SG mosquito hit-MID

'S/he is hitting the mosquito (on her/his body).'

(LaPolla 1996: 1945)

The active/middle opposition is diagrammatically shown as above, where the dotted circles and arrows represent the agent's personal sphere and actions respectively.

The conceptual basis of the active/middle opposition can then be formulated in terms of the manner of the development of an action, as follows:

Active/middle opposition:

Active: The action extends beyond the agent's personal sphere and achieves its effect on a distinct patient.

Middle: The development of an action is confined within the agent's personal sphere so that the action's effect accrues on the agent itself.

Defining the middle voice domain in terms of confinement of an action within the sphere of the agent affords a unified treatment of various types of middle construction. Just as in the case of causatives, middle constructions come in three types--lexical, morphological, and periphrastic--both within individual languages and across different ones. Balinese, for example, exhibits all three types of middle construction, allowing some situation types to be expressed either morphologically or periphrastically, as shown in Table 1.

Our approach to middle voice phenomena is more consistent than Kemmer's (1993), which distinguishes reflexive situations from other middle event types, although these two categories are assumed to form a continuum, as shown in the diagrammatic representation in Figure 6. In our approach, Kemmer's reflexive, middle, and single participant situation types all fall in the middle voice domain, as defined above. Kemmer's distinctions among these types appear to be partly based on the typical forms expressing them. Reflexive situations tend to be expressed periphrastically, as in the case of Balinese nyagur awak 'hit oneself'. Kemmer's middle situation types are typically expressed morphologically, as in ma-cukur 'shave' in Balinese, and single-participant events are typically expressed by forms without any middle markers, as in the Balinese lexical middle negak 'sit'.

[FIGURE 6 OMITTED]

Kemmer arrives at her classification of event types as a result of her decision to "[deal] with ... middle-marking languages, or languages with overt morphological indications of the middle category" (Kemmer 1993: 10; bold face original, underline added). As pointed out in the discussion of causatives above, a strict form-based approach to the middle voice tends to focus on morphological middles, which is similar to the narrow treatment of morphological causatives, ignoring other possible form types. Such an approach would consider the Tarascan (Mexico) form ata-kurhi 'hit oneself' and the Quechua form maqa-ku 'hit onself' as middies, while treating the English and Balinese equivalents hit oneself and nyagur awak as distinct reflexives. Perhaps Kemmer would consider oneself and awak here as "overt morphological indications of the middle category." But then, why is she distinguishing reflexive situations from the middle situations in her diagram reproduced in Figure 6? Also, what of the German form aufstehen 'stand up', which shows no middle marking? Is it not a middle because it lacks any morphological marking? It is semantically equivalent to the Balinese middle form ma-jujuk 'stand up'.

A more systematic typological investigation of the form-function correlation can be achieved if variation in form is taken as a function of the "naturalness" of the middle action. Natural middle actions--for example, sitting and walking--tend to be lexicalized as intransitive verbs, while actions typically directed to others--for example, hitting and kicking--tend to be expressed by periphrastic constructions involving a reflexive form when they are confined within the agent's personal sphere. What Kemmer (1993) has identified as middles--morphological middles--center on those actions that people typically apply to themselves, but that are applied to others often enough. (10) One must, however, realize that there are both intra- and crosslinguistic variations--such that in Balinese ma-jujuk 'stand up' has a morphological middle prefix, but negak 'sit (down)' is simply lexical. The same marking pattern is reversed in German, where sich hinsetzen 'sit down' has a middle marker, but aufstehen 'stand up' does not. These irregularities require individual accounts, based on historical, cognitive, and even cultural data.

The middle voice system has several important implications for our general understanding of the nature of voice phenomena. Recall that most of the widely received definitions of voice (such as the one quoted from Crystal [2003] in Note 1) hold that voice opposition does not entail a meaning contrast. This is not the case for the active/middle opposition, as shown by the examples above as well as by the contrast between the English active form John hit Bill and the middle form John hit himself.

Secondly, these examples show that voice alternations do not necessarily alter argument alignment patterns. There is no change in grammatical relation in the contrastive pairs in (18) and (19). If the situations depicted there give the impression of unusual utterances, consider the mundane situations described by the following Greek examples, where a meaning contrast is expressed without a realignment of arguments:

Ancient Greek

(20) a. louo khitona. (active)

wash.1SG.ACT shirt.ACC

'I wash a shirt.'

b. louomai khitona. (middle)

wash.1SG.MID shirt.ACC

'I wash my shirt/I wash a shirt for myself.'

While morphological middle constructions in some languages are strictly intransitive (as in the case of the Balinese ma-), and middles derived via the decausative function (as in the Greek forms porefisai 'to cause to go, to convey': poreusasthai 'to go' and kaiein 'to light, kindle': kaiesthai 'to be lighted, to bum') are intransitive, intransitivity is not a defining property of middle constructions. A large number of languages allow middle constructions that are syntactically transitive, as shown in the examples above and (21b) below, where the direct object is clearly marked by the accusative case suffix -n.

Amharic

(21) a. lemma te-lac' ce.

Lemma MID-shave.PERF.3M

'Lemma shaved himself.'

b. lamma ras-u-n te-lac' ce.

Lemma head-POSS.3M-ACC MID-shave.PERF.3M

'Lemma shaved his head.'

(Amberber 2000: 325, 326)

The general tendency for morphological middles to be intransitive is best viewed as the result of historical processes responding to the pressure on the form to conform to the semantic intransitivity, which characterizes middle events. This is exactly what has happened to many of the middle forms expressing reflexive middle situations in European languages, where the relevant affixes evolved from reflexive pronouns in the parent languages. The course of this development can be illustrated by using synchronic data below, where the Swedish example shows an intermediate clitic stage, the Russian form sebja exemplifies the earliest transitive pattern, and -s' (or -sja) the advanced fused pattern.

(22) a. Ivan ubi-1 sebj-a. (Russian)

Ivan kill.PERF-PAST.SG.MASC self-ACC

'Ivan killed himself.'

b. Honkamma-de sag. (Swedish)

she comb-PAST MID

'She combed.'

c. Ona prichesa-l-a-s'. (Russian)

she comb-PAST-FEM-MID

'She combed.'

Finally, in recognizing intransitive and transitive verbs as lexicalized middle and active voice forms, we elevate the active/middle contrast to the status of a central voice opposition observed in all human languages (cf. Dixon's [1979: 68-69] observation that "all languages appear to distinguish activities that necessarily involve two participants from those that necessarily involve one ... Then all languages have classes of transitive and intransitive verbs, to describe these two classes of activity"). (11)

Let us now turn to the antipassive voice. As the name suggests, the syntactic properties of antipassive constructions mirror somewhat those of passives, but the semantic aspect is different in these two voices. In the case of the passive, there is no implication that an agent is not somehow fully involved in the action. Indeed, full involvement of an agent is a crucial feature distinguishing the passive (e.g. John was killed while he was asleep) from the spontaneous middle (e.g. John died while he was asleep). Antipassive situations contrast in meaning with those expressed in the active and the ergative voice regarding the attainment of the intended effect upon a patient, however.

The intended effect of an action on a patient differs depending on the verb type. With contact verbs, the antipassive presents a situation as failing to make contact, as in the following examples:

Chukchee

(23) a. elteg=e keyn=en penre-nen.

father=GER bear=ABS attack=3SG:3SG/AOR

'The father attacked the bear.'

b. elteg=en penre=tko=g[??]e

father=GER attack=APASS=3SG.AOR

keyn=ete. (antipassive)

bear=DAT

'The father rushed at the bear.'

(Kozinsky et al. 1988: 652)

Warlpiri

(24) a. nyuntulu-rlu [??]-npa-ju pantu-rnu ngaju.

you-ERG [??]-2SG.A-1SG.P spear-PAST I.ABS

'You speared me.'

b. nyuntulu-rlu [??]-npa-ju-rla pantu-rnu

you-ERG [??]-2SG.A-1SG-DAT spear-PAST

ngaju-ku. (antipassive)

I-DAT

'You speared at me; you tried to spear me.'

(Dixon 1980: 449)

According to Dixon (1980: 449), (24b) above "indicates that the action denoted by the verb is not fully carried out, in the sense that it does not have the intended effect on the entity denoted by the object [read "patient", MS]." Similarly, visual contact is not made when situations involving visual perception are presented in the antipassive voice:

Warrungu

(25) a. nyula nyaka+n wurripa+[??].

3SG.NOM see+P/P bee+ABS

'He saw bees.'

b. ngaya nyaka+kali+[??] wurripa+wu katyarra+wu.

1SG.NOM see-APASS+P/P bee+DAT possum+DAT

'I was looking for bees and possums.'

(Tsunoda 1988: 606)

Moreover, for action types affecting a patient, the antipassive voice presents a situation as NOT affecting the patient in totality, as in the following examples:

Samoan

(26) a. S[bar.a] 'ai e le teine le i'a.

PAST eat ERG ART girl ART fish

'The girl ate the fish.'

b. S[bar.a] 'ai le teine i le i'a.

PAST eat ART girl LOC ART fish

'The girl ate some (of the) fish.'

(Mosel and Hovdhaugen 1992: 108)

The voice parameter focusing on the ergative/antipassive contrast can be formulated as below:

Ergative/antipassive opposition:

Does the action develop to its full extent and achieve its intended effect on a patient?

Yes [right arrow] ergative(/active)

No [right arrow] antipassive

Notice that in (24b) an antipassive event is conveyed solely by the case marking on the patient, underscoring our earlier point that voice may be manifested in a nominal element denoting the relevant participant. In the case of the antipassive, the status of the patient is at issue, and antipassivization iconically affects the form of the patient nominal--either case marking it differently from the active/ergative (a case of the so-called differential object marking [Moravcsik 1978]), or avoiding coding it (examples below).

As conceived here, both the middle and the antipassive relate to the nature of the development of an action. Specifically, both have the ontological feature of an action not (totally) affecting a distinct patient. The conceptual affinity between the two explains the middle/antipassive polysemy seen in a fair number of languages. Observe:

Yidiny

(27) a. wagu:da bambi-dinu.

man.ABS cover-MID

'The man covered himself.'

b. wagu:da wawa-:dinu gudaganda.

man.ABS saw-APASS dog.DAT

'The man saw the dog.'

(Dixon 1977: 277, 280)

Balinese

(28) a. Ia sedek ma-sugi.

3SG ASP MID-wash.face

'She is washing her face.'

b. Tiang ma-daar.

1SG APASS-eat

'I ate.'

Shibatani and Artawa 2002)

Russian

(29) a. Ivan mojetsja mylom.

Ivan wash.MID soap.INSTR

'Ivan washed himself with soap.'

b. Babuska rugajetsja.

granny.NOM scold.APASS

'Granny is scolding.'

(Geniusiene 1987: 9)

In addition, languages may show the well-known connection between the middle and the passive (12) through the use of the same form as the antipassive, thus illustrating a three-way middle-passive-antipassive polysemy:

Russian (cf. the examples immediately above)

(30) Dom stroitsja turezk-oj firm-oj

house.NOM is.being.built.PASS Turkish-INST firm-INST

INKA.

INKA

'The house is being built by the Turkish company INKA.'

Kuku Yalanji

(31) a. karrkay julurri-ji-y. (middle)

child.ABS wash-MID-NONPAST

'The child is washing itself.'

b. warru (yaburr-ndu) bayka-ji-ny. (passive)

young man.ABS shark:LOC:pt bite-PASS-PAST

'The young man was bitten (by a shark).'

c. nyulu dingkar minya-nga nuka-ji-ny. (antipassive)

3SG.NOM man.ABS meat-LOC eat-APASS-PAST

'The man had a good feast of meat (he wasted nothing).'

(Patz 1982: 244, 248, 255)

3.3. The termination of action parameter

In a regular transitive event, an action terminates in a patient. However, the action may extend beyond the patient and affect an additional entity, which then functions as a new terminal point. Benefactives/malefactives and applicatives express this kind of situation. The relevant parameter can be formulated in the following form:

Benefactive/malefactive/applicative parameter:

Does the action develop further than its normal course, such that an entity other than the direct event-participants becomes a new terminal point registering an effect of the action?

No [right arrow] active/middle

Yes [right arrow] benefactive/malefactive/applicative

While the notion of benefit-giving is a broad one, there is one particular type with a perceptible change in the beneficiary. This is the case involving transfer of an object, where the object itself is directly affected by the act of giving. In a typical giving situation, the object is physically moved from one owner to a new one. The recipient beneficiary is secondarily affected because it comes into possession of the transferred object. Languages often have a special benefactive construction that portrays this type of situation, where the effect on the beneficiary is indicated by its argument status in syntactic coding. As shown in Shibatani (1996), benefactive constructions are typically based on the syntactic schema of the give-construction even involving the verb form for giving in some languages, as in the case of Japanese seen below:

(32) a. Taroo-wa Hanako-ni hon-o yat-ta.

Taro-TOP Hanako-DAT book-ACC give-PAST

'Taro gave Hanako a book.'

b. Taroo-wa Hanako-ni hon-o kat-te

Taro-TOP Hanako-DAT book-ACC buy-CONJ

yat-ta.

BEN-PAST

'Taro bought Hanako a book.'

In (32b) the buying action is extended beyond the patient (the book), and affects the beneficiary nominal (Hanako) coded in the dative form. Compare this construction to the one below, expressing a more general benefit-giving in which the beneficiary takes on a nonargument form.

(33) Taroo-wa Hanak-no tame-ni hon-o

Taro-TOP Hanako-of sake-for book-ACC

kat-te yat-ta.

buy-CONJ GIVE-PAST

'Taro bought a book for (the sake of) Hanako.'

While (33) may express any type of benefit-giving--including one of buying a book to help Hanako's book-selling business--(32b) specifically conveys the meaning that the transfer of the book was intended. Note also the English translations accompanying these examples, which show the same contrast.

Benefactive/malefactive events are also realized by so-called external possession constructions in Indo-European and some other languages (cf. Payne and Barshi 1999), although the context may determine whether or not a clear benefactive/malecfactive reading obtains from them. When a body part is involved as the primary patient (cf. below), the benefactive/malecfactive reading is not strongly pronounced beyond that which is conveyed by the verb; cf. (34) and (35a):

German

(34) Ich wasche mir die Hande.

I wash I.DAT the hands

'I wash my hands.' (lit. 'I wash me the hands.')

(35) a. Man hat ihm den Arm gebrochen.

lit. 'They broke him the ann.'

b. Man hat seinen Arm gebrochen.

'They broke his arm.'

Where inalienable possession is implicated as above, the dative nominal indicates that the action has affected it as a new terminal point of the action. In German, the external possession construction is generally obligatory when the affected body part is inalienably possessed; the extension of the action to its owner is inevitable under such circumstances. Indeed, an internal possession construction like (35b) suggests that the arm in question was detached, and no effect on its owner is asserted by such a sentence. Internal possession constructions involving inalienably possessed body parts, as in the English form I broke his arm, suggest that the arm's owner was affected, but the implication is obtained through a commonsensical world view. The dative construction (35a), on the other hand, asserts that the body part owner is affected by the action.

The benefactive/malefactive reading can be seen more readily in the following examples, where the dative nominal represents a mentally affected party:

French

(36) a. Jean lui a casse sa vaisalle.

lit. 'Jean broke her her dishes.'

b. Jean a casse sa vaisalle.

'Jean broke her dishes.'

Modern Hebrew

(37) a. ha tinok lixlex li et ha xulca.

the baby dirtied I.DAT ACC the shirt

'The baby dirtied the shirt on me.'

b. ha tinok lixlex et ha-xulca shel-i.

the baby dirtied ACC the-shirt of-me

'The baby dirtied my shirt.'

(Berman 1982; T. Gibon pers. comm.)

Where inalienable possession is evident, as in these examples, a malefactive meaning obtains more readily. The trade-off between inalienability and affective reading shows that a principle of relevance is at work in these constructions: the relevance of the dative arguments to the event must be somehow "guaranteed." Involvement of an inalienably possessed object guarantees the relevance of the possessor to the event, since whatever happens to the body part will affect its possessor automatically. When an inalienable possession relation does not obtain--as in (36a) and (37a)--a benefactive/malefactive effect upon the dative argument is pronounced as a way of establishing its relevance to the event. The attendant interpretation that a possessive relation exists contributes to the establishment of the affective relationship; the owner of an object is more easily affected by what happens to its possession.

Contrary to what the label suggests then, so-called external possession constructions DO NOT assert a possessive relation between the dative argument and the directly affected patient. Indeed, the relevant constructions arise independently from externalization of the possessor, as in the German example below (also in [36a] above), or when the notion of possession is irrelevant, as in the following examples (40)-(41) from River Warihio (Uto-Aztecan): (13)

German

(38) Peter repariert mir mein Fahrrad.

'Peter fixes me my bicycle.'

River Warihio

(39) a. hustina pasu-re muni kukuci icio.

Agustina cook-PERF beans children BEN

'Agustina cooked beans for the children.'

b. hustina pasu-ke-re muni kukuci.

Agustina cook-BEN-PERF beans children

'Agustina cooked beans for the children.'

(40) maniwiri no'o wikahta-ke-ru yoma aari.

Manuel 1SG.NS sing-BEN-PERF all afternoon

'Manuel sang all afternoon for me.'

(41) tapana no'o yuku-ke-ru.

yesterday 1SG.NS rain-BEN-PERF

'Yesterday it rained on/for me.'

(Felix 2005: 253, 257, 258)

That the condition of physical proximity should be more important than the possessive relation in inducing a benefactive/malefactive construction is shown by the following River Warihio examples (see Shibatani 1994 for other cases):

(42) a. maniwiri ihcorewapate-re wani pantaoni-ra.

Manuel get.dirty-PERF John jeans-POSS 'Manuel dirtied John's jeans.' (John's jeans were over the chair.)

b. maniwiri ihcorewapate-ke-re pantaoni wani.

Manuel get.dirty-BEN-PERF jeans John

'Manuel dirtied John's jeans.' (John was wearing his jeans.)

In general, applicative constructions have been considered as syntactic valency-increasing operations that are pragmatically motivated (see Peterson 1999). Our claim is that their conceptual basis is rooted in the ontological feature of an action, as stated in the voice parameter above. Peterson's (1999) survey shows that certain applicatives are more basic and prevalent than others. In the words of Peterson (who lumps benefactives and applicatives together), "the locative and circumstantial applicatives depend on the presence of other applicative constructions, while benefactive and instrumental/comitative applicatives do not. That is, there are two core applicative constructions, benefactive and instrumental/ comitative, and these serve as anchors as it were for the development of additional applicative constructions marked either by the same or distinct morphology" (Peterson 1999: 135). This observation is consistent with our view of the benefactive/applicative voice. Benefactive and instrumental/comitative participants are much more directly involved in the event than a causal factor, or setting entity such a location, hence much more likely to be affected by the action. That the benefactive applicative is obligatory in some languages also underscores the point regarding the affected nature of the recipient beneficiary (cf. above).

In the past, grammarians may have not paid sufficiently close attention to the subtle meaning differences that exist between applicative constructions and their nonapplicative counterparts. However, recent descriptions of applicative constructions have begun to notice some revealing semantic effects. For example, Donohue (1999) shows that the Tukang Besi comitative applicative conveys a meaning whereby the applied comitative nominal is actively engaged in the event: (14)

Tukang Besi

(43) a. No-moturu kene wowine ane ke hotu mopera.

3R-sleep and woman exist and hair short

'He slept with the woman with the short hair.'

(i.e. they were sleeping near each other.) (# they had sex together.)

b. No-moturu-ngkene te wowine ane ke hotu

3R-sleep-COM CORE woman exist and hair

mopera.

short

'He slept with the woman with the short hair.' (i.e. they had sex together.)

(Donohue 1999: 231)

The following instrumental applicative from Pulaar also demonstrates how an applied instrumental nominal can implicate a participant more thoroughly affected by the agent's action:

Pulaar

(44) a. mi loot-ii min am a

1SG wash-PERF.ACT y.s. 1SG.POSS PREP

saabunnde hee.

Soap DET

'I washed my younger sibling with (some of) the soap.'

b. mi loot-r-ii min am

1.SG wash-INST-PERF.ACT y.s. 1SG.POSS

saabunnde hee.

soap DET

'I washed my younger sibling with (all of) the soap.'

(Sebastian Ross-Hagebaum pers. comm.)

The various effects of locative applicatives have also been recognized in the literature. The Balinese locative expression in (45b) below, for example, describes a situation where the action of planting banana trees extends in such a way as to affect the garden. Here the entire garden ends up being planted with banana trees, while no such implication is made in the nonapplicative counterpart (45a).

Balinese

(45) a. Tiang mulan biyu di tegalan tiang-e.

1SG plant banana in garden 1SG-POSS

'I planted bananas in my garden.'

b. Tiang mulan-in tegalan tiang-e biyu.

1SG plant-APPL garden 1SG-POSS banana 'I planted my garden with bananas.'

(I. Wayan Arka pers. comm.)

On the conceptual framework for voice phenomena *.

Abstract

This article attempts to lay the conceptual foundations of voice phenomena, ranging from the familiar active/passive contrast to the ergative/antipassive opposition, as well as voice functions of split case-marking in both transitive and intransitive constructions. We advance the claim that major voice phenomena have conceptual bases rooted in the human cognition of actions, which have evolutionary properties pertaining to their origin, development, and termination. The notion of transitivity is integral to the study of voice as evident from the fact that the so-called transitivity parameters identified by Hopper and Thompson (1980) and others are in the main concerned with these evolutionary properties of an action, and also from the fact that the phenomena dealt with in these studies are mostly voice phenomena. A number of claims made in past studies of voice and in some widely-received definitions of voice are shown to be false. In particular, voice oppositions are typically based on conceptual--as opposed to pragmatic--meanings, may not alter argument alignment patterns, may not change verbal valency, and may not even trigger verbal marking. There are also voice oppositions more basic and wide-spread than the active/passive system, upon which popular definitions of voice are typically based

1. Introduction

Current studies on voice phenomena suffer from a number of inadequacies at several levels of description and explanation. At the most fundamental level, there is no coherent conceptual framework that adequately addresses the matter, such that we are often left to wonder whether or not a given phenomenon falls in the domain of voice. For one thing, people differ in the treatment of causative and reflexive constructions; some consider them to represent voice categories, while others do not. Still others avoid raising the issue at all. Various definitions currently offered are of little use, as they are typically based on an Indo-European active/passive opposition, and arbitrarily include or exclude a particular phenomenon from the domain of voice. (1)

Properly identifying construction types representing a voice sub-domain is also a serious problem. In Crystal's (2003) definition (cf. Note 1) reflexives are not recognized as proper voice constructions and their relationship to the middle voice is not entirely clear. A similar problem is seen in Kemmer's (1993) extensive study of middle voice constructions.

There are also severe limitations at the level of explanation. Closer to the main theme of this volume is the problem of understanding the increases and decreases in valency and accompanying changes in argument structure observed in voice phenomena. Why do certain phenomena (e.g. the causative and applicative) show an increase in valency, while others (e.g. the passive and antipassive) typically have a valency-reducing effect? What motivates these valency changes in opposite directions?

Functional explanations regarding the distribution of certain voice constructions go a long way toward an explanatory functional study of grammatical phenomena (cf. Haiman 1985). Being largely based on formal properties such as "linguistic distance" and "full" vs. "reduced form," these explanations are not functional enough to be able to make more general predictions. (2)

The problems outlined above largely stem from two related methodological issues. One is the lack of a coherent conceptual framework for characterizing and analyzing voice phenomena; the other is an over-reliance on formal properties in both analysis and explanation. Clearly the latter problem is caused by the former and by the lack of commitment to the cognition-to-form approach in linguistic analysis. (3) The purpose of this article is thus to lay out a conceptual framework that coherently delineates the domain of voice, which embraces both those phenomena that are traditionally recognized as falling in the voice domain and those that have been kept in limbo. The framework required must deal with the fact that many voice phenomena straddle the semantics-pragmatics boundary, although the active/middle opposition is basically conceptual or semantic, and the active/passive opposition is largely pragmatic. We endeavor to unify these manifestations of voice function by assuming that the pragmatic relevance of clausal units is semantically determined in the first place.

The conceptual foundations of voice can only be arrived at by inspecting contrasting phenomena across languages. Our initial task is therefore to learn how a given language, using its own resources, achieves the goal of expressing a relevant conceptual opposition found in another language. While the ultimate goal of functional typology is to discover the correlative patterns between form and function, this article is concerned primarily with the initial task of postulating conceptual bases of voice phenomena and identifying constructions across languages that express the relevant oppositions.

One final introductory remark is due regarding the controversy over the question of whether the formal relationships between opposing voice categories should be treated as inflectional or derivational. We consider this question to be academic in the absence of rigorous definitions for these processes. In the realm of voice phenomena, some systems, for example, the Ancient Greek active/middle system, incorporate voice morphology in their inflectional paradigm. Others like the English active/ passive opposition do not show a simple morphological relationship--inflectional or derivational--since constructions as a whole enter into the formal opposition. The regularity or productivity of the pattern is often taken to be an important criterion distinguishing inflections from derivations; the former are thought to be regular and obligatory, while the latter allow exceptions. But regularity in natural language is always relative, and so are the patterns of voice oppositions. Even among the known ones, nothing is one hundred percent regular. An alternation that is well-integrated within the inflectional paradigm may show irregularity. In Ancient Greek, for example, we find both active forms that do not have middle counterparts (activa tantum) and middle forms lacking the corresponding active (media tantum). The active/passive opposition also shows a high degree of regularity, without ever being one hundred percent (as in the case of English), others place much severer limitations on the range of permissible passive constructions.

2. The evolution of an action: voice, transitivity, and aspect

The basic claim of this article is that major voice phenomena have their conceptual bases rooted in the human cognition of actions. Because such actions have various effects upon us, we have special interest in the way that they arise, how they develop, and the manner in which they terminate--what is referred to as the evolutionary properties or phases of an action in this article. Through a system of grammatical oppositions, a language provides a means for expressing conceptual contrasts pertaining to the evolutionary properties of an action that the speaker finds relevant for communicative purposes. Among the evolutionary properties, voice is primarily concerned with the way event participants are involved in actions, and with the communicative value, or discourse relevance pertaining to the event participants from the nature of this involvement.

Mention of the evolution of an action immediately brings to mind two other grammatical concepts, namely, transitivity and aspect. It is thus appropriate to clarify the relationships and differences between these notions. Traditionally, voice has been defined in reference to transitivity, or more narrowly in terms of the transitivity of a verb or clause; the active/ passive opposition most typically obtains with transitive verbs. A more important connection between transitivity and voice, however, lies in the notion of semantic transitivity, rather than strictly verbal or clausal transitivity. Indeed, it is easy to see this connection, as in the work of Hopper and Thompson (1980), where many of the phenomena discussed in terms of transitivity are nothing but voice phenomena. This important article concludes the section on grammatical transitivity as follows: "It is tempting to find a superordinate semantic notion which will include all the Transitivity components. If there is one, it has so far not been discovered ..." (Hopper and Thompson 1980: 279). Our claim is that what they are looking for is a theory of voice. In fact, the work of Hopper and Thompson lays important ground work for the study of voice. In this regard, Kemmer (1993: 247) is absolutely correct in noting that "the scale of transitivity ... forms the conceptual underpinning for voice systems in general, and for reflexive and middle marking systems in particular." (4) While none of these works makes it quite clear, voice is a system of correspondences between action or event types and syntactic structures. For example, what is known as the active voice is the pattern of correspondence between the high transitive event type or the prototypical transitive action and the nominative-accusative coding pattern of the event participants, as in the English active sentence She killed him (see Section 5 below).

The parameters of transitivity identified by Hopper and Thompson (1980) pertain to "different facet[s]" of "carrying-over or transferring an action from one participant to another" (Hopper and Thompson 1980: 253), and they in effect represent the evolutionary properties of an action, that is, they pertain to the way an action is brought about, to the way it is transferred to the second participant, and to the way it affects this participant. In order to bring grammar closer to cognition, we propose to examine specific evolutionary properties of an action pertaining to voice oppositions that are distilled as transitivity parameters in Hopper and Thompson (1980) and others dealing with the issues of transitivity.

If transitivity is integral to a theory of voice, how then do aspect and voice differ under the assumption that both are concerned with the way an action evolves? These two grammatical categories invite different kinds of questions. Aspect asks where the vantage point is with regard to the temporal structure of an action. When the action is viewed holistically encompassing all of its temporal phases, we obtain the perfective viewpoint of the described action. On the other hand, if specific sections of internal temporal structure are focused, we obtain various types of imperfective aspectual construal of an event. The contrast between the perfective and the imperfective aspects and the representative subcategories of the latter seen across languages are represented in Figure 1.

[FIGURE 1 OMITTED]

Voice, on the other hand, asks how an action evolves--that is, it asks about the nature of its origin, the manner in which it develops, and the way that it terminates. These evolutionary phases of an action and the various voice categories pertaining to them are depicted schematically in Figure 2.

[FIGURE 2 OMITTED]

3. Major voice oppositions and their conceptual bases

Under the present conception, the three principal evolutionary phases of an action--origin, development, and termination--form the basis for the major voice parameters. These parameters are generally expressible in the form of questions concerning the evolutionary properties of an action, as below:

Major voice parameters:

I. The origin of an action

(a) How is the action brought about?

(b) Where does the action originate?

(c) What is the nature of the agent?

II. The development of an action

How does the action develop?

(a) Does the action extend beyond the agent's personal sphere or is it confined to it?

(b) Does the action achieve the intended effect in a distinct patient, or does it fail to do so?

III. The termination of an action

Does the action develop further than its normal course, extend beyond the immediate participants of the event, and terminate in an additional entity?

Figure 2 summarizes the voice constructions pertaining to these parameters. Throughout the following discussion, we touch upon the theoretical consequences of this diagrammatic representation of the voice domain.

3.1. Parameters pertaining to the origin of an action

The first opposition to be examined has to do with the nature of the origin of an action--namely, whether the action in question is brought about volitionally or nonvolitionally by a human agent.

Volitional/spontaneous opposition:

Is the action brought about volitionally?

Yes [right arrow] volitional

No [right arrow] spontaneous

While not widely recognized as a voice opposition, this distinction has been recognized as such in the Japanese grammatical tradition, perhaps because the suffix for the spontaneous voice is identical with that used in the passive construction. In fact, it is generally believed that the Japanese passive arose from the spontaneous construction. Languages (or grammarians' interpretations of the facts?) may differ with regard to the precise meaning contrast seen in the volitional/spontaneous opposition. In Japanese, the spontaneous construction expresses a situation where the agent does not intend to bring about an action, but where there is a circumstantial factor external to the agent that induces an action (such as eating "dancing-mushrooms" as in [1b] below). In other languages, a spontaneous form conveys the meaning of an action accidentally brought about. Other manifestations of the opposition may be alternatively expressed in terms of such notions as intentional/unintentional or controlled/uncontrolled, but we shall take the position that these contrasts are included in the basic function of the volitional/spontaneous opposition. That is, by "volitional voice" we mean a connection between a particular syntactic form and a type of action that is brought about by the willful involvement of an agent who "intends the action," and sees to it that the intended effect is achieved. Departure from this action type in any significant way may be construed as constituting a spontaneous action, expressed by a construction formally contrasting with the volitional construction.

In Modern Japanese, the domain of the volitional/spontaneous opposition has shrunk to such an extent that mental activities are the only ones where the contrast is readily observed, with the spontaneous morphology (-re/-rare) having generally given way to a passive interpretation in the domain of physical actions. In Classical Japanese (ninth-twelfth centuries), the volitional/spontaneous opposition was more widely observed, as in the following examples:

Classical Japanese

(1) a. Kikori-domo mo mai-keri. (volitional)

wood cutter-PL also dance-PAST

'Wood cutters also danced.'

b. Kikori-domo mo mawa-re-keri. (spontaneous)

wood cuter-PL also dance-SPON-PAST

'Wood cutters also danced willy-nilly.'

Spontaneous expressions in Japanese typically do not contain an agent in subject position. Because information regarding the volitional status of an agent is most readily accessible to the speaker, the volitional/spontaneous distinction is typically made with reference to a first person agent; accordingly, the missing agent is understood to be the speaker unless otherwise specified. This non-coding of an agent in subject position paved the way for a spontaneous expression where a patient nominal is coded in subject position, as in the following spontaneous construction (2b). Undoubtedly, this was an important step in the development of the passive from the spontaneous construction.

Modern Japanese

(2) a. Boku-wayoku mukasi-no-koto-o

I-TOP often old days-GEN-things-ACC

omo-u. (volitional)

think-PRES

'I often think about the things of the old days.'

b. Saikin mukasi-no-koto-ga yoku

recently old days-GEN-things-NOM often

omowa-re-ru. (spontaneous)

think-SPON-PRES

'Recently the things of the old days often come to mind.'

Since the volitional/spontaneous opposition is not widely recognized as a voice phenomenon, it is perhaps worth spending some time showing how widespread in the world's languages it actually is. As in other voice sub-domains, languages make use of different resources in expressing the volitional/spontaneous opposition. Indonesian and Malay use the multifunctional prefix ter- to express unintended or accidental actions:

Indonesian

(3) a. Ali memukul anak-nya. (volitional)

Ali AF.hit child-3SG.POSS

'Ali hit his child.'

b. Ali ter-pukul oleh anak-nya. (spontaneous)

Ali SPON-hit PREP child-3SG.POSS

'Ali accidentally hit his child.'

(I Wayan Arka pers. comm.)

According to Winstedt (1927: 86-87), the function of ter- in Malay is characterized as denoting an action due "not to conscious activity on the part of the subject, but to external compulsion or accident." It is noteworthy that spontaneous constructions in both Japanese and Indonesian/ Malay have an affinity with the passive in that they share the same affix in these languages. Compare the spontaneous constructions above with the passives in Japanese and Indonesian below:

(4) a. Taroo-wa Ziroo-ni nagura-re-ta. (Japanese passive)

Taro-TOP Jiro-by hit-PASS-PAST

'Taro was hit by Jiro.'

b. rumah itu tidak ter-beli oleh

house that NEG PASS-buy PRES

saya. (Indonesian passive)

1.SG

'The house cannot be bought by me.'

The diagrammatic representation of voice constructions in Figure 2 can be thought of as a semantic map, where different constructions are distributed over relevant territory within the voice domain. This is a useful way of representing conceptual affinities among various voice constructions, but its utility is predicated only on a comprehensive view of voice as advocated in this article. Spontaneous and passive are both concerned with the origin of an action. What they share is the idea that this lies NOT in the pragmatically most relevant participant; in the case of the passive, it is the agent of low discourse relevance and in the spontaneous case, it is the external circumstance.

The map in Figure 2 also shows the "neighboring" relationship between the spontaneous, the middle, and the antipassive. In Russian and a number of Australian languages, middle forms are recruited for the volitional/spontaneous contrast, as in the following examples:

Russian

(5) a. Kostja poreza-I xleb.

Kostja cut.PERF-PAST.SG.MASC bread

'Kostja cut the bread.'

b. Kostja porezaq-sja.

Kostja cut.PERF-PAST.SG.MASC-SPON

'Kostja has [accidentally] cut himself.'

(Vera Podlesskaya pers. comm.)

Diyari

(6) a. natu yinana danka-na wawa-yi.

1SG.ERG 2SG.O find-PARTC AUX-PRES

'I found you (after deliberately searching).'

b. nani danka-tadi-na wara-yi yinka ngu.

1SG.ABS find-SPON-PARTC AUX-PREP 2SG.LOC

'I found you (accidentally).'

(Austin 1981: 154)

Another favorite source for the spontaneous construction--especially prominent among Indo-Aryan and Dravidian languages of India--is the so-called dative-subject construction, which typically expresses uncontrollable states:

Sinhala

(7) a. mame ee wacene kiwwa.

I.NOM that word say.PAST

'I said that word.'

b. mate ee wacene kiyewuna.

I.DAT that word say.P.PAST

'I blurted that word out.'

(Gair 1990: 17)

The adaptation of the dative-subject construction for a spontaneous action is also seen when the "dative-subject" is marked by cases different from the dative as in the following Bengali examples, where the nominal form corresponding to the dative subject is marked with genitive. Here the volitional/spontaneous contrast takes on interesting nuances:

Bengali

(8) a. Ami toma-ke khub pc chondo kor-i.

1SG.NOM 2ORDSG-OBJ very liking do-PRES.1

'I like you very much.' (According to my own criteria.)

b. Ama-r toma-ke khub pc chondo

1SG-GEN 2ORDSG-OBJ very liking

hc y.

become-PRES.3ORD

'I like you very much.' (According to some [socially] set criteria.)

(Onishi 2001: 120)

When the basic meaning of the verb denotes a spontaneous (involuntary) action, the volitional voice form can be obtained by using a self-benefactive construction, as in Marathi and other Indo-Aryan languages:

Marathi

(9) a. sitaa raD-l-i.

Sita.NOM cry-PERF-F

'Sita cried.'

b. sitaa-ne raD-un ghet-l-a.

Sita-ERG cry-CONJ take-PERF-N

'Sita cried (so as to relieve herself).'

(Prashant Pardeshi pers. comm.)

Lhasa Tibetan has a set of auxiliaries expressing different categories of perspective. "Perspective-choice" interacts with both person and evidential categories in a complex way, but the relevant auxiliaries can be divided into a "self-centered" and an "other-centered" group (Denwood 1999). Verbs denoting such intentional actions as reading and dancing normally occur with self-centered auxiliaries when used with first person subjects. They can be made nonintentional or spontaneous with the use of other-centered auxiliaries, as in the following examples:

Tibetan

(10) a. ngas. yi.ge, klog.ba yin.

I-SMP letter read-LINK-AUX (self-centered)

'I read the letter (on purpose).'

b. ngas. yi.ge, klog.song.

1-SMP letter read-AUX (other-centered)

'I read the letter (without meaning to).'

(Denwood 1999: 137)

Conversely, although unintentional verbs expressing involuntary actions such as coughing and seeing normally occur with other-centered auxiliaties, they can be rendered volitional by the use of self-centered auxiliaries:

Tibetan

(11) a. glo. rgyab.byung.

cough-AUX (other-centered)

'I coughed (involuntarily).'

b. glo. rgyab.pa.yin.

cough-LINK-AUX (self-centered)

'I coughed (deliberately).'

(Denwood 1999: 139)

A similar pattern is observed in Newar (Tibeto-Burman), where the relevant contrast is expressed in terms of a distinction between conjunct and disjunct verbal endings--apparently an evidentiality-related phenomenon. Note that only clauses with first person subjects allow this contrast to be expressed.

Newar

(12) a. ji-n kayo tachya-na

1SG-ERG cup break-PC

'I broke the cup (deliberately).'

b. ji-n kayo tachya-ta

1SG-ERG cup break-PD

'I broke the cup (accidentally).'

(Kansakar 1999: 428)

Finally, the phenomenon now widely recognized in the name of "split intransitivity" is rooted in the volitional/spontaneous opposition. Observe first some well-known examples from Eastern Pomo below:

Eastern Pomo

(13) a. ha: c'e:xelka.(volitional)

1SG.A slip

'I am sliding.'

b. wi c'e:xelka. (spontaneous)

1SG.P slip

'I am slipping.'

(McLendon 1978: 1-3)

Although the verb forms are the same, when the pronominal form is inflected for the patient (13b), the sentence conveys a spontaneous action or a "lack of protagonist control" (McLendon 1978: 4). A similar contrast is seen in the Caucasian language Tsova-Tush (Batsbi), where "[the] referent of [an ergative] subject is a voluntary, conscious, controlling participant in the situation named by the verb" (Holisky 1987:113).

Tsova-Tush (Batsbi)

(14) a. (as) vuiz-n-as.

1SG.ERG fall-AOR- 1SG.ESRG

'I fell down, on purpose.'

b. (so) voz-en-sO.

1SG.NOM fell-AOR-1SG.NOM

'I fell down, by accident.' (Holisky 1987: 104)

In addition to these cases of "fluid-S" marking (Dixon 1994), split intransitivity may be realized as a lexically-conditioned phenomenon, where intransitive verbs are classified into an "agentive" class and a "patientive class." Agentive and patientive nominals respectively trigger marking similar to the corresponding arguments of a transitive clause. The Philippine language Cebuano shows this pattern through a focus system which is characteristic of Formosan and Western Austronesian languages:

Cebuano

(15) Transitive actor-focus construction

Ni-basa ako ug libro.

AF-read I.TOP 1NDEF book

'I read a book.'

(16) Transitive patient-focus construction

Gi-basa nako ang libro.

PF-read I TOP book

'I read the book.'

(17) a. Agentive intransitive

Ni-dagan ako. (actor-focus form)

AF-run I.TOP

'I ran.'

b. Patientive intransitive

Gi-kapoy ako. (patient-focus form)

PF-tired I.TOP

'I got tired/I am tired.'

Generalizing processes have the effect of obliterating the basic semantic motivation for distinguishing two classes of intransitive verbs; either the larger agentive or larger patientive class of intransitive verbs tends to have semantically heterogeneous verbs. Nevertheless, the split of intransitive verbs into two classes is rooted in the distinction between volitional and involuntary actions involving an animate protagonist. This is seen in a minority class of verbs, such that a minority agentive class contains verbs denoting controlled actions, and a minority patientive class includes verbs denoting involuntary states of affairs (see Merlan 1985). In Cebuano (and perhaps other Philippine languages as well) the larger agentive class includes verbs denoting uncontrolled events such as raining or slipping off, while the minority patientive class contains verbs that express strictly involuntary states of affairs such as being hungry, becoming tired, or contracting diseases.

The patterns of split intransitivity discussed here underscore an important point that we wish to advance in this article: voice can be also expressed by nominal forms. Traditionally, voice has been regarded as a verbal category. Indeed, many linguists take verbal marking or verbal inflection as the defining feature of voice. (5) We reject this restrictive view. As we define it, voice is concerned with the evolutionary properties of an action. It is typically marked on the verb because a verb expresses an action. Verbal voice marking is therefore simply a case of iconicity. An action, however, also involves participants such as agent and patient. Because an action occurs in relation to these protagonist participants, any form representing them could also bear voice marking. The volitional/ spontaneous opposition manifested in nominal forms also reflects the underlying relationship between the origin of an action and the volitional status of the agent. (6) Nominal marking for certain voice contrasts is thus also motivated by the iconicity principle.

Let us now turn to the causative/noncausative opposition. As noted in the introduction, the causative has been problematic with respect to its status as a voice category. Widely-received definitions of voice, such as Crystal's in Note 1, maintain that voice oppositions do not entail a semantic contrast, which have prevented many grammarians from readily accepting causative/noncausative as one. As the above discussion on the volitional/spontaneous opposition shows, however, there is no reason to believe that voice is a semantically neutral phenomenon. As it happens, one of the oldest systems of voice contrast in Indo-European--the active/middle opposition--also involves a meaning contrast (see below). (7) The question concerning the causative/noncausative opposition (and other semantic oppositions) is whether the relevant contrasts can be naturally integrated into a coherent conceptual framework of voice. Our answer will be yes.

The causative/noncausative opposition pertains to the origin of an action; that is, whether the action originates with the agent of the main action or with another agent heading the action chain. The causative action chain is represented in Figure 3. (8)

[FIGURE 3 OMITTED]

In a noncausative situation, the initial agent ([Agent.sub.2]) is also the agent of the main action. In a causative situation, the ultimate origin of the main action lies in the agent ([Agent.sub.1]) heading the action chain, which is different from the agent ([Agent.sub.2]) of the main action. The relevant parameter for the causative/noncausative distinction can be formulated as below:

Causative/noncausative opposition:

Does the action originate with an agent heading the action chain that is distinct from the agent or patient of the main action?

Yes [right arrow] causative

No [right arrow] noncausative

The contrast between a noncausative situation represented by an expression such as Bill walked and its causative counterpart expressed by a periphrastic causative form like John made Bill walk can thus be naturally captured in terms of the nature of the origin of an action. Situations expressed by lexical causatives such as John killed Bill have an (initial) agent distinct from the patient of the main action.

One of the important points of past studies of causative constructions has to do with the fact that a voice category can be expressed by a construction as a whole, rather than by local morphological entities such as verb inflection or nominal case marking. Lexical and periphrastic causative constructions such as John killed Bill and John made Bill walk are a case in point. They differ in form from morphological causatives such as Quechua wanu-ci (die-CAUSE) 'kill' and Japanese aruka-se (walk-CAUSE) 'make walk', where the causative meaning is expressed morphologically. Traditionally, grammarians have tended to consider only morphological causatives as proper cases. However, such a position leads to the uncomfortable decision of treating the Quechua and Japanese forms cited above as causative, while treating the semantically parallel English expressions kill and make walk as noncausative. The form-based treatment of causatives is tantamount to simply circumscribing morphological causatives, and does not lead to a comprehensive study of causative phenomena. Causation is a semantic, not a morphological notion, and as such the whole range of expression types must be taken into account in a satisfactory analysis. Indeed, a (functional) typological study is predicated on the view that a variety of expression types will obtain in any given conceptual domain. The formal tripartite pattern of lexical, morphological, and periphrastic causative constructions has now been widely accepted, and some revealing correlations between form and function have been identified in the causative domain (see Shibatani and Pardeshi 2002 on recent developments). We see below that a similar pattern holds in other voice domains as well.

Having discussed two voice phenomena pertaining to the origin of an action, we now turn to the next major voice parameter concerning its development. We will consider the other voices associated with the nature of the origin of an action--the passive and the inverse--after dealing with other conceptually-based voice phenomena.

3.2. Parameters pertaining to the development of an action

In this section we recognize at least two sets of contrastive patterns in the developmental phase of an action. One is concerned with whether the action develops beyond the personal sphere of the agent or is instead confined within it. The latter mode of development forms the conceptual basis of what is known as the middle voice. The other contrastive pattern of action development is concerned with whether or not the action has been successfully transferred to the patient and has achieved its intended effect. This contrast forms the conceptual basis for the ergative/antipassive opposition.

The active/middle voice opposition is best known from studies of classical Indo-European languages such as Ancient Greek and Sanskrit, and calls for a broad understanding of the notion of action confinement in the agent's personal sphere. The clearest case in which the development of an action is confined to the agent's sphere is when simple intransitive activities, such as sitting and walking, are lexicalized as intransitive verbs. Here the development of the action is clearly confined within domain of the agent, as shown in the schematic representation Figure 5a. These situations contrast with active (causative) situations (e.g. John sat his son in the chair and John made his son walk) where the relevant actions involve an agent that instigates an action which develops outside the (initial) agent's domain (see Figure 4). In the words of Benveniste (1971 [1950]: 148): "In the active, the verbs denote a process that is accomplished outside the subject. In the middle, which is the diathesis to be defined by the opposition, the verb indicates a process centering in the subject, the subject being inside the process."

[FIGURES 4-5 OMITTED]

Reflexive situations also constitute one of the middle action types, since here the action is also confined within the agent's personal sphere. The active expression John hit Bill contrasts with the reflexive expression John hit himself, where the confinement of the hitting action within one's personal sphere (e.g. hitting one's head or body) is marked by a coreferential reflexive pronoun (see Figure 5b). (9)

Other middle situations of body-care action--bathing, combing one's hair, washing one's hands, and dressing oneself- are straightforward, where the agent's action deals with its own body or body part. Because an action confined to the agent's sphere typically affects the agent itself, this aspect of the middle--an effect accruing to the agent itself--plays an important role in framing certain actions of the middle. Greek middle expressions such as paraschesthai ti 'to give something from one's own means' and paratithesthai siton 'to have food served up' are a case in point. Here the actions actually extend beyond the agent's sphere, but their effects accrue on the agent in the manner of a typical middle depicted in Figures 5b and 5c. In other cases, the notion of the agent's personal sphere is more strictly adhered to, as in the following examples:

Sanskrit

(18) a. devadatto yajnadattasya bharyam

Devadatta.NOM Yajnadatta.GEN wife upayacchati. (active)

have. relations. 3SG.ACT

'Devadatta has relations with Yajnadatta's wife.'

b. devadatto bharyam upayacchate. (middle)

Devadatta.NOM wife have.relations.3 SG.MID

'Devadatta has relations with his (own) wife.'

(Klaiman 1988: 34)

Sanxiang Dulong/Rawang

(19) a. [an.sup.53) [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55]. (active)

3SG mosquito hit

'S/he is hitting the mosquito.'

b. [an.sup.53] [a.sup.31][dzwl.sup.31] [a.sup.31][be[??].sup.55] -[cm.sup.31]. (middle)

3SG mosquito hit-MID

'S/he is hitting the mosquito (on her/his body).'

(LaPolla 1996: 1945)

The active/middle opposition is diagrammatically shown as above, where the dotted circles and arrows represent the agent's personal sphere and actions respectively.

The conceptual basis of the active/middle opposition can then be formulated in terms of the manner of the development of an action, as follows:

Active/middle opposition:

Active: The action extends beyond the agent's personal sphere and achieves its effect on a distinct patient.

Middle: The development of an action is confined within the agent's personal sphere so that the action's effect accrues on the agent itself.

Defining the middle voice domain in terms of confinement of an action within the sphere of the agent affords a unified treatment of various types of middle construction. Just as in the case of causatives, middle constructions come in three types--lexical, morphological, and periphrastic--both within individual languages and across different ones. Balinese, for example, exhibits all three types of middle construction, allowing some situation types to be expressed either morphologically or periphrastically, as shown in Table 1.

Our approach to middle voice phenomena is more consistent than Kemmer's (1993), which distinguishes reflexive situations from other middle event types, although these two categories are assumed to form a continuum, as shown in the diagrammatic representation in Figure 6. In our approach, Kemmer's reflexive, middle, and single participant situation types all fall in the middle voice domain, as defined above. Kemmer's distinctions among these types appear to be partly based on the typical forms expressing them. Reflexive situations tend to be expressed periphrastically, as in the case of Balinese nyagur awak 'hit oneself'. Kemmer's middle situation types are typically expressed morphologically, as in ma-cukur 'shave' in Balinese, and single-participant events are typically expressed by forms without any middle markers, as in the Balinese lexical middle negak 'sit'.

[FIGURE 6 OMITTED]

Kemmer arrives at her classification of event types as a result of her decision to "[deal] with ... middle-marking languages, or languages with overt morphological indications of the middle category" (Kemmer 1993: 10; bold face original, underline added). As pointed out in the discussion of causatives above, a strict form-based approach to the middle voice tends to focus on morphological middles, which is similar to the narrow treatment of morphological causatives, ignoring other possible form types. Such an approach would consider the Tarascan (Mexico) form ata-kurhi 'hit oneself' and the Quechua form maqa-ku 'hit onself' as middies, while treating the English and Balinese equivalents hit oneself and nyagur awak as distinct reflexives. Perhaps Kemmer would consider oneself and awak here as "overt morphological indications of the middle category." But then, why is she distinguishing reflexive situations from the middle situations in her diagram reproduced in Figure 6? Also, what of the German form aufstehen 'stand up', which shows no middle marking? Is it not a middle because it lacks any morphological marking? It is semantically equivalent to the Balinese middle form ma-jujuk 'stand up'.

A more systematic typological investigation of the form-function correlation can be achieved if variation in form is taken as a function of the "naturalness" of the middle action. Natural middle actions--for example, sitting and walking--tend to be lexicalized as intransitive verbs, while actions typically directed to others--for example, hitting and kicking--tend to be expressed by periphrastic constructions involving a reflexive form when they are confined within the agent's personal sphere. What Kemmer (1993) has identified as middles--morphological middles--center on those actions that people typically apply to themselves, but that are applied to others often enough. (10) One must, however, realize that there are both intra- and crosslinguistic variations--such that in Balinese ma-jujuk 'stand up' has a morphological middle prefix, but negak 'sit (down)' is simply lexical. The same marking pattern is reversed in German, where sich hinsetzen 'sit down' has a middle marker, but aufstehen 'stand up' does not. These irregularities require individual accounts, based on historical, cognitive, and even cultural data.

The middle voice system has several important implications for our general understanding of the nature of voice phenomena. Recall that most of the widely received definitions of voice (such as the one quoted from Crystal [2003] in Note 1) hold that voice opposition does not entail a meaning contrast. This is not the case for the active/middle opposition, as shown by the examples above as well as by the contrast between the English active form John hit Bill and the middle form John hit himself.

Secondly, these examples show that voice alternations do not necessarily alter argument alignment patterns. There is no change in grammatical relation in the contrastive pairs in (18) and (19). If the situations depicted there give the impression of unusual utterances, consider the mundane situations described by the following Greek examples, where a meaning contrast is expressed without a realignment of arguments:

Ancient Greek

(20) a. louo khitona. (active)

wash.1SG.ACT shirt.ACC

'I wash a shirt.'

b. louomai khitona. (middle)

wash.1SG.MID shirt.ACC

'I wash my shirt/I wash a shirt for myself.'

While morphological middle constructions in some languages are strictly intransitive (as in the case of the Balinese ma-), and middles derived via the decausative function (as in the Greek forms porefisai 'to cause to go, to convey': poreusasthai 'to go' and kaiein 'to light, kindle': kaiesthai 'to be lighted, to bum') are intransitive, intransitivity is not a defining property of middle constructions. A large number of languages allow middle constructions that are syntactically transitive, as shown in the examples above and (21b) below, where the direct object is clearly marked by the accusative case suffix -n.

Amharic

(21) a. lemma te-lac' ce.

Lemma MID-shave.PERF.3M

'Lemma shaved himself.'

b. lamma ras-u-n te-lac' ce.

Lemma head-POSS.3M-ACC MID-shave.PERF.3M

'Lemma shaved his head.'

(Amberber 2000: 325, 326)

The general tendency for morphological middles to be intransitive is best viewed as the result of historical processes responding to the pressure on the form to conform to the semantic intransitivity, which characterizes middle events. This is exactly what has happened to many of the middle forms expressing reflexive middle situations in European languages, where the relevant affixes evolved from reflexive pronouns in the parent languages. The course of this development can be illustrated by using synchronic data below, where the Swedish example shows an intermediate clitic stage, the Russian form sebja exemplifies the earliest transitive pattern, and -s' (or -sja) the advanced fused pattern.

(22) a. Ivan ubi-1 sebj-a. (Russian)

Ivan kill.PERF-PAST.SG.MASC self-ACC

'Ivan killed himself.'

b. Honkamma-de sag. (Swedish)

she comb-PAST MID

'She combed.'

c. Ona prichesa-l-a-s'. (Russian)

she comb-PAST-FEM-MID

'She combed.'

Finally, in recognizing intransitive and transitive verbs as lexicalized middle and active voice forms, we elevate the active/middle contrast to the status of a central voice opposition observed in all human languages (cf. Dixon's [1979: 68-69] observation that "all languages appear to distinguish activities that necessarily involve two participants from those that necessarily involve one ... Then all languages have classes of transitive and intransitive verbs, to describe these two classes of activity"). (11)

Let us now turn to the antipassive voice. As the name suggests, the syntactic properties of antipassive constructions mirror somewhat those of passives, but the semantic aspect is different in these two voices. In the case of the passive, there is no implication that an agent is not somehow fully involved in the action. Indeed, full involvement of an agent is a crucial feature distinguishing the passive (e.g. John was killed while he was asleep) from the spontaneous middle (e.g. John died while he was asleep). Antipassive situations contrast in meaning with those expressed in the active and the ergative voice regarding the attainment of the intended effect upon a patient, however.

The intended effect of an action on a patient differs depending on the verb type. With contact verbs, the antipassive presents a situation as failing to make contact, as in the following examples:

Chukchee

(23) a. elteg=e keyn=en penre-nen.

father=GER bear=ABS attack=3SG:3SG/AOR

'The father attacked the bear.'

b. elteg=en penre=tko=g[??]e

father=GER attack=APASS=3SG.AOR

keyn=ete. (antipassive)

bear=DAT

'The father rushed at the bear.'

(Kozinsky et al. 1988: 652)

Warlpiri

(24) a. nyuntulu-rlu [??]-npa-ju pantu-rnu ngaju.

you-ERG [??]-2SG.A-1SG.P spear-PAST I.ABS

'You speared me.'

b. nyuntulu-rlu [??]-npa-ju-rla pantu-rnu

you-ERG [??]-2SG.A-1SG-DAT spear-PAST

ngaju-ku. (antipassive)

I-DAT

'You speared at me; you tried to spear me.'

(Dixon 1980: 449)

According to Dixon (1980: 449), (24b) above "indicates that the action denoted by the verb is not fully carried out, in the sense that it does not have the intended effect on the entity denoted by the object [read "patient", MS]." Similarly, visual contact is not made when situations involving visual perception are presented in the antipassive voice:

Warrungu

(25) a. nyula nyaka+n wurripa+[??].

3SG.NOM see+P/P bee+ABS

'He saw bees.'

b. ngaya nyaka+kali+[??] wurripa+wu katyarra+wu.

1SG.NOM see-APASS+P/P bee+DAT possum+DAT

'I was looking for bees and possums.'

(Tsunoda 1988: 606)

Moreover, for action types affecting a patient, the antipassive voice presents a situation as NOT affecting the patient in totality, as in the following examples:

Samoan

(26) a. S[bar.a] 'ai e le teine le i'a.

PAST eat ERG ART girl ART fish

'The girl ate the fish.'

b. S[bar.a] 'ai le teine i le i'a.

PAST eat ART girl LOC ART fish

'The girl ate some (of the) fish.'

(Mosel and Hovdhaugen 1992: 108)

The voice parameter focusing on the ergative/antipassive contrast can be formulated as below:

Ergative/antipassive opposition:

Does the action develop to its full extent and achieve its intended effect on a patient?

Yes [right arrow] ergative(/active)

No [right arrow] antipassive

Notice that in (24b) an antipassive event is conveyed solely by the case marking on the patient, underscoring our earlier point that voice may be manifested in a nominal element denoting the relevant participant. In the case of the antipassive, the status of the patient is at issue, and antipassivization iconically affects the form of the patient nominal--either case marking it differently from the active/ergative (a case of the so-called differential object marking [Moravcsik 1978]), or avoiding coding it (examples below).

As conceived here, both the middle and the antipassive relate to the nature of the development of an action. Specifically, both have the ontological feature of an action not (totally) affecting a distinct patient. The conceptual affinity between the two explains the middle/antipassive polysemy seen in a fair number of languages. Observe:

Yidiny

(27) a. wagu:da bambi-dinu.

man.ABS cover-MID

'The man covered himself.'

b. wagu:da wawa-:dinu gudaganda.

man.ABS saw-APASS dog.DAT

'The man saw the dog.'

(Dixon 1977: 277, 280)

Balinese

(28) a. Ia sedek ma-sugi.

3SG ASP MID-wash.face

'She is washing her face.'

b. Tiang ma-daar.

1SG APASS-eat

'I ate.'

Shibatani and Artawa 2002)

Russian

(29) a. Ivan mojetsja mylom.

Ivan wash.MID soap.INSTR

'Ivan washed himself with soap.'

b. Babuska rugajetsja.

granny.NOM scold.APASS

'Granny is scolding.'

(Geniusiene 1987: 9)

In addition, languages may show the well-known connection between the middle and the passive (12) through the use of the same form as the antipassive, thus illustrating a three-way middle-passive-antipassive polysemy:

Russian (cf. the examples immediately above)

(30) Dom stroitsja turezk-oj firm-oj

house.NOM is.being.built.PASS Turkish-INST firm-INST

INKA.

INKA

'The house is being built by the Turkish company INKA.'

Kuku Yalanji

(31) a. karrkay julurri-ji-y. (middle)

child.ABS wash-MID-NONPAST

'The child is washing itself.'

b. warru (yaburr-ndu) bayka-ji-ny. (passive)

young man.ABS shark:LOC:pt bite-PASS-PAST

'The young man was bitten (by a shark).'

c. nyulu dingkar minya-nga nuka-ji-ny. (antipassive)

3SG.NOM man.ABS meat-LOC eat-APASS-PAST

'The man had a good feast of meat (he wasted nothing).'

(Patz 1982: 244, 248, 255)

3.3. The termination of action parameter

In a regular transitive event, an action terminates in a patient. However, the action may extend beyond the patient and affect an additional entity, which then functions as a new terminal point. Benefactives/malefactives and applicatives express this kind of situation. The relevant parameter can be formulated in the following form:

Benefactive/malefactive/applicative parameter:

Does the action develop further than its normal course, such that an entity other than the direct event-participants becomes a new terminal point registering an effect of the action?

No [right arrow] active/middle

Yes [right arrow] benefactive/malefactive/applicative

While the notion of benefit-giving is a broad one, there is one particular type with a perceptible change in the beneficiary. This is the case involving transfer of an object, where the object itself is directly affected by the act of giving. In a typical giving situation, the object is physically moved from one owner to a new one. The recipient beneficiary is secondarily affected because it comes into possession of the transferred object. Languages often have a special benefactive construction that portrays this type of situation, where the effect on the beneficiary is indicated by its argument status in syntactic coding. As shown in Shibatani (1996), benefactive constructions are typically based on the syntactic schema of the give-construction even involving the verb form for giving in some languages, as in the case of Japanese seen below:

(32) a. Taroo-wa Hanako-ni hon-o yat-ta.

Taro-TOP Hanako-DAT book-ACC give-PAST

'Taro gave Hanako a book.'

b. Taroo-wa Hanako-ni hon-o kat-te

Taro-TOP Hanako-DAT book-ACC buy-CONJ

yat-ta.

BEN-PAST

'Taro bought Hanako a book.'

In (32b) the buying action is extended beyond the patient (the book), and affects the beneficiary nominal (Hanako) coded in the dative form. Compare this construction to the one below, expressing a more general benefit-giving in which the beneficiary takes on a nonargument form.

(33) Taroo-wa Hanak-no tame-ni hon-o

Taro-TOP Hanako-of sake-for book-ACC

kat-te yat-ta.

buy-CONJ GIVE-PAST

'Taro bought a book for (the sake of) Hanako.'

While (33) may express any type of benefit-giving--including one of buying a book to help Hanako's book-selling business--(32b) specifically conveys the meaning that the transfer of the book was intended. Note also the English translations accompanying these examples, which show the same contrast.

Benefactive/malefactive events are also realized by so-called external possession constructions in Indo-European and some other languages (cf. Payne and Barshi 1999), although the context may determine whether or not a clear benefactive/malecfactive reading obtains from them. When a body part is involved as the primary patient (cf. below), the benefactive/malecfactive reading is not strongly pronounced beyond that which is conveyed by the verb; cf. (34) and (35a):

German

(34) Ich wasche mir die Hande.

I wash I.DAT the hands

'I wash my hands.' (lit. 'I wash me the hands.')

(35) a. Man hat ihm den Arm gebrochen.

lit. 'They broke him the ann.'

b. Man hat seinen Arm gebrochen.

'They broke his arm.'

Where inalienable possession is implicated as above, the dative nominal indicates that the action has affected it as a new terminal point of the action. In German, the external possession construction is generally obligatory when the affected body part is inalienably possessed; the extension of the action to its owner is inevitable under such circumstances. Indeed, an internal possession construction like (35b) suggests that the arm in question was detached, and no effect on its owner is asserted by such a sentence. Internal possession constructions involving inalienably possessed body parts, as in the English form I broke his arm, suggest that the arm's owner was affected, but the implication is obtained through a commonsensical world view. The dative construction (35a), on the other hand, asserts that the body part owner is affected by the action.

The benefactive/malefactive reading can be seen more readily in the following examples, where the dative nominal represents a mentally affected party:

French

(36) a. Jean lui a casse sa vaisalle.

lit. 'Jean broke her her dishes.'

b. Jean a casse sa vaisalle.

'Jean broke her dishes.'

Modern Hebrew

(37) a. ha tinok lixlex li et ha xulca.

the baby dirtied I.DAT ACC the shirt

'The baby dirtied the shirt on me.'

b. ha tinok lixlex et ha-xulca shel-i.

the baby dirtied ACC the-shirt of-me

'The baby dirtied my shirt.'

(Berman 1982; T. Gibon pers. comm.)

Where inalienable possession is evident, as in these examples, a malefactive meaning obtains more readily. The trade-off between inalienability and affective reading shows that a principle of relevance is at work in these constructions: the relevance of the dative arguments to the event must be somehow "guaranteed." Involvement of an inalienably possessed object guarantees the relevance of the possessor to the event, since whatever happens to the body part will affect its possessor automatically. When an inalienable possession relation does not obtain--as in (36a) and (37a)--a benefactive/malefactive effect upon the dative argument is pronounced as a way of establishing its relevance to the event. The attendant interpretation that a possessive relation exists contributes to the establishment of the affective relationship; the owner of an object is more easily affected by what happens to its possession.

Contrary to what the label suggests then, so-called external possession constructions DO NOT assert a possessive relation between the dative argument and the directly affected patient. Indeed, the relevant constructions arise independently from externalization of the possessor, as in the German example below (also in [36a] above), or when the notion of possession is irrelevant, as in the following examples (40)-(41) from River Warihio (Uto-Aztecan): (13)

German

(38) Peter repariert mir mein Fahrrad.

'Peter fixes me my bicycle.'

River Warihio

(39) a. hustina pasu-re muni kukuci icio.

Agustina cook-PERF beans children BEN

'Agustina cooked beans for the children.'

b. hustina pasu-ke-re muni kukuci.

Agustina cook-BEN-PERF beans children

'Agustina cooked beans for the children.'

(40) maniwiri no'o wikahta-ke-ru yoma aari.

Manuel 1SG.NS sing-BEN-PERF all afternoon

'Manuel sang all afternoon for me.'

(41) tapana no'o yuku-ke-ru.

yesterday 1SG.NS rain-BEN-PERF

'Yesterday it rained on/for me.'

(Felix 2005: 253, 257, 258)

That the condition of physical proximity should be more important than the possessive relation in inducing a benefactive/malefactive construction is shown by the following River Warihio examples (see Shibatani 1994 for other cases):

(42) a. maniwiri ihcorewapate-re wani pantaoni-ra.

Manuel get.dirty-PERF John jeans-POSS 'Manuel dirtied John's jeans.' (John's jeans were over the chair.)

b. maniwiri ihcorewapate-ke-re pantaoni wani.

Manuel get.dirty-BEN-PERF jeans John

'Manuel dirtied John's jeans.' (John was wearing his jeans.)

In general, applicative constructions have been considered as syntactic valency-increasing operations that are pragmatically motivated (see Peterson 1999). Our claim is that their conceptual basis is rooted in the ontological feature of an action, as stated in the voice parameter above. Peterson's (1999) survey shows that certain applicatives are more basic and prevalent than others. In the words of Peterson (who lumps benefactives and applicatives together), "the locative and circumstantial applicatives depend on the presence of other applicative constructions, while benefactive and instrumental/comitative applicatives do not. That is, there are two core applicative constructions, benefactive and instrumental/ comitative, and these serve as anchors as it were for the development of additional applicative constructions marked either by the same or distinct morphology" (Peterson 1999: 135). This observation is consistent with our view of the benefactive/applicative voice. Benefactive and instrumental/comitative participants are much more directly involved in the event than a causal factor, or setting entity such a location, hence much more likely to be affected by the action. That the benefactive applicative is obligatory in some languages also underscores the point regarding the affected nature of the recipient beneficiary (cf. above).

In the past, grammarians may have not paid sufficiently close attention to the subtle meaning differences that exist between applicative constructions and their nonapplicative counterparts. However, recent descriptions of applicative constructions have begun to notice some revealing semantic effects. For example, Donohue (1999) shows that the Tukang Besi comitative applicative conveys a meaning whereby the applied comitative nominal is actively engaged in the event: (14)

Tukang Besi

(43) a. No-moturu kene wowine ane ke hotu mopera.

3R-sleep and woman exist and hair short

'He slept with the woman with the short hair.'

(i.e. they were sleeping near each other.) (# they had sex together.)

b. No-moturu-ngkene te wowine ane ke hotu

3R-sleep-COM CORE woman exist and hair

mopera.

short

'He slept with the woman with the short hair.' (i.e. they had sex together.)

(Donohue 1999: 231)

The following instrumental applicative from Pulaar also demonstrates how an applied instrumental nominal can implicate a participant more thoroughly affected by the agent's action:

Pulaar

(44) a. mi loot-ii min am a

1SG wash-PERF.ACT y.s. 1SG.POSS PREP

saabunnde hee.

Soap DET

'I washed my younger sibling with (some of) the soap.'

b. mi loot-r-ii min am

1.SG wash-INST-PERF.ACT y.s. 1SG.POSS

saabunnde hee.

soap DET

'I washed my younger sibling with (all of) the soap.'

(Sebastian Ross-Hagebaum pers. comm.)

The various effects of locative applicatives have also been recognized in the literature. The Balinese locative expression in (45b) below, for example, describes a situation where the action of planting banana trees extends in such a way as to affect the garden. Here the entire garden ends up being planted with banana trees, while no such implication is made in the nonapplicative counterpart (45a).

Balinese

(45) a. Tiang mulan biyu di tegalan tiang-e.

1SG plant banana in garden 1SG-POSS

'I planted bananas in my garden.'

b. Tiang mulan-in tegalan tiang-e biyu.

1SG plant-APPL garden 1SG-POSS banana 'I planted my garden with bananas.'

(I. Wayan Arka pers. comm.)

Комментариев нет:

Отправить комментарий