1. Introduction
The South American continent harbors exceptional linguistic diversity, as it shows the highest proportion of language families and isolatesFootnote 1 per continent (Campbell, Reference Campbell, Campbell and Grondona2012; Seifart & Hammarström, Reference Seifart, Hammarström and Campbell2017:260). This raises questions about the history of the settlement of South America, the most recently populated continent according to current scientific knowledge (e.g. O’Connor & Kolipakam, Reference O’Connor, Kolipakam, O’Connor and Muysken2014). At the same time, within South America, some features, such as a five-member vowel system, extensive classifier system, or cross-referencing of only one argument on the verb have been found to be widely shared among languages from different stocks specifically within the area called Amazonia (Dixon & Aikhenvald, Reference Dixon, Aikhenvald, Dixon and Aikhenvald1999:8–10). Such patterns raise the question of whether they are due to contact or are remnants of deep genetic relationships (Epps, Reference Epps2009), and their close examination is instrumental in the debate on whether the Amazonian basin forms a linguistic area (see a summary in Epps & Michael, Reference Epps, Michael and Hickey2017).
As descriptions of Amazonian languages increase in both number and quality, work identifying macro-areas on the basis of relatively fine-grained linguistic phenomena will probably become more common. A promising example of this type is Guillaume & Rose’s (Reference Guillaume, Rose and Floricic2010) suggestion that sociative causatives may be an areal feature of southwest Amazonia, with the distribution of sociative causatives outside this area attributed to the spread of Tupían languages from their southwestern homeland. A systematic examination of the distribution of such morphemes both within South America and beyond is an obvious target for future research. (Epps & Michael Reference Epps, Michael and Hickey2017:952)
In this work, we are undertaking a re-evaluation of the global distribution of dedicated expression for sociative causation with a worldwide sample of over 300 languages. Our results confirm the main hypotheses of Guillaume & Rose (Reference Guillaume, Rose and Floricic2010) and establish dedicated sociative causative constructions as a rare phenomenon worldwide with the notable exception of South America.
1.1. Sociative causation
Sociative causation is a particular type of causation where the causer not only makes the causee do an action but also participates in it (Shibatani & Pardeshi, Reference Shibatani, Pardeshi and Shibatani2002; Zúñiga & Kittilä, Reference Zúñiga and Kittilä2019).Footnote 2 It is further distinguished in three semantic sub-types (Shibatani & Pardeshi, Reference Shibatani, Pardeshi and Shibatani2002), illustrated by the examples in (1), where sociative causation is expressed by a regular causative marker.

Shibatani & Pardeshi (Reference Shibatani, Pardeshi and Shibatani2002:148ff.) consider sociative causation as an intermediate category between direct and indirect causation on the causative continuum and show that it is usually expressed either by direct or indirect causative markers, as in (1). Languages can also have markers dedicated to this category (see also Kulikov, Reference Kulikov, Martin Haspelmath, Oesterreicher and Raible2001:892), as is the case with the sociative causative ha- in Alamblak (2b).

1.2. Previous research on the distribution of sociative causation
Guillaume & Rose (Reference Guillaume, Rose and Floricic2010) presented a preliminary worldwide survey of the dedicated expression of sociative causation by grammatical morphemes and listed 17 such cases, the large majority of which were found in South America (Map 1). The authors thus hypothesized that sociative causatives constitute an areal feature of South American languages, more precisely south-western Amazonia. Since more than half of the 15 languages listed in South America belong to the Tupian family, the authors also suggested that the feature could have developed in the Tupian family first, and then diffused to neighboring languages of other stocks. This hypothesis was supported by the likely origin of the Tupian family in the same area (Rodrigues, Reference Rodrigues, Dixon and Aikhenvald1999:108).

Map 1. Survey of sociative causative markers in the world (Guillaume & Rose, Reference Guillaume, Rose and Floricic2010:390).
Due to its exploratory nature, Guillaume & Rose’s (Reference Guillaume, Rose and Floricic2010) survey has certain limitations, some of which are already acknowledged in the article itself. First, the sample was neither geographically nor genetically balanced, and included only positive examples, i.e. languages where a sociative causative morpheme was attested, making it impossible to quantify the feature’s prevalence. Additionally, the results could be biased not only by the authors being Amerindianists, but also by the descriptive tradition of valency-changing mechanisms in general and sociative causation in particular in South America (at least since the sixteenth century with Anchieta, Reference Anchieta1595:48–49, on Tupinambá). From our experience in the current work, we have observed that descriptions of South American languages almost systematically account for valency-changing derivations, while such sections are strikingly absent from many grammars of North American languages. As for sociative causation, for Tupian languages (and most notably languages from the Tupi–Guarani branch), the usual template for grammars includes a section on “comitative causative.” As a consequence, we think that a dedicated sociative causative marker would be more likely described for South American languages and with a rather transparent label.
Pöllänen (Reference Pöllänen2022, Reference Pöllänen2024) is a follow-up study to Guillaume & Rose (Reference Guillaume, Rose and Floricic2010) focusing on the “core” geographical area where most sociative causative markers had been found, and widening the scope to non-morphological means.Footnote 4 The survey examines a genealogically balanced sample of 32 languages from a zone covering western and southern Amazonia, the Andes, the dry Chaco Basin area and the Atacama Desert, and finds two more languages with a dedicated sociative causative marker. A detailed account of the discrepancies between Guillaume & Rose (Reference Guillaume, Rose and Floricic2010), Pöllänen (Reference Pöllänen2022), and the present study is available in Supplementary material 3.
Section 2 describes the aims and methodology of the present study (sample, questions, and coding). Section 3 presents the results and Section 4 discusses them. Section 5 offers a summary of the paper.
2. Aims and methodology
2.1. Aims
Guillaume & Rose (Reference Guillaume, Rose and Floricic2010) have hypothesized that the presence of dedicated sociative causative markers is an areal feature of South America, with a cluster in south-western Amazonia, on the basis of their pilot survey of dedicated sociative causative markers in a worldwide convenience sample.
The major aim of the present paper is precisely to reassess the spatial distribution of dedicated sociative causative markers on the basis of a survey on a large worldwide sample of 325 languages. A secondary aim is to widen the scope beyond morphemes by including syntactic constructions that could be dedicated to sociative causation.
2.2. Language sample
The language sample for this study was built within a larger multidisciplinary project (Out of Asia SNSF Sinergia project) which, among other aims, was designed to (re)assess known linguistic areas and discover new ones in the Americas. The 325-language sample designed for this project over-samples American languages. It includes 220 languages of the Americas and 105 languages from other parts of the world, i.e. 25 languages per macro-area.Footnote 5 Within each macro-area (as defined in Hammarström & Donohue, Reference Hammarström and Donohue2014), our language sample maximizes phylogenetic diversity and favors isolates over language families, while trying to cover as much geographical space as possible. A consequence of the American bias in the sample and the maximization of phylogenetic diversity is that it is almost only in the Americas that a stock has several representative languages in the sample. Another consequence is that our results are more telling for the Americas than for the other macro-areas, where they may be under- or over-estimated. The geographical and phylogenetic distributions of the languages are illustrated in Map 2 and summarized in Table 1. Language names, genetic and macro-areal classifications, as well as geographical coordinates, follow Glottolog (Hammarström et al. Reference Hammarström, Forkel, Haspelmath and Bank2022).

Map 2. Geographical and phylogenetic distributions of the languages of the sample.
Table 1. Geographical and genetic distribution of the languages in the sample

a Two stocks appear in both North and South America: this is why the total of stocks is reduced by 2.
2.3. Questions and coding
All languages of the sample have been coded for the three following questions, as part of a larger questionnaire on sociative causation available in Supplementary material 1, also serving as a coding guide.
-
(i) Does the language have a dedicated construction to express sociative causation?
This question, labelled SocCaus.01 in the questionnaire, targets the presence of a dedicated construction for the expression of sociative causation, whatever the means of grammatical expression. We consider a construction to be dedicated when sociative causation is expressed by a grammatical device which exclusively expresses sociative causation. Possible answers are “yes” or “no.”
-
(ii) If yes to the preceding question, what kind of construction is sociative causation expressed with?
This second question (SocCaus.02 in the questionnaire) concerns the form of the sociative causative construction, i.e. whether it is a dedicated morpheme or a specialized combination of morphemes. This is in line with Pöllänen (Reference Pöllänen2022, Reference Pöllänen2024) but contrasts with Guillaume & Rose (Reference Guillaume, Rose and Floricic2010), who exclusively searched for grammatical morphemes and excluded periphrasis and complex predication (see Supplementary material 3 for details of the discrepancies across studies).
-
(iii) Does the language use a non-dedicated construction to encode sociative causation?
The last question (SocCaus.04) codes for non-dedicated expressions of sociative causation. An example is when a language has a causative marker that sometimes entails sociative causation. Beyond its interest for typologists, the aim of this question within the present study is to observe the distribution of the various expressions of causative sociation in different macro-areas, without a restriction to dedicated constructions.
In the remainder of the section, we describe the coding process. Because the concept of sociative causation is still not widespread, it is not always clearly identified as present or absent by grammar authors.Footnote 6 Hence the valency-changing sections of grammars were carefully scrutinized, and dozens of keywords like “help,” “aid,” “together” were systematically searched. Also, the semantics of causation is often not explored.Footnote 7 As a result, examples and their contexts were carefully reviewed and possible interpretations were examined and discussed among authors and research assistants (see the case of Mojeño Trinitario in the next paragraph). Consistent sociative causation meaning throughout the examples was judged necessary to consider the construction to be dedicated and yield a “yes” to the first question. In case of inconsistent meaning, we considered the polyfunctional construction not to be a dedicated sociative causation construction (“no” to the first question). Consequently, we sometimes disregarded constructions analyzed as sociative causatives in other papers, i.e. expressing sociative causation among other functions, and even as their “primary function.” In the case of uncertainty, we contacted the grammar authors when possible (see Acknowledgments). Edge cases were always discussed collectively.
An example of a polyfunctional morpheme is the Mojeño Trinitario prefix im-∼em-. Its occurrence in (3a) is a good illustration of the expression of sociative causation. However, other occurrences like (3b) express causation only, while some rare occurrences like (3c) seem to express sociative only (or with a very weak causation). Despite occurrences like (3a), we did not consider -im to be a dedicated sociative causative morpheme.

Our methodology is thus quantitative in terms of the number of languages surveyed, and qualitative in terms of the care provided in harvesting and coding the data.
3. Results
The raw data are given in Inman et al. (Reference Inman, Natalia Chousou-Polydouri, Kellen Parker van Dam and Françoise Rose2025)Footnote 8 and Supplementary material 2. The feature set ‘Categorical genderlect’ in Inman et al. (Reference Inman, Natalia Chousou-Polydouri, Kellen Parker van Dam and Françoise Rose2025) gives the full coding for the three questions and the 325 languages. Supplementary material 2 details each of the dedicated constructions in our sample, with information on the author’s label of the construction, its attested semantics and illustrative examples.
This section highlights the major results. Section 3.1 presents the quantitative and geographical distribution of languages with dedicated sociative causative constructions in our sample. Section 3.2 examines the possible forms of the dedicated constructions. Section 3.3 measures the use of a non-dedicated construction to encode sociative causation.
3.1. Dedicated constructions for sociative causation: how many and where
Positive answers about the presence of a dedicated construction for sociative causation amount to 19 out of 325 languages. The detailed list is given in Table 2. Languages with a dedicated sociative construction amount to 5.8% of our sample, while negative answers amount to 94.2%, with 306 languages. We can conclude that sociative causative constructions are non-marginally present, as one in 20 languages of our sample has them.
Table 2. Languages with dedicated sociative causative constructions

However, the spatial distribution of dedicated sociative causative constructions is very skewed towards South America (see Map 3). As detailed in Table 2, of the 19 languages with a dedicated construction for sociative causation in the sample, 15 are spoken in South America. The 15 South American languages with dedicated constructions for sociative causation belong to ten stocks: eight of them individually belong to different stocks (Barbacoan, Guahiboan, Naduhup, Cahuapanan, Harakmbut, Nuclear-Macro-Je, Quechuan, and one isolate, Yuracaré), five are Tupian languages, and two Pano-Tacanan. Note that our sample also includes another six Tupian, two Pano-Tacanan, eight Nuclear-Macro-Je, and four Quechuan languages that do not display a sociative causative construction.Footnote 9 The four languages with dedicated sociative causative constructions spoken outside of South America are Nama (Khoe-Kwadi, Namibia, Botswana, and South Africa), Galo (Sino Tibetan, India), Marind (Anim, Papua New Guinea and Indonesia), and Alamblak (Sepik, Papua New Guinea).

Map 3. Worldwide distribution of dedicated sociative causative constructions.
This means that among the South American languages of the sample, 14.3%, or one in seven languages, have a dedicated sociative causative construction. On the other hand, if we only consider the languages spoken outside of South America in the sample, the prevalence of dedicated constructions for sociative causation falls to only 1.8%. Even when taking into account the effect of the American bias, the results are suggestive of a certain skewing in the geographical distribution of dedicated sociative causative constructions in the languages of the world.
Our sample is unbalanced for the number of languages and stocks taken into consideration for each macro-area. In order to balance these, we generated 250 random subsamples, with 15 languages of different stocks for each macro-area. The average presence of dedicated sociative causative constructions in each of these macro-areas is very similar to the one in our sample (compare Table 4 with Table 3), confirming that the American focus in our sample is not distorting macro-areal differences. In all cases, the presence of dedicated sociative causative constructions in South America remains noteworthy.
Table 4. Prevalence of dedicated sociative causative constructions across 250 subsamples

Table 3. Dedicated sociative causative constructions across macro-areas

3.2. Forms of the dedicated constructions for sociative causation
Our survey targeted any grammatical construction dedicated to the expression of sociative causation, so as to include languages which express sociative causation with dedicated devices other than a dedicated morpheme, e.g. a combination of a causative morpheme with some other marker on the verb or on the causee.
The results are given in the last column of Table 2. Most languages with a dedicated sociative causative construction encode it through a dedicated morpheme (17/19). Only two languages do not: Ese Ejja combines a causative and a comitative applicative marker, and Yurakaré uses a specific object paradigm associated with the absence of a valency-changing marker. These three types of dedicated constructions are presented next.
Dedicated sociative causative morphemes are illustrated with Marind and Teko in (4).

The two cases of dedicated sociative constructions encoded differently than with a dedicated sociative causative morpheme merit special attention. We first look at Ese Ejja, which expresses sociative causation with a combination of a causative and an applicative marker. The marker -sawa is a comitative applicative which can occur on its own to encode co-participation only, as in (5a). It most frequently combines with the causative -mee to express sociative causation: in (5b), the subject helps the people to learn (lit. make them know) their language, i.e. participates in their learning.Footnote 10

In Ese Ejja, the combination of the sociative and the causative marker seems to systematically express sociative causation, which is not necessarily the case cross-linguistically. In Yimas, this same combination gives rise to a caused event with an additional causee (in the dative), rather than involving the causer in the caused event.

Finally, Yuracaré expresses sociative causation by prefixing a special paradigm of object indexes onto intransitive verbs. Example (7a) illustrates the intransitive verb root yupa- ‘go in’, which has a zero-marked third person subject. Example (7b) demonstrates that this intransitive root needs some valency-changing process to be used transitively. Here causation is encoded through reduplication of the final syllable, and van Gijn postulates the presence of zero A and P third person affixes. Finally, example (7c) shows this same intransitive verb root with only an overt, third person object prefix ka-, which not only transitivizes the verb but also implies the semantics of sociative causation.

The construction in (7c) is a dedicated sociative causation construction because the use of this object indexing paradigm on intransitive verbs necessarily triggers the sociative causation meaning. Note that this paradigm differs from the regular object indexing paradigm only in the third person (Ø- for the ‘regular’ object indexing paradigm vs. ka- for the sociative causative). This means that with first and second person, it is the mere presence of the object index (and the absence of a valency-changing marker) which induces sociative causation. We have consequently considered this grammatical expression to be a construction rather than a single morpheme.
The Yurakaré sociative causative marker partially recalls the “special causee marking” in Punjabi, a type of dedicated construction reported in only one language in Guillaume & Rose (Reference Guillaume and Rose2007), and not attested in our sample. As shown in (8a), the verb form cukvãiã is causativized with a causative suffix, which expresses a regular causative if the causee is encoded with an ablative. In (8b), the same causativized verb form cukvãiã along with the accusative/dative marker nũ marking the causee expresses sociative causation.

The choice of including formal means other than dedicated morphemes does not affect the results much: the great majority of dedicated expressions of sociative causation are morphemes, found in 17 of the 19 languages. Pöllänen (Reference Pöllänen2024:23) rightly notes that this is expected given the tendency towards rich and agglutinative verbal morphology in western South American languages.
3.3. Non-dedicated constructions to encode sociative causation: where?
This last result targets the non-dedicated constructions encoding the sociative causation meaning. By non-dedicated construction, we mean that the language makes use of a construction which can but does not systematically express sociative causation. Example (9) illustrates the use of the comitative marker mu- in Totontepec Mixe, which adds an instrument to the argument structure of the verb in (9a), as a marker of sociative causation with motion verbs involving inanimate objects, as in (9b).

We will not go into the details of the particular constructions because our goal in surveying these non-dedicated constructions within this article is to contrast their geographical distribution with that of dedicated constructions (Section 3.1), rather than to refine the existing typology of the expression of the sociative causation meaning.Footnote 11
Table 5 presents the distribution of non-dedicated constructions for sociative causation in the different macro-areas. It is not geographically skewed, and in particular, there is no strong bias towards South America: on average, 31.5% of the South American languages of the sample present a non-dedicated construction, while the average for the total sample is 30.5%.
Table 5. Non-dedicated sociative causative constructions across macro-areas

4. Discussion
Section 4.1 discusses the worldwide presence and distribution of dedicated sociative causative constructions, while Sections 4.2 and 4.3 discuss their genetic and geographical distribution, respectively.
4.1. Worldwide presence of dedicated sociative causative constructions
Our large sample allows us to evaluate the overall frequency of dedicated sociative causative constructions. They are found in 19 languages out of 325, about 5.8% of the languages of the sample. As such, they cannot be considered extremely rare. However, because their presence is denser in South America, their prevalence outside of South America is particularly low, and even negligible, with four cases out of 220 languages spoken outside of South America (< 2%).
The present paper is not meant to be a comprehensive, worldwide report of dedicated sociative causative markers in the literature, but a survey of its distribution in a large, balanced and worldwide sample. For a fuller inventory, one should add to the 19 languages of our study the nine additional ones listed in Guillaume & Rose (Reference Guillaume, Rose and Floricic2010), and the one additional language in Pöllänen (Reference Pöllänen2022, Reference Pöllänen2024). However, this would require an assessment of all reported cases, as we do not necessarily endorse the analyses by these authors. Supplementary material 3 documents the case of disagreement between the three studies, but does not assess the analysis of languages which are not part of our sample.
The absence of sociative causative constructions from North American languages is very telling: they are completely absent from a total of 115 languages distributed in 62 language stocks. By contrast, we cannot exclude that sociative causative constructions might be more present in Africa, Papunesia, and Eurasia because the number of languages investigated in these macro-areas is much lower (25 to 30). Sociative causatives are plausibly present in more than one or two languages in these areas, especially given the biases explained in the methodology. This represents an exciting topic to investigate in the future.
4.2. Genetic distribution
Languages with dedicated sociative causative constructions in our sample belong to 14 different stocks (13 language families and an isolate, Yuracaré). Of these 13 families, only four, all from South America, are represented by more than one language in our sample. Two of the four families display a dedicated sociative causative construction in several languages: two out of four Pano-Tacanan languages, and five out of 11 Tupian languages.
The third and fourth families, the Nuclear-Macro-Je and Quechuan families, albeit represented by nine and five languages respectively, have only one language each with a dedicated marker (Krenak and Yauyos Quechua).
Within the Pano-Tacanan stock, Cavineña and Ese Ejja, both of the Tacanan branch, show a dedicated construction. However, the two constructions are not directly related diachronically: the construction in Cavineña is based on a dedicated sociative causative morpheme -kere and the construction in Ese Ejja is bimorphemic with the causative -mee and the sociative -sawa.Footnote 12 Note that the two languages from the Panoan branch present in the sample have non-dedicated constructions to express sociative causation, so that the semantic domain of sociative causation might be particularly salient in the Pano-Tacanan stock, whether expressed by a dedicated construction or not.
As for the Tupian stock, a dedicated sociative causative morpheme is found in five languages of the survey, belonging to four different branches of the family: Teko and Aweti belong to the large Maweti–Guarani branch, Karo to the Purubora-Ramarama branch, Mekens to the Tuparic branch and Mundurukú to the Mundurukuic branch. As mentioned above, these dedicated markers are traditionally called “comitative causative” and are considered cognates and reflexes of the Proto-Tupi verbal prefix *eɾʲe- ∼eɾʲo- reconstructed by Rodrigues & Cabral (Reference Rodrigues, Câmara Cabral, Campbell and Grondona2012:509, 531–533).Footnote 13 As for the six Tupian languages within our sample that do not show a dedicated sociative causative marker, they belong either to branches with no reflex of *eɾʲe- ∼eɾʲo- (Gavião do Jiparaná, Jurúna, Karitiâna), or to the Maweti–Guarani branch (Tupinambá, Paraguayan Guarani, and Cocama-Cocamilla). In Tupinambá, the functions of the reflex of *eɾʲe- ∼eɾʲo- cover non-causative meanings, so that we do not treat it as a dedicated sociative causative marker (see Section 2.3 on the exclusion of polyfunctional morphemes). Paraguayan Guarani and Cocama-Cocamilla have lost the dedicated sociative causative marker. In Paraguayan Guarani, the reflex of Maweti–Guarani *erʲe- (Corrêa de Silva, Reference Corrêa da Silva2010:218) is not fully productive and its combination with some verb roots acquired a conventionalized meaning (Estigarribia, Reference Estigarribia2020:218–219). In Cocama-Cocamilla, it has fossilized; we uncovered it in a few verb roots, but it is not recognized as an independent morpheme in grammar descriptions. The Tupian family thus constitutes a nice showcase for dedicated sociative causative markers, with a suggested reconstructed form, a hypothetical lexical source for it,Footnote 14 inheritance with formal differentiation through a number of branches,Footnote 15 and some examples of loss of the dedicated marker.
4.3. Geographic distribution
Section 3.1 states that the distribution of dedicated sociative causative constructions is skewed towards South America. In contrast, Section 3.3 highlights that there was no such bias for non-dedicated constructions. This shows that the general concept of sociative causation is expressed to a similar degree around the world, but tends to grammaticalize almost exclusively in South America, confirming the very areal status of dedicated sociative causative constructions.
The prevalence of dedicated sociative causative constructions in South America cannot be explained by chance, as they are almost absent from the rest of the world, nor only by genetic inheritance, since they are found in many different stocks in South America. Consequently, their particular geographical distribution seems to be explainable as a result of diffusion across languages. A further contribution of this paper is to point out that dedicated sociative causative constructions may in the future serve as an excellent feature to observe diffusion of linguistic features in an area, and to argue for a linguistic area. As already presented in the introduction, shared linguistic features have already been argued to account for linguistic areas in South America: sometimes for Amazonia as a whole, recently more often for a western/eastern divide of South America, and for more reduced areas such as the Guaporé–Mamoré region. Aikhenvald (Reference Aikhenvald2012) posits a divide between three major areas: Andean, Amazonian, and Southern Cone. Nevertheless, the possibility of the whole Amazonian basin forming a single linguistic area has been a matter of debate (see a summary in Epps & Michael, Reference Epps, Michael and Hickey2017), with some quantitative studies supporting instead a western/eastern division of South America (see e.g. Birchall, Reference Birchall2014).Footnote 16
Map 4 zooms in on the distribution of languages with dedicated sociative causative constructions within our South American sample. We observe that most cases are found within the area generally defined as “Greater Amazonia,”Footnote 17 with the two exceptions Awa-Cuaiquer and Yauyos Quechua, which are spoken on the western slopes of the Andes. Within Greater Amazonia itself, the highest density of cases is found in south-western Amazonia. However, we are not yet in a position to comment further on the areal distribution of the sociative causative within South America, because a denser sample would be necessary to allow for a finer delimitation of areas comprising the languages with a dedicated construction. In fact, the main goal of the major research project Out of Asia, of which the present study is part, is to analyze the distribution of a set of linguistic features, including the sociative causative, with a Bayesian algorithm (sBayes, by Ranacher et al., Reference Ranacher, Nico Neureiter, Barbara Sonnenhauser, Weibel, Muysken and Bickel2021) to detect areal signal, controlling for universal preference and genetic inheritance.

Map 4. Geographical distribution of dedicated sociative constructions in South America.
Our survey has not uncovered a formal resemblance between dedicated sociative causative morphemes or constructions across stocks, so we can only suggest that a potential diffusion of this feature would not have taken place through borrowing of a construction or morpheme, but through replication of a pattern (the plain fact of having a morpheme or a special construction for the specific function of sociative causation).Footnote 18 Each innovative language would have used inherited material to create a dedicated construction on the basis of an external model, and the diffusion would have resulted from a series of individual replications from one language to the other. It is an important result for contact linguistics, because “little attention has been granted in the literature to the borrowing of features belonging to the domain of verbs” (Matras, Reference Matras, Matras and Sakel2007:44). Our results conform to the generalization that “contact phenomena in the area of voice and valency are almost exclusively pattern-oriented” (Matras, Reference Matras, Matras and Sakel2007:47). Pattern replication is often characteristic of linguistic convergence that can lead to the building of linguistic areas (Matras & Sakel, Reference Matras, Sakel, Matras and Sakel2007). Finding evidence of some concrete examples from contact-induced transfer is far beyond the reach of the present study, but we definitively call for such an investigation.
A final contribution our survey offers to further research on the diffusion of a dedicated sociative causative construction is to highlight the importance of Tupian languages within the cases of dedicated sociative constructions, with five cases out of 19. Moreover, Map 5 shows that these Tupian languages are central to the area where languages with dedicated sociative causative constructions are found. This is suggestive of a potentially central role of Tupian languages in that diffusion process (across stocks, but maybe also within the Tupian stock; see note Footnote 15). Very briefly, the large and dense geographical diffusion of Tupian languages from a core in south-western Amazonia (Noelli, Reference Noelli, Silverman and Isbell2008; Rodrigues & Cabral, Reference Rodrigues, Câmara Cabral, Campbell and Grondona2012; dos Santos et al., Reference Santos, Soares da Silva, Ewerton, Takeshita and Thomaz Maia2015; O’Hagan et al., Reference O’Hagan, Chousou-Polydouri and Michael2019), their dedicated sociative causative morpheme that seems to have been rather stable through time (Rodrigues & Cabral, Reference Rodrigues, Câmara Cabral, Campbell and Grondona2012:509, 531–533), and their many contact situations (Cabral, Reference Cabral1995; Rodrigues, Reference Rodrigues1996; Muysken, Reference Muysken, Campbell and Grondona2012) could have made them instrumental in the diffusion of dedicated sociative causative constructions. The specificities of this hypothetical scenario are left for future research.

Map 5. Tupian vs. non-Tupian languages with a dedicated sociative causative construction.
5. Summary
The goal of this paper was to reassess, on the basis of a worldwide sample of over 300 languages with a focus on the Americas, the areal relevance of sociative causative markers hypothesized by Guillaume & Rose (Reference Guillaume, Rose and Floricic2010). The scope of our investigation was wider than that of Guillaume & Rose (Reference Guillaume, Rose and Floricic2010), by including expressions dedicated to sociative causation other than morphemes. The result is that out of 325 worldwide languages, 19 show a dedicated construction for sociative causation, and in 17 cases out of 19, it is expressed by a plain dedicated sociative causative marker. Importantly, 15 of these 19 languages are spoken in South America. The present study has confirmed that dedicated sociative causative constructions is a rara outside of South America, but not in South America. The prevalence of dedicated constructions in this macro-area is all the more telling that non-dedicated constructions for sociative causation are evenly distributed across macro-areas, reaching an average of one in four languages: South America is the only macro-area where sociative causation is frequently grammaticalized.
Anchored within a project aiming at uncovering historical contact among languages in the Americas, our survey provides data that may, in conjunction with comparable data on other linguistic features, be instrumental in informing future research on areal patterns in South America.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/jlg.2025.10005
Acknowledgments
This work is based on the ATLAs database (Inman et al., Reference Inman, Natalia Chousou-Polydouri, Kellen Parker van Dam and Françoise Rose2025) which was funded primarily by the Out of Asia SNSF Sinergia project CRSII5_183578 and the Swiss National Centre of Competence in Research Evolving Language. We would like to warmly thank David Inman and Natalia Chousou-Polydouri for invaluable discussions on the questionnaire, and revision of our work, as well as Kellen Parker van Dam for generating the maps. The coding work by Oscar Cocaud-Degrève and Raphaël Luffroy was also fundamental to this study. It was in very large part based on grammars generously shared by Harald Hammarström. Finally, we would also like to thank the linguists who provided additional information about particular individual languages, namely Alexandra Aikhenvald on Tariana, Les Bruce on Alamblak, Fernando de Carvalho and Emerson José Silveira da Costa on Tupinambá, Eva Lindström on Kuot, Florian Lionnet on Laal, Andrey Nikulin Guzmán on Borum (Krenak), Bruno Olsson on Marind, Mark Post on Galo, Yvonne Treis on Kambaata and An Van linden on Harakmbut.
Competing interests
The authors declare none.