Introduction
I use the term mentalese to describe a co-occurrence style whose usage obscures the nature of human experience from itself. Its written forms pervade the traditions of empiricism and rationalism from which the twentieth-century social sciences emerged. Localizable features of the style include metapragmatic locutions that rely on verba sentiendi denoting sensory activity (see-, hear-, perceive-, etc.) or cognition (think-, believe-, remember-, etc.), and derived nominals (perception, memory, etc.), through which a wide range of narratives about human activities – including those cast in genres of perception-talk and mind-talk – are formulated. Yet those who rely on them obscure the characteristics of their own activities from themselves. Why?
Leakage
One difficulty with mentalese is that verba sentiendi exhibit leakage with verba dicendi – verbs of speech and communication – so that perception-talk routinely conflates solitary sensation with discursively mediated social interaction. A range of situations can be conflated through such leakage by using a single lexeme. In (1), where Larry is speaking to Moe, all three utterances employ inflected forms (boldface) of the same verb, see-, a transitive verb that takes an animate subject and a direct object, but the verb’s direct objects (italicized) and their denotata (marked by superscripts) differ, as do the kinds of activities describable through composite effects shaped by cotextual cues.

We interpret (1a) to mean that entities J and K (denoted by Tony J and sparrowhawk K) were in co-present proximity, so that light rays reflected by K entered J’s eyes, making K visible to J; here, the verb has the sense see 1 ‘perceive with eyes.’ The utterance in (1b) lacks this sense: entities like Carthage C or its government G are no longer visible (they perished centuries ago), and, in any case, the direct object of see- is the italicized subordinate clause as a whole.
The utterance in (1b) describes an act of reaching a conclusion about a situation in the past. This is only possible by participating in a discursive process through which historical facts are described in the present. The discursive process consists of perceiving and then construing speech, but (1b) doesn’t describe its first phase-segment: If Sally reached her conclusion by listening to a classroom lecture, she used her ears (to listen to the lecturer); if she reached it by reading a historical text, she used her eyes (to read print matter) or her fingers (if she read it in braille), but, in all cases, the utterance describes a conclusion based on discursive reasoning, not its sensory or evidentiary source; we might gloss it as see 2 ‘discursively conclude.’
The utterance in (1c) describes Larry’s agreeing with Moe. If (1c) is uttered during a phone conversation (Larry is in Alabama, Moe in Ohio), Larry is agreeing with what he heard Moe say (using ears not eyes) and the consensus is reached through talk. We might gloss the verb as see 2, ‘discursively conclude [that addressee is right]’ or ‘agree.’ But what exactly are we glossing?
None of these are construals of isolable verbs. They are construals of multi-channel sign-configurations in which discursive and non-discursive signs cotextually accompany verbs and yield composite effects. Let us first focus on features of discursive co-text. The cotextual construal we have labeled see 1 ‘perceive with eyes’ emerges when the verb’s direct object is a simple noun phrase that denotes a localizable and concrete entity susceptible to co-presence and visual inspection, as in (1a), so that other direct objects of this noun class (mountain, tree, pebble, etc.) also yield the ‘perceive with eyes’ construal in locutions like “Tony saw a___.” The second construal, see 2 ‘discursively conclude,’ emerges when the direct object is a complex nominalization, such as a subordinate clause of the form that-S, where the sentence S denotes a proposition p about a non-copresent state of affairs, and the array as a whole conveys the sense ‘discursively conclude that p.’ Whether the conclusion describes a situation in the distant past, as in (1b), or an abstract conjecture (that truth is elusive; that 293 is a prime number…), improving one’s ability to see 2 involves gaining facility in (genred) forms of discursive reasoning (history, philosophy, math), not getting an eye exam. And in (1c), if the direct object, your point, is idiomatic for that which you just said – which is a direct object of the same that-S type as in (1b) – a similar propositional-conclusion sense emerges; but, since see your point also refers to addressee’s speech, the overall sense is closer to ‘agree.’
Other verba sentiendi, like hear-v and touch-v, and feel-v exhibit comparable contrasts. Although the range of cotextual arrays – and composite senses – is too varied to summarize here, a few naturally occurring examples of usage can clarify the main issues. In the examples in (2), the text-fragments in boldface are verbs (or derived nominals), those in italics add further specificity to activities denoted by the former, and composite construals are shaped not by the former alone but by co-textual arrays that contains both.

The cases in (2a) specify something done with the ears as sound sensors, and the cases in (2a’) specify complex forms of discursive reasoning for which sound sensors alone do not suffice. The auricular perception effect in (2a) is shaped by the kinds of direct object specified in each case (chirping, commotion, sound), which formulate their referents (K) as entities of a kind that can be audible and co-present with perceiver (J), which results in the composite effect that ears suffice to hear them, so that anyoneJ who is not hearing impaired can engage in the activities described in (2a). By contrast, in (2a’), the kinds of entities specified as direct objects of hear-v are construable effects of discourse (construable as backlash, argument, feedback, judgment, approval, and so on), whose construability requires that the subject actantJ of hear-v not merely have ears but also be able to engage in communicative interactions with interlocutory others that share a common language, and must already have acquired the socialized ability to identify and construe utterances in that language as forms of backlash, argument, feedback, and so forth, an ability distinct from whatever the having of ears equips every mammal to do.
Sensory and discursive construals of touch-v, taste-v, feel-v (and of derived nominal locutionsFootnote 1) are differentiated in much the same way as for hear-v. Although sensory construals differ (for touch-v, they tend to be tactile; for taste-v, gustemic; and for feel-v, tactile or affective), they are shaped by specific cotextual arrays in each case; other cotextual arrays yield discursive construals.
Concrete nouns favor corresponding sensory construals, while complex locutions denoting speech or writing favor discursive construals. For instance, engine K, edges of box K, and arm K (and adjectives warm, sharp, soft) favor the tactile construal of touch in (2b), while praising K, story K, and section K [of] article K favor the “no hands” discursive construals of (2b’). For taste, the edible denotata of vegetables K, noodles K, and food K favor gustemic construals in (2c); but any denotata of speech framed as (dis)preferred, such as her debating style K, etc., favor the “no tongues” metadiscursive construals of taste locutions in (2c’). In the feel-v constructions in (2d) – such as ØK feel-V adj to ØJ, or phrasal verb ØJ feel-for v ØK [with ØL] ‘seek ØK [with probeL]’ – concrete noun arguments (like saris K, sixpence K, pulse K) and sensory adjectives (like coarse) favor tactile construals, while adjectival nomina sentiendi of emotion (loved, sad, angry, embarrassed) favor affective construals; however, in the biclausal locutions in (2d’), feel-v functions as a verbum dicendi (like say-v, write-v, claim-v, etc.) whose subordinate clause denotes represented speech, whether framed as direct speech (“Bicycles… landscape”), or as indirect speech that reproduces the content of claims made by othersJ but not their exact words.
Transposing speech to thought
The cases in (2d’) illustrate a more general point: statements by others can readily be transposed from represented speech to represented thought frames. Such transpositions have two general features. They can be transposed with minimal deformation of the textual material transposed. But even when this is done, the actant frame of the matrix verb is radically transformed.
The actants of a verbal locution are the denotata of its case-assigned arguments. Every verb assigns case relations like agent-of (A), patient-of (O), or indirect-object-of (D) the verb to its arguments.Footnote 2 Its actants are the denotata of these arguments. Attributes of denotata (and thus of actants) are jointly specified by verbs and their arguments in localizable text-segments and serially elaborated in discourse. In an utterance like he J told him K, actants J and K are both specified as masculine (by he and him), while J is specified as agentive actant of transitive tell-V (by the nominative form, he), and K as patientive (by accusative him).
Under conditions of reference-maintenance, actant attributes are cumulatively (re)specified across stretches of discourse to yield composite sketches of actant characteristics. In a sequence like That lawyer J knew a lot … Everyone liked her J, the same person (J) is characterized twice, first as an agentive actant (of know-V) and for profession (by lawyer), then as a patientive actant (of like-V) and for gender (by her). Hearers (or readers) of discourse arrive at an understanding of entities and states of affairs described in discourse by attending to actant formulations that serially unfold across stretches of discourse, one text-segment at a time, and, when discourse is relatively cohesive, arrive at a composite sketch of actant attributes, including characteristics of situations involving them. Actant formulations are not reducible to “actor” (the special case of agentive actants) nor to denotata of nouns (since that-S clauses denote object actants of see-V in (1b), of feel-V in (2d’) and of other verbs in (3)-(4) and subsequent examples), nor shaped only by case relations (which are specified only within sentence-boundaries) because actant formulations are serially re-shaped across stretches of discourse much larger than its sentence-fragments, and thus across large spans of social history linked by discourse, as the discussion in subsequent sections shows.
All encounters with verbal locutions are encounters with actant formulations of entities and states of affairs that are serially (re)configured through speech. The non-transparency of the cues that configure them creates tendencies among language users to describe cumulative effects through a metalanguage of “things” and “events,” which readily obscures how such effects are configured, and what is done to interlocutors through processes (and personnel) that configure them, as discussed below. Let us begin with cases of reconfiguration where speech is transposed to thought.
Transposing speech to thought
Transposing represented speech into represented thought relies on fractional congruence of actant formulations across stretches of discourse. Why? The actant frames of verba dicendi and verba sentiendi are parallel in some respects, divergent in others.
One parallel is that subordinate clauses can be transposed without alteration for verbs that are paired as cognates insofar as they take the same nominalized clause types as direct objects, even though they differ in denotation, as illustrated in (3).

The verbs say/insist/declare in (3a) denote acts of speaking, while the verbs think/believe/feel in (3a’) denote mental states, but both take that-S nominalizations as direct objects. Thus, represented speech locutions like “Property owners {say/insist/declare…} that [taxes should be levied on trade]S” can later be represented as “Property owners {think/believe/feel…} that [taxes should be levied on trade]S” without altering the subordinate clause, thereby converting its propositional content from something publicly asserted (said, insisted upon, declared, etc.) by actual people, whose views can then be debated by interlocutors (as tends to happen in liberal democracies) into someone’s (or even some collectivity’s) private mental states, which are much harder to locate.
Similarly, the verbs ask/inquire in (3b) denote acts of speaking, while the verb wonder in (3b’) denotes a mental state, but both classes of verbs take whether/when/how-S nominalizations as direct objects. Thus, represented speech locutions like “The kids {asked/inquired…} whether/when/ how [the teacher would come back]S” can later be represented as “The kids wondered… whether/ when/how [the teacher would come back]S,” thereby converting publicly perceivable questions, which can be answered by interlocutors, into mental states that appear to have no interlocutors. Why?
Such conversions are possible because, despite the analogical convergence of (3), the actant frames of verba dicendi and verba sentiendi diverge in important ways, as in (4).

One divergence is that verba dicendi tend to be ditransitive verbs, VERB (XA, YO, ZD) – where the YO constituent is a subordinate clause, and XA and ZD are NPs denoting interlocutors – but verba sentiendi tend to be transitive verbs, VERB (XA, YO), which lack indirect object slots for addressee-interlocutors. Thus, the utterances with verba sentiendi in (4) are perfectly fine without indirect objects, but the (parenthetical) variants in which indirect objects denote addressees are infelicitous. Since both verba dicendi and verba sentiendi take nominalized subordinate clauses as YO arguments, the propositional content of these clauses can be transposed from represented speech to represented thought frames – with minimal alteration of transposed locutions, as noted above – but when propositional content is transposed in this way, what is said appears to get converted into what is thought. Thus, Miss Simmons said to them “I am hungry” is perfectly fine, but Miss Simmons felt [# to them] that she was hungry is infelicitous with parenthetical insertion, and, without it, converts the propositional content of its subordinate clause into a mental state that no interlocutor has witnessed, so that the only candidate available for “witness” appears to be Miss Simmons herself.
What is the composite effect and how does it arise? The congruence of object clause types in (3) allows language users readily to convert statements publicly asserted by others into their mental states, and the noncongruence in (4) privatizes them.
It is worth noting that actant formulations also pay a part in shaping sensory and discursive construal of the types discussed earlier.
Representing speech and sensation
The contrast between sensory and discursive construal in (1)-(2) appears to depend on cryptotypic contrasts among sentence-internal variables of structural sense [or “grammar”], such as contrasts in verb’s direct object (that-S vs. NP). It does not. The contrast actually depends on co-textually construable circumstances of utterance production, including characteristics of actants (or referents of verb’s arguments), which clarify whether sensory or discursive construals are plausible (for both types of direct objects, NPs and that-S clauses).
For instance, the that-S direct object of (1b) yields a discursive construal (see 2 ‘discursively conclude’) because CarthageC and its governmentG are entities that Sally cannot see 1 ‘perceive with eyes’ in the 21st century. But in a different situation, such as (1b’), where LarryL tells MoeM, “SallyS saw that Tom T was sitting down but Bill B wasn’t,” the that-S direct object (italicized) readily permits the see 1 ‘perceive with eyes’ construal if the referents of TomT, BillB and SallyS are known to be co-present in the same room at the same time.
Similarly, the NP direct objects of hear-V in (2a) and (2a’) have variable construals that depend on non-sentential factors. Since “auricular perception” [of speech] is a phase-segment of the “construal of speech,” the latter is not diametrically opposed to the former, but a laminated effect that arises when specific cotextual conditions hold. Thus, in contrast to the first case in (2a’), if we consider a case such as (2a’’), where LarryL tells MoeM, “HeJ can hear the backlash K,” Larry’s utterance describes person J’s “auricular perception” if J doesn’t know the language in which the utterance is spoken but can still “hear” its audible sound; however, if person J knows the language, an additional layer of significance – J’s “construal of speech” – readily gets laminated upon “auricular perception.”
Appeals to attributes of actants such as prior language socialization (in cases like (2a’’)), or co-presence in the same room (in cases like (1b’)) are appeals to non-linguistic circumstances of utterance production that are entirely independent of sentence-internal grammatical patterns, such as “cryptotypes” (Whorf Reference Whorf1945), which are defined solely in terms of concatenation of Bloomfieldean form-classes, and thus have sense-compositional construals that do not vary by circumstances of utterance production. The difference between sensory and discursive construal in (1)-(2) depends, by contrast, on cotextually construable features of utterance production, including actant formulations, which may be specified by speech participants within or across sentence-long stretches of discourse.
Conflating cotextual semiosis with denotational stereotypy
Let us return to see-v. We have considered a few cases in (1). Other locutionary arrays, such as the see X as Y construction (e.g., she sees his willingness as hope for the future), bring two nominalizations together to denote more complex forms of discursive reasoning, as in (5)

In (5), a stacked series of see X as Y locutions describe claims published in scientific texts: When faced with difficulties in reconciling individual statements about specific phenomena with sets of statements that count as theories, some scientists describe these difficulties as “puzzles” in their writings, while others describe them as “counterinstances” that allow them to refute a theory and replace it with a better one. Here, see X as Y locutions describe complex forms of discursive reasoning that link scientists and publications to each other across the centuries.
The denotational stereotypy (Agha Reference Agha2007, 101-124) of a verb like see doesn’t capture the range of effects describable through it. If speakers of English are asked “what is seeing?” (or, “what is it to see?”) they are likely to say something like “well, you do it with your eyes…” or “it’s when you look at something.” But the preceding discussion shows this to be incorrect: verbs like see acquire this construal only in specific cotextual arrays, not in others. Similarly, English speakers are likely to say that activities of hearing and being heard involve the ears, or of touching and being touched involve the hands, but not that they involve discursive reasoning – as illustrated in (2a’) and (2b’) – and thereby produce incorrect or incomplete descriptions of what all English speakers (including themselves) can do with English. Why do such impairments arise?
One reason is that lexical items are relevant to human communication only as fragments of multi-channel sign-configurations, which readily re-specify the construal of their fragments, as in (1)-(5)), but are harder to describe than their lexical fragments. Since words are minimal independently utterable denoting expressions, they are readily reproduced in metalinguistic discussions, a tendency that reproduces the widespread delusion that language is just a “bag o’ words.” Such lexical fetishism is one reason for the difficulty noted above: attempts to describe the significance of multi-channel semiotic arrays by glossing word-long extractabilia – like see, hear or touch (and their derivatives) – tend to replace the construal of a sign-configuration with the denotational stereotypy of a fragment.
A second reason is that brief locutions tend denotationally to encapsulate or under-differentiate phase-segments of activities that complex locutions more readily differentiate. This issue has already arisen in preceding discussion: I noted earlier that locutionary arrays containing see-V, such as (1b), describe processes that have more than one phase-segment, but their phase segmentation is not recoverable by glossing any individual lexeme they contain. This has nothing to do with (1b) in particular. It is a general feature of denotation.
All denoting expressions produce a segmentation of their denotata. Complex expressions produce a fuller segmentation. The principles that organize such segmentations are varied and interact with each other to yield composite effects (Agha Reference Agha2007, 37-55, 84-144). I focus here only on a single issue, which may be illustrated through a simple example: Joe gets mugged at knifepoint in an alley. Later, at the police station, he is asked by a police-officer to review photographs of local criminals in order to identify his mugger, (6a). What happens next is described in (6b).

Different phases or segments of Joe’s activity, differentiated in (6b), lines 1-3, can later be described through word-long locutions, but all options are not equally plausible; implausible descriptions are marked by # in (6c). Joe’s activity in line 1 is plausibly described as perceiving (or looking at) something, but not as characterizing it because he has said nothing. The activity in line 3 is plausibly described as characterizing something (or someone), but to say that he is perceiving (or looking at) it is to ignore the fact that he spoke. The overall sequence of activities (in lines 1-3) can also be described later on by using forms of the verb recognize-V, for example, as (Joe’s) recognizing (someone), as noted on the right in (6c). In such cases, the activity described as recognizing someone, still has two phases or segments – a non-discursive phase (line 1) and a discursive phase (line 3) – but these are encapsulated together (not segmented or differentiated) when the word recognizing (or recognition) is used, and the fact that Joe spoke is not recoverable by glossing the word.
Similar contrasts emerge around locutionary arrays containing see- V and touch- V, as in (7).

Such statements are true only if the verb’s subject actant (TonyJ, EileenJ) had a sensory encounter with somethingK that is correctly characterized by the italicized phrase, so that in each case, a non-discursive phase of activity – a visual phase in (7a), a tactile phase in (7b) – is followed by a discursive phase in which visible or tactile sensoria (encountered in the first phase) are then characterized as a sparrowhawk K or as her friend’s soft arm K (in a subsequent phase). The sentence-long locutions of (7a) and (7b) permit questions to be asked about both the non-discursive and discursive phases of the activities they describe: For (7a), if Tony was blindfolded, the non-discursive activity (involving eyes) could not have occurred; if he was not, the expression sparrowhawk (involving speech) may or may not correctly describe what he saw. However, if we encapsulate the activities denoted in (7a) or (7b) under word-long labels like seeing or touching, only non-discursive phases are recoverable from the denotational stereotypy of our labels, and the discursive phases of such activities – in which people characterize what is seen or touched – become unavailable for discussion. Since actual human beings are always able to offer some characterization of whatever they see or touch (including characterizations which they later seek to rectify or improve), metalinguistic discussions of seeing or touching tend to produce inadequate accounts of the activities they are taken to describe.
This issue becomes poignant when we realize that the one who characterizes what was seen or touched is not necessarily the one who saw or touched it. This is the issue of voiced cognition, to which we now turn. Let us begin by focusing on so-called “visual perception,” the effect configured by cotextual arrays of the type described for cases like (1a), but about which there’s more to be said.
Origo of voiced cognition
Any attempt to describe an event of visual perception links two events to each other, the event in which perception is described and the event in which perception allegedly occurs. How are the activities that unfold in these events related to each other? To answer this question is to describe the voicing structure of visual perception. Several variables are involved. Let us begin with an example.
In (1a), when LarryL tells MoeM that TonyJ saw a sparrowhawkK, the utterance links two events to each other: it describes a narrated event (En) in which J allegedly encounters K through an act of seeing, and it constitutes a speech event (Es) in which L describes that encounter to M through an act of speaking. The past tense of the verb saw, or see-pst, deictically formulates the En (in which J encounters K) as having occurred prior in time to the Es (in which L speaks to M), and indicative mood formulates L as now asserting something about J’s prior experience. By denoting K, the expression sparrowhawk K provides characterizability conditions on what type of entity K is, but leaves open the question of whether the characterization was supplied by J (in En) or is created by L (in Es). This is the issue of voiced cognition, the issue of whether what is perceived is characterized by the one perceiving it, or by someone else. This issue is entirely obscured by all talk of “reported speech.”
Our folk-metalinguistic term “reported speech” is a misleading and pitiable way of describing something that is better termed formulated conduct. Why formulated? In any instance where L describes J’s prior speech (e.g., LarryJ tells MoeJ: “Give me five dollars,” TonyJ demanded/asked/ requested/…), it is Larry who formulates what happened (Tony may have said or done nothing, may not even exist, etc.). Why conduct? What Larry formulates through choice of metapragmatic frame is not just speech but the type of conduct performed through speech and its accompaniments, whether formulated as having a distinctive socio-pragmatic contour through choice of verb (demanded vs. asked vs. requested …); or formulated as represented thought through cross-clause co-reference (“How could IJ get it so wrong, IJ demanded of myselfJ, but said nothing…); or, formulated as a nonverbal response to speech (His silence/ glare /scowl… was tantamount to a demand). Although such examples show that a single metapragmatic verb (demand) can co-textually be re-specified to yield denotational leakage across many types of formulated conduct (represented speech/thought/ action), the cases to which we now turn make plain that formulated conduct may involve far more interesting forms of voicing, such as formulations of voiced cognition that interlocutors do not even grasp as formulations.
Such non-transparent voicing contrasts arise from contrasts of deictic organization. They permeate descriptive utterances in all languages. A few possibilities are illustrated in (8), where L is speaking to M. All of L’s utterances in (8a-f) contain deictically inflected forms of the transitive verb see- (in boldface), and thus unavoidably contain the actant array Ø J see- Ø K – since the verb denotes a relation between two entities, someone who sees (J) and something seen (K) – but differ among themselves in all other respects, including the manner in which slots in this array are filled (or not filled) by linguistic expressions (and thus in whether or how actants J and K are specified), and in the manner in which tense and mood deixis are specified in the utterance (and thus in whether or how En relates to Es). These differences are indicated in columns A-E.

The “direct speech” formulation in (8a) assigns distinct voices to each clause by differentiating deictic origos: the matrix clause is formulated as words used by L – since 3rd person and past tense (TonyJ said…) are the deictics L would use to describe J’s prior activities; and the subordinate clause is formulated as J’s words – since 1st person and present tense (IJ see…) are the deictics J would use to describe his encounter with co-present K – so that sparrowhawk K is voiced in (8a) as J’s description of K. The “indirect speech” formulations in (8b) collapse this distinction by deploying L-origo deictics throughout: the 3rd person and past tense in both clauses (TonyJ said… and …heJ saw…) are the deictics L would use to describe J’s prior speech or activity, so that any descriptions of K chosen from among the cline of hypernymic variants in (8b) are formulated as words chosen by L (…a hawk K/bird K/…) to describe K, leaving open for construal whether they match any choices J may (or may not) have made (if J even exists, said anything, etc.), which remains undecidable by M; this ordinary language ambiguity was erroneously dichotomized as de dicto and de re in scholastic logic (an error that persists in modern philosophy), where, despite the dichotomy, de re is just the “special case” of de dicto (Kneale Reference Kneale1966, 630) where an omniscient narrator allegedly knows what J “really” said about K. In (8c), L identifies J through proper name deixis as TonyJ – much as in (8a–b), as indicated in Column C – but makes no reference to J’s speech (column B), and any descriptions of what was seen (…sparrowhawk K/bird K…) that may occur in (8c) are understood as descriptions formulated by L (since no-one else is alleged to have spoken). In (8d), L does offer some characterization of whatK was seen (as in (8a-c); column D) – whether formulated as hawk K, bird K, etc. – but doesn’t specify whoJ saw itK, or what theyJ said, thereby formulating (8d) as L’s own words. In (8e), L asserts that a sighting occurred without specifying whatK was seen, whoJ saw itK, or anyone else’s speech. In (8f), L’s utterance preserves the transitive verb’s actant structure (but not tense and mood inflection) through an -ing nominalization; the utterance is at best a sentence fragment that leaves subject and object NPs (and thus attributes of actants J and K) unspecified. Across the range (8a-f), L’s utterances decrease in sense-specificity and deictic selectivity across a cline of possibilities, as contrasts among rows indicate (i.e., all five variables, A-E, are marked “+” for (8a), and “-” for (8f)).
Let us now focus on how entity K is variably characterized across this range of possibilities. Some characterizations of K are less specific (more hypernymic) than others, but not less correct: If K is correctly characterized as a sparrowhawk K, then (since “a sparrowhawk is a (kind of) hawk”) it is also correctly characterized as a hawk K; and (since “a hawk is a (kind of) bird”) it is also correctly characterized as a bird K; and (since “a bird is a (kind of) thing”) it is also correctly characterized as something K. And so if the biclausal statement in (8a) is true, so are the ones in (8b). Similarly, if the monoclausal statements in (8c) are true, so are corresponding hypernymic ones in (8d-e). The non-specific statements are not less true than their specific variants, they merely characterize K with a lower degree of sense-specificity.
Non-specificity has some advantages. If a specific variant (TonyJ saw a sparrowhawkK) is uttered in prior discourse, a non-specific variant (SomeoneJ saw a hawkK/…) can be asserted later without introducing new evidence. Non-specificity also has some disadvantages. It is not possible to use a non-specific variant to infer a specific variant: It is not possible to deduce from SomeoneJ saw somethingK that TonyJ saw a sparrowhawkK.
The locution Someone saw something is more interesting than the rest because it is non-specific in ways not yet discussed. It is more capacious than (can cotextually encapsulate) the others. How inclusive is it? Let us consider the entire range of locution types with see-V that we have discussed so far:

Any of the locutions in (9) can be re-described in subsequent discursive interaction as someone saw something without introducing new evidence; however, if all of these utterances have occurred in prior discourse, and someone saw something is uttered later, it is no longer possible to decide which among the options in (9) counts as its antecedent.
The locution Someone saw something preserves the actant frame Ø J see- Ø K but replaces all expressions that can fill subject and object positions with generic nouns (someone and something) thereby eliminating all information that denotationally specific variants can contribute to this frame when they fill it, as in (9a-d). Its generic form gives Someone saw something the advantages noted above (it can be used to redescribe its antecedent without further evidence) as well as the disadvantages (you can’t use it to pick out its antecedent from options like (9a-d)). Its disadvantages catch up with its users: although the deictically selective locutions (9a-d) allow us to distinguish the activities they denote through the descriptions on the right, A-D, the generic locution Someone saw something will tend to be glossed as an act of visual identification.
Yet since the locution Someone saw something preserves selective verb deixis – past tense and indicative mood (see (8e), column E) – it asserts that a specific act occurred at a specific time.
The nominalization ØJ see-ing ØK in (8f) eliminates this as well. Its uttered form is seeing; that is, since see-V neither stops being a transitive verb nor magically loses its actant frame, someone seeing (ØJ) and something seen (ØK) are implied (but not mentioned) in any instance (or stretch) of talk in which seeing occurs as a standalone word. Since values of ØJ and ØK can always be filled in during subsequent stretches of talk (e.g., in answering WhoJ? and WhatK? questions) the frame hasn’t disappeared; it is merely idling, temporarily suspended, held in abeyance.
What happens to interlocutors in frame-abeyant stretches of talk that contain standalone seeing? Since language users cannot gloss bound suffixes like -ing, all they can gloss is see, whose denotational stereotypy is narrower, as noted earlier, than the construals differentiated (as see 1 ‘perceive with eyes’ vs. see 2 ‘discursively conclude’) when its actant frame is filled. Since noun and verb deixis have also been eliminated, activities denoted by (ØJ) see-ing (ØK) are not even locatable in space or time, nor specified for whoJ saw whatK, nor for which cotextual construal of see-v is at issue. These issues may be summarized as follows:

How does such talk impair the ones talking and listening?
Since questions about seeing (or about derived complex nominals like the nature of seeing) are plausibly answered by describing what people do with their eyes, such questions (and their answers) necessarily obscure what people do to each other through non-zeroed X J see- V Y K locutions in actual communication, including all issues of discursive semiosis and voiced cognition. In fact, to suppose that answers to such question describe the abilities of human beings is to engage in a reductionist form of voiced cognition without knowing it. It gets worse. Once you notice that “seeing is (a kind of) perceiving,” you can find your way to yet more generic hypernyms, like perceiving and perception, as objects of concern. But if you now pose questions about the nature of perception, you have suspended the contrast between see-V and hear-V and taste-V and touch-V and feel-V during the act of posing the question; and, concurrently, through frame-abeyance, you have eliminated all actant frames within which such verbs occur, and thus lack any means of distinguishing the activities describable through more specific locutionary arrays in (1) through (9) from each other. In short, to ask about the nature of perception is to conflate the conditions under which everyone else can distinguish countlessly many activities – including auricular perception, represented speech, tactile perception, construing compliments, preferring attributes of people and describable situations, visual identification, reaching conclusions, agreeing with someone, refuting a theory, and so on – from each other. Why does this effect arise?

Having lost sight of the social world as we know it, you can now explore the denotational stereotypy of (standalone, frame-abeyant, hypernymic) perception, and even find your way into neurobiology, where you can discover a great many new things, but no means of finding your way back to everything that perception cannot help you perceive. Other abstractabilia like awareness, affect, and consciousness are obviously only a step away. Meanwhile, other abstractions, which are linked to such locutions in less obvious ways, can be fabricated metalinguistically, as we’ll see below. Yet we have no interest in nouns (these ones, or any other) considered by themselves.
The preceding discussion has introduced some of the variables of discursive semiosis through whose co-occurrence the style I call “mentalese” is crafted. All of them need not be in play in every instance, and each can be combined with others, as we shall see. Even though mentalese often gets crafted through the pursuit of goals other than that of crafting it, it (and its impairments) readily emerge, syllable by syllable, whenever certain operations are performed. I use the term exaptation to describe the group of operations that are often involved. Four of them, which are of special interest, are summarized preemptively below and illustrated in the pages that follow:

When the resulting project gets taken up by incumbents within a division of labor, we can sometimes identify a new tradition or school of thought, as well as constructs (fashioned through uses of the above operations) that come to count as diacritics of membership within it. Such registers of discourse appear discrete insofar as specific diacritics and value projects become emblematic of the practices of a specifiable social domain of practitioners. Yet since these operations can be repeated, we also find processes of serial conversion through which constructs devised by one group are refashioned into constructs that are emblematic of membership in yet another group, whose terminology bears no phonolexical resemblance to the terminology it displaces (thereby distinguishing groups from each other) but nonetheless repackages other semiotic variables from the antecedents it displaces in the pursuit of new projects. The next few sections illustrate the serial conversion of constructs devised in early modern philosophy into constructs that have found homes in the twentieth century social sciences (psychology, anthropology, sociology) in a manner unnoticed by their practitioners.
Mind-talk in Locke and Hume
Locke and Hume describe the mind by performing metalinguistic operations on the sense structure of English without knowing it. The definitions excerpted in (13) provide some illustration:

What does the phrase the mind denote in such passages? This is not the ordinary English word (cf. “Do you mind?”; “Oh, never mind!”) but a technical term in a philosophical register to which specific attributes are assigned. Where do its attributes come from? They have two main sources.
One set of attributes is abstracted from the soul: the phrase the mind denotes entities that “make their first appearance in the soul” (Hume), or “when the Soul comes to reflect” (Locke). Both Locke and Hume (like Descartes) seek to describe a secular – non-theological – analogue of the soul, for which a new word-form (phonolexical shape) is needed: Just as the soul denotes a part of a person, so does the mind.
Other attributes of this new person-part are projected from the actant structure of verbs taking human subjects, mainly verbs of two types: class 1 verba sentiendi like Ø J see-V ØK (whose direct objects may be concrete nouns or subordinate clauses) and class 2 verba sentiendi like ØJ think-V ØK (which favor the latter).
Locke and Hume denote the Ø J actant of such verbs with the phrase the mind J, whether characterized as an actor (Hume lists itsJ “operations” as “actions” that occur whenever itJ sees, thinks, etc.), or characterized as an actor-possessum located within a person (Locke speaks of “the operations of our own minds within us”). Such locutions endow the mind with some of the attributes of the soul (traditionally also described as something “within us” that acts), though not specifically scriptural ones (like immortality or an afterlife).
Other attributes of the mind are configured by operations performed on the ØK actant of verba sentiendi, and are described through a register of technical terms that are homonyms but not synonyms of ordinary English words. Locke and Hume proceed in parallel ways but their terminologies differ.
Locke baptizes the Ø K actant of such verbs as Ideas K, which are said to be located “in” the “mind.” This is just what we would expect: For locutions of the form John thinks [that-S]K, whatever is denoted by [that-S]K can later be described as John’s thoughtsK, and located as in him (but not in his shirt). Locke’s IdeasK are denoted by nominal expressions linked to verba sentiendi, but come in two varieties because two kinds of nominal expressions are involved. One “kind” of IdeaK (called sensationsK1) is denoted by sensory adjectives (bitter, yellow, hot, etc.) that are paired with class 1 verba sentiendi as noun modifiers within direct objects (viz, see-V [yellow hats]NP; taste-V [bitter fruit]NP; touch-V [hot water]NP, etc.); hence sensations K1 are (the kind of) IdeaK that you can taste-V/see-V/touch-V (and thereby “experience” as bitterness K1/ yellowness K1/ heat K1). The second species of IdeaK, termed Reflections K2, consist of whatever is denoted by nominalized clauses (that-S, whether-S, etc.) when these occur as direct objects of class 2 verbs like think-V, doubt-V, believe-V, etc.; they are whateverK2 you can think-V/ doubt-V/ believe-V (or “experience” as thoughts K2/doubts K2/beliefs K2, or as phase-segmentsK2 of processes called thinking, doubting, believing, etc.). Since these locutions readily take first person subjects with cross-clause co-reference (‘I J think/doubt…[that IJ will win the race]K2), the subject actantJ is traditionally described as a selfJ-engaged-in-reflection by Augustine and Descartes (Matthews Reference Matthews1992), and by those that follow them, including Locke and Hume,Footnote 3 so that direct objectsK2 are readily described (by Locke) as “reflected on by our selvesJ.”
Hume offers a parallel account of Ø K actants but his terminology differs. Hume’s terms are neither synonymous with their ordinary English homonyms, nor with Locke’s terms (e.g., Locke’s term Ideas and Hume’s term Ideas denote different things). Since Locke and Hume are coining technical terms, I use the letters L and H as preposed superscripts (as in LIdeas and HIdeas) to keep track of the voicing structure of their proposals. Hume’s generic term for the Ø K actant of all verba sentiendi is HPerceptions, which are said to be of two kinds, HImpressions and HIdeas, where HImpressions are metalinguistically bifurcated further into HSensations K1 and HReflexions K2 (in a manner parallel – but not identicalFootnote 4 – to Locke’s procedure for distinguishing LSensations K1 from LReflections K2; see above), but although HReflexions K2 are species of HImpressions (not of HIdeas) they are said to be “derived in great measure from our ideas” (Hume Reference Hume and Selby-Bigge1739, 1.ii), and are thus linked to HIdeas too. Why? Hume’s own explanation – that even though HImpressions “strike upon the senses” there is always “a copy taken by the mind” which “we call an idea” – is prima facie incoherent, because, as Thomas Reid observed, this thesis requires that “every simple idea that enters the human mind be examined, and shown to be copied from a resembling impression …[but]…No man can pretend to have made this examination of all our simple ideas without exception” so that Hume’s claim that “the rule here holds without exception” has no basis (Reid Reference Reid1788, 26). The real reason for Hume’s muddle is that HImpressions and HIdeas are projected (respectively) from the Ø K actants of class 1 (see-V type) and class 2 (think-V type) verbs, which overlap in taking that-S clauses as direct objects (so that their denotata can be transposed as Ø K actants across locutions; cf. 2.0).
Hume reserves his coinage, HIdeas, for the Ø K actants denoted by direct objects of verbs like think-V – which can anaphorically be called ideas even in present day English, viz., “Joan thinks [there’s money in farming]K. I don’t know what gave her that idea K.” – but regards an HIdea K as a mental state, not the content of a description. Hume’s discussion of “simple” vs. “complex” HIdeas is projected from the obvious fact that descriptions can be short or long, but is tragically presented as an anatomy of mental states. Similar troubles follow from the transposability of actants from speech to thought constructions: After posing a question – “What is our idea of necessity, when we say that two objects are necessarily connected together” – Hume anaphorically refers back to the (italicized) that-S actant of say- as “this idea K of necessity,” a variant of his favored “the idea of X” construction, which recurs throughout the book. Relationships among linguistic expressions are then described as relationships among HIdeas: “Thus the idea of [A] an equilateral triangle of an inch perpendicular may serve us in talking of a [B1] figure, of a [B2] recti-lineal figure, of a [B3] regular figure, of a [B4] triangle, and of an [B4] equilateral triangle. All these terms, therefore, are in this case attended with the same idea” (Hume Reference Hume and Selby-Bigge1739, I.vii; interpolations and italics added). To say that “all these terms” are “attended with the same idea” is just an obscure way of saying that the italicized expressions have sense-relationships to each other, so that the first one (A) can be linked to any of the others (Bn) through the statement “An A is a (kind of) Bn.” This type of talk is then extended through parallels – see (3) above – between say-V type and think-V/feel-V type locutions to constructs like Hpassions, Hsentiments, Hself-interest, which, particularly when taken up by Hume’s friend and protégé Adam Smith (in Smith Reference Smith and Haakonssen1759, Reference Smith and Cannan1776), later contribute to mythologies of capitalism (Hirschmann Reference Hirschmann1977).
Three points are worth noting. Locke and Hume both perform metalinguistic experiments on the sense structure, deixis, and cotextual indexicality of English locutions without knowing it. They rely on native speaker intuitions to extract items (actants) from English locutionary patterns but cannot describe what these patterns are, nor where they come from. Second, once they have extracted actants from these arrays, they describe them as fragments of a new construct, the mind, and, by predicating various attributes of the fragments they name (LIdeas, HImpressions, etc.), they respecify their attributes and narratologically create entities of a new kind. Unlike the sense structure or deictic categories of any language – which (a) differ from that of every other, (b) change through the sociohistorical practices of language users, and (c) allow acts of referring to be co-textually negotiated (and disputed) by those linked to each other through utterances (Agha Reference Agha2007, 85-103) – the entities that Locke and Hume baptismally create are repositioned into narratives about HHuman-Nature (or LHuman-Understanding), alleged to be the same for all human beings at all times, and are even imagined, somehow, to be manifest in sensory inputs that all people everywhere can see, hear, or touch (even though they are extracted from locutionary arrays of English). Third, all of this is an exercise in voiced cognition. It describes capacities for thinking, knowing, judging, etc., possessed by a creature called Man. Such a creature can never be mistaken for any actual human being, however, for the reasons just noted in (a)-(c), despite the noun that denotes it. Yet the conflation persists for the next few centuries. Why?
Those who read Locke and Hume are no better equipped to describe these metalinguistic operations than the authors they read. Through the efforts of these readers, classical mentalese exfoliates into newer forms. While we cannot discuss the entire range of constructs that are proposed – particularly in the 19th century, as in the writings of Kant and Hegel (through which forms of idealism are created) or in the writings of Jevons and Edgerton (through which the rational choice homunculus of economics takes shape), which I describe elsewhere – the cases discussed below illustrate how old wine reaches the 20th century in bottles that look new. I focus on two cases of exfoliation in the sections that follow, the species of things/objects-talk crafted by James Gibson, and of mind-talk by Emile Durkheim. Before we proceed, however, let us consider why things/objects-talk emerges as a corollary of mind-talk in classical mentalese.
Creating “things” for “mind”
The metalinguistic procedure through which Locke and Hume extract sensations K from locutions and reassign them to mind J leaves things or external objects as remainders. Starting from locutions that serve as sites of extraction (e.g., IJ/theyJ /… see-V [yellow hats]K/taste-V [bitter fruit]K /….), once attributes denoted by adjectives (like yellow, bitter, etc.) are extracted from ØK actants and termed sensations K1 or sensible qualities K1 (“experienced” by mind J), the denotata of modified nouns (hats K1,’ fruit K1’, etc.) are left over as remainders called external objects K1’. How are theseK1’ related to mind J? In posing this problem, Locke and Hume adopt an inherited scholastic idiom and innovate within it.
The term object – Latin Objectum – is a technical term coined by medieval scholastic philosophers (as a translation of Aristotle’s τὸ ὑποκείμενον “substrate, substance”) for what is left over (in materia “stuff, material”) when all properties are abstracted by the powers of apprehension of the soul, yielding a contrast between obiecta extrinsecus “external objects” and forinsecus obiectae qualitates “the qualities of outward objects” (Dewan Reference Dewan1981). Even after soulJ begets mind J, a parallel contrast between external objects K1’ and sensible qualities K1 remains pervasive in Locke and Hume. What is to be done with it? Their answers differ. Locke supposes that once sensible qualities K1 (such as color, taste, weight) are abstracted from objects K1’ by the mind J, something is left over, which he describes as “I know not what” (thereby converting ØK1’ into ØK?, which Kant later converts into the thing-in-itself). Hume takes a different approach, as in (14), his bundle theory, which says that when (14a) external objects “are seen and felt” they become “present” to the “mind,” which is itself (14b) just a “heap” of “perceptions,” which, (14c), make up “every thing,” including (14d) the sensible qualities “of which objects are composed,” leaving no underlying substance as remainder, so that, (14e), only sensible qualities (color, sound, etc.) can “afford” us an idea of body (thereby bringing ØK1’ into very close convergence with ØK1, while leaving differences between them obscure):

The entire problem of how to relate sensible qualities K1 to external objects K1’ – on which Locke and Hume express different opinions – derives from a common fallacy: Rather than noting that object actantsK of locutions like I J/they J /… see-V [yellow hats]K are concurrently assigned attributes by distinct features of locutions, including (a) sensoryK1 attributes by adjectives like yellow, (b) concrete-entityK1’ formulations by nouns like hats, and (c) formulations of Ø K’s copresence to ØJ by several concurrent features of Ø J see-V Ø K locutions (discussed for (1a) above), Locke and Hume describe these denotational effects as (a’) sensible qualities of (b’) external objects that are (c’) present to the mind J, and thereby extract and respecify entities like (a’), (b’) and (c’) from attributes specified by locutionary features like (a), (b) and (c). In doing so, they collapse nearly all of the cross-cutting dimensions of discursive semiosis we have considered so far, and thereby convert the multidimensional world of enactable and construable possibilities that human beings can inhabit, modify or negotiate through speech into a kind of flatland populated by new entities that create new puzzles for posterity.
For instance, although all attributes of such entities (and of others that cannot be found in flatland) are readily specified by utterable and audible speech in public communication, once the idiom of sensible qualities furnishing ideas about objects to the mind takes over, the idiom respecifies its entities by privatizing and encapsulating them. Privatization: The locution He has ideas [#to her] of objects is infelicitous with parenthetical variant, but fine without it, much like Miss Simmons felt [#to them] that she was hungry, as discussed in (4)ff. Encapsulation: Although Locke and Hume rely on extended narratives to obtain their results, once they encapsulate their results in word-long descriptions (their technical terms), the fact that entities like (a’), (b’) and (c’) were created through metalinguistic procedures – performed on (a), (b) and (c) – becomes non-recoverable by attempting to gloss a technical term like external objects K, which is a generic term for Ø K actants of Ø J see-V/feel-V/touch-V /…Ø K locutions, as is evident from (14a), while the mind J is a generic term for Ø J actants; such non-recoverability of narratively segmented content from encapsulating term is exactly parallel to that in (6), where the details of the narrative in (6b) become non-recoverable when glossing words like recognizing or recognition in (6c). Encapsulation here obscures the fact that talk of external objects K is a correlative of mind J-talk. The act of replacing the locutions upon which metalinguistic experiments are performed with a new idiom creates new entities that obscure their own origins and create new puzzles. These obscurities persist in the writings of all subsequent authors who use this idiom or its equivalent.Footnote 5
The idiom grows. In describing his newly minted entities, Hume sometimes prefers Latin etyma (external objects K, (14a) < objecta extrinsicus), and sometimes Germanic etyma [thing K, (14c); body K, (14e)], or links the two to each other, (14d), thereby repurposing Latinisms to invent a homegrown style of post-Newtonian things/bodies-talk for doing metaphysics in English, which subsequent writers adopt, making Humean flatland wildly popular for generations of humans to come. Notice also that (14e) exhibits a curious use of the term “afford”: we are told that only sensible qualities can afford us an idea of body. This idiom is elaborated into the doctrine of the “affordances” of “things” proposed by Gibson, as we shall see in the next section.
Homegrown styles permit new obscurities to be added to old ones through similar methods. Whereas Locke and Hume mainly assign sensible qualities to adjectival modifiers in ØK actants ([yellow hats]K, [bitter fruit]K,…) subsequent empiricists (like John Stuart Mill) try to extend this analysis to the nouns themselves ([yellow hats]K, [bitter fruit]K, …) but new problems arise when they aggregate them under homegrown nouns (like things K) that are maximally generic hypernyms (“a hat/fruit… is a (kind of) thing”) as well as anaphoric deictics (see below). Whereas nouns that have specific denotational stereotypy supply characterizability conditions on denotata of correspondingly specific kinds – as do hat (‘apparel, worn on the head,…), fruit (‘edible, grows on plants, …), and so on – hypernyms like thing (which are maximally generic because all sentences of the form “(a/the) __Noun is a (kind of) thing” are true sentences of English for every noun) sharply under-differentiate characterizability conditions on denotata in comparison with their hyponyms (i.e., (a) what is a thing? is a more puzzling kind of question than (b) what is a pencil /hat/fruit…?, so that language learners readily ask questions like (b), but when philosophers ask questions like (a) they end up writing treatises to answer them). Meanwhile, thing is also related to a class of quantifiers (something, anything, everything) that readily form anaphoric deictics in resumptive reference to (collections of) actants, which, if differentiated earlier in more specific ways (by hat K, pencil K, fruit K, etc.), can later be referred to as “some/any/every-thing K you were talking about” or more simply as “things K.” The possibilities for such resumptive hypernymic reference are far more expansive than the trivial case of monolexemic noun antecedents. Any conceivable topic that person A holds forth upon (Queen Victoria’s third last hiccup K; the ring structure K of macromolecules; frail hopes K; what General Relativity tells us about spacetime K; nail cosmetics K,…) can (along with anything said about itK) later be described by person B as the “the thingK youA were talking about,” or, through generic anaphora, as “thatK thing” or “thoseK things,” while incremental topic shifts and differences of opinion (K + n) can be introduced cataphorically as “but the thingK+n is….” These issues may be summarized as follows:

This dual indeterminacy creates a mystery: What exactly are perceivable things? After 200 pages of speculation, Mill refers to Hume and concludes that things K (which he also calls visible and tangible objects K, or (following Newton) forms of matter K) are “permanent possibilities of sensation” (Mill Reference Mill1865, 198), which he then links to “the object of the sensation” (Mill Reference Mill1865, 215), and later to objects K of touch-V and feel- V, and thus to their object actants (hats K, fruits K, etc.). But when such entities are described as things K or objects K of perception, characterizability conditions on all specific antecedentsK (and on all states of affairs of the universe involving themK) become non-recoverable at the moment of resumptive hypernymic reference. At this point, all Mill can do is to gloss his own technical terms. Although glosses of things K or objects K of perception as “permanent possibilities of sensation” are plausible (by sense-compositionality), they (and the kinds of voiced cognition they perform) are universally irrelevant to actual human beings, who can use perceivable speech and selective deixis to differentiate whatever they can talk about (including things that can’t be found in flatland) from each other, or, if they can’t, can simply give up talking about what no-one can locate (such as the things-of-flatland, as non-empiricists have done), whichever language they use.
The deeper irony of mentalese is that the things K that are said to be perceived always exclude speech – yet you can see or perceive the words on this page, can’t you? – but include certain referents of speech (mainly referents of concrete nouns), which eliminates any consideration of how what is included is assigned attributes by what is excluded, a problem whose solution requires remarkable feats of voiced cognition, such as the ones discussed above and the ones to which we now turn.
Gibson’s affordances
Gibson acknowledges “a similarity” between his own account and Mill’s “hypothesis of the permanent possibilities of sensation” (Gibson Reference Gibson1966, 223) because he is trying to complete Mill’s project of trying to incorporate things K-talk into mind J -talk. In doing so, he coins the idiom of the affordances of things to propose that “the ‘values’ and ‘meanings’ of things in the environment can be directly perceived” (2015 [1979], 119; emphases added). Regarding affordances, we are told that “I have coined this word as a substitute for values,” and that the affordances of things K depend on theirK “properties” (1966, 285). But what is it to directly perceive a thing K? Gibson’s talk of perceivable things faces the same pincer action of dual indeterminacy noted for Mill in (15), and attempts to gloss Gibson’s own terms by sense compositionality obscure their denotata for the reasons just discussed. Let us focus instead on how the doctrine of direct perception pairs things with properties. Gibson gives the doctrine an initial formulation, then switches to a more elaborate formulation.
The initial formulation is sourced from Koffka’s Hume-inspired gestalt psychology, which gives things K the ability to describe their own attributes through a very specific voicing structure:

How are specific things K paired with exactly what they say? For instance: Why doesn’t thunder say “eat me!”? Why doesn’t fruit say “fear me!”? Inanimate objects are paired with their dicta by projecting from the sense structure of intelligible verb-object locutions (eat fruit K makes sense, but eat thunder K doesn’t) attributes denoted by the verb (eat) onto – and only onto – its own direct object actant (fruit K), then allowing itK to communicate them “directly” to perceiversJ by treating themJ (in “the experience itself”) as addresseesJ of say- clauses and agentsJ of imperatives in biclausal represented speech – “a fruitK says [to ØJ] ‘[ØJ] eat meK !.’” – and similarly in the other cases. Voiced thing-communication is called “direct perception.” Difficulties arise: How does fruitK learn to speak English? Since its perceiver-addresseesJ may not all speak English, how many other languages does it speak? Is thing-multilingualism possible?
Gibson Reference Gibson2015 [1979] tells us that “Kurt Lewin coined the term Aufforderungscharakter” around 1929, and Koffka (Reference Koffka1935) re-interpreted it, as in (16), and that, although Gibson’s own “concept of affordance is derived from” these proposals, it has been modified. How? Gibson transforms Lewin’s and Koffka’s proposal into Anglo-American mentalese: “the affordances of things K for an observer J are specified in stimulus information K1” (Gibson Reference Gibson2015, 130), just as the idea of body K is afforded us J by sensible qualities K1 (Hume, (14e)), and things K/objects K are [for perceiversJ] permanent possibilities of sensation K1 (Mill, §6.0), where the parallelism of actants (italics and superscripts mine) metrically preserves a co-occurrence style that allows lexical variation across the ages. Whereas Lewin’s term die Aufforderungs (“a prompt, request, summons,” etc.) is a metadiscursive noun that denotes the perlocutionary effects of speech-acts, and Koffka converts these speech-acts into the voices of things K, as in (16), Gibson’s term affordances converts the voices of things K into those “properties” of objects K that are manifest to perceiversJ, or subject actantsJ of perceive-V and look at-V, as always already “in” their object actantsK: “whatK weJ perceive when weJ look at objectsK are their affordancesK” (p. 126). New metalinguistic operations yield a new voicing structure, which creates Gibsons’s affordances.
Although Gibson seeks to describe how objects K are perceived by animals J, the new voicing structure requires redefining these terms. Animals? He begins with talk of “any animal” and “all animals” but soon opts for “a more particular description” that applies “to animals that behave more or less as we do,” or are “animals like ourselves” (p. 31); and so, although the noun animals recurs throughout the book, it mainly describes humans, who, given his examples, are mainly speakers of English. What about objects? We are told that since “the term object as used in philosophy and psychology is so inclusive as to be almost undefinable,” his more restricted use of the term “refers only to a persisting substance with a closed or nearly closed surface” or “a ‘concrete’ object, not an ‘abstract’ one” (p. 31), which restricts objects K to denotata of concrete nouns (making explicit what remains tacit in Locke, Hume, and Mill). Nouns of this class – path, obstacle, brink, surface – are precisely the ones that show up in sentences like those in (17), through which affordances are created, curated and distinguished from all other entities in the universe.

These are obviously not sentences of ordinary English. They belong to a quasi-technical register that Gibson himself creates. How are sentences that describe affordances – and the register to which they belong – created?
Since affordances become manifest when objects K serve as ØK actants of perceive-v (and its hyponyms) – “whatK weJ perceive when weJ look at objectsK are their affordancesK” – objectK actants (like path K, obstacles K, brinks K, and surfaces K) must be linked to perceive- V in some way. Koffka’s method of direct voicing, as in (16), through which objectsK (like fruit K) describe their own attributes to perceiver-addresseesJ is not the method that Gibson uses. Rather, Gibson relies on the ability of animals-like-usJ to interpret ordinary English locutions that contain hyponyms of thing(s) K or object(s) K (like path K, obstacles K, brinks K, and surfaces K) and, by performing metalinguistic operations on such locutions, produces statements like (17), which purportedly describe (through voiced cognition) what animals-like-usJ perceive- V while looking at themK. Since Gibson never makes these methods explicit, even though they apply to all of his cases, he appears to be operating with what Sapir (Reference Sapir1921, 167) called a “form feeling” (or intuitive grasp) for analogical linguistic patterns that language users cannot readily describe. Gibson’s inability to describe the patterns on which he relies is precisely what creates his affordances.
Since “affordancesK” are “whatK weJ perceive,” theyK are definitionally equivalent to Hume’s HPerceptions K. A new phonolexical string has been invented to recapitulate – and mask – an old error. Koffka’s method of direct voicing – which lets objectsK describe themselves to perceiversJ – won’t do because “fruitK says…” is an oxymoron from the standpoint of structural sense (say-V requires a human subject noun, which fruit isn’t). A new verb – under a new definition – is needed. Why not “fruitK affords…”?, and, more hypernymically, “objectsK afford…”? We now have the locutionary template for beginning all of the statements in (17), each of which begins with a hyponym of object(s) K positioned as subject of afford- V. How can these sentences be completed?
Since newly minted afford- V is a transitive verb, it needs a direct object. Which one? Koffka’s oxymoron is an object K lesson: The nouns to be found for direct object slots of afford- V must have structural sense conformity with other sentence-partials (so that pitfalls like fruit K says can be avoided). Where can the right nouns be found? The denotational stereotypy (attributes predicable) of nouns that fills the subject slot of afford- V can supply the nouns that fill direct object slots, but only if certain metalinguistic operations are performed on attributive predicates. What attributes of surfaces K, cliff-edges K, obstacles K and paths K are relevant to animals-like-usJ? Well, one of the attributes predicable of surfaces K is that theyK “support animals,” and nominalizing support- V gives us a support N, which allows us to complete (17a); but, in addition, (17a) must be construed as tacitly supplying a second (understood) noun of the right sort (as “surfaces afford (animals) a support”) not a noun of the wrong sort (e.g., not as “surfaces afford (virtue/wit/ exogamy/…) a support”), a tacit sense that provides evidence for the (undisclosed) metalinguistic operation just performed. Similarly, one of the attributes predicable of cliff-edges K is that “animals are injured by falling off themK,” and nominalizing be injured- V gives us injury N, which allows us to complete (17b) (and, optionally, to switch out cliff-edge with brink without altering propositional content); and (17b) must also be understood as tacitly supplying an understood noun of the right sort (as “a cliff-edgeK affords (animals) injury”), not a noun of the wrong sort (e.g., not as “a cliff-edgeK affords (pencils/burps/ tetrahedra/…) injury”), thereby signaling the (undisclosed) step through its own construal. Similarly, one of the attributes predicable of obstacles K is that “animals can collide with themK,” and nominalizing collide- V gives us collision N, which allows us to complete (17c); and (17c) must also be understood as tacitly supplying an understood noun of the right sort (as “obstaclesK afford (moving animals) collision”), not a noun of the wrong sort (e.g., not as “obstaclesK afford (lisps/pastries/ hiccups/…) collision”), thereby signaling the (undisclosed) step through its own construal. Similarly, one of the attributes predicable of paths K is that “pedestrians (can) walk on themK,” and nominalizing walk- V gives us walking N; and, optionally, if you switch out walking N with a hypernym like locomotion N (cf. “walking is (a kind of) locomotion”), and if you retain pedestrian as modifier from the previous step, you get the variant in (17d), which is propositionally equivalent, but sounds more scientific because it exhibits features of scientific English (generic nominalizations, no gerunds, technical terms like afford-V).
The above steps are followed by a final step, which introduces taxonomic structure. If you nominalize afford-V into a new noun, affordances N, you can treat it as the hypernym for all the other nominalizations created above – locomotion, collision, injury, support – which can now be described as species of affordances. Notice that each of these affordances is paired in a one-to-one fashion with the subject term with which we began – so that that locomotion is an affordance of paths K, but not of obstacles K or cliff-edges K: and similarly for the rest – thus preserving the structural sense regularities from which these claims are derived.
Gibson thus creates a quasi-scientific register (complete with nominalizations, taxonomies, nomic verb inflection, etc.) for discussing things K, in which things K that cannot speak for themselves (unlike Koffka’s things K) can be made to disclose their attributes through a series of (undisclosed) metalinguistic operations, which convert (through voiced cognition) attributes predicable of themK (by speakers of English) into properties” that are allegedly manifest “in” themK (to all animalsJ who perceive themK). Are they? Any speaker of English can see2 ‘discursively conclude’ that these sense relationships do hold between source statements (of ordinary English) and statements-about-affordances (in scientific register), particularly when the latter are explicitly justified by appeal to the former, but no speaker of English can merely see 1 ‘perceive with eyes’ such relationships, and those who speak no English can neither see 2 them nor see 1 them. And so for all Gaffordances K [= “whatK weJ perceive” ≃ HPerceptions K], which are allegedly manifest in “direct perception,” perception-talk eliminates discursive ability and social life, as in (11), and attempts to give sense-compositional glosses of affordances or perception(s) K makes non-recoverable the fact that Gibson manufactures themK by performing metalinguistic experiments on the sense structure of English, even if Gibson, unaware that languages have sense structure, claims that all animals-like-usJ can see 1 themK in “ambient light.”
Despite the appearance of science-i-ness, the ordinary English sentences – from which these claims are derived – are much more contingent. Cliff-edges K and injury have no necessary connection to each other. Various animals – including people – can be injured by cliff-edges, mainly when they fall off them (another term you have to supply for the affordances claim to work). Yet people have other relationships to cliff-edges too. A cliff-edge can become someone’s property, and when they build a cottage or mansion on it, regrouped as real estate. It can become someone’s brand. Artistically re-designed images of mountain sides and cliff-edges form part of the brand logo of REI® and North Face® sports equipment (purchased by millions of people who never go near cliff edges). Romances unfold while people gaze at sunsets, and each other, while sitting near them. Like any standalone noun, cliff-edges produces an object formulation of referents through attributes predicable of it, but this is only a beginning. The range of attributes that can be assigned to the referents of any expression – a tiny subset of which Gibson encapsulates as (the) “affordances” of (the) “things” (to which the expression refers) – is not confined or delimited by the manner in which standalone expressions formulate their referents but is limitlessly extendable by the narratives and practices in which locutionary arrays containing such expressions (and whatever they denote) occur in the lives they enable, which makes this range limitlessly large, and always immune to taxonomy, simply because new attributes are incrementally produced each time people are linked to each other through narratives and practices involving them and their denotata. Speakers of standalone nouns (if any exist) may well live in a voiced theatre of affordances. People don’t.
More importantly, narratives that articulate attributes may be crafted by people other than the ones who rely on them. Let us approach this issue by considering items of food and drink. Karrebaek (Reference Karrebæk2014) shows that “food items do not have intrinsic meaning” but are assigned attributes (like delicious and unhealthy) by competing metasemiotic discourses in Danish schools, which also articulate contrastive registers of conduct involving them. Children whose lunch boxes contain rye bread K (which is promoted by the Ministry of Health as nutritious) are evaluated by teachers who see itK as “good appropriate and respectable school-children” (p. 22); meanwhile, immigrant parents of minority children evaluate pork K as haram “impermissible” (in accordance with Islamic religious dicta) unlike parents of other children. The same food items are therefore “differently enregistered” (p 20) both by criteria of edibility (of food) and by criteria of social positioning (of persons who eat). Similarly, Roseberry (Reference Roseberry1996) shows that coffee-products were transformed by marketing methods into emblems of social distinction in the United States in the 1980s, assigning “Yuppie coffee” and its users a class position legible to others. Similarly, in her study of registers of consumption promoted by French multinational corporations, Yount-André shows that “the values that speech registers index allow speakers (and eaters) to position themselves relative to one another in social interaction” (Yount-André Reference Yount-André2021, 90). These cases point to issues that pertain to a much wider class of phenomena.
A great many narratives about the entities that peopleJ see-V/touch-V/taste-V/…ØK are not crafted by themJ at the moment theyJ encounter themK. They are formulated altogether elsewhere, and acquire an asymmetric distribution in society through their own dissemination, situating and differentiating those who rely on them within frameworks of social organization, and this entire range of discursively mediated social processes becomes non-recoverable, as in (6), during attempts to gloss phrases like the affordances of things just when the solipsistic daze of voiced cognition kicks in the doors of perception.
Durkheim’s mana
In Durkheim’s (Reference Durkheim1995) [1912] usage, nouns like religion and society are homonyms of ordinary language words but differ in sense, and are thus distinct lexemes. Similarly the word-form mana, which plays a pivotal role in his narrative, is a homonym of a Melanesian word but has a Durkheimean sense unknown to Melanesians (see below). The strategy of lexemic homonymy is of course a means of introducing a voicing structure into his subject matter. In order to clarify sense and voicing contrasts among homonyms, I write Durkheim’s terms with preceding superscripts (D for Durkheim) in what follows. We realize that Dreligion denotes something different from the ordinary word religion, when Durkheim makes it plain that you can’t study the former by studying present-day Christianity, Judaism, etc., because they lie at great evolutionary remove from his own subject-matter, the “elementary” forms of Dreligious life that he can locate in “primitive” societies. This voicing structure enables Durkheim to incorporate Western philosophical terms into his definition of Dreligion, and then to ascribe them to primitives: “At the root of our judgments, there are certain fundamental notions that dominate our entire intellectual life,” which “philosophers, beginning with Aristotle, have called the categories of understanding: notions of time, space, number, cause, substance, personality,” and this hodgepodge of categories, drawn from such mutually incompatible doctrines as those of Aristotle and Kant, is said to “confine thought” in such a way that it cannot “break out of them without destroying itself,” while an analysis of something Durkheim calls “primitive religious beliefs” will reveal that such categories are “born in and from religion.” (pp 8-9). The denotational non-equivalence of religion and Dreligion should now be evident: No one supposes that Christianity or Buddhism, for instance, have articulated concepts of time, space, number, cause, etc., that correspond either to Aristotle’s or Kant’s proposals or constitute a “skeleton of thought.” Meanwhile, Durkheim’s own proposals about Dreligion are not based on ethnographic research on any religion. His book is a library dissertation: He consults contemporary writers on “primitive” societies, cherry-picks some passages as “data,” and uses these to answer his own questions. Once created through this technique, the construct he terms Dreligion relies heavily on interlinked claims about Dmana and Dtotemism.
In refuting the classical doctrine of totemism, Lévi-Strauss castigates Durkheim for “the kind of amalgamation” he attempts “between the notions of mana, totem and tabu” (Lévi-Strauss Reference Lévi-Strauss and Needham1991 [1962], 32). For Durkheim (Reference Durkheim1995) [1912], Dtotemism is “the religion of a kind of anonymous and impersonal force” that can “bring about certain effects mechanically” (p 192), attributes he assigns to Dmana as well, once the latter has been transfigured into a “substratum of belief” underlying Dreligion everywhere. But if “religious representations are collective representations” (p 9), what’s a representation? A Drepresentation, is similar to a Lockean idea or a Kantian Vorstellung in that it is a mental state that precedes language. Why? Here is Durkheim’s account of language:

Durkheim’s conjectural evolutionism holds that “language” is merely a superficial and latter-day addendum to pre-discursive Dideas (a modified blend of Lideas and Hideas), which form the deeper substrate that Durkheim seeks to describe. He asks “Now let’s see how useful language is to thought. Is it essential or can we think without signs?” (p 224). He answers: “…language aids thoughts but isn’t a complete substitute for ideas.” (p 226).
The primacy of Dideas pervades the analysis given in Elementary Forms. Much like the “anonymous and impersonal force” he calls Dreligion, Durkheim’s Dideas are anonymous and impersonal too: they can be scooped out of lexical items in unrelated languages and, once liberated from the narratives of others, can be respecified through his own narrative and encapsulated into thoughts held by collectivities anywhere. This is accomplished through a sequence of metalinguistic operations, which include at least four steps.
The first step is to decouple Dideas from whatever people “call” them and then to claim their equivalences across locales:

Since perfect synonymy between expressions is never found even in a single language, it is evident that the expressions wakan [Sioux], orenda [Iroquois], pokunt [Shoshone], maintou [Algonquin], mauala [Kwakiutl], yek [Tlingit], and sgâna [Haida] are not even potentially plausible as candidates for synonymy. Yet conditions on their identity as Dideas – understood as conditions on identity independent of the words used to express them – are easily supplied through a strategy of voiced cognition: whenever these various “primitives” utter such words, the pre-discursive Didea they are trying to express is the one glossed (by Durkheim) as “Power in the absolute, without qualification or limitation of any kind.” Denotation aside, the expressions from each language he cites are also speaker-focal social indexicals: the ability to produce and have any comprehension of utterances containing the word wakan is indexical of membership in the community of Sioux speakers; of those containing the word yek is indexical of membership in the community of Tlingit speakers; and so on. Durkheim eliminates these contrasts. In this era of colonial dominance, Europeans can flay the syllables from the mouths of the colonized with supreme self-assurance. Even more consequential, as we shall see, is the manner in which stories about distant others organize projects at home.
Having decoupled his unitary Didea from the languages that express it, Durkheim proceeds to his second step: To find an underlying ur-principle, which underlies “the same idea” everywhere.
In statements like “…it is not peculiar to the Indians of America; it was first studied in Melanesia…,” the anaphor “it” refers to the Didea he has just identified. He continues: “We find among these peoples [of Melanesia], under the name ‘mana,’ a notion that is exactly equivalent to the wakan of the Sioux and the orenda of the Iroquois.” In order to track “exactly equivalent” samples of his Didea across locales, he needs to supply some locutionary form that expresses it. He continues with the same anaphoric “it”:

The quote from Codrington supplies Durkheim with a ØJ believe-V ØK frame (and tools for transposing speech to though; see 2.0 above), where ØJ gets named The Melanesians J, understood as a social collectivity (i.e., all the residents of the 2,000 or so islands that Europeans calls “Melanesia,” all treated as a single believer), and the ØK term gets called mana K, now understood as a label for a collective belief. Continuing in his own voice, Durkheim equates Codrington’s gloss for mana with “the same notion” that Durkheim himself has identified elsewhere, thereby replacing their speech (which Codrington’s glosses by glossing mana) with his own thoughts.
Durkheim’s third step is to propose that mana be used as a cover term for this underlying Didea in all societies. He first extends it to Australia: “Therefore, it is by no means reckless to impute to the Australian societies an idea such as the one I have drawn from my analysis of totemic beliefs …” (p. 197). He then extends it to “magic as a whole” (“Hubert and Mauss…established that magic as a whole is based on the notion of mana,” pp. 203-4), so that “mana” is found everywhere “magic” is found. He then extends it to “all quarters” and to the “more fundamental” Dideas encapsulated in the totemic principle (thus creating the bundle which Levi-Strauss unravels in the passage cited earlier):

The conjectural evolutionism we noted earlier in Durkheim’s Lectures – “In the beginning, the signs were very simple and expressed ideas only in vague terms…”; see (18) – is deployed in Elementary Forms to identify the “simpler and more obscure” Dideas that constitute a “primitive substratum” for Dreligion everywhere. Just as language-specific words and expressions were discarded earlier on, extended narratives, such as “mythological constructions,” can now be discarded as mere “secondary products” of the “substratum of belief” that underlies them. Decoupling his Didea from the speech of those who articulate it allows Durkheim to respecify it: “It is the notion of force in its earliest form.” This “notion” is then re-voiced as the mental states of generic collectivities – as what “the Sioux conceive,” as what “The Melanesian imputes,” and so on – in a miscellany of passages that follow, as in (22):

A vast aggregation of imputed beliefs about causal forces in tow, Durkheim proceeds to step four, where familiar phonolexical strings are endowed with new senses: His new definition of Dreligion (created by voicing forms of cognition to Dsocieties elsewhere) can now be used to redefine what ordinary nouns like religion and society (and nouns denoting practices found in such societies, like philosophy and science) really mean for the Europeans who use them: “Thus the idea of force is of religious origin. From religion, philosophy first and later the sciences borrowed it.” (206)
When Dreligion and Dsociety come home, the circle of voiced cognition is complete. Durkheim, who is trained as a philosopher, seeks not merely to create a new field of study, called sociology, but to constitute its object of study, called society, as pre-eminent among objects of study insofar as philosophers (like students of Aristotle or Kant), and scientific researchers (in physics or chemistry) must recognize that the “concepts” they use are merely latter-day descendants of the more original source that Durkheim himself has identified. As long as the sequence of metaphors remains unbroken, the magnitude and power of this new entity can be respecified at pleasure (“A society is to its members what a god is to its faithful,” 208), and collective Drepresentations K (thoughtsK [-decoupled-from-speech-and-then] housed in the mind J of [speaker-]aggregates) can be re-specified as the voice of all K when society speaks within an individual’s utterances: “It is society that speaks through the mouths of those who affirm them in our presence; it is society that we hear when we hear them; and the voice of all itself has a tone that an individual voice cannot have” (210). No-one has so far identified the mysterious tone that distinguishes the voice of all from an individual voice, needless to say.
How did these mysteries emerge? Keesing (Reference Keesing1984) shows that the metaphysics of Dmana (and thus of Dreligion and Dsociety) is based on confusion about the sense of the word mana in Oceanic languages.

Durkheim ignored Codrington’s doubts and re-defined the term to suit his purposes: In claiming that “The whole religion of the Melanesian consists in procuring mana for himself, for his own benefit or someone else’s” (Durkheim Reference Durkheim1995 [1912], 196), Durkheim treats mana as a noun denoting a thingK (which a person can seek or acquire), and The Melanesian J as a generic being (whose Dreligion includes beliefs about itK).
Keesing (Reference Keesing1984) shows that mana is “a stative verb, not a noun” (138), and since Oceanic languages differ from each other, speakers cannot be aggregated into a unitary being like The Melanesian, so that the non-existence of this entity hinders attempts to assign it any beliefs.
Although nouns can be derived from verbs in Oceanic languages, much as in English, derivational patterns and word senses differ across these languages. Keesing’s survey of Oceanic languages shows that members of this language family exhibit two patterns for deriving nouns from verbs: In one pattern (Class I), an overt suffix is added to a verb stem to create a noun; in the other pattern (Class II), no suffix is added to verb stem, so that noun and verb forms are identical. Some languages (like Kwaio) use only the Class I pattern; other languages (like Bugotu) use only the Class II pattern; and a third class of languages (like Gela) exhibit both patterns, as illustrated in (24):

Both patterns are found in English as well. In Class I, the Gela verb tambu-V yields the noun tambu-ga N by addition of the suffix -ga, just as the English verb resist-V yields the noun resistance N by addition of the suffix -ance. In class II, however, the Gela word-form mbou sometimes functions as the verb mbou-V (with the sense indicated in column A) and sometimes as the noun mbou-N (where the zero suffix form, -ø, column B, marks absence of any overt suffix), just as many English words (paper, hand, walk, etc.) have the same word-form in verb and noun function (to paper V the wall vs. a piece of paperN; to handV over the money vs. the human handN; to walkV the dog vs. a long walkN), and, in both languages, the construction in which the word-form occurs (italicized in the English examples) clarifies whether the word-form counts as a verb or noun in the instance, thereby posing no difficulties for languages users, whatever conundrums it may pose for those unable to describe the language they themselves speak (Codrington) or languages spoken by others (Durkheim). Despite similarities, however, the zero-suffix noun remains an abstract noun in Oceanic languages, but need not be in English, which has consequences for the word-form mana.
Keesing argues that mana was a Class II stative verb that formed abstract verbal nouns with zero suffix in the parent language (Proto-Oceanic) and is realized (in various constructional forms) in descendant languages, whose usages thereby share some features: since the verbal noun denotes abstract situations, people and things involved in them can be described as ‘having mana-ness’ or as ‘being mana’; or, a derived transitive verb (with a different suffix) can be predicated of god/ancestor as agentive subject to say that ancestors support/ protect/ empower people; but these are glosses of utterances containing mana locutions, which contextually vary in meaning, not descriptions of a unitary belief that can be captioned by the noun mana and ascribed to all speakers at all times and places.
What about present-day usages of mana? Mana locutions differ in form and sense across the Oceanic languages, and none are related to the metaphysics proposed by Durkheim and others. In Fijian and Tongan, the stative verb mana can be glossed as ‘be effective’ or ‘it works’ but what is described varies by topic: “A Fijian cure which is mana for one complaint, may not be mana for another…” (p144). Mana locutions permit a wider range of glosses in Maori – viz., ‘be effective’ (he kupu mana tana kupu ‘his word is effective’); ‘fulfills’ (ka mana taku kupu i au ‘I will fulfil my word’); ‘potent’ (he karakia mana ‘a potent charm’); ‘granted’ (e kore to tono e whakamana ‘your request will not be granted’) – but utterance construal varies in the same way. Other Oceanic languages exhibit similar variation.

Given manifest results, utterances containing mana locutions can convey the efficacy or success of an antecedent cause; but, depending on the subject noun phrase, a mana locution (as a whole) might convey the efficacy of penicillin, or success by doctors, or support by ancestral spirits, and so on, and such forms of efficacy have no ingredient or “substrate” that is common to all cases: “…mana as invisible medium of power was an invention of Europeans, drawing on their own folk metaphors of power and the theories of nineteenth-century physics” (148). More particularly:

We might conclude by observing that Durkheim’s strategy of captioning Dideas with words from other languages, while ignoring how any words are used or interpreted by their speakers, creates a metaphysics of Dmana, which, once fabricated, can readily be imputed to a miscellany of collectivized others in North America, Australia, or anywhere else, without further ado. And given the voicing structure of tropes of homonymy (which equates Dreligion with religion, and Dsociety with society) a new enterprise can seemingly be created, which, if others pursue it, reproduces itself over “subsequent generations of anthropologists” (p138), even if such fancy has no basis in fact, let alone in “social fact” – or Dsocial fact – whatever that is.
On the other hand, if we reject such encapsulations by noting that “the difficulty with the Durkheimian notion of social fact… is the question of how such a collective understanding itself comes about,” and ask “how then does a social regularity of recognition emerge?” (Agha Reference Agha2003, 245-6), the role of discursive semiosis in making and re-making such constructs, and in equipping people with socially positioned registers of emblematic conduct within them, becomes quite clear.
Conclusion
All human beings continuously rely on the medium of articulate speech to manage their affairs, and verba sentiendi are among the tools through which they manage them. Such verbs can be handy tools for denoting a wide range of effects in relation to accompanying co-textual arrays, but this range is far wider than their glosses might suggest (as discussed in the first four sections of this article). They can also be extracted from co-textual arrays into metalinguistic queries about standalone nomina sentiendi (whether these be nominalizations of verbs [(10), (11)] or of their actants [(15)]) through procedures that eliminate the very conditions under which adequate answers can be given (as discussed in the next three sections). The act of posing and answering such questions (and the processes of encapsulation, privatization, denotational re-segmentation, etc., on which they rely) create conduct formulations that obscure the conduct of human beings from themselves. The entities created through such practices – the signatures of modernity, the darlings of “mind”-craft – provide no clue as to how they were created or curated from the expressions that appear to denote them, especially when neither the ones who propose or accept them are able readily to describe the properties of the medium on which all their efforts rely, and which these entities now appear to sideline or supplant, namely the medium of articulate speech. Yet the various constructs proposed in the above cases are of less intrinsic interest than are the methods of entity manufacture on which they rely, which are highly productive, and which therefore can – and have, and will again – be used to assemble many other variants too, so that describing the processes through which they are assembled appears to be a worthwhile thing to do.
Yet the issues of real interest lie elsewhere. Sapir observed that “even a child may speak the most difficult language with idiomatic ease” but cannot “define the mere elements of that incredibly subtle linguistic mechanism” (Sapir Reference Sapir and David1963 [1927], 549). All adults are exactly like Sapir’s child in one – and only one – respect: all human beings have a facility in linguistic communication far greater in scope and power than the ability to describe explicitly the properties of the medium of communication on which they rely, which always remains meager in comparison unless effort is made to acquire the tools necessary to analyze speech, a task seldom attempted by most people, however, because facility of the former kind obscures inability of the latter kind.
These are the conditions under which new constructs are created and brought into the social world through discursive semiosis, carved out of quotidian varieties of cotextual locutionary form, which everyone can deploy with perfect ease, but which the most industrious members of each generation may lack the ability to describe, perhaps because they put in no effort for the reasons just noted, perhaps because they were too busy fabricating social constructs and false hopes instead. After all, once items are extracted from locutionary arrays, they are readily respecified through new labels, and, through attributes predicable of them, can be repositioned into new narratives and explanatory schemes, to which their readers, hapless in the same way, can be drawn for a while, perhaps because these constructs seem to fill those very gaps within locutionary arrays – such as those in Ø J see-/think-/believe-… Ø K, and derived locutions – from which these constructs were exapted in the first place, and to which common sense guides haplessness once again. At least for a while.
Yet all of this is as nothing compared to the possibility that should we attempt to investigate the manner in which such constructs are fabricated in any or all of the discursive practices on which our lives rely – law, medicine, commerce, science, politics, and so on – and can acquire the means to investigate them, we might become better equipped to navigate our way through the constructs we encounter as seemingly prefabricated within them. Does this seem like a worthwhile thing to do?