January 25, 2015

(Koen Sebregts points out that the title of my post is misleading, since the authors explicitly state they do not predict humidity to cause complex tones. I’ll leave it unchanged as a warning to myself.)

There is a new ‘geophonetics’ paper out at PNAS this morning Climate, vocal folds, and tonal languages: Connecting the physiological and geographic dots by Caleb Everett, Damián Blasi, and Seán Roberts. In a nutshell, the claim, as I understand it, is that since low humidity is inimical to precise manipulation of vocal folds, languages with ‘complex tone’ (understood as having a more than two-way pitch-based contrast) are unlikely to thrive in regions with low humidity (i.e. hot and arid, and all types of cold). This hypothesis appears to explain the areal patterns in the prevalence of tone.

The phonetic component of the argument seems quite sound to me, and I’m not really qualified to judge the evolutionary part of this paper in any reasonable way. As a phonologist, there’s something that really bothers me though, and that is the almost total lack of critical discussion of what it means to be a tonal language. It’s noticeable that the phonological part of the bibliography is really sparse, especially compared to the phonetic part: there’s Moira Yip’s Tone textbook, several large-scale overviews like Maddieson (1978) and Hyman (2001), a couple of tonogenesis citations, and Jonathan Kaye’s (1990) textbook cited as an example of the consensus as disputed by the authors. No other work by Larry Hyman, who has published a lot on the typology of ‘tone languages’, no discussion of the whole idea of ‘pitch accent’ and how it relates to tone: the only definition given (p. 2) is ‘phonemic tone, in which pitch is used to contrast lexical meaning’. The data sources are WALS and especially the ANU phonotactics database. Both of these sources, while extremely valuable, put a lot of emphasis on documenting ‘diversity’ and quite a bit less on pinpointing what exactly it is they’re talking about.

I’ve gone to the ANU database to have a look at what it says about tone in languages I know about. Now I work on fairly ‘boring’, European languages, but recently I’ve been getting quite interested in the European ‘pitch accent’ systems, plus I actually know a bit about these languages, so I thought I’d have a go at a selection. I have to say I wasn’t exactly happy with the results. The following list gives the number of tonal contrasts given for that language in the ANU database, and my comments if any.

  • Celtic
    • Scots Gaelic: 0. This can’t be right; there is a ‘pitch accent’ or glottalization system in most varieties as far as we know. As a shameless plug, here is a paper of mine with more references inside.
    • Welsh: 0. (The pitch system of Welsh is non-trivial, but fair enough.)
    • Cornish: 0. (Probably, but how do we know for sure?)
    • Manx: 0. (Probably true, but how did it end up somewhere in the Trossachs?)
    • Breton: 0. (Fair enough, although see the paper linked above for some discussion.)
  • North Germanic
    • Danish: 0. Here’s a good example of the controversy. Danish has stød, a type of laryngealization. Does it count? Maybe not, but it has been analysed as being tonal (e.g. by Junko Itô & Armin Mester and Tomas Riad), although reasonable people disagree. In any case, tonal dialects of Danish definitely exist.
    • ‘Jysk’: 0. (Why not Jutland Danish or the like? Anyway, it has both regular stød and so-called Western Jutland stød, how do we count?)
    • Old Norse: 0. We don’t know that! Not zero according to Tomas Riad, anyway. Or see Haukur Þorgeirsson’s dissertation – OK, it’s in Icelandic, but still…
    • Nynorsk: 2. Nynorsk is a written standard…
    • Aurland: 2. Why Aurland specifically? But fair enough.
    • Trondheim: 2. Does this mean Standard Norwegian as spoken in Trondheim? Well then yes. But all around Trondheim you have dialects with so-called ‘circumflex accent’, where apocope creates a tonal contrast in monosyllables (usually absent in North Germanic). Does it count as 2 (after all, there is only ever a two-way contrast in monosyllables vs polysyllables) or 3 (the traditional way of looking at things)?
  • West Germanic
    • Limburgish [sic]: 2. OK, so we count pitch accents as tones. But many people have argued that it’s not a lexical tone contrast but instead something else. How do we count? (Actually, the same question applies to much of North Germanic.)
    • Kölsch: 0. Unless I’m grossly mistaken, it does have a similar type of ‘tonal accent’ system to Limburg. So either both have tones or neither.
  • Baltic
    • Latvian: 2. That’s just incorrect, I think – at least in conservative varieties you have a three-way contrast in stressed syllables. OK, the third ‘accent’ involves glottalization as well, but how do we decide to exclude that?
    • Lithuanian: 0. Definitely has a pitch accent system. If Latvian is treated as having two contrasts then surely Lithuanian should have (at least) two?
  • Slavic
    • Serbian: 2
    • Croatian: 0
    • Chakavian [sic]: 3
    • Bosnian: 0
    • To me, this doesn’t really compute: there are certainly differences among these varieties but as far as I know they’re not quite as drastic. (That is off the top of my head, though, I’m willing to be corrected.)
  • I also note that Basque has 0 contrasts, while Japanese has 2 or 3; there are dialects of Basque that have been argued to have a similar pitch accent system.

None of this is to say that the ANU database is bad and unusable, or that the paper is incorrect rubbish. This is a random, small selection of unrelated languages in a particular region. There is also much to be said for treating European tonal accents as a typologically unusual special case – Jakobson did that with Baltic tonal Sprachbund in 1931 already.

On the other hand, the authors are clear that their predictions should be particularly salient in relatively extreme environmental conditions, so it’s not entirely beside the point to look at northern Europe more closely. And it’s interesting that in the survey above the ANU database is consistently underreporting the number of tonal contrasts in this fairly cold, fairly dry region.

This does worry me. I’m sure the statistical models give the correct results for the data that’s fed into them. I’m willing to accept that the large amount of data might drown out this sort of noise. And yet, still, I’m finding it difficult to be convinced in matters I know little about when the things that I do know something about are a bit dodgy.

Again, this isn’t about bashing the munging of large-scale data. This paper is certainly quite explicit in its hypotheses and proposes a causal explanation based on more than correlations alone. But the data appears to have been collected without much regard to the theoretical underpinnings of what it is that we’re actually collecting here. And that means I’m not sure.

Blog archive  |
comments powered by Disqus

About me

I’m Pavel Iosad, and I’m a Senior Lecturer in the department of Linguistics and English Language at the University of Edinburgh. ¶ You can always go to the start page to learn more.



Subscribe to the  RSS feed, or follow me on Twitter at  @anghyflawn.