23 Apr 2018

India, the land of diversity, is bestowed with several cultures. The vast number of languages spoken here is a testimonial to that fact. While learning these languages is exciting, studying them is fascinating for another reason—they help us understand the little-known history of the Dravidians, who lived in the subcontinent before the speakers of Indo-Aryan languages arrived around 1500 BC. So far, not much is known about their origin and their dispersal across the country. Now, a study by a group of researchers from the Max Planck Institute for Psycholinguistics, the Netherlands, has tried to reconstruct the Dravidian language family tree in an attempt to understand Dravidian history. The results show that these languages are as old as 4500 years! So what does that mean?

The Dravidian language family is one of the world's primary language families. It has 80 varieties, spoken by over 200 million people in south, central and north India. Interestingly, the Kurukh language of Nepal and the Brahui language of Pakistan and Afghanistan belong to the Dravidian family. Certain languages of the Dravidian language family are known to have been written for over 2000 years, and influenced Vedic Sanskrit and modern Indo-Aryan languages, thus playing a significant role in contact between Indo-European and Austroasiatic language families.

Previous linguistic studies have tried to understand the relationship between various Dravidian language subgroups and have recognised four main branches of the language family. The South Dravidian I subgroup consists of Tamil, Malayalam, Irula, Kodava, Kurumba, Kota, Toda, Badaga, Kannada, Koraga and Tulu, among others. The South Dravidian II subgroup has Telugu, Gondi and Kuvi, and others. The Central Dravidian subgroup comprises Gadaba, Parji and Kolami, and the North Dravidian subgroup has Brahui, Kurukh and Malto languages.

An interesting fact about the Dravidian family of languages is that there many smaller, spoken ones, which are not well-studied. The researchers of this study, published in the journal Royal Society Open Science, focused on a representative sample of languages to reconstruct the Dravidian language family tree by using first-hand data.

“We collected 100 items of basic vocabulary from native speakers of a diverse sample of Dravidian languages”, says Dr. Annemarie Verkerk, an author of the study. They used the Swadesh list, a compilation of primary vocabulary that can help in assessing the relationship between existing languages alongside the changes over time, devised by the linguist Morris Swadesh.

While previous studies on the Dravidian family of languages have used datasets derived from dictionaries, this study combined data collected first-hand with advanced statistical analysis. “Swadesh’s 100-concept elicitation list was used to collect lexical data for 20 languages. Responses were recorded and written down if this was possible. For some spoken languages, informants did not want to write down responses in non-native scripts. Recordings and written transcripts were transcribed to the International Phonetic Alphabet (IPA). Cognate (relatedness) coding of the responses was performed using the Dravidian Etymological Dictionary”, explains Dr. Verkerk, talking about the methods used in the study.

Map of the Dravidian languages in India, Pakistan, Afghanistan and Nepal adapted from Ethnologue ( R. Soc. open sci. 5: 171504)

The researchers used a Bayesian phylogenetic inference method, a probability-based statistical approach that estimates the evolutionary history and relationships of a set of species or languages based on some prior data. Commonly used in ecology, this approach deduced a set of highly probable language trees, rather than a single ‘best’ language tree.

The findings are then compared to a commonly used reference tree namely Krishnamurti’s family tree, which classifies the languages into three main groups—South I and South II, Central and North. B. Krishnamurti, a linguist, is one of the most prominent figures in the study of the Dravidian languages as he applied modern comparative linguistic theory to study the diverse Dravidian languages and their relationships. The results of his work are used as a reference in further studies on Dravidian languages.

However, the researchers of this study found that there is a significant split between South I and the three other groups, which is estimated to have happened 2500-3000 years ago, a little latter than the expansion of the Southern Neolithic civilisation between 4000 and 3000 years ago. The agricultural practices in this period provide external evidence through crop vocabulary for the findings of this study.

The researchers incorporated historical and linguistic data, mostly evidence on when the written Dravidian languages are first attested in inscriptions, in the language tree to estimate the age of the Dravidian language family. Using these data, they found that the Dravidian language tree to be approximately 4500 years old, coinciding with the archaeological findings regarding the Southern Neolithic, a definitive period on the archaeological timescale. The Neolithic period in South India which lasted from 5000 years ago to 3400 years ago, is characterised by man-made artefacts like the ash mounds produced by early agricultural communities.

A NeighborNet visualization of differences relating to the vocabulary of languages. Colour-coding gives subgroup affiliation: red, South I; blue, Central; purple, North; yellow, South II. (Source: R. Soc. open sci. 5: 171504)

While the analysis is informative about when the Dravidian languages originated and when they diversified, it does not tell us much about where this may have happened.

“The current analysis is an investigation of the genealogy of languages. It does not incorporate information about the current geographical information of languages, nor does it reconstruct where ancestors of the Dravidian languages were spoken in the past. However, this is possible, and this has been done for other language families”, says Dr. Verkerk.

The use of a Bayesian phylogenetic technique for linguistics makes the current study an interesting one. “The methods employed in this study have been started to be used in linguistics from 2000 onwards. They have been met with scepticism by historical linguists, who invented the 'standard' way to study the history of language, the so-called comparative method. The comparative approach makes a feature-by-feature comparison of two or more languages with common descent from a shared ancestor, to extrapolate and infer the properties of the ancestor. However, phylogenetic techniques for linguists build on the comparative method and are gaining more acceptance now”, explains Dr. Verkerk.

The study is one of the few based on the best-supported analysis that provides the current best estimate of Dravidian descent, throwing light onto the movement of ancient populations into and through South Asia. It shows that dating languages is one of the critical steps in understanding the history of Eurasia.