Home | Our Mission | History | Religion & Society | Books & Magazines | Ol Chiki Script | Dictionary & Grammar | Fonts | Disom Khobor | Art & Culture | Projects | Pasteboard | Feedback
a portal for Santals
Status of our language

Enter subhead content here


                1. Presentation
                List of topics
                2. Language identification
                3. Statistical and geographical data
                4. Language corpus
                5. Script and spelling
                6. Status
                7. Literature
                8. Schools
                9. Oral channels
                10. Language usage
                11. Reference framework
                12. General remarks
                1. Presentation
                A descriptive sociology of language with the base line data: what language
                varieties (name, linguistic and legal status), spoken by whom (numerical
                strength and geographic distribution), are written in what manner (script
                and spelling), disseminated in what form (pamphlets, periodicals, books,
                etc.) and used for what purposes (education, religion, government, etc.) is
                a prerequisite for understanding the overall societal picture of a language
                in a multilingual and developing country like India. This volume on India,
                the second in the series The Written Languages of the World, thus provides
                baseline data on the written languages of India. A second volume on the
                spoken but unwritten languages is also in progress outside the purview of
                this joint project and is to be brought out independently by the Office of
                the Registrar General, India.
                The division of languages as written and unwritten is more methodological
                than ideological, and perhaps underlines the fact that alphabetization plays
                a crucial role in the development of linguistic infrastructures in society.
                This does not rule out the possibility that certain languages that are
                spoken and not written may, however, show a high degree of refinement in
                their oral traditions and may prove more important than written languages in
                certain contexts. Yet, the criterion, written languages, as opposed to
                spoken ones, was not only the simplest dichotomy but the most productive of
                concrete results. This operation naturally presupposes an inventory of all
                living languages in India. The Linguistic Survey of India, conducted in the
                early part of the 20th century (from 1886 to 1927) under the editorship of
                Sir George Abraham Grierson, listed a total number of 179 languages and 544
                dialects (Grierson, 1927). This list is of limited use in the present Indian
                context mainly because several territories which were included under that
                survey no longer form a part of the Indian Union, and others which do form a
                part of the Union, did not receive adequate coverage under that survey.
                After Independence, an attempt was made in the 1961 census to present the
                mother tongue data in the same classification scheme as that of Grierson and
                a list of 193 classified languages was prepared corresponding to 1,652
                mother tongues actually returned (1961 Census Language Tables). This list
                excluded unclassified and foreign mother tongues. The languages were
                identified as belonging to four families: Austric- 20, Dravidian - 20,
                Indo-European - 54, Tibeto-Chinese - 97 and I of doubtful affiliation.
                However, the language list of the 1971 census which provides the frame of
                reference for the present study, was found to be more suitable, for it was
                the latest as far as language statistics was concerned but more importantly
                it defined"language" in terms of broad demo- and geolinguistic units. The
                census consists of a list of 105 languages each with a speaker strength of
                10,000 and above on the all India level.
                Excluding foreign languages and a few others of doubtful linguistic status,
                a total number of 96 languages were surveyed, of which 50 were found to be
                written and the rest unwritten. This division was based on the following
                considerations. The first and most obvious one being the existence of some
                sort of script or scripts which appeared in print. It is well known that
                writing in a large number of Indian languages was practised from a very
                early date and the advent of the printing press in the early 1 9th century
                helped in standardizing the existing scripts. It simultaneously provided an
                impetus for devising scripts for a number of yet unwritten languages. Thus,
                by the end of the 19th century most Indian languages could boast of some
                writing or other, albeit not always used by the society as a whole. These
                writings fall into two types: I) writings by scholars of various sorts, like
                linguists, anthropologists, etc., which strictly speaking are not addressed
                to the speakers of the languages themselves and 2) writings by non-native
                speakers, such as missionaries, rather than members of the speech community.
                We found that existence of both these types of writings did not prove
                sufficient to call a language written. Mere transcribed texts in some
                scholarly journal or book, of the kind, "Useful Words and Sentences in
                Dafla" by I.M. Simon published in 1900, is not enough. In the second
                category too, written and printed matter (mainly biblical translations into
                native languages) appeared fairly early, as in Malto (1881), Korku (1900),
                Kinnauri (1909), Kuvi (1916), Vaiphei (1917), Shina (1929), etc. and
                although this literature certainly was addressed to the speakers of the
                languages themselves, its advantage to this language community was minimal,
                as it is quite doubtful if anything further came out in these languages
                after this erstwhile beginning. Without native participation the language
                lapsed back to an unwritten state. On the other hand, although a similar
                kind of beginning was also made in the case of the Santali language, native
                participation enhanced the process, not only in devising a new script, but
                by developing it further through various literary activities. There is
                little doubt now concerning the established written status of the Santali
                A second consideration that was found useful was whether primary education
                was or was not offered in a language. Language in education is an obvious
                corollary to language expansion. In fact, formal education works as a
                catalytic agent in the overall development of a language, that is, in its
                elaboration and modernization process. Most importantly, it encourages the
                production of popular, refined and learned prose in the form of text books,
                which contributes to the reshaping of a language by transfering it from a
                preliterate stage to a more advanced stage. Reshaping according to Kloss,
                (Kloss, 1978) is done mainly through the realm of information and not of
                imagination. Text books, even at a modest level contribute to this realm. It
                will be seen that even such small tribal languages like Ho, Bodo/Boro,
                Kharia, Kabui, Bhili/Bhilodi, etc. are well on their way to producing
                non-narrative (dialectic) prose, which has been taken as a positive
                indicator in favour of their written status.
                Data on all the 96 languages has been uniformly gathered mainly by means of
                a questionnaire (the format of which is included in the section on language
                reports in this volume) and secondarily by consulting knowledgeable and
                official agencies on any single aspect. For example, the data on school
                education was collected through the survey questionnaire and secondarily
                from two other published sources: I) the National Council for Educational
                Research and Training and 2) the Commission for Linguistic Minorities.
                Comparative data from all three sources appear in this volume. This
                procedure was unavoidable in certain cases, as no single data source was
                found fully satisfactory. For the survey fieldwork based on the
                questionnaire, languages were alloted to territories on the basis of the
                numerical strength of the mother tongue speakers. In the case of scheduled
                languages the strength required was 100,000 speakers and above and in the
                case of non-scheduled languages 5,000 or above. Thus, in a state like Andhra
                Pradesh a total of 14 languages falling within both these limits were
                surveyed, namely: Telugu, Urdu, Tamil, Hindi, Kannada, Marathi, Oriya and
                Gadaba, Jatapu, Konda, Koya, Savara, Gondi and Kolami. On the other hand,
                Konkani was surveyed in 5 States or territories i.e. Kerala, Maharashtra,
                Karnataka, Tamil Nadu and Goa, Daman & Diu, where it satisfied the above
                criteria. We have made one exception in the case of Urdu in Jammu and
                Kashmir, where although it does not fulfill the population criterion of
                100,000 mother tongue speakers, it is the official language of the State,
                and hence has been included in the survey.
                The questionnaire is designed to draw up sociolinguistic profiles for
                individual languages. A primary task of such a design is to juxtapose the
                number of speakers with the degree of language elaboration or
                implementation. The key variables which have gone into this profile formula
                are listed below and are known as the Union List of Topics.
                List of Topics
                1. Language Identification (QueStions: Q.l*)
                2. Statistical & Geographical data (Q.2)
                3. Language Corpus (Q.3)
                4. Script & Spelling (Q.4)
                5. Status (Q.5)
                6. Language Elaboration (Q.6, 7, 8, 9)
                7. Language in Education (Q.10)
                8. Language in Mass Media (Q.11)
                9. Language in Administration (Q.12, 13, 14)
                10. Language in Courts of Justice (Q.15)
                11. Language in Legislature (Q.16)
                12. Language in Industries (Q.17, 18)
                13. Reference Framework & Promoting Agencies (Q.l9)
                14. Summary (Historical and Sociolinguistic Background) (Q.20)
                *Q = Question

                2. Language Identification
                In a geographically vast multilingual country of over six hundred million
                speakers like India, language identification is not a simple matter,
                particularly in the absence of a definitive inventory of names recognised by
                linguists to be languages in their own right, i.e. possessing linguistically
                autonomous systems. The main source therefore that is adopted for our
                purpose here is the Census of India - a nation-wide operation conducted
                every ten years with a history of over one hundred years to its credit. The
                census returns in terms of "mother tongues" are usually presented as a
                scheme of "languagesn, the spirit of which basically comes from the
                Linguistic Survey of India. The editor of the survey, Sir G.A. Grierson, was
                himself associated with the language aspects of the Indian census during the
                first decades of this century. In more recent times, the 1961 census data
                was totally cast in Grierson's language scheme. The 1971 census which
                provides the language format for this survey, also presented mother tongue
                data in terms of languages but diverged from the Grierson scheme. In all
                cases, the principal language names of our survey are taken from the 1971
                census (Social & Cultural Tables, 1971).
                3. Statistical and Geographical Data
                These topics provide data on the numerical strength of a language and its
                geographic distribution. Numerical strength is a combined figure of: I)
                native speakers of the language and 2) second language speakers (Q.2.2).
                Up-to-date data on languages spoken by fewer than 10,000 speakers on the all
                India level are not available in print. Individual languages on the basis of
                their numerical strength are shown under population ranges like 10- 20,000,
                20 - 50,000, 50 - 100,000, 100,000 - 1 million, 1 - 5 million, 10 million
                and above. The groupings sometimes characteristically reflect their
                linguistic affiliation and status. For example, all languages specified in
                Schedule VIII to the Indian Constitution other than Kashmiri, Sindhi and
                Sanskrit have a numerical strength of 10 million and above. On the basis of
                their native speaker strength alone (Q.2.21), these fifteen constitutional
                languages make up an overwhelming 95.37% of the Indian population. On the
                other hand, to the lower order speaker blocks between 10 - 100,000, belong
                most of the languages of the Tibeto-Burmese sub-family. To the middle order
                blocks of 1 - 5 million belong the rest of the languages including Kashmiri
                and Sindhi. This gross numerical strength is qualified by the presence or
                absence of monolinguality and bilinguality. The question on bilinguality can
                be viewed in two ways; I) bilinguals who are part of mother tongue strength
                and 2) second language speakers who are added to the strength of a mother
                tongue. The above can result in either a stable or replacive bilingualism.
                In the Indian context, English sets the highest limit of the second kind,
                i.e. 99.24% of English speakers are second language speakers. For other
                Indian languages second language strength is marginal. Only fGur languages
                viz. Assamese (17.1%), Kannada (17.55%), Tamil (10.39%), and Tulu (19.03%)
                could claim a 10% and above addition to their total speaker strength by
                second language speakers. For a large number of languages the increase is
                almost nil. Therefore, Indian languages in general account for their
                strength mainly through native speaker strength. The other dimension of
                bilingualism (see I above) can be measured in a three point scale of
                high-medium- low.
                Table 1
                High, Medium and Low Bilingualism by Mother Tongue Groups
                HIGH MEDIUM LOW
                (30 - 50% and above) (10 - 30%) (10% & below)
                Bishnupuriya (52.38) Assamese (13.20) Angami (negligible)
                Bodo/Boro (54.62) Bengali (12.01) Ao (negligible)
                Dimasa (31.00) Dogri (21.88) Bhili/Bhilodi (5.21)
                Gondi (41.93) Garo (13.26) Bhotia (negligible)
                Kharia (51.72) Gorkhail/Nepali (28.69) Hindi (6.41)
                Konkani (57.49) Gujarati (13.05) Hmar (negligible)
                Kurukh/Oraon (46.69) Ho (22.27) Kabui (negligible)
                Lepcha (52.25) Kannada (17.11) Khasi (9.35)
                Mikir (30.07) Kashmiri (16.00) Kheza (negligible)
                Mundari (36.33) Lushai/Mizo (11.62) Konyak (negligible)
                Santali (31.72) Malayalam (18.67) Ladakhi (negligible)
                Sindhi (42.95) Manipuri/Meithei Lotha (negligible)
                Tangkhul (43.46) Marathi (14.66) Nicobarese (negligible)
                Thado (40.01) Punjabi (20.97) Oriya (8.46)
                Tripuri (30.48) Tamil (13.85) Phom (negligible)
                Tulu (45.09) Telugu (17.00) Sangtam (negligible)
                Urdu (27.92) Sema (negligible)

                This scale has no rationale behind it except for the fact that most of the
                scheduled languages show a rate of bilingualism between 10 - 30%. Although,
                second language-preference data (Q.2.215) are given, it is hazardous to
                conclude if the said bilingualism is stable or replacive. However, there is
                compelling evidence regarding the Indian linguistic situation that it is
                maintenance-prone, and that bilingualism is likely to be stable rather than
                replacive (Pandit, 1971). The ethnicity data (Q.2.1), which is available for
                comparison with mother tongue data in a limited number of cases only
                (scheduled tribes), might serve as an exception to the case in point. For
                example, only 15.12% of the people belonging to the Mundari tribe, and
                14.92% of the Ho tribe in Orissa claim Mundari and Ho respectively to be
                their languages. The rest of the 17,813 Mundari population and 31,916 of the
                Ho have switched over to other languages. Questions 2.213 and 2.214 add
                another dimension to bilingualism, i.e. whether the phenomenon is largely
                male-based and more connected with the economic activities of the community
                rather than the home, thereby playing a major role in the language
                socialization of the child. Female bilingualism has a significant bearing on
                the linguistic environment of the society. In a large number of groups,
                particularly tribes like Gondi (47.25), Kurukh/Oraon (48.23), Mundari
                (44.06), Santali (40.54), Kharia (52.94), etc. female bilingualism is
                appreciable. It is also on the high side among some groups like Konkani
                (45.76), Sindhi (43.04), Tulu (43.63), etc.
                The other dimension in addition to numerical strength, is location,
                (Q.2.216), which is divided into rural/urban. Locality as a demolinguistic
                dimension, particularly urbanisation is yet to be appreciated fully. Urban
                centres serve a very important role in language standardization, as they act
                not only as melting pots but also as prestige centres or trend-setters.
                These centres also turn out to be important for mass communication, like
                media, broadcasting, etc. At least two major cities like Calcutta and Delhi
                have played singular roles in the processes of standardization of Bengali
                and Hindi respectively. Generally speaking, most Indian languages are
                totally rural. Barring Sindhi (74.42), Urdu (44.84) and Konkani (43.54), in
                all other cases the percentage of urban speakers is 30% or below. Even the
                scheduled languages do not show a uniform percentage of urban population and
                vary between 5 to 30%. There are only a few non-scheduled languages like
                Angami (7.72), Ao (11.39), Bhotia (8.22), Dogri (4.48), Gorkhali/Nepali
                (18.83), Kabui (9.70), Khasi ( 12.15), Ladakhi (9.69), Lushai/Mizo ( 14.87),
                Manipuri/Meithei (16.10) and Tulu(20.91), which come within this range.
                The geographical distribution of languages (Q.2.4) is primarily based on
                figures within India only, although many Indian languages are spoken across
                the borders in neighbouring countries and beyond. It has been possible in a
                limited way to get data on Indian languages spoken outside the country.
                Their distribution within India is shown in terms of States and Union
                Territories as demarcated by the Government of India. Taken from the
                linguistic composition of each territory, a particular language may be
                overwhelmingly concentrated in a State or States, and other languages spoken
                therein constitute minority languages. For example, Orissa State has 84.15%
                Oriya language speakers, the rest are other language speakers of which 6.91%
                belong to other scheduled languages and 8.94% to non-scheduled languages.
                This linguistic distribution has not only been decisive in carving out
                linguistic states but also in identifying particular languages as
                predominant, official, state languages. Thus, Orissa State is a distinct
                geo-political unit of the Indian Union and Oriya is the official language of
                that state. However, the linguistic composition of states clearly shows that
                no state is totally unilingual, and this therefore gives rise to linguistic
                minorities. Although both scheduled and non-scheduled languages as minority
                languages, stand on a par with each other, their traditions and experiences
                are not the same. Also their patterns of distribution are of two types:

                1. A minority scheduled language in one state may be the language of the
                majority in another state (or states). For example, Hindi in Orissa
                vis-a-vis Hindi in Uttar Pradesh, Haryana etc.
                2. A minority non-scheduled language is nowhere the language of a majority.
                For example, Santali in Orissa, Bihar and West Bengal, etc.
                Thus geographical distribution of languages has been not only decisive in
                demarcating linguistic states but also deciding upon the language policies
                to be adopted within the states.
                4. Language Corpus
                This section provides only the basic linguistic facts concerning the
                language corpus and is therefore indicative rather than exhaustive. The
                items included are family affiliation (Q.3.1.), immediate cognate languages
                (Q.2.3.) major named regional variants (Q.3.3.) and grammatical features
                (Q.3.2.). All the information is based upon existing published works of a
                fairly established and complete nature. In a few cases regarding grammatical
                information, we have used materials from unpublished sources based upon
                their intrinsic merit. These data complement and complete the
                sociolinguistic data in an important way without pretending to be of focal
                interest to the survey.
                The languages spoken in India belong to four distinct language families or
                their sub-families: Indo-Aryan, Dravidian, Austroasiatic and Tibeto-Burmese,
                barring a few languages like Andamanese, Onge etc. which are yet to be
                identified. None of these unidentified languages appear in the list of
                written languages. The classification scheme adopted for the Indo-European
                family of languages is that of G.A. Grierson (Grierson, 1927). This is the
                single largest family of languages comprising 73.93% of the Indian
                population. A total number of 15 languages of our list belong to this family
                or more particularly to its sub-family, Indo-Aryan i.e. Assamese, Bengali,
                Gujarati, Hindi, Kashmiri, Marathi, Oriya, Punjabi, Sindhi, Urdu, Konkani,
                Bishnupuriya, Bhili/Bhilodi,Gorkhali/Nepali and Dogri. The second largest is
                the Dravidian family spoken by 23.95% of the total population. The
                classification scheme adopted for Dravidian is from Bh. Krishnamurty
                (Krishnamurty, 1969): a total number of 7 languages of our list belong to
                this family i.e. Tamil, Telugu, Malayalam, Kannada, Kurukh/Oraon, Gondi and
                Tulu. Austric is the third most populous family of languages spoken by 1.27%
                of India population. A total number of 6 languages in our list belong to
                this family or more specifically to its sub-family Austroasiatic.
                Austroasiatic in India is mainly divided into Munda and non-Munda languages.
                Of the 6 languages in our list Khasi and Nicobarese belong to the non-Munda
                group and the rest to the Munda group. For the purpose of classification of
                the Munda languages, the scheme adopted is from N.H. Zide (Zide, 1969), viz.
                Khasi, Nicobarese, Kharia, Santali, Mundari and Ho. The other language
                sub-family is the Tibeto-Burmese branch to which 22 languages belong
                following Grierson's scheme (Grierson, 1927). The speakers comprise 0.79% of
                the total population i.e. Tripuri, Ladakhi, Bodo/Boro, Bhotia,
                Manipuri/Meithei, Kabui, Konyak, Tangkhul, Mikir, Sema, Khezha, Sangtam,
                Phom, Angami, Ao, Lotha, Thado, Hmar, Garo, Lushai/Mizo, Dimasa, Lepcha. In
                determining the immediate cognate languages (Q.2.3.), the above mentioned
                classifications are also followed.
                The grammatical features for all the languages (Q.3.2.) are compiled from
                descriptions which are both the latest and linguistically succinct in their
                treatment. In a few cases we have fallen upon works of less substance, but
                these were the most readily available.
                The presentation of regional variants of a language (Q.3.3.), which
                preferably could have been given in terms of "dialects", presupposes a
                matrix of its own and even puts the whole list of written languages to a
                very different sort of test. Earlier linguistic surveys have done so and
                have succeeded in producing a list of languages and their variants which are
                strictly speaking "dialects". This operation has never been fool-proof i.e.
                by using a single set of criteria, preferably linguistic. It became a mixed
                bag of many criteria ranging from linguistic, to sociological, to juridical.
                Without disowning these attempts, we merely accept the fact that languages
                and dialects are demarcated in a number of ways, which Heinz Kloss
                conceptualizes as Abstand and Ausbau corresponding to what he calls
                "language by distance"-a linguistic one, and "language by development" - a
                sociological one (Kloss, 1972). This does not mean that specific area
                surveys, particularly "dialect surveys" were not carried out in specific
                language areas in recent times in India. Many of these surveys are quite
                rigorous and the results are dependable. The main obstacle in our not
                incorporating the findings of some of these "dialect surveys" is that these
                have not been carried out for all languages included in this survey. It is
                often the case in the field of dialectology that more mis-information seems
                to exist than facts. The present volume lists the names of "language
                variants" which may or may not have strict dialect reference, but their
                mother tongue status is unquestionable and they appear under languages in
                the census with a bond of relationship which is more functional than
                linguistic. However, their relevance is not to be denied in drawing up the
                sociolinguistic portrait of the language in question under which these
                "variants" are patently active.
                5. Script and Spelling
                Against this background data the information on literature, a primary index
                of language unfolding, is of focal importance. These have been divided into
                five subcategories:
                a. Script and Spelling (Q.4)
                b. Background of Literature (Q.6)
                c. Religious and Ideological Writings (Q.7)
                d. Categories of Literature (Q.8)
                e. Periodicals (Q.9)
                The number of scripts in the case of written Indian languages are many; some
                belonging to distinct origins and others to a common original source. There
                are three main kinds; I) derivatives of Brahmi, 2) Arabic and 3) Roman. The
                Brahmi script which is syllabic, is considered indigenously Indic and gave
                birth to a number of distinct but related scripts both in and outside India,
                whereas Arabic and Roman entered the sub-continent with the advent of the
                Muslim and Christian religions respectively. Most major Indian languages
                across families have either independently developed this Brahmi variety
                (e.g. Tamil, Oriya, Punjabi, Gujarati, Manipuri, etc.) for writing their
                languages, or used partially modified varieties to suit their particular
                genres (e.g. Assamese-Bengali, Hindi-Marathi, Kannada-Telugu,
                Tibetan-Ladakhi, etc.). Arabic has been adopted to write Urdu. A vast number
                of smaller languages, emerging of late as written languages through the
                efforts of the Christian missionaries are written in Roman script, which is
                an alphabetic system. There may be a rare case of alphabetization, as in the
                case of the Santali language, where a whole new script called Olchiki was
                invented to write the language. The creator claimed that the shapes of the
                letters were partly a revelation and partly determined by the flora and
                fauna, personalities and other familiar objects of the Santal culture
                (Mahapatra, 1987). One would not be too surprised if the Santali experiment
                turns out to be a trend-setter in the process of alphabetization at least
                for some, when the script is later made to play a much bigger role in the
                socio-political and cultural identity of the group. For a number of language
                groups in the Chotanagpur area, the question of script has become an issue.
                Another trend-setter is the progressive "Nagarization", or the adoption of
                the Devanagari script for some languages, which were not totally
                preliterate. Languages like Konkani, Santali, Bodo/Boro, etc. continue to be
                written in more than one script, due to their geographical distribution
                across several states, where there is pressure to adopt the script of the
                majority language. A large number still, are either totally unwritten or
                have reached only an incipient stage of alphabetization. In some cases there
                may exist a few specimens of transcribed texts but that is no proof of true
                graphization. Incipient graphization is generally restricted; a) to
                compilation of a dictionary or a grammar and b) to the preparation of
                transcribed texts, of orally transmitted traditions like songs, legends,
                tales, etc., for purely scholarly purposes. These may eventually lead to an
                occasional publication of a manual or a primer and ultimately lead to
                further literary achievements. Literacy campaigns are closely connected with
                the process of graphization.
                6. Status
                The status (Q.5) of a language accrues from two main sources: linguistic
                (Q.5.1.) and legal (Q.5.3.).

 The linguistic autonomy of a language is
                established by its intrinsic distance or Abstand from all other systems or
                by its development through oral and literary activities or Ausbau. India
                being viewed as a comprehensive linguistic area, there exists many
                linguistic zones of high intensity communication and contact. As a result,
                intrinsic distance between languages may not prove definitive, as in the
                case of Bengali/Assamese, Hindi/Punjabi, Marathi/Konkani, Tamil/Malayalam,
                etc. Most major languages of India are established more as tools of advanced
                societies and cultures, rather than by distinctive linguistic
                characteristics. Many of these languages have independent histories going
                back several centuries. The attitude of speakers reinforce the distinct
                ethnolinguistic communities, which are built around this experience.
                However, in a number of less established languages, where Ausbau has set in
                with a fair to good indication of language elaboration, it is natural that
                many languages face growing problems. In the beginning stage of
                modernisation every language is deficient. It is in this context, that the
                process of standardization has to be viewed as a part of over all language
                planning. It is to be noted that various standardizing products such as:
                text books, news sheets and other expository prose for a written norm, and
                radio and television etc., for a spoken norm, are already present in all
                language areas. It is only a question of time that the subdialects will give
                way to the printed page and to normative broadcasts, i.e. to standardized
                written and spoken norms.
                The other dimension through which status accrues to a language is legal. The
                Indian Constitution, which is the fountain-head of official language policy,
                defines the primary, status-oriented, juridical role of the Indian
                languages. The specific provisions contained in the Constitution of India on
                the language question are to be found in part XVII, entitled, Off icial
                Language. These provisions, articles 343 to 351, are organised in four
                chapters: Chapter I, Language of the Union (Articles 343, 344); Chapter II,
                Regional Languages (Articles 341-347); Chapter III, Language of the Supreme
                Court, High Courts, etc. (Articles 348, 351), and Chapter IV, Special
                Directives (Articles 350, 351). To articles 344(1) and 355 has been appended
                the Eighth Schedule to the Constitution. This classification is a two tiered
                system prescribing Hindi in Devanagari script as the official language of
                the Union of India, subject to the continuance of English for official
                purposes for a limited period of fifteen years from the commencement of the
                Constitution (Article 343). Secondly, article 345 allows the legislature of
                a State to adopt any one or more languages in use in the State (or Hindi)
                for use for all official purposes in place of English. In view of this
                provision, most states have passed specific legislation declaring State
                languages as official. However, it should be noted that, except for Bhotia,
                Lepcha and Nepali in Sikkim; Lushai/Mizo in the districts of Aizwal and
                Lunglei in Mizoram; Manipuri/Meithei in Manipur and Nepali in the three
                sub-divisions of the district of Darjeeling in West Bengal, regional
                official status is restricted to Schedule VIII languages. Sanskrit, Sindhi
                and Kashmiri are the three exceptions. The following Table 2 gives the legal
                status of the Indian Languages:
                Table 2
                The Legal Status of Indian Languages
                LANGUAGE LANGUAGE
                Assamese VIII Schedule Assam
                Bengali VIII Schedule West Bengal Cachar district
                Tripura of Assam
                Gujarti VIII Schedule Gujarat
                Hindi (Official
                language of the
                Union of India)
                VIII Schedule Uttar Pradesh Some regions of
                Madhya Pradesh
                Himachal Pradesh
                Kannada VIII Schedule Karnataka
                Kashmiri VIII Schedule ---
                Malayalam VIII Schedule Kerala Mahe of Pondi-
                cherry, some
                regions of
                Marathi VIII Schedule Maharashtra
                Oriya VIII Schedule Orissa
                Punjabi VIII Schedule Punjab
                Sanskrit VIII Schedule ---
                Sindhi VIII Schedule ---
                Tamil VIII Schedule Tamilnadu, Some regions of
                Pondicherry Karnataka
                Telugu VIII Schedule Andra Pradesh Some regions of
                Yaman of
                Ganjam and
                Koraput dis-
                tricts of Orissa
                Urdu VIII Schedule Bihar, Jammu & Some regions of
                Kashmir Karnataka
                7. Literature
                As we have said earlier, literature or achievements in the realm of written
                tradition has an important bearing on building up the socio-cultural
                strength of a language. This broad area includes five categories, but script
                and spelling have already been discussed. These categories are:
                a Background of Literature (Q.6)
                b Religion and Ideological Writings (Q.7)
                c Categories of Literature (Q.8)
                d Newspapers (Q.9)
                It is imperative that the bulk of any literature be produced mainly by the
                native speakers of a language and that these be original writings and not
                translations. In the survey the total publications in the language cover a
                period of twenty years divided into three time blocks, namely: 1961-71,
                1971-80 and 1981 The block-wise presentation of statistics adds depth to the
                achievement of providing a comparative picture over time. If we take the
                publication of biblical literature as a convenient point for the onset of
                publication in the Indian languages, it would seem that first publications
                go back to the early 18th century as in Tamil (1714) or Urdu (1747) A vast
                number of languages could claim some printed materials by the middle of the
                19th century. A number of other languages that also went through a similar
                beginning, however, could not sustain this achievement and relapsed into a
                non-literate stage, for example, Malto, Sora, etc. Nevertheless, publication
                and biblical literature continued to remain related for a fairly long period
                of time, particularly in the case of small languages. With the decline of
                missionary activities in India and the spread of mother tongue education,
                school text-books began to form the bulk of language publications. Our
                survey covering this twenty year period shows that the scale of publication
                (frequent, occasional, lacking) could be more profitably asked in a
                four-point scale like (highly frequent-frequent-occasional-lacking), lest we
                group for example, Assamese, Lushai/Mizo and Khasi as "frequent" in terms of
                total publications, viz. publications in the case of Assamese (3,519),
                Lushai/Mizo (913) and Khasi (1,886) We have no doubt that the achievements
                in the case of Lushai/Mizo or Khasi is the result of a fairly sustained
                movement, but although more frequent than Mikir (53) and Kurukh/Oraon (35)
                these languages are less highly prolific than Hindi (37,034) or Bengali
                (19,949). The 1981 statistics will show that the scheduled languages other
                than perhaps Kashmiri add 100 - 1,500 titles a year to their total, while
                there are other languages that may add few or more. Therefore, in a number
                of cases like Lushai/Mizo, Hmar, Manipuri/Meithei, Konkani, etc., where
                production efforts may be on a much more modest scale, they nevertheless
                cannot be viewed as unimportant. Between the unwritten and the scheduled
                languages, there falls a large number of languages where the frequency and
                quantum of publication may not be uniform.

                The process of language elaboration is a deliberate but slow process. Also,
                achievements may not be manifested by literature alone. There are other
                means like the printed media which keep the involvment of people alive to
                the literacy tradition. For example, even small languages like Bishnupuriya,
                Garo, Hmar, Khasi, Ladakhi, Lepcha, Lushai/Mizo, Nicobarese, Santali,
                Tangkhul, Tripuri, Thado, etc. publish their own news sheets and magazines.
                As Kloss (Kloss, 1978) has pointed out, it is not the absolute number of
                such publications that matters. Three periodicals may mean a lot in the case
                of a speech community numbering 10,000 persons, while six periodicals or
                other publications would hardly be impressive in the case of a speech
                community with more than five million speakers. The achievement should be
                put into proportion to the relative size of the community. But by any
                standards, the scheduled languages other than Sindhi and Kashmiri are highly
                prolific in the matter of publication within the given time span of twenty
                years. These languages can be divided into two groups: languages having more
                than 10,000 publications, i.e. Hindi, Bengali, Marathi, Tamil, Malayalam,
                Telugu, Gujarati and Kannada and languages having between 3,000 and 10,000
                i.e. Oriya, Punjabi, Assamese and Urdu. Other languages having a range
                between 100 and 2,000 publications are: Bodo/Boro, Dogri, Lushai/Mizo,
                Khasi, Thado, Hmar, Ladakhi, Manipuri/Meithei, Garo, Gorkhali/Nepali,
                Konkani, Tulu and Santali. The rest of the 50 languages have less than 100
                publications. In the matter of periodicals like newspapers, news sheets and
                magazines, although the scheduled languages may show a fewer number of
                newspapers, magazines, etc. than other languages, sheer number is not the
                only indication of their superiority, rather circulation and frequency also
                count. We have not been able to give the circulation figures due to an
                unevenness in the data returns. There are still some languages like
                Bhili/Bhilodi, Bhotia, Gondi, Kheza, Konyak, Kurukh/Oraon, Lotha, Sangtam,
                which have no periodicals to their credit.
                The literary growth of a language and its use in an increasing number of
                domains can evolve along three major directions: I) poetry and fiction, 2)
                non-narrative (expository) prose and 3) oral channels such as speeches,
                broadcasts, etc. It is also true that the demarcation line between fiction
                and expository prose is not always clear, but it is the latter type of
                literature which is said to have a greater impact upon newly literate
                speakers. The dichotomy is between imaginative versus informative
                literature. In written literature, prose stands more in need of language
                standardization than poetry, expository prose more than fiction.
                Standardization becomes urgent at the refined level and indispensable at the
                learned level of non-narrative prose. Therefore, for purposes of the survey,
                literature has been divided into two broad categories, narrative and
                The sub-categories of narrative literature are provided in two sections that
                of Iyrics and fiction. The non-narrative prose which ranges from devotional
                or ideological writings to school textbooks, has the sub-categories
                (popular, refined and learned) corresponding to the three levels of
                education (primary, secondary and university). Textbooks produced for these
                three levels automatically contribute to the progressive standardization of
                the language. It may be seen from the data, that in the case of most major
                languages, it is the non-narrative literature that exceeds the narrative. As
                a language grows, its interest no longer remains confined to narratives but
                becomes more and more refined in the sense that information transfer of a
                higher order becomes more imperative. In the case of smaller languages,
                particularly the ones which are still struggling to acquire a written
                tradition, achievements are limited to production of school textbooks. The
                questions of language standardization and the overall development of
                languages have remained unsatisfactory as they are, but can now perhaps be
                measured fairly objectively according to the dimension of language
                elaboration, i.e., the level of standardization a language has reached
                through the development of its non-narrative prose, or for that matter all
                its written and oral channels.
                8. Schools
                If a language could be gainfully employed in education it also stands to
                lose when the language is not used in this domain. The domain of education
                is dealt with under three basic headings; primary (Q.10.1), secondary
                (Q.10.3) and university (Q.10.6). A second dimension is added to this data
                by asking whether the employment is exclusive, i.e. as the only teaching
                medium or inclusive, i.e. teaching medium along with another language.
                Inclusive education propagates institutional bilingualism inherent to the
                system and produces as a result group bilingualism. The data on schools in
                terms of parameters provided in the questionnaire was not easy to collate as
                education is primarily a state jurisdiction, so that statistics have to be
                collected through numerous state agencies. Secondly, education is not wholly
                run by governments and there are many non-governmental agencies who run
                their own private schools. However, due to the complexity of the private
                school dimension, it had to be excluded from the survey. In the case of many
                smaller languages and minorities, data collected through the survey was not
                exhaustive. However, to counterbalance this weakness we have given
                information from two other sources, but the one that we value more from the
                point of dependability, is the present survey source, collected through
                various state Directorates of Education, on both primary and higher levels.
                Only Punjab state is an exception and could not give us the necessary data
                in time. Although quantitative data on the number of schools and number of
                students enrolled in these schools, or the number of literates produced in
                the language in a year are very relevant, it is no less important to know
                first whether the domain of education is uniformly occupied by all languages
                on all levels.
                The school level may be further divided into languages which have reached
                secondary level and others which are only at primary level and then into the
                production of literature corresponding to these two levels. These can be
                termed "popular non-narrative prose" and "refined non-narrative prose"
                respectively, as opposed to "learned non-narrative prose", which corresponds
                to the University level of education. Thus, the schools are not just centres
                for education but also trigger off activities related to language
                development. Without this text-level linkage, evaluating texts would be a
                difficult proposition.
                9. Oral Channels
                Oral channels refer to three major media of language usage: 1) use on radio
                (Q.l1.1), 2) use on television (Q.l1.2) and 3) use in movies (Q.11.3). It
                may perhaps be conceded that the spoken word does not require the same
                degree of unification and codification as the printed page. The script and
                spelling which so often hampers the growth of language in writing, is
                non-extant when spoken. A largely unwritten language may be standardized and
                modernised to a very high degree chiefly through established oral media like
                radio broadcasts, television or movies. Of course, oral deliveries may be at
                different levels, starting from regional to reasonably sophisticated styles.
                With reference to broadcasts four such levels have been identified: a)
                folklore, poems, etc., b) reports, announcements, etc., c) lectures,
                sermons, etc. and d) scholarly programmes (Kloss, 1978). It must be noted
                that in India, radio and television are fully controlled by the Government
                and a large share of movie production is also Government controlled. In
                broadcasts, programmes are divided into three categories by the All India
                Radio: I ) music, 2) spoken words and 3) news, and apparently there is no
                order in the choice of the programmes corresponding to the developmental
                stage reached by the language. Except Bishnupuriya, all other languages
                appearing in our list of written languages, have some amount of radio
                programmes in them.
                Languages can be broadly divided into two groups, those which have regular
                daily programmes, and those which have irregular programmes spread over a
                week to a month. Languages which have occasional programmes are
                Bhili/Bhilodi, Gondi, Ho, Kharia, Mundari, Kurukh/Oraon and Tulu. Among the
                languages which have a regular daily programme, ranging from a few minutes
                to almost round the clock, some are limited to a single station while others
                are multi-station languages. The languages which are heard in single
                stations only are: Angami, Bhotia, Bodo/Boro, Dimasa, Dogri, Garo, Hmar,
                Kheza, Khasi, Kabui, Mikir, Konyak, Lotha, Manipuri/Meithei, Nicobarese,
                Phom, Sangtam, Sema, Tangkhul, Tripuri and Thado. Multi-station languages of
                which Hindi is the most important (44 stations) also have daily national
                programmes, daily programmes on external services and are broadcast from
                foreign stations. These programmes are enriched by many features
                accompanying a generally sophisticated and demanding programme.
                In 1981, the year of reference for this survey, television was just being
                introduced in India. Therefore, the data on the use of language in T.V. does
                not give a picture of the current position, which has grown multifold in
                India since the year 1983.
                A third and very important category of oral channel are the movies. Films
                fall into two types: I) feature abd 2) short. Although feature films are
                produced mainly by private agencies based on commercial considerations,
                short films, including documentaries and other publicity oriented matters,
                are produced mainly by the governments. As a result, short films are also
                produced now in quite a few languages other than the scheduled ones. In
                comparison to the position in 1961 when hardly a handful of languages like
                Bengali, Hindi, Tamil, Telugu, Marathi could truely claim exploitation in
                this channel, the position in 1981 is remarkable from the point of view of
                the number of languages involved and the quantum of production. Our 1981
                data show that a number of non-schedule languages are still on the threshold
                of production like Dogri, Garo, Gondi, Mikir, while others are well in
                advance like Gorkhali/Nepali, Konkani, Manipuri/Meithei and Tulu. Similarly,
                all the scheduled languages have uniformly succeeded in producing a large
                number of films, particularly short films. Nevertheless, production of films
                is an expensive proposition and needs commercial viability, which is still
                beyond the reach of many small languages. Less expensive means of
                circulation of oral literature are the production of records (Q.l1.4) and
                tapes and cassettes (Q.l1.5). Again, both these media are exploited by
                private agencies commercially or by the government for developmental or
                welfare purposes. It is natural, that many of the smaller languages cannot
                compete with the former and solely depend upon governmental efforts. This is
                clear from the types of agencies which are mainly responsible for the
                production of these materials.
                10. Language Usage
                The present survey introduced an important set of key domains to ascertain
                the range and intensity of language usage or functions. These domains are:
                Administration (Q.14) at the national, state and local levels; Judiciary
                (Q.15) - at the national, state and local levels, and Legislature (Q.16) at
                the national and state levels. Industries subdivided into manufacturing
                (Q.17) and sales and services (Q.18) were also surveyed to provide a
                function-based portrait of individual languages. Although the results are
                compiled on a scale of frequent - occasional - lacking, the data is
                generally normative rather than statistical. Needless to say in a macro
                level survey such as this, data on language functions are likely to suffer
                from some amount of tentativeness leading to over or under generalization,
                until a number of these languages with reference to their functions in key
                domains are further substantiated by micro level studies.
                However, certain major trends of language usage are still discernible, i.e.
                most languages other than the scheduled languages have very few functions
                outside their immediate local environment and that too in informal milieu
                and in oral communication. The industries, including sales and services, may
                turn to Indian languages in a very restricted way, such as: in publicity or
                labeling of products. The larger the industry the more restrictive is the
                language choice. It is also probably true that industries like sales and
                services and those which produce consumer goods as against capital goods,
                are more open to utilize various languages, as they have a vested interest
                in reaching out to the general public. Yet, their involvment with Indian
                languages is still hesitant and is based more on trial and error than on a
                committed policy. The general picture of language usage that emerges, mainly
                points to a pyramidal structure, where a large number of languages operate
                at the base or local level. On the higher level, i.e. through state to
                national levels the competing languages turn out to be Hindi and English.
                Only more intensive studies based on the format as developed for this
                project can give a sharper picture of the character of the changing
                linguistic situation in India. But what is important is that the present
                survey has staked out a clear path for further investigation.
                11. Reference Framework
                This section (Q.l9) provides a selected bibliography of reference works,
                such as: dictionaries, grammars and language teaching aids available in the
                language, which usually serve as tools for language standardization. It also
                gives a list of specialists and agencies, which can be consulted with regard
                to efforts made in language planning activities.
                12. General Remarks
                This section (Q.20) gives an overview of the languages of the survey tracing
                their history and growth through time. It also provides some initial
                conclusions regarding the findings of the survey.
                In conclusion, the project has a number of bearings on the language
                situation of India. It provides a descriptive grid of language development
                processes, which should act as a stimulus to language planning activities in
                India ranging from Hindi to other scheduled and non-scheduled languages.
                This will enable us to provide a measurement of language development in
                terms of a "vitality rating" (McConnell, 1988; - McConnell-Gendron, 1988; -
                Lieberson, 1981; - Mahapatra, 1986), which can be monitored over time to
                observe the course of language development and then compared to such
                variables as speaker strength. Considering the fact that literacy is one of
                the major problems of India, the survey provides insights into the situation
                in terms of what language is taught to how many, where and when for
                maximising the benefits. To the social and cultural diversity of India, to
                which languages contribute in no small way, this survey will provide a
                systematic approach to understanding the sociolinguistic reality of the
                country. This is another major step forward in designing nation-oriented
                language profiles (Ferguson, 1966).
                B.P. MAHAPATRA
                Deputy Registrar General (Languages)
                Calcutta, 1988
                CENSUS OF INDIA, 1961 - 1964: Vol. I, Part II-C(ii), Language Tables, Delhi.
                CENSUS OF INDIA, 1971 - 1977: Part-II-C(ii) Social and Cultural Tables,
                FERGUSON, Charles A., 1971: "National Sociolinguistic Profile Formulas" in
                Sociolinguistics, Bright, William (Ed.), Mouton & Co.
                GRIERSON, G.A., 1927: Linguistic Survey of India, Vol. I, Pt. I, Motilal
                Banarasi Dass, Delhi, (Reprint, 1967).
                KLOSS, H., 1978: Introduction to the Written Languages of the World, Vol. I.
                The Americas, Kloss, H. and G.D. McConnell (Eds), Les Presses de
                l'Université Laval, Québec.
                KLOSS, H., 1972: Moderator's Statement in Indian Census Centenary Seminar,
                Registrar General of India, New Delhi.
                KRISHNAMURTI, Bhadriraju, 1969: "Comparative Dravidian Studies" in Current
                Trends in Linguistics, Vol. 5, Sebeok, T.A. (Ed.) Mouton, The Hague, pp.
                LIEBERSON, S., 1981: "Language Shift in the United States: Some Demographic
                Clues" in Language Diversity and Language Contact, Introduced by Anwar S.
                Dil, Stanford University Press.
                MAHAPATRA, B.P., 1986: "Language Maintenance and Shift in Bihar", Bhasha
                Anurakshan ebam Visthapan, Central Institute of Hindi, Agra, pp. 89-101.
                MAHAPATRA, B.P., 1987: "Tribal Language Pedagogy: A Case for Santali Guru",
                Indian Linguistics, Vol. 47, No. 1-4.
                McCONNELL, G.D., 1988: "A Model of Language Development and Vitality" -
                Unpublished, paper presented at the International Conference on Language and
                National Development: The Case of India, Hyderabad.
                McCONNELL, G.D. & Jean-Denis GENDRON, 1988: Dimensions et mesure de la
                vitalité linguistique, Volume 1, CIRB, #G-9, 170 p.
                PANDIT, P.B., 1971: "Tamil-Saurashtri Bilingualism - a Case study",
                Department of Linguistics, University of Delhi, Delhi (Mimeographed).
                THE 16th REPORT, 1973-1974: The Commissioner for Linguistic Minorities in
                India, Delhi.
                ZIDE, Norman H., 1969: "Munda and Non-Munda Austro-Asiatic Languages" in
                Current Trends in Linguistics, Vol. 5, Sebeok, T.A. (Ed.), Mouton, The
                Hague, pp. 53-80.

E-Group: Wesanthals