Dirk Speelman and Dirk Geeraerts
- Published in print:
- 2009
- Published Online:
- September 2012
- ISBN:
- 9780748640300
- eISBN:
- 9780748671380
- Item type:
- chapter
- Publisher:
- Edinburgh University Press
- DOI:
- 10.3366/edinburgh/9780748640300.003.0013
- Subject:
- Linguistics, Applied Linguistics and Pedagogy
An important assumption underlying most if not all methods of dialectometry is that the automated analysis of the differences in language use between different locations, as they are recorded by ...
More
An important assumption underlying most if not all methods of dialectometry is that the automated analysis of the differences in language use between different locations, as they are recorded by dialectologists in large scale surveys, can reveal patterns which directly reflect regional variation. Focusing on lexical variation, this chapter examines the role of concept characteristics in lexical dialectometry in three consecutive logical steps. First, it conducts a regression analysis of data taken from a large lexical database of Limburgish dialects in Belgium and the Netherlands to show that concept characteristics such as concept salience, concept vagueness and negative affect contribute to the lexical heterogeneity in the dialect data. Next, it demonstrates that the relationship between concept characteristics and lexical heterogeneity influences the results of conventional lexical dialectometric measurements. Finally, the chapter proposes a lexical dialectometric method in which concept characteristics form the basis of a weighting schema that determines to which extent concept specific dissimilarities can contribute to the aggregate dissimilarities between locations.Less
An important assumption underlying most if not all methods of dialectometry is that the automated analysis of the differences in language use between different locations, as they are recorded by dialectologists in large scale surveys, can reveal patterns which directly reflect regional variation. Focusing on lexical variation, this chapter examines the role of concept characteristics in lexical dialectometry in three consecutive logical steps. First, it conducts a regression analysis of data taken from a large lexical database of Limburgish dialects in Belgium and the Netherlands to show that concept characteristics such as concept salience, concept vagueness and negative affect contribute to the lexical heterogeneity in the dialect data. Next, it demonstrates that the relationship between concept characteristics and lexical heterogeneity influences the results of conventional lexical dialectometric measurements. Finally, the chapter proposes a lexical dialectometric method in which concept characteristics form the basis of a weighting schema that determines to which extent concept specific dissimilarities can contribute to the aggregate dissimilarities between locations.
Benedikt Szmrecsanyi
- Published in print:
- 2009
- Published Online:
- September 2012
- ISBN:
- 9780748640300
- eISBN:
- 9780748671380
- Item type:
- chapter
- Publisher:
- Edinburgh University Press
- DOI:
- 10.3366/edinburgh/9780748640300.003.0016
- Subject:
- Linguistics, Applied Linguistics and Pedagogy
This chapter summarises the results of a study which departs from most previous work in dialectometry in several ways. Empirically, it draws on frequency vectors derived from naturalistic corpus data ...
More
This chapter summarises the results of a study which departs from most previous work in dialectometry in several ways. Empirically, it draws on frequency vectors derived from naturalistic corpus data and not on discrete atlas classifications. Linguistically, it is concerned with morphosyntactic (as opposed to lexical or pronunciational) variability. Methodologically, it combines the careful analysis of dialect phenomena in authentic, naturalistic texts to aggregational-dialectometrical techniques. Two research questions guide the investigation: First, on methodological grounds, is corpus-based dialectometry viable at all? Second, to what extent is morphosyntactic variation in non-standard British dialects patterned geographically? By way of validation, findings are matched against previous work on the dialect geography of Great Britain. The study draws on the Freiburg English Dialect Corpus, a naturalistic speech corpus sampling interview material from 162 different locations in 38 different counties all over the British Isles, excluding Ireland.Less
This chapter summarises the results of a study which departs from most previous work in dialectometry in several ways. Empirically, it draws on frequency vectors derived from naturalistic corpus data and not on discrete atlas classifications. Linguistically, it is concerned with morphosyntactic (as opposed to lexical or pronunciational) variability. Methodologically, it combines the careful analysis of dialect phenomena in authentic, naturalistic texts to aggregational-dialectometrical techniques. Two research questions guide the investigation: First, on methodological grounds, is corpus-based dialectometry viable at all? Second, to what extent is morphosyntactic variation in non-standard British dialects patterned geographically? By way of validation, findings are matched against previous work on the dialect geography of Great Britain. The study draws on the Freiburg English Dialect Corpus, a naturalistic speech corpus sampling interview material from 162 different locations in 38 different counties all over the British Isles, excluding Ireland.
Jelena ProkiĆ and John Nerbonne
- Published in print:
- 2009
- Published Online:
- September 2012
- ISBN:
- 9780748640300
- eISBN:
- 9780748671380
- Item type:
- chapter
- Publisher:
- Edinburgh University Press
- DOI:
- 10.3366/edinburgh/9780748640300.003.0009
- Subject:
- Linguistics, Applied Linguistics and Pedagogy
Dialectometry is a multidisciplinary field that uses various quantitative methods in the analysis of dialect data. Very often those techniques include classification algorithms such as hierarchical ...
More
Dialectometry is a multidisciplinary field that uses various quantitative methods in the analysis of dialect data. Very often those techniques include classification algorithms such as hierarchical clustering algorithms used to detect groups within certain dialect area. Although known for their instability, clustering algorithms are often applied without evaluation or with only partial evaluation. Very small differences in the input data can produce substantially different grouping of dialects. This chapter evaluates algorithms used to detect groups among language dialect varieties measured at the aggregate level. The data used in this research is dialect pronunciation data that consists of various pronunciations of 156 words collected all over Bulgaria. The distances between words are calculated using Levenshtein algorithm, which also resulted in the calculation of the distances between each two sites in the data set. Seven hierarchical clustering algorithms, as well as the k-means and neighbor-joining algorithm, are applied to the calculated distances.Less
Dialectometry is a multidisciplinary field that uses various quantitative methods in the analysis of dialect data. Very often those techniques include classification algorithms such as hierarchical clustering algorithms used to detect groups within certain dialect area. Although known for their instability, clustering algorithms are often applied without evaluation or with only partial evaluation. Very small differences in the input data can produce substantially different grouping of dialects. This chapter evaluates algorithms used to detect groups among language dialect varieties measured at the aggregate level. The data used in this research is dialect pronunciation data that consists of various pronunciations of 156 words collected all over Bulgaria. The distances between words are calculated using Levenshtein algorithm, which also resulted in the calculation of the distances between each two sites in the data set. Seven hierarchical clustering algorithms, as well as the k-means and neighbor-joining algorithm, are applied to the calculated distances.
Hans Goebl
- Published in print:
- 2016
- Published Online:
- August 2016
- ISBN:
- 9780199677108
- eISBN:
- 9780191808821
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780199677108.003.0007
- Subject:
- Linguistics, Language Families, Historical Linguistics
This chapter focuses on the diachronic and synchronic relation between language and space, critically considering some of the most important advances made within Romance linguistic geography and ...
More
This chapter focuses on the diachronic and synchronic relation between language and space, critically considering some of the most important advances made within Romance linguistic geography and dialectometry. It reviews the early work of the dialect geographers in recording regional variation by means of detailed linguistic atlases and in tracing dialect areas, while reflecting upon the linguistic nature of the major discontinuities and their historical significance in explaining the fragmentation of Latin. The second part of the chapter concentrates on more recent taxometric and cartographic achievements of dialectometry in its quantitative investigations and interpretations of traditional linguistic atlases, exploring the non-coincidence of single areas and their surrounding isoglosses; difficulties in measuring the data of linguistic atlases; integration of quantitative methods with traditional qualitative geolinguistics; discovery of lower and higher ranking structural patterns concealed in traditional presentations of atlas data; and cartographic exploitation of similarity and distance matrices.Less
This chapter focuses on the diachronic and synchronic relation between language and space, critically considering some of the most important advances made within Romance linguistic geography and dialectometry. It reviews the early work of the dialect geographers in recording regional variation by means of detailed linguistic atlases and in tracing dialect areas, while reflecting upon the linguistic nature of the major discontinuities and their historical significance in explaining the fragmentation of Latin. The second part of the chapter concentrates on more recent taxometric and cartographic achievements of dialectometry in its quantitative investigations and interpretations of traditional linguistic atlases, exploring the non-coincidence of single areas and their surrounding isoglosses; difficulties in measuring the data of linguistic atlases; integration of quantitative methods with traditional qualitative geolinguistics; discovery of lower and higher ranking structural patterns concealed in traditional presentations of atlas data; and cartographic exploitation of similarity and distance matrices.