Gary Libben and Gonia Jarema (eds)
- Published in print:
- 2007
- Published Online:
- January 2010
- ISBN:
- 9780199228911
- eISBN:
- 9780191711213
- Item type:
- book
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780199228911.001.0001
- Subject:
- Linguistics, Psycholinguistics / Neurolinguistics / Cognitive Linguistics
This book presents new work on the psycholinguistics and neurolinguistics of compound words. It shows the insights this work offers on natural language processing and the relation between language, ...
More
This book presents new work on the psycholinguistics and neurolinguistics of compound words. It shows the insights this work offers on natural language processing and the relation between language, mind, and memory. Compounding is an easy and effective way to create and transfer meanings. By building new lexical items based on the meanings of existing items, compounds can usually be understood on first presentation, though—as, for example, breadboard, cardboard, cupboard, and sandwich-board show—the rules governing the relations between the components’ meanings are not always straightforward. Compound words may be segmentable into their constituent morphemes in much the same way as sentences can be divided into their constituent words: children and adults would not otherwise find them interpretable. But compound sequences may also be independent lexical items that can be retrieved for production as single entities and whose idiosyncratic meanings are stored in the mind. Compound words reflect the properties both of linguistic representation in the mind and of grammatical processing. They thus offer opportunities for investigating key aspects of the mental operations involved in language: for example, the interplay between storage and computation; the manner in which morphological and semantic factors impact on the nature of storage; and the way the mind’s computational processes serve on-line language comprehension and production. This book explores the nature of these opportunities, assesses what is known, and considers what may yet be discovered and how.Less
This book presents new work on the psycholinguistics and neurolinguistics of compound words. It shows the insights this work offers on natural language processing and the relation between language, mind, and memory. Compounding is an easy and effective way to create and transfer meanings. By building new lexical items based on the meanings of existing items, compounds can usually be understood on first presentation, though—as, for example, breadboard, cardboard, cupboard, and sandwich-board show—the rules governing the relations between the components’ meanings are not always straightforward. Compound words may be segmentable into their constituent morphemes in much the same way as sentences can be divided into their constituent words: children and adults would not otherwise find them interpretable. But compound sequences may also be independent lexical items that can be retrieved for production as single entities and whose idiosyncratic meanings are stored in the mind. Compound words reflect the properties both of linguistic representation in the mind and of grammatical processing. They thus offer opportunities for investigating key aspects of the mental operations involved in language: for example, the interplay between storage and computation; the manner in which morphological and semantic factors impact on the nature of storage; and the way the mind’s computational processes serve on-line language comprehension and production. This book explores the nature of these opportunities, assesses what is known, and considers what may yet be discovered and how.
Edward P. Stabler and Jeff MacSwan
- Published in print:
- 2014
- Published Online:
- May 2016
- ISBN:
- 9780262027892
- eISBN:
- 9780262320351
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262027892.003.0011
- Subject:
- Linguistics, Sociolinguistics / Anthropological Linguistics
A fully lexicalized grammar can represent the knowledge of a multilingual speaker simply by putting lexical items from the various languages together. This conception suggests that multilingualism ...
More
A fully lexicalized grammar can represent the knowledge of a multilingual speaker simply by putting lexical items from the various languages together. This conception suggests that multilingualism should be a quite natural state, an idea that fits well with a conception according to which every adjustment in “register” or dialect for context is regarded as a kind of codeswitching (CS). This conception of CS does not invoke any special mechanisms to control the interaction among the languages a learner knows. Nonetheless, some restrictions appear to be needed, since CS does not occur at arbitrary positions in an utterance. Exploring these restrictions, MacSwan (1999, 2014) proposes that the syntactic elements of various languages can mix freely, but the morphophonologies have different properties and cannot be mixed. Hence, CS is possible only at those points in an utterance that would not disrupt morphophonological dependencies. A computational model of language analysis proposed by Stabler (2001), which separates morphophonological and syntactic processing, is explicit and general enough to allow these CS proposals to be realized. This chapter shows such a computational model at work, concluding that (1) when the syntax is lexicalized, so all crosslinguistic variation is due to the lexicon, CS is immediately predicted for mixed utterances; (2) the extent of codeswitching in the syntax is based on the extent to which the categories of the different languages are assimilated, which allows for the influence of learned CS patterns; and (3) morphophonology, based on linear position and ranked constraints, does not allow word-internal CS.Less
A fully lexicalized grammar can represent the knowledge of a multilingual speaker simply by putting lexical items from the various languages together. This conception suggests that multilingualism should be a quite natural state, an idea that fits well with a conception according to which every adjustment in “register” or dialect for context is regarded as a kind of codeswitching (CS). This conception of CS does not invoke any special mechanisms to control the interaction among the languages a learner knows. Nonetheless, some restrictions appear to be needed, since CS does not occur at arbitrary positions in an utterance. Exploring these restrictions, MacSwan (1999, 2014) proposes that the syntactic elements of various languages can mix freely, but the morphophonologies have different properties and cannot be mixed. Hence, CS is possible only at those points in an utterance that would not disrupt morphophonological dependencies. A computational model of language analysis proposed by Stabler (2001), which separates morphophonological and syntactic processing, is explicit and general enough to allow these CS proposals to be realized. This chapter shows such a computational model at work, concluding that (1) when the syntax is lexicalized, so all crosslinguistic variation is due to the lexicon, CS is immediately predicted for mixed utterances; (2) the extent of codeswitching in the syntax is based on the extent to which the categories of the different languages are assimilated, which allows for the influence of learned CS patterns; and (3) morphophonology, based on linear position and ranked constraints, does not allow word-internal CS.
Florian Cramer
- Published in print:
- 2008
- Published Online:
- August 2013
- ISBN:
- 9780262062749
- eISBN:
- 9780262273343
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262062749.003.0023
- Subject:
- Society and Culture, Media Studies
This chapter describes the detailed relationship software and language; it states that software processes the natural language and constructs in programming languages. The chapter explores the ...
More
This chapter describes the detailed relationship software and language; it states that software processes the natural language and constructs in programming languages. The chapter explores the in-depth analysis of software implementation languages and software written languages, and provides a discussion on the role of computer control languages, machine languages, common human languages, and different language variants. It concludes with the differentiation between the computer programming languages and coding concepts in the computing sector.Less
This chapter describes the detailed relationship software and language; it states that software processes the natural language and constructs in programming languages. The chapter explores the in-depth analysis of software implementation languages and software written languages, and provides a discussion on the role of computer control languages, machine languages, common human languages, and different language variants. It concludes with the differentiation between the computer programming languages and coding concepts in the computing sector.
Parisa Kordjamshidi, Joana Hois, and Marie-Francine Moens
- Published in print:
- 2013
- Published Online:
- January 2014
- ISBN:
- 9780199679911
- eISBN:
- 9780191760112
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780199679911.003.0007
- Subject:
- Linguistics, Psycholinguistics / Neurolinguistics / Cognitive Linguistics, Semantics and Pragmatics
Computational approaches in spatial language understanding nowadays distinguish and use different aspects of spatial and contextual information. These aspects comprise linguistic grammatical ...
More
Computational approaches in spatial language understanding nowadays distinguish and use different aspects of spatial and contextual information. These aspects comprise linguistic grammatical features, qualitative formal representations, and situational context-aware data. In this chapter, we apply formal models and machine learning techniques to map spatial semantics in natural language to qualitative spatial representations. In particular, we investigate whether and how well linguistic features can be classified and automatically extracted and mapped to region-based qualitative relations using corpus-based learning. We separate the challenge of spatial language understanding into two tasks: (i) we identify and automatically extract those parts from linguistic utterances that provide specifically spatial information, and (ii) we map the extracted parts that result from the first task to qualitative spatial representations. In this chapter, we present both tasks and we particularly discuss experimental results of the second part of mapping linguistic features to qualitative spatial relations. Our results show that region-based spatial relations can indeed be learned to a high degree and that they are distinguishable on the basis of different linguistic features.Less
Computational approaches in spatial language understanding nowadays distinguish and use different aspects of spatial and contextual information. These aspects comprise linguistic grammatical features, qualitative formal representations, and situational context-aware data. In this chapter, we apply formal models and machine learning techniques to map spatial semantics in natural language to qualitative spatial representations. In particular, we investigate whether and how well linguistic features can be classified and automatically extracted and mapped to region-based qualitative relations using corpus-based learning. We separate the challenge of spatial language understanding into two tasks: (i) we identify and automatically extract those parts from linguistic utterances that provide specifically spatial information, and (ii) we map the extracted parts that result from the first task to qualitative spatial representations. In this chapter, we present both tasks and we particularly discuss experimental results of the second part of mapping linguistic features to qualitative spatial relations. Our results show that region-based spatial relations can indeed be learned to a high degree and that they are distinguishable on the basis of different linguistic features.
Mark Steedman
- Published in print:
- 2011
- Published Online:
- August 2013
- ISBN:
- 9780262017077
- eISBN:
- 9780262301404
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262017077.003.0013
- Subject:
- Linguistics, Semantics and Pragmatics
Combinatory Categorial Grammar (CCG) allows semantically equivalent alternate surface derivations typified by the sentence “Harry admires Louise,” as well as English noun phrases, to have all of the ...
More
Combinatory Categorial Grammar (CCG) allows semantically equivalent alternate surface derivations typified by the sentence “Harry admires Louise,” as well as English noun phrases, to have all of the type-raised categories allowed by a full-blown morphological case system. Many critics have argued that this so-called spurious ambiguity makes CCG quite impracticable to apply to useful tasks such as parsing and question answering in open domains, regardless of its linguistic attractions. This chapter examines how CCG can be used for efficient natural language processing. It first considers algorithms that have formed the basis of a number of practical CCG parsers before turning to logical forms and how they are built with CCG. The chapter also discusses processing scope and pronominal reference in CCG, generation of strings from logical forms using CCG, the use of scope for rapid inference in support of question answering or textual entailment, and human sentence processing.Less
Combinatory Categorial Grammar (CCG) allows semantically equivalent alternate surface derivations typified by the sentence “Harry admires Louise,” as well as English noun phrases, to have all of the type-raised categories allowed by a full-blown morphological case system. Many critics have argued that this so-called spurious ambiguity makes CCG quite impracticable to apply to useful tasks such as parsing and question answering in open domains, regardless of its linguistic attractions. This chapter examines how CCG can be used for efficient natural language processing. It first considers algorithms that have formed the basis of a number of practical CCG parsers before turning to logical forms and how they are built with CCG. The chapter also discusses processing scope and pronominal reference in CCG, generation of strings from logical forms using CCG, the use of scope for rapid inference in support of question answering or textual entailment, and human sentence processing.
Masashi Sugiyama and Motoaki Kawanabe
- Published in print:
- 2012
- Published Online:
- September 2013
- ISBN:
- 9780262017091
- eISBN:
- 9780262301220
- Item type:
- book
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262017091.001.0001
- Subject:
- Computer Science, Machine Learning
As the power of computing has grown over the past few decades, the field of machine learning has advanced rapidly in both theory and practice. Machine learning methods are usually based on the ...
More
As the power of computing has grown over the past few decades, the field of machine learning has advanced rapidly in both theory and practice. Machine learning methods are usually based on the assumption that the data generation mechanism does not change over time. Yet real-world applications of machine learning, including image recognition, natural language processing, speech recognition, robot control, and bioinformatics, often violate this common assumption. Dealing with non-stationarity is one of modern machine learning’s greatest challenges. This book focuses on a specific non-stationary environment known as covariate shift, in which the distributions of inputs (queries) change but the conditional distribution of outputs (answers) is unchanged, and presents machine learning theory, algorithms, and applications to overcome this variety of non-stationarity. After reviewing the state-of-the-art research in the field, the book discusses topics that include learning under covariate shift, model selection, importance estimation, and active learning. It describes such real-world applications of covariate shift adaption as brain-computer interface, speaker identification, and age prediction from facial images.Less
As the power of computing has grown over the past few decades, the field of machine learning has advanced rapidly in both theory and practice. Machine learning methods are usually based on the assumption that the data generation mechanism does not change over time. Yet real-world applications of machine learning, including image recognition, natural language processing, speech recognition, robot control, and bioinformatics, often violate this common assumption. Dealing with non-stationarity is one of modern machine learning’s greatest challenges. This book focuses on a specific non-stationary environment known as covariate shift, in which the distributions of inputs (queries) change but the conditional distribution of outputs (answers) is unchanged, and presents machine learning theory, algorithms, and applications to overcome this variety of non-stationarity. After reviewing the state-of-the-art research in the field, the book discusses topics that include learning under covariate shift, model selection, importance estimation, and active learning. It describes such real-world applications of covariate shift adaption as brain-computer interface, speaker identification, and age prediction from facial images.
Mark Steedman
- Published in print:
- 2011
- Published Online:
- August 2013
- ISBN:
- 9780262017077
- eISBN:
- 9780262301404
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262017077.003.0001
- Subject:
- Linguistics, Semantics and Pragmatics
Computational linguists and psycholinguists show little interest in the formal semantics of quantification, even though quantifiers and other scoping elements play an important role in human and ...
More
Computational linguists and psycholinguists show little interest in the formal semantics of quantification, even though quantifiers and other scoping elements play an important role in human and computational natural language processing. Quantifiers, along with negation and other categories such as modality, tense, and aspect, matter because they support inference and entailment. To understand the relevance of quantifier-scope, it is necessary to look at how rare events reveal the nature of the system and where quantifers might actually have practical applications. Questions provide few opportunities to deploy quantifiers, other than simple definites and indefinites. Some quantifiers convey subtle implicatures about the completeness or otherwise of the speaker’s knowledge.Less
Computational linguists and psycholinguists show little interest in the formal semantics of quantification, even though quantifiers and other scoping elements play an important role in human and computational natural language processing. Quantifiers, along with negation and other categories such as modality, tense, and aspect, matter because they support inference and entailment. To understand the relevance of quantifier-scope, it is necessary to look at how rare events reveal the nature of the system and where quantifers might actually have practical applications. Questions provide few opportunities to deploy quantifiers, other than simple definites and indefinites. Some quantifiers convey subtle implicatures about the completeness or otherwise of the speaker’s knowledge.
Andrew Piper
- Published in print:
- 2018
- Published Online:
- January 2019
- ISBN:
- 9780226568614
- eISBN:
- 9780226568898
- Item type:
- book
- Publisher:
- University of Chicago Press
- DOI:
- 10.7208/chicago/9780226568898.001.0001
- Subject:
- Literature, Criticism/Theory
For well over a century, academic disciplines have studied human behavior using quantitative information. Until recently, however, the humanities have remained largely immune to the use of data—or ...
More
For well over a century, academic disciplines have studied human behavior using quantitative information. Until recently, however, the humanities have remained largely immune to the use of data—or vigorously resisted it. Thanks to new developments in computer science and natural language processing, literary scholars have embraced the quantitative study of literary works and have helped make Digital Humanities a rapidly growing field. But these developments raise a fundamental and as yet unanswered question: what is the meaning of literary quantity? This book answers that question across a variety of domains fundamental to the study of literature. It focuses on the elementary particles of literature, from the role of punctuation in poetry and the matter of plot in novels, to the study of topoi and the behavior of characters, to the nature of fictional language and the shape of a poet’s career. How does quantity affect our understanding of these categories? What happens when we look at 3,388,230 punctuation marks, 1.4 billion words, or 650,000 fictional characters? Does this change how we think about poetry, the novel, fictionality, character, the commonplace, or the writer’s career? In the course of answering these questions the book introduces readers to the analytical building blocks of computational text analysis and brings them to bear on fundamental concerns of literary scholarship.Less
For well over a century, academic disciplines have studied human behavior using quantitative information. Until recently, however, the humanities have remained largely immune to the use of data—or vigorously resisted it. Thanks to new developments in computer science and natural language processing, literary scholars have embraced the quantitative study of literary works and have helped make Digital Humanities a rapidly growing field. But these developments raise a fundamental and as yet unanswered question: what is the meaning of literary quantity? This book answers that question across a variety of domains fundamental to the study of literature. It focuses on the elementary particles of literature, from the role of punctuation in poetry and the matter of plot in novels, to the study of topoi and the behavior of characters, to the nature of fictional language and the shape of a poet’s career. How does quantity affect our understanding of these categories? What happens when we look at 3,388,230 punctuation marks, 1.4 billion words, or 650,000 fictional characters? Does this change how we think about poetry, the novel, fictionality, character, the commonplace, or the writer’s career? In the course of answering these questions the book introduces readers to the analytical building blocks of computational text analysis and brings them to bear on fundamental concerns of literary scholarship.
Subhadra Dutta and Eric M. O’Rourke
- Published in print:
- 2020
- Published Online:
- May 2020
- ISBN:
- 9780190939717
- eISBN:
- 9780190939748
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/oso/9780190939717.003.0013
- Subject:
- Psychology, Social Psychology
Natural language processing (NLP) is the field of decoding human written language. This chapter responds to the growing interest in using machine learning–based NLP approaches for analyzing ...
More
Natural language processing (NLP) is the field of decoding human written language. This chapter responds to the growing interest in using machine learning–based NLP approaches for analyzing open-ended employee survey responses. These techniques address scalability and the ability to provide real-time insights to make qualitative data collection equally or more desirable in organizations. The chapter walks through the evolution of text analytics in industrial–organizational psychology and discusses relevant supervised and unsupervised machine learning NLP methods for survey text data, such as latent Dirichlet allocation, latent semantic analysis, sentiment analysis, word relatedness methods, and so on. The chapter also lays out preprocessing techniques and the trade-offs of growing NLP capabilities internally versus externally, points the readers to available resources, and ends with discussing implications and future directions of these approaches.Less
Natural language processing (NLP) is the field of decoding human written language. This chapter responds to the growing interest in using machine learning–based NLP approaches for analyzing open-ended employee survey responses. These techniques address scalability and the ability to provide real-time insights to make qualitative data collection equally or more desirable in organizations. The chapter walks through the evolution of text analytics in industrial–organizational psychology and discusses relevant supervised and unsupervised machine learning NLP methods for survey text data, such as latent Dirichlet allocation, latent semantic analysis, sentiment analysis, word relatedness methods, and so on. The chapter also lays out preprocessing techniques and the trade-offs of growing NLP capabilities internally versus externally, points the readers to available resources, and ends with discussing implications and future directions of these approaches.
Charles Yang
- Published in print:
- 2016
- Published Online:
- May 2017
- ISBN:
- 9780262035323
- eISBN:
- 9780262336376
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262035323.003.0002
- Subject:
- Psychology, Cognitive Psychology
A review of statistical facts of language especially morphology and children’s acquisition of morphology with focus on productivity. Contrary to popular beliefs, productivity should be understood as ...
More
A review of statistical facts of language especially morphology and children’s acquisition of morphology with focus on productivity. Contrary to popular beliefs, productivity should be understood as a categorical notion in language, judging from the now extensive cross-linguistic studies of language acquisition. Why language must make use of a small set of wide-ranging rules, rather than memorized expressions, and why this is a difficult task for the child learner who acquires language in a few short years.Less
A review of statistical facts of language especially morphology and children’s acquisition of morphology with focus on productivity. Contrary to popular beliefs, productivity should be understood as a categorical notion in language, judging from the now extensive cross-linguistic studies of language acquisition. Why language must make use of a small set of wide-ranging rules, rather than memorized expressions, and why this is a difficult task for the child learner who acquires language in a few short years.
Bernardo Cuenca Grau and Adolfo Plasencia
- Published in print:
- 2017
- Published Online:
- January 2018
- ISBN:
- 9780262036016
- eISBN:
- 9780262339308
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262036016.003.0015
- Subject:
- Society and Culture, Technology and Society
In this dialogue, Bernardo Cuenca Grau, a computer scientist at the Department of Computer Science, University of Oxford, begins by explaining his research in technology based on ontologies and ...
More
In this dialogue, Bernardo Cuenca Grau, a computer scientist at the Department of Computer Science, University of Oxford, begins by explaining his research in technology based on ontologies and knowledge representation, somewhere between mathematics, philosophy, and computer science. He goes on to argue why we need to represent knowledge in a way that it can be processed by a computer and therefore enable automated reasoning of this knowledge using artificial intelligence. Later he explains how his investigation probes the limits of mathematics to find the most appropriate languages for developing practical applications. For example, the large-scale processing of structured information linked to comprehensive health systems. Bernardo is supportive of collective tools such as Wikipedia. He also discusses why in his opinion the success of a scientific or technological idea depends very much on luck, and why the semantic web has not been defined. Furthermore, he argues why bureaucracy confuses process with progress.Less
In this dialogue, Bernardo Cuenca Grau, a computer scientist at the Department of Computer Science, University of Oxford, begins by explaining his research in technology based on ontologies and knowledge representation, somewhere between mathematics, philosophy, and computer science. He goes on to argue why we need to represent knowledge in a way that it can be processed by a computer and therefore enable automated reasoning of this knowledge using artificial intelligence. Later he explains how his investigation probes the limits of mathematics to find the most appropriate languages for developing practical applications. For example, the large-scale processing of structured information linked to comprehensive health systems. Bernardo is supportive of collective tools such as Wikipedia. He also discusses why in his opinion the success of a scientific or technological idea depends very much on luck, and why the semantic web has not been defined. Furthermore, he argues why bureaucracy confuses process with progress.
Andrew Piper
- Published in print:
- 2018
- Published Online:
- January 2019
- ISBN:
- 9780226568614
- eISBN:
- 9780226568898
- Item type:
- chapter
- Publisher:
- University of Chicago Press
- DOI:
- 10.7208/chicago/9780226568898.003.0004
- Subject:
- Literature, Criticism/Theory
This chapter shows how probabilistic topic modeling drives interest in the study of topics, and how topic modeling has proven a successful tool for identifying coherent linguistic categories within ...
More
This chapter shows how probabilistic topic modeling drives interest in the study of topics, and how topic modeling has proven a successful tool for identifying coherent linguistic categories within collections of texts. Yet despite interest in topic models, no one has yet asked the question “What is a topic?” (either in classical rhetoric or computational study). If we derive large-scale semantic significance from texts, how does this relate to the longer philosophical/philological tradition? Beginning with an overview of the long (pre-computational) history of topics (from Aristotle to Renaissance commonplace books to nineteenth-century encyclopedism), then moving to a quantitative approach to topic modeling’s link to the past, the chapter uses a single topic (a single model run on a collection of German novels written over the course of the long nineteenth century) to explore larger metaphorical constellations associated with this topic (through the close reading of individual passages), and to apply a more quantitative approach (chapter 2’s method of multidimensional scaling, where word distributions are transformed into spatial representations). The topic’s semantic coherence (when it’s more or less present) at different states of likelihood within a text is compared to the spatial relationships and interconnectedness of topics (the way they coalesce/disperse).Less
This chapter shows how probabilistic topic modeling drives interest in the study of topics, and how topic modeling has proven a successful tool for identifying coherent linguistic categories within collections of texts. Yet despite interest in topic models, no one has yet asked the question “What is a topic?” (either in classical rhetoric or computational study). If we derive large-scale semantic significance from texts, how does this relate to the longer philosophical/philological tradition? Beginning with an overview of the long (pre-computational) history of topics (from Aristotle to Renaissance commonplace books to nineteenth-century encyclopedism), then moving to a quantitative approach to topic modeling’s link to the past, the chapter uses a single topic (a single model run on a collection of German novels written over the course of the long nineteenth century) to explore larger metaphorical constellations associated with this topic (through the close reading of individual passages), and to apply a more quantitative approach (chapter 2’s method of multidimensional scaling, where word distributions are transformed into spatial representations). The topic’s semantic coherence (when it’s more or less present) at different states of likelihood within a text is compared to the spatial relationships and interconnectedness of topics (the way they coalesce/disperse).
François Grosjean
- Published in print:
- 2019
- Published Online:
- June 2019
- ISBN:
- 9780198754947
- eISBN:
- 9780191816437
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/oso/9780198754947.003.0013
- Subject:
- Linguistics, Psycholinguistics / Neurolinguistics / Cognitive Linguistics, Sociolinguistics / Anthropological Linguistics
The author and his family went back to Europe for good, and had to acculturate to a new culture. His boys needed a bit of help during their first years but then things worked out well. The author set ...
More
The author and his family went back to Europe for good, and had to acculturate to a new culture. His boys needed a bit of help during their first years but then things worked out well. The author set up his laboratory and started collaborating with firms involved in natural language processing (NLP). One of the main projects he worked on with his team was an English writing tool and grammar checker for French speakers. He also developed a long-term partnership with the Lausanne University Hospital (CHUV). The author explains how he helped students do experimental research in his laboratory. The chapter ends with some statistics showing how successful the laboratory was over a span of twenty years.Less
The author and his family went back to Europe for good, and had to acculturate to a new culture. His boys needed a bit of help during their first years but then things worked out well. The author set up his laboratory and started collaborating with firms involved in natural language processing (NLP). One of the main projects he worked on with his team was an English writing tool and grammar checker for French speakers. He also developed a long-term partnership with the Lausanne University Hospital (CHUV). The author explains how he helped students do experimental research in his laboratory. The chapter ends with some statistics showing how successful the laboratory was over a span of twenty years.
Masashi Sugiyama and Motoaki Kawanabe
- Published in print:
- 2012
- Published Online:
- September 2013
- ISBN:
- 9780262017091
- eISBN:
- 9780262301220
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262017091.003.0007
- Subject:
- Computer Science, Machine Learning
This chapter discusses state-of-the-art applications of covariate shift adaptation techniques to various real-world problems. It covers non-stationarity adaptation in brain-computer interfaces; ...
More
This chapter discusses state-of-the-art applications of covariate shift adaptation techniques to various real-world problems. It covers non-stationarity adaptation in brain-computer interfaces; speaker identification through change in voice quality; domain adaptation in natural language processing; age prediction from face images under changing illumination conditions; user adaptation in human activity recognition; and efficient sample reuse in autonomous robot control.Less
This chapter discusses state-of-the-art applications of covariate shift adaptation techniques to various real-world problems. It covers non-stationarity adaptation in brain-computer interfaces; speaker identification through change in voice quality; domain adaptation in natural language processing; age prediction from face images under changing illumination conditions; user adaptation in human activity recognition; and efficient sample reuse in autonomous robot control.
Andrew Piper
- Published in print:
- 2018
- Published Online:
- January 2019
- ISBN:
- 9780226568614
- eISBN:
- 9780226568898
- Item type:
- chapter
- Publisher:
- University of Chicago Press
- DOI:
- 10.7208/chicago/9780226568898.003.0006
- Subject:
- Literature, Criticism/Theory
Quantity has a role to play in understanding the nature of characters and the process of characterization (the writerly act of generating animate entities through language). With Alex Woloch’s ...
More
Quantity has a role to play in understanding the nature of characters and the process of characterization (the writerly act of generating animate entities through language). With Alex Woloch’s question of “the many” in mind, the chapter begins with a survey of an estimated 85 characters per novel in the nineteenth century, a conservative estimate of 20,000 novels published during this period in English, producing ca. 1.7 million unique characters appearing in one century in one language. Simultaneously, the process of characterization poses challenges of scale: the great number of characters, plus the vast amount of information surrounding even a single, main character. Characters (like other textual features) are abundant across the pages of novels. Through an examination of over 650,000 characters using new techniques in natural language processing and entity recognition, this chapter explores the "character-text" of novels (how characters are activated, described, objectified). The (surprising) evidence here suggests that the process of characterization is best described as one of stylistic constraint, aligning the practice of characterization more closely with a character’s etymological origins (as representative, general, or “characteristic”—not individualistic). The chapter then explores the rise of “interiorly oriented” characters and Nancy Armstrong’s notion of strongly gendered “deep character.”Less
Quantity has a role to play in understanding the nature of characters and the process of characterization (the writerly act of generating animate entities through language). With Alex Woloch’s question of “the many” in mind, the chapter begins with a survey of an estimated 85 characters per novel in the nineteenth century, a conservative estimate of 20,000 novels published during this period in English, producing ca. 1.7 million unique characters appearing in one century in one language. Simultaneously, the process of characterization poses challenges of scale: the great number of characters, plus the vast amount of information surrounding even a single, main character. Characters (like other textual features) are abundant across the pages of novels. Through an examination of over 650,000 characters using new techniques in natural language processing and entity recognition, this chapter explores the "character-text" of novels (how characters are activated, described, objectified). The (surprising) evidence here suggests that the process of characterization is best described as one of stylistic constraint, aligning the practice of characterization more closely with a character’s etymological origins (as representative, general, or “characteristic”—not individualistic). The chapter then explores the rise of “interiorly oriented” characters and Nancy Armstrong’s notion of strongly gendered “deep character.”
George A. Khachatryan
- Published in print:
- 2020
- Published Online:
- May 2020
- ISBN:
- 9780190910709
- eISBN:
- 9780190910730
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/oso/9780190910709.003.0010
- Subject:
- Psychology, Developmental Psychology
Instruction modeling is still in its early stages. This chapter discusses promising directions in which instruction modeling could develop in coming years. This includes increasing the richness of ...
More
Instruction modeling is still in its early stages. This chapter discusses promising directions in which instruction modeling could develop in coming years. This includes increasing the richness of interfaces used in instruction modeling programs (e.g., by allowing students to enter responses in free form and have them graded via natural language processing); applying instruction modeling to subjects beyond mathematics, including English, foreign language, and science; using educational data mining to create automated “coaches” to help teachers better implement instruction modeling programs in their classrooms; creating approaches to instruction modeling that allow for rapid authorship of content; redesigning schools (in schedules as well as architecture) to optimize the use of instruction modeling; and putting in place government policies to encourage the use of comprehensive blended learning programs (such as those developed through instruction modeling).Less
Instruction modeling is still in its early stages. This chapter discusses promising directions in which instruction modeling could develop in coming years. This includes increasing the richness of interfaces used in instruction modeling programs (e.g., by allowing students to enter responses in free form and have them graded via natural language processing); applying instruction modeling to subjects beyond mathematics, including English, foreign language, and science; using educational data mining to create automated “coaches” to help teachers better implement instruction modeling programs in their classrooms; creating approaches to instruction modeling that allow for rapid authorship of content; redesigning schools (in schedules as well as architecture) to optimize the use of instruction modeling; and putting in place government policies to encourage the use of comprehensive blended learning programs (such as those developed through instruction modeling).
Andrew Piper
- Published in print:
- 2018
- Published Online:
- January 2019
- ISBN:
- 9780226568614
- eISBN:
- 9780226568898
- Item type:
- chapter
- Publisher:
- University of Chicago Press
- DOI:
- 10.7208/chicago/9780226568898.003.0001
- Subject:
- Literature, Criticism/Theory
This book studies the meaning of repetitions of reading instead of its more singular moments--the quantities of texts and the quantities of things within texts--that give meaning to a reader's ...
More
This book studies the meaning of repetitions of reading instead of its more singular moments--the quantities of texts and the quantities of things within texts--that give meaning to a reader's experience. The introduction establishes a new theoretical framework for thinking about literary quantity through three key terms: implication, distribution, and diagram. First, implication puts the idea of modeling and its contingency into the foreground of research. Focusing on the implicatedness of computational modeling allows a rethinking of our investments in either purely empirical or subjective reading experiences. Implicatedness acknowledges both the constructedness of our knowledge about the past and an underlying affirmatory belief contained in any construction of the world. Distributed reading, by contrast, suggests a fundamentally relational/reflexive way of thinking about literary meaning, the way the sense of a text is always mutually produced through the construction of context. Distributional models allow for a more spatial, contingent modeling of texts and contexts. Finally, diagrammatic reading indicates how the practice of computation produces meaning by "drawing together" different sign systems (letter, number, image). Replacing the haptic totality of the book--its graspability--computational reading relies on the diagram's perspectival totality, the continual mediation between letters and numbers.Less
This book studies the meaning of repetitions of reading instead of its more singular moments--the quantities of texts and the quantities of things within texts--that give meaning to a reader's experience. The introduction establishes a new theoretical framework for thinking about literary quantity through three key terms: implication, distribution, and diagram. First, implication puts the idea of modeling and its contingency into the foreground of research. Focusing on the implicatedness of computational modeling allows a rethinking of our investments in either purely empirical or subjective reading experiences. Implicatedness acknowledges both the constructedness of our knowledge about the past and an underlying affirmatory belief contained in any construction of the world. Distributed reading, by contrast, suggests a fundamentally relational/reflexive way of thinking about literary meaning, the way the sense of a text is always mutually produced through the construction of context. Distributional models allow for a more spatial, contingent modeling of texts and contexts. Finally, diagrammatic reading indicates how the practice of computation produces meaning by "drawing together" different sign systems (letter, number, image). Replacing the haptic totality of the book--its graspability--computational reading relies on the diagram's perspectival totality, the continual mediation between letters and numbers.
Andrew Piper
- Published in print:
- 2018
- Published Online:
- January 2019
- ISBN:
- 9780226568614
- eISBN:
- 9780226568898
- Item type:
- chapter
- Publisher:
- University of Chicago Press
- DOI:
- 10.7208/chicago/9780226568898.003.0003
- Subject:
- Literature, Criticism/Theory
This chapter is about words, not in the individual sense, but in the distributional sense (a larger set of patterns/behaviors, as a form of usage). Contra a single luminous word, distributional ...
More
This chapter is about words, not in the individual sense, but in the distributional sense (a larger set of patterns/behaviors, as a form of usage). Contra a single luminous word, distributional semantics shows relationships existing between words; meaning shaped through probabilistic distributions. Understanding texts as word distributions, a way of thinking about plot (the way actions/beliefs are encoded in narrative form), tracks the shift/drift of language in a text as it signals to readers a change in the text’s concerns. Using a trilingual collection (450 mostly canonical novels from the long nineteenth century), the chapter shows these novels as distinctive in their lexical contraction. Though for much of their history novels have been imagined as an abundant, often exceedingly long form, multiplying dramatically over time, vector space model techniques show these novels pushing against their perceived history of imagined excess. These novels are unique in how the linguistic “space” within them contracts as they explore social constraint experienced through language. Here the art of lack offers insights into what it means to contract inward and, in so doing, potentially saying more. The art of lack is the dream of insight where there is increasingly less and less to say.Less
This chapter is about words, not in the individual sense, but in the distributional sense (a larger set of patterns/behaviors, as a form of usage). Contra a single luminous word, distributional semantics shows relationships existing between words; meaning shaped through probabilistic distributions. Understanding texts as word distributions, a way of thinking about plot (the way actions/beliefs are encoded in narrative form), tracks the shift/drift of language in a text as it signals to readers a change in the text’s concerns. Using a trilingual collection (450 mostly canonical novels from the long nineteenth century), the chapter shows these novels as distinctive in their lexical contraction. Though for much of their history novels have been imagined as an abundant, often exceedingly long form, multiplying dramatically over time, vector space model techniques show these novels pushing against their perceived history of imagined excess. These novels are unique in how the linguistic “space” within them contracts as they explore social constraint experienced through language. Here the art of lack offers insights into what it means to contract inward and, in so doing, potentially saying more. The art of lack is the dream of insight where there is increasingly less and less to say.
Andrew Piper
- Published in print:
- 2018
- Published Online:
- January 2019
- ISBN:
- 9780226568614
- eISBN:
- 9780226568898
- Item type:
- chapter
- Publisher:
- University of Chicago Press
- DOI:
- 10.7208/chicago/9780226568898.003.0007
- Subject:
- Literature, Criticism/Theory
The “corpus” (or body of work, since Cicero) of an author is meant to be organic, integral--well connected--but also distinct and whole; it marks limits; it is the material complement to the author’s ...
More
The “corpus” (or body of work, since Cicero) of an author is meant to be organic, integral--well connected--but also distinct and whole; it marks limits; it is the material complement to the author’s life. What does it mean to imagine writing as a body, something with a distinct shape/form, but also subject to vulnerability? How do we understand those moments when a writer opens herself or her corpus up to change, how radical or gradual are these movements, how permanent, fleeting, or even recurrent? Is there something called “late style,” a distinctive signature that characterizes the end of a career as contours of an aging body mapped onto the weave of writing? When and how do we intellectually/creatively exfoliate? While we have very successful ways of detecting “authors” or “style,” we have considerably fewer techniques for talking about change, the nature of the variability within an author’s corporal outline, the variety of measures to study the shape of a writer’s career. Working with a trilingual collection of roughly 30,000 poems in French, German and English, this chapter explores questions of local/global vulnerability and late style, concluding with a computationally informed reading of the work of Wanda Coleman.Less
The “corpus” (or body of work, since Cicero) of an author is meant to be organic, integral--well connected--but also distinct and whole; it marks limits; it is the material complement to the author’s life. What does it mean to imagine writing as a body, something with a distinct shape/form, but also subject to vulnerability? How do we understand those moments when a writer opens herself or her corpus up to change, how radical or gradual are these movements, how permanent, fleeting, or even recurrent? Is there something called “late style,” a distinctive signature that characterizes the end of a career as contours of an aging body mapped onto the weave of writing? When and how do we intellectually/creatively exfoliate? While we have very successful ways of detecting “authors” or “style,” we have considerably fewer techniques for talking about change, the nature of the variability within an author’s corporal outline, the variety of measures to study the shape of a writer’s career. Working with a trilingual collection of roughly 30,000 poems in French, German and English, this chapter explores questions of local/global vulnerability and late style, concluding with a computationally informed reading of the work of Wanda Coleman.
Andrew Piper
- Published in print:
- 2018
- Published Online:
- January 2019
- ISBN:
- 9780226568614
- eISBN:
- 9780226568898
- Item type:
- chapter
- Publisher:
- University of Chicago Press
- DOI:
- 10.7208/chicago/9780226568898.003.0002
- Subject:
- Literature, Criticism/Theory
This chapter is a history of what Bataille might call the general economy of punctuation: its distributions, luxuriant overaccumulation, and rhythmic rise and fall (Amiri Baraka’s “delay of ...
More
This chapter is a history of what Bataille might call the general economy of punctuation: its distributions, luxuriant overaccumulation, and rhythmic rise and fall (Amiri Baraka’s “delay of language”). Economy of punctuation shows how spacing/pacing create meaning on the page, also how tactics of interruption, delay, rhythm, periodicity, and stoppage are all essential means of communicating within literature’s long history. Economy of punctuation reveals the social norms surrounding how we feel about the discontinuities of what we want to say. Viewing the relationship between punctuation’s excess and its manifestation in twentieth-century poetry through a collection of 75,000 English poems by 452 poets who were active during the twentieth century, the chapter explores methods that move from the elementary function ("grep") to more sophisticated uses of word embeddings; it also explores poems that deploy periods well in excess of the norms of their age. Few narratives are more strongly ingrained in the field of poetics than this era's growing antipathy to punctuation. Yet we observe how the period became increasingly deployed by these poets. The period’s abundance creates a language space marked not only by a sense of the elementary (deictic/rudimentary) but also of opposition/conjunction, a sense of the irreconcilable.Less
This chapter is a history of what Bataille might call the general economy of punctuation: its distributions, luxuriant overaccumulation, and rhythmic rise and fall (Amiri Baraka’s “delay of language”). Economy of punctuation shows how spacing/pacing create meaning on the page, also how tactics of interruption, delay, rhythm, periodicity, and stoppage are all essential means of communicating within literature’s long history. Economy of punctuation reveals the social norms surrounding how we feel about the discontinuities of what we want to say. Viewing the relationship between punctuation’s excess and its manifestation in twentieth-century poetry through a collection of 75,000 English poems by 452 poets who were active during the twentieth century, the chapter explores methods that move from the elementary function ("grep") to more sophisticated uses of word embeddings; it also explores poems that deploy periods well in excess of the norms of their age. Few narratives are more strongly ingrained in the field of poetics than this era's growing antipathy to punctuation. Yet we observe how the period became increasingly deployed by these poets. The period’s abundance creates a language space marked not only by a sense of the elementary (deictic/rudimentary) but also of opposition/conjunction, a sense of the irreconcilable.