Geoffrey Rockwell and Stéfan Sinclair
- Published in print:
- 2016
- Published Online:
- January 2017
- ISBN:
- 9780262034357
- eISBN:
- 9780262332064
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262034357.003.0007
- Subject:
- Society and Culture, Cultural Studies
What is big data, and what does it have to do with the humanities? The Snowden revelations have drawn attention to the opportunities and dangers to the gathering of large collections of data, ...
More
What is big data, and what does it have to do with the humanities? The Snowden revelations have drawn attention to the opportunities and dangers to the gathering of large collections of data, including the collecting of text messages and email. Techniques that digital humanists have used in the study of individual texts are now being scaled up to study large collections. The digital humanities have a valuable historical and ethical perspective on big data analytics. Questions about what to do with too much information go back to Plato. Questions about the completeness of data, the usefulness of metadata, and the value of analytics can help us understand what big data can and cannot do. In particular we need to be careful of false positives, or false predictions based on data too large to check with other methods.Less
What is big data, and what does it have to do with the humanities? The Snowden revelations have drawn attention to the opportunities and dangers to the gathering of large collections of data, including the collecting of text messages and email. Techniques that digital humanists have used in the study of individual texts are now being scaled up to study large collections. The digital humanities have a valuable historical and ethical perspective on big data analytics. Questions about what to do with too much information go back to Plato. Questions about the completeness of data, the usefulness of metadata, and the value of analytics can help us understand what big data can and cannot do. In particular we need to be careful of false positives, or false predictions based on data too large to check with other methods.