Text Mining: A Guidebook for the Social Sciences

Text Mining: A Guidebook for the Social Sciences

By: Gabe Ignatow (author), Rada F. Mihalcea (author)Paperback

In Stock


Online communities generate massive volumes of natural language data and the social sciences continue to learn how to best make use of this new information and the technology available for analyzing it. Text Mining: A Guidebook for the Social Sciences brings together a broad range of contemporary qualitative and quantitative methods to provide strategic and practical guidance on analyzing large text collections. This accessible book, written by sociologist Gabe Ignatow and computer scientist Rada Mihalcea, surveys the fast-changing landscape of data sources, programming languages, software packages, and methods of analysis available today. Suitable for novice and experienced researchers alike, the book will help readers use text mining techniques more efficiently and productively.

About Author

Gabe Ignatow is an Associate Professor of Sociology at the University of North Texas where he has taught since 2007. His research interests are in the areas of sociological theory, text mining and analysis methods, new media, and information policy. Gabe's current research involves working with computer scientists and statisticians to adapt text mining and topic modeling techniques for social science applications. Gabe has been working with mixed methods of text analysis since the 1990s, and has published this work in journals including Social Forces, Sociological Forum, Poetics, the Journal for the Theory of Social Behaviour, and the Journal of Computer-Mediated Communication. He is the author of over 30 peer-reviewed articles and book chapters and serves on the editorial boards of the journals Sociological Forum, the Journal for the Theory of Social Behaviour, and Studies in Media and Communications. He has served as the UNT Department of Sociology's graduate program co-director and undergraduate program director and has been selected as a faculty fellow at the Center for Cultural Sociology at Yale University. He is also a co-founder and the CEO of GradTrek, a graduate degree search engine company. Rada Mihalcea is a professor of computer science and engineering at the University of Michigan. Her research interests are in computational linguistics, with a focus on lexical semantics, multilingual natural language processing, and computational social sciences. She serves or has served on the editorial boards of the following journals: Computational Linguistics, Language Resources and Evaluation, Natural Language Engineering, Research on Language and Computation, IEEE Transactions on Affective Computing, and Transactions of the Association for Computational Linguistics. She was a general chair for the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL, 2015) and a program cochair for the Conference of the Association for Computational Linguistics (2011) and the Conference on Empirical Methods in Natural Language Processing (2009). She is the recipient of a National Science Foundation CAREER award (2008) and a Presidential Early Career Award for Scientists and Engineers (2009). In 2013, she was made an honorary citizen of her hometown of Cluj-Napoca, Romania.


Part I: Digital Texts, Digital Social Science 1. Social Science and the Digital Text Revolution Learning Objectives Introduction History of Text Analysis Risk and Rewards of Text Mining for the Social Sciences Social Data from Digital Environments Theory and Metatheory Ethics of Text Mining Organization of This Volume 2. Research Design Strategies Learning Objectives Introduction Levels of Analysis Strategies for Document Selection and Sampling Types of Inferential Logic Approaches to Research Design Part II: Text Mining Fundamentals 3. Web Crawling and Scraping Learning Objectives Introduction Web Statistics Web Crawling Web Scraping Software for Web Crawling and Scraping 4. Lexical Resources Learning Objectives Introduction WordNet Roget's Thesaurus Linguistic Inquiry and Word Count General Inquirer Wikipedia Downloadable Lexical Resources and APIs 5. Basic Text Processing Learning Objectives Introduction Tokenization Stopword Removal Stemming and Lemmatization Text Statistics Language Models Other Text Processing Software for Text Processing 6. Supervised Learning Learning Objectives Feature Representation and Weighting Supervised Learning Algorithms Evaluation of Supervised Learning Software for Supervised Learning Part III: Text Analysis Methods from the Humanities and Social Sciences 7. Thematic Analysis, QDAS, and Visualization Learning Objectives Thematic Analysis Qualitative Data Analysis Software Visualization Tools 8. Narrative Analysis Learning Objectives Introduction Conceptual Foundations Mixed Methods of Narrative Analysis Automated Approaches to Narrative Analysis Future Directions Specialized Software for Narrative Analysis 9. Metaphor Analysis Learning Objectives Introduction Theoretical Foundations Qualitative Metaphor Analysis Mixed Methods of Metaphor Analysis Automated Metaphor Identification Methods Software for Metaphor Analysis Part IV: Text Mining Methods from Computer Science 10. Word and Text Relatedness Learning Objectives Introduction Theoretical Foundations Corpus-based and Knowledge-based Measures of Relatedness Software and Datasets for Word and Text Relatedness Further Reading 11. Text Classification Learning Objectives Introduction Applications of Text Classification Representing Texts for Supervised Text Classification Text Classification Algorithms Bootstrapping in Text Classifcation Evaluation of Text Classification Software and Datasets for Text Classification 12. Information Extraction Learning Objectives Introduction Entity Extraction Relation Extraction Web Information Extraction Template Filling Software and Datasets for Information Extraction and Text Mining 13. Information Retrieval Learning Objectives Introduction Theoretical Foundations Components of an Information Retrieval System Information Retrieval Models The Vector-Space Model Evaluation of Information Retrieval Models Web-Based Information Retrieval Software and Datasets for Information Retrieval 14. Sentiment Analysis Learning Objectives Introduction Theoretical Foundations Lexicons Corpora Tools Future Directions Software and Datasets for Word and Text Relatedness 15. Topic Models Learning Objectives Introduction Digital Humanities Political Science Sociology Software for Topic Modeling V: Conclusions 16. Text Mining, Text Analysis, and the Future of Social Science Introduction Social and Computer Science Collaboration

Product Details

  • ISBN13: 9781483369341
  • Format: Paperback
  • Number Of Pages: 208
  • ID: 9781483369341
  • weight: 359
  • ISBN10: 148336934X

Delivery Information

  • Saver Delivery: Yes
  • 1st Class Delivery: Yes
  • Courier Delivery: Yes
  • Store Delivery: Yes

Prices are for internet purchases only. Prices and availability in WHSmith Stores may vary significantly