- Item type
- Book
- Language
- English
- Publication year
- 2021
- Edition no.
- 2nd ed.
- Contributors
- ISBN
- 978-0-367-56859-7
- Note
Big Data and Social Science: Data Science Methods and Tools for Research and Practice, Second Edition shows how to apply data science to real-world problems, covering all stages of a data-intensive social science or policy project. Prominent leaders in the social sciences, statistics, and computer science as well as the field of data science provide a unique perspective on how to apply modern social science research principles and current analytical and computational tools. The text teaches you how to identify and collect appropriate data, apply data science methods and tools to the data, and recognize and respond to data errors, biases, and limitations.
Features:
Takes an accessible, hands-on approach to handling new types of data in the social sciences
Presents the key data science tools in a non-intimidating way to both social and data scientists while keeping the focus on research questions and purposes
Illustrates social science and data science principles through real-world problems
Links computer science concepts to practical social science research
Promotes good scientific practice
Provides freely available workbooks with data, code, and practical programming exercises, through Binder and GitHub
New to the Second Edition:
Increased use of examples from different areas of social sciences
New chapter on dealing with Bias and Fairness in Machine Learning models
Expanded chapters focusing on Machine Learning and Text Analysis
Revamped hands-on Jupyter notebooks to reinforce concepts covered in each chapter
This classroom-tested book fills a major gap in graduate- and professional-level data science and social science education. It can be used to train a new generation of social data scientists to tackle real-world problems and improve the skills and competencies of applied social scientists and public policy practitioners. It empowers you to use the massive and rapidly growing amounts of available data to interpret economic and social activities in a scientific and rigorous manner.
Table of Contents
1. Introduction
2. Working with Web Data and APIs - Cameron Neylon
3. Record Linkage - Joshua Tokle and Stefan Bender
4. Databases - Ian Foster and Pascal Heus
5. Scaling up through Parallel and Distributed Computing - Huy Vo and Claudio Silva
6. Information Visualization - M. Adil Yalcin and Catherine Plaisant
7. Machine Learning - Rayid Ghani and Malte Schierholz
8. Text Analysis - Evgeny Klochikhin and Jordan Boyd-Graber
9. Networks: The Basics - Jason Owen-Smith
10. Data Quality and Inference Errors - Paul P. Biemer
11. Bias and Fairness - Kit T. Rodolfa, Pedro Saleiro, and Rayid Ghani
12. Privacy and Confidentiality - Stefan Bender, Ron Jarmin, Frauke Kreuter, and Julia Lane
13. Workbooks - Brian Kim, Christoph Kern, Jonathan Scott Morgan, Clayton Hunter, and Avishek Kumar.