This virtual seminar will be given in 14 biweekly seminars of 3 hours each starting January 22, 2016. The seminars will start at 11:00 AM EST. The application deadline is November 30th, 2015.
About the Seminar
Recent advances in both the availability of digital trace data as well as tools for analyzing such data opens up new vistas for social science research. These trends make new forms of research both possible and necessary. This seminar intends to provide it's participants with tools and insights necessary to navigate the emerging data landscape. In particular, it is aimed towards scholars who come from primarily qualitative and narrative traditions, but who still want to take advantage of digital trace data and computational tools, without necessarily resorting to purely econometric methods. In particular the seminar intends to accomplish the following goals:
- To give junior scholars an overview of available digital trace data sources and attendant computational tools
- To train junior scholars in rigorous computational-qualitative research design, methods, and attendant styles of theorizing
- To initiate a number of computational-qualitative research projects
About the Instructor
The seminar will be led by Aron Lindberg, Assistant Professor of Information Systems at Stevens Institute of Technology. He is also a research fellow at the Swedish Center for Digital Innovation. His research focuses on applying computational tools to digital trace data with a qualitative mindset. This involves extracting and analyzing data with regards to processes, latent patterns, meaning, and relationships. Substantially his research focuses on open innovation, collective intelligence, and the structure of temporal processes such as routines and narratives. The seminar will be assisted by Jonas Andersen, postdoctoral scholar at Gothenburg University and Swedish Center for Digital Innovation. The seminar will also have multiple visits from experienced scholars active within the field of computational-qualitative field research such as Nick Berente, James Gaskin, James Howison, Natalia Levina, Rikard Lindgren, Kalle Lyytinen, Ann Majczhrzak, Brian Pentland, and Youngjin Yoo.
Structure
The seminar has three main components. If necessary, participation can be restricted to only the seminar component:
- Seminar on computational-qualitative research
- Tutoring on computational-qualitative research proposals
- Training on computational methods based on online resources
Because of its virtual nature, sessions will be conducted on a biweekly basis over Skype. Before each sessions, readings will be indicated, which will then be discussed in-depth in an open-ended brownbag style. Participants enrolled in the tutoring and training components will also continuously work on an individual research proposal, based on the idea of integrating computational and qualitative research. Further, each participant in the tutoring and training components will make a personalized training plan to learn methods that will help him/her to solve the computational challenges associated with his or her research project.
Seminar Modules
The seminar itself is composed of four different modules, each covering an essential aspect of computational-qualitative field research. Below is a brief overview of these modules. As the seminar rests on dynamic participation, the exact topics for discussion will evolve throughout the course depending on the background and learning trajectories of the participants.
1. The Rise of Computational Social Science
- Session 1 (1/22): A Brave New World (Aron Lindberg)
- Session 2 (2/5): A Pragmatist Approach to Computational-Qualitative Research (Aron Lindberg)*
2. New Data, New Methods
- Session 3 (2/19): Digital Traces (Aron Lindberg)
- Session 4 (3/4): Studying Archives of Online Behavior (James Howison)*
- Session 5 (3/18): Narrative Positivism (Brian Pentland)
- Session 6 (4/1): Causal Effects of Sequential Activity Configurations (Ann Majczhrak)
- Session 7 (4/29): Organizational Genetics (Youngjin Yoo)*
3. Combining Heterogeneous Data & Analyses
- Session 8 (5/27): Grounded Theory meets Inductive Analytics (Natalia Levina)
- Session 9 (6/3): Process Theory (Kalle Lyytinen)
- Session 10 (6/10): Big Data Grounded Theory (Nick Berente)*
- Session 11 (6/28): Seeing the Forest and the Trees (James Gaskin)
- Session 12 (7/1): Configural Analyses (Ola Henfridsson)
4. New Directions
- Session 13 (7/8): Text Mining (Hani Safadi)
- Session 14 (7/22): Quantitative Storytelling*
* Sessions marked with an asterisk (*) will consist of a 2h seminar and a concluding 1h tutoring+training session
Pricing & Scheduling
The seminar will be delivered virtually during 14 biweekly, 3h long Friday 11 AM EST sessions, starting on January 22nd, 2016. The costs for the seminar component are $280 for students and $325 for faculty. If you also want to take part in the more individualized tutoring and training components, there is an additional cost of $200. Both doctoral students, pre-doctoral Master's students, post-doctoral scholars, as well as junior faculty are welcome. To ensure a high-quality experience, participation will be limited to 16 for the full seminar, and 8 for tutoring on research proposals & training on computational methods. To sign up, send an email to Aron Lindberg with your CV and a short statement of your research interests. Please also specify whether you are interested in only the seminar, at the lower rate, or in all three components (seminar, tutoring, & training). The deadline for applications will be November 30th, 2015. Decisions on admittance to the seminar will be made by December 15th, 2015.
Resources
Thes following resources are essential for all participants:
- Tools
- R environment for statistical computing. It goes well with RStudio - an open source IDE.
- Github account is used for version control and collaboration. See comprehensive tutorials here and here.
- Stackoverflow is a Q&A platform with a vibrant community and a lot of answers to typical problems.
- MOOCs on R and Data Analysis
- The Data Scientist’s Toolbox on Coursera gives an overview of R, RStudio, and Github.
- R Programming on Coursera gives a good foundation for programming in R: objects, reading data, functions, and loops. DataCamp's introductory and intermediate courses provide similar information, but in-browser and self-paced.
- Data Analysis with R on Udacity is another course for beginners, that also includes exploratory data analysis and visualization techniques. For dedicated courses on visualization see also: Coursera and DataCamp
- Social Network Analysis Labs in R and SoNIA at Stanford.
- A course on Sequence Data Analysis with all materials available here.
- Books
- Advanced R by Hadley Wickham, online.
- An Introduction to Statistical Learning by James et al. (pdf) and expert videos.
- The R Inferno by Patrick Burns (pdf).
- Guide to Programming and Algorithms Using R by Özgür Ergül.
- XML and Web Technologies for Data Sciences with R (Use R!) by Deborah Nolan & Duncan Temple Lang.
- Text Analysis with R for Students of Literature (Quantitative Methods in the Humanities and Social Sciences) by Matthew Jockers.
- Qualitative Comparative Analysis with R: A User's Guide (SpringerBriefs in Political Science) by Alrik Thiem, Adrian Dusa.
- The Art of R Programming: A Tour of Statistical Software Design by Norman Matloff (pdf).
- Data Manipulation with R (Use R!) by Phil Spector.
- Handling and Processing Strings in R by Gaston Sanchez.
- SQL
- SQL Zoo in-browser tutorials and practice for SQL.
- GalaZQL in-browser interactive SQL tutorial.
- SQL from Stanford is a self-paced online mini course that covers the basics with more courses here.
- Helpful resources online
- R-bloggers one of the most visited resources about R and news from the industry.
- DataTau - data science online community.
- Code Academy HTML & CSS free in-browser course.
- Regular expressions: directory, interactive exercises, online testers: regexr, regex101.
- Podcasts
- @RLangTip offers daily R tips.
- @R_Programming tweets R learning tips from rstatistics.net