Big Data

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Biased sample, messy

2 cons of big data

Intercoder reliability test

2 human coders analyze news articles independently, compared to see level of agreement; overcomes subjectivity

Analyze user profiles, analyze tweets posted by the same user, analyze network statistics

3 methods of sorting out users

Unsolicited, public opinion of the past, greater flexibility

3 pros of big data

Collect big social data, sort out social media users, analyze big social data

3 steps of big data analysis

Data limitation, biased samples, need to plan ahead, hardware requirements

4 limitations of Twitter's streaming API

Data cleaning, term frequency, word association, word cloud, sentiment analysis

5 components of basic text mining

LDA Analysis

A computer tells the topics it detected in tweets

Topic modeling

A document that has a mixture of latent topics

Representative sample/census

A smaller extraction of a larger population used for manual content analysis

API

Allows programs to tap into one another, register for one account with another

Computational social science

An emerging research field of social scientists and computer scientists

Big Data

Analyzes news media and public opinion, domain-dependent, ever-evolving, goes beyond the capabilities of traditionally used tools; tests and advances social science theories

Dictionary-based analysis

Big social data analysis based off pre-determined categories and key words

Supervised machine learning

Classifying documents into known categories (sometimes using topic modeling)

Basic text mining

Data cleaning, term frequency, word association, word cloud, sentiment analysis

Sentiment analysis

Manual content analysis vs. SentiStrength

Bigram

Most frequent pairs of words

Unigram

Most frequent word

Codebook

Pre-defined categories used for manual content analysis

Hand coding

Reading and deciding the topic of information (e.g. tweets)

Stemming

Reducing a word to its base form

Lemmatization

Reducing comparative/superlative degrees of a word (even ones that don't look like the root) to the base form

Data cleaning

Stemming; lemmatization; removing stop and space words, punctuation, and making lower case

The core of communication research

To answer our questions


Ensembles d'études connexes

Microbiology, Ch 26, Nester's 9th

View Set

Regulations: Securities Exchange Act of 1934

View Set

Personal Fitness Chapter 3 Lesson 2

View Set

ES 391: Chocolate Milk: A Post- Exercise Recovery Beverage for Endurance Sports

View Set

Asepsis and Infection Control- Gero

View Set

HIST 11 - Compilation of Quiz Questions

View Set