data science - final
How is data arranged in a pivot table?
- All related data is in the same column allowing you to aggregate data by column value - Tableau uses this file format
Sentence level sentiment analysis
- Document may have multiple views - Finer - Issues with question, sarcastic sentences
Aspect-based sentiment analysis
- Entities with many aspects - ex - choosing an iphone
What is analytics?
- Extracting information from data - Discovering meaningful patterns
Document-level sentiment analysis
- Simplest - Supervised and Unsupervised learning - Subjective vs objective
How does Hadoop and Map Reduce facilitate the analysis of Big Data?
- The Hadoop ecosystem provides the furnishings that turn the framework into a comfortable home for big data activity
In a database table there are...
- columns - rows - characters
Comparative sentiment analysis
- comparing one brand or product to another - ex - apple vs. android
What are the benefits of sentiment analysis?
- help your brand create better products/services - understand consumers
How is data stored?
- in a database. - as lots of binary numbers, by magnetism, electronics or optics.
What factors influence the confidence interval of a trend line (i.e., the Tableau exercise)?
- increasing/decreasing sample size
What is the difference between relational databases and data formatted for Pivot Table analysis?
A relational database is a structured database used to recognize relations among stored items of information. While data formatted for a Pivot Table allows you to extract significance from a large, detailed data set.
What is sentiment analysis?
A technique used to detect favorable and unfavorable opinions toward specific products and services using large numbers of textual data sources
A basic rule of the Pivot Table data structure is that
All values of the same type need to be in one column
The most detailed type of sentiment analysis is:
Aspect-based sentiment analysis
How is analytics related to Big Data?
Big data analytics is the often complex process of examining large and varied data sets -- or big data -- to uncover information including hidden patterns, unknown correlations, market trends and customer preferences that can help organizations make informed business decisions
What is Map Reduce?
Breaks up the task and gives each connected computer a small piece to work on.
What is the most appropriate technique to analyze the statement "Coke tastes better than Pepsi"?
Comparative sentiment analysis
How is it different from Big Data?
Data analytics is generally more focused than big data because instead of gathering huge piles of unstructured data, data analysts have a specific goal in mind and sort through relevant data to look for ways to gain support
According to Bertolucci, the simplest type of analytics is:
Descriptive
What is the difference between descriptive, predictive, and prescriptive analytics?
Descriptive - Summarizes past data Predictive - Forecast of what may happen in the future Prescriptive - Helps facilitate decision-making
According to Wohlsen, Facebook's recent study of its users revealed:
Exposure to fewer positive messages led to fewer positive posts
What is the difference between Hadoop and Map Reduce?
Hadoop = storage Map Reduce = analysis
According to Peck, the use of analytics to determine workers' potential is most widely used in
Hourly work, where the jobs are standardized
What to do to widen or shrink the shaded area (margin of error) of forecasts in a time-trend forecasting analysis?
Increase time
What is the purpose of Database
Keep track of things with more than one theme
Hadoop is often paired with another piece of software called
MapReduce
What is association mining and where is it used?
Most Used: Shopping habits What is it: data analysis to determin patterns in data, if shopper buys X how likely to buy Y
According to Feldman, the most common application of sentiment analysis is:
Reviews of consumer products and services
What is NOT a type of unstructured data:
Stock prices
What is Hadoop?
Stores big databases in smaller pieces across a network of connected computers
According to Peck, "people analytics" is
The application of predictive analytics to people's careers
According to Paine, an analysis found that a team's probability of scoring increases as:
They string together more successful passes
How is data arranged in a relational database?
This model organizes data into one or more tables (or "relations") of columns and rows, with a unique key identifying each row. Generally, each table/relation represents one "entity type" (such as customer or product).
How do the tables of data become associated in a relational database?
With the use of unique and foreign keys
In a database table, another name for a character is
a byte
In a database table, another name for a column is
a field
What are the uses of association mining?
a procedure which is meant to find frequent patterns, correlations, associations, or causal structures from data sets found in various kinds of databases such as relational databases, transactional databases, and other forms of data repositories.
In a database table, another name for a row is
a record
What is a database?
a self-describing collection of integrated records
What are the uses of forecasting?
a technique that uses historical data as inputs to make informed estimates that are predictive in determining the direction of future trends.
How does sentiment analysis software determine a positive/negative score for a block of text?
analyzing words used and their conotation
Example of descriptive analytics.
company reports
Example of prescriptive analytics.
decision trees
Hadoop is a platform that:
makes big data easier to manage
Bertolucci claims an ongoing problem with Hadoop for companies is that
managers and executives don't really understand what it does
What are some examples of unstructured data?
often include text and multimedia content. some examples for email and social media
According to Paine, people have been tracking soccer data for:
over 60 years
What are some examples of structured data?
refers to information with a high degree of organization. some examples are census and federal records.
Explain the difference between structured and unstructured data.
structured data is easily searchable by basic algorithms whereas unstructured data is more like human language. It doesn't fit nicely into relational databases like SQL, and searching it based on the old algorithms ranges from difficult to completely impossible.
In a database table, a group of similar records are
tables or files
How does sentiment analysis work?
the process of computationally identifying and categorizing opinions expressed in a piece of text, especially in order to determine whether the writer's attitude towards a particular topic, product, etc., is positive, negative, or neutral.
Relational databases follow to a set of practices called
the rules of normalization
T/F - According to Hurwitz, Hadoop is capable of handling unstructured data:
true
Example of predictive analytics.
weather forecasting
In what situations can sentiment analysis be inaccurate?
wording can be misleading and often misunderstood by computer programs.