data driven test 3
Positioning
Determine how your offering competes with other offerings in the target segment.
Gaining Comfort With The Analytics Gap
1. Bolstering Your Knowledge Base - Managers may fail to fully understand specifics - Expanding upon this analytics vocabulary will ultimately enhance their ability to understand/make decisions 2. Build off Prior Experience - Experience w/ analytics builds trust - Repeated experience improves ability to frame the right questions - Experience builds familiarity w/ organization's data 3. Create Analytical Options - Places focus on incremental analytic improvement - "Keep the train rolling w/ steady deliverables" 4. Capitalize on Domain Knowledge - Deeper understanding of the business context - Resulting insight should resonate with managers 5. Recognize the limitations of the model - Reconcile discrepancies in analytical models
Sentiment scoring:
Access comments associated with a post Code each comment as : positive=1 , negative= -1, neutral = 0 Composite score- average the comment specific scores. Scale between 0 to 100 ( 0 is negative and 100 is positive) P/N ratio is the ratio of positive to negative comments Distribution is percentage of positive, negative, and neutral
Caution:
Automated analysis results in a large percentage of comments being coded neutral. Comments with both positive and negative sentiments end up classified as neutral. Benchmark and compare with manually coded sentiment and use as input to train are automated analysis algorithm
Segmentation Variables
Basis variable- most important, try to find out why one group is different from the other group. Try to uncover the needs or want or benefits that the two groups are seeking from a category . Why variable Descriptor variables- they explain the characteristics of the people in the segment like demographics. Who variable. Helpful in finding and reaching these segments Purchase and other behavioral variables- look at the differences of behavior between segmentation group. what variable. Metrics that capture behavior
minding the analytics gap: main problem
Business is becoming increasingly more data driven as executives increasingly recognize the upsides that analytics can bring to their business. -Issue: Many managers cannot make decisions based on these analytics and do not know how to translate them into actionable solutions that will help the company. -Study of 2,719 business executives, managers and analytics professionals from organizations around the world shows that the number one problem was translating business analytics into decisions (problems were not complex modeling skills issues with workers or data management)
Targeting
Determine which groups to serve and how. • Evaluate attractiveness in terms of demand levels, opportunities, costs, and fit with the firm's capabilities. • Determine the level of resources to be allocated. • Find and reach customers using appropriate marketing activities.Determine if we are able to meet the consumers need and want.
content analysis : product reviews
Clean the data , prepare the data, remove stop words and brand names. Generate word clouds. It can help with positioning analysis. Can also help developing vocabulary for developing communications
Segmentation Analysis
Conditions for evaluating the quality of segmentation analysis : homogeneity/heterogeneity( how close knit the customers are in the current segment, how homogeneous they are/ how does the different segment differ), identifiability / accessibility ( can you identify who belongs to which segment and can you access them , should look at descriptive variables), parsimony( cost effective
Cluster analysis with two variables:
First make a scatter plot. Then examine visually and group members close to each other on the plot into one segment. Ensure sufficient distance between segments. The number of segments is easy to arrive at when close knit groups are well separated from each other. We may need a more scientific way to validate segmentation basis and or when some objects defy easy grouping . Euclidean distance is best used
Segmentation
Group customers with similar wants, needs, and responses. • Customers pay a premium for products that better meet needs and wants. • Customers in one group should differ from customers in another group. • Cost of serving customers in each group must be less than or equal to the prices these customers are willing to pay
Two methods of cluster analysis :
Hierarchical - look at data row by row. Step wise process. Look at who could be the closest to a particular customer and group them together . Partitioning- reassign objects to clusters until optimality is reach . Done randomly and look at who get assign to which cluster
Consuming, Not Just Producing
Increased amount of data allows: - Better understanding their business, customers, environment - Driving more sophisticated methods of extracting insights from data. But They are not yet using it in a meaningful way This can be fixed by: - Training more data scientists with a background in business - Train managers to be more savvy with data and insights.Managers have a hard time with data due to: Burgeoning Analytics Sophistication - Analytically Challenged, Analytical Practitioners and Analytical Innovators Competing Demands for Attention - Due to the dynamic market managers do not have time to do everything, analytics takes a back seat - "financial crisis cancels analytics,"
Cluster analysis : many variables:
More than three variables you need a 3d plot. Visualization is difficult/ impossible. However , the euclidean distance idea can be extended to this .
K means Clustering :
Not hieratical Starts with a random partioning of the data. Need to specify the number of groups you want before clustering. A centroid represents a cluster : centroids means of all variables used as input for the cluster analysis . Cluster centroid coordinates for a cluster with n members, in a two dimensional plane.You first compute the mean when doing K-means partioning. The means become the center point .Divide the clusters into two cluster. Recompute the centroids for two clusters
Cluster analysis :
Split data into groups: members of a group resemble each other ( within group similarity), members of different groups differ from the members of other groups ( between group dissimilarity) .
steps in K means clustering
Step 1 is organize data standardize all numeric variables that represent needs, wants, attitudes, benefits. Step 2 . Choose number of clusters desired Step 3. choose initial centroids arbitrarily Step 4. assign each object to a cluster by choosing the centroid that is closest to the object Step 5. compute the position of the new cluster centroids by taking the average for each axis/dimension across all objects within a cluster Step 6. recompute the distances of each object from the new cluster centroids Step 7. reassign each object to the closets centroid Step 8 repeat five , six, and seven until cluster membership does not change and the within cluster variance is minimized Step 9. finding optimal number of clusters ( visually and combination of measures(Nbclust in R) Step 10. profile each cluster by examining demographics ( and choice or other behaviors , if available) of the members of each cluster
Uncovering Segments
Using cluster analysis , depends on having a similarity measure in place( is there a measure that tells us if two customers are similar to each other. ) looks at scaled data ( data that has been collected on a scale)or nominal data( when someone says yes or no to something) .
Reducing data:
Variables might overlap as they could be measuring similar or interrelated constructs. Factor analysis reduces a large set of segmentation variables to a smaller set of independents
Composite scoring
accounts for neutral comments, averaging hides useful information
Content analysis: word clouds
based on a frequency analysis of words that appear in posts, blogs, and reviews of products/brands. Size of the word is proportional to Frequency with which the word appears. Limitations of word clouds is that it is static information, it ignores word combinations and context. Has a qualitative feel only and does not allow further analysis
Content analysis depends on
contributors posting comments. Are contributors representative of all the users of a product?
Number of clusters
elbow plot - helps you specify the number of clusters . Looks at ratio of within cluster to between cluster variance. You will find the most ideal clusters at the elbow of the graph
P/N ratio
ignores neutral comments
Discriminant
is also known as descriptive data
Distribution
most complete information , flexibility to construct your own metric
Hierarchical clustering
start at the bottom most level and then move up Look at each up point and each point is a cluster Look at the interpoint distances. Use the formula n(n-1)/2 . N = 10 because there is 10 points . End number gives you the number of pairs you have to look at . Have to compute the distances and put it in the distance matrix Start with the two points that are closet to each other , keep moving though the ones that are closets to each other. Dendrogram tells you how many clusters you should look at . Looks at the bottom and than works up. Dendrogram : ward method. (google it ) Scree plot: it compares the sum of squared error for each cluster. Each cluster starts at one but as you keep combining the sum of squared error goes up. The best cluster is at the elbow of the graph. Two variables, two clusters
Constant sum method
the higher number tells the importance of the variable. Forces respondents to make trade offs
segmentation: why, who, what
• Why: differences in needs and wants, lifestyle, attitudes, preferences, decision process, benefits sought... • Who: characteristics of the members of the segments that are helpful in finding and reaching these segments • Age, income, education, gender, media habits, social groups... • What: metrics that capture manifested behavior and responsiveness to marketing efforts • Usage, loyalty, price/promotion sensitivity...