Data Visualization
Accessible Tips for People with CVD
-Use CVD-friendly color combinations -A common combination of CVD-friendly colors includes shades of gray along with blue and orange or blue and red -Select different tints of the same color -Using light and dark shades of one color allows an end user with CVD to associate the darker tints and lighter tints with differences in the visualization -Explore alternative methods to distinguish data in visualizations -Add text, icons, arrows, annotations, different line widths, or different line styles (such as solid line versus dashed line) to clarify the message in a data visualization
Color Wheel
A color wheel, which is a useful tool for design, shows relationships among colors and represents different combinations a designer can choose Designers often also use gradients, or shades, of a single color to create minimalist designs Best practice is to limit the number of colors used in a single visualization to no more than four Using too many colors can create visual chaos and make it difficult for readers to understand the message the visualization is trying to convey
Dashboards
A dashboard is a collection of individual visualizations that allows the audience to view multiple pieces of data at once A company should perform a cost-benefit analysis when planning a large dashboard project to ensure that the value of the insights the dashboard provides is worth the investment of time, talent, and other resources For accounting professionals, whose work often focuses on financial data captured by an AIS, dashboards are commonly used, and highly desired, for reporting Examples of dashboard use cases include analyzing financial key performance indicators (KPIs) and more
Histogram
A histogram is a visual representation of numeric distributions based on user-defined ranges: It graphs the frequency of values occurring in a data set The defined ranges, called bins, are plotted as bars where each bar's height is equal to the total occurrences of that bin in the data set The width of a bar is equal to the range of the bin, and all the ranges are the same size
Static Vs Interactive Dashboards
A static dashboard is a still image with no interactivity, such as a JPEG of a Tableau dashboard exported into a PowerPoint presentation An interactive dashboard is presented within the software used to create it or on a website in such a way that the audience can interact with it An interactive dashboard has a landing page, which is the first dashboard view when the dashboard is opened Dashboard views allow users to save specific settings, like filters, for ease of use
Storyboards
A storyboard is a collection of dashboards, stand-alone visualizations, infographics, and other presentation materials that turn the data analytics and visualization into a business presentation When we use a storyboard to present data analytics and visualizations, the storyboard follows the story arc, which has five parts
Color Accessibility
Accessibility is one of the most important color decisions to consider as not all sighted people experience color the same way Since accounting data is numeric and often financial, it's natural to want to use green to indicate positive numbers or increases and red to indicate negative numbers or decreases, but that message will be lost on someone with color vision deficiency (CVD) CVD is most commonly referred to as color blindness
Infographics
An infographic is a stand-alone visual that tells a story through graphic design and rarely needs to be accompanied by verbal communication Effective infographics quickly engage the audience's attention using carefully crafted charts, statistics, quotes, and summarized information that highlight key takeaways The complexity of infographics ranges from simple, single illustrations to complex larger spreads
Box Plots
Another distribution visualization, the box plot, illustrates the distribution of quantitative values for a categorical value Within a box plot visualization, every category has its own box plot diagram, which identifies: Minimum: Smallest value in the category Maximum: Largest value in the category Median: Middle value in the category Quartiles: Division of the category values into four sections
Purple
Assocation: Often appears in products targeting women or children; Associated with femininity Positive: Mystery Pride Wisdom Negative: Impracticality Corruption Immaturity
Yellow
Assocations: Draws attention, but more subtly than red; Used in children's products; Highlights important information by drawing attention Positive: Happiness Fun Optimism Negative: Egotism Cowardice Criticism
Blue
Assocations: Most used color in design; Used by conservative businesses where trust and security are core values Positive: Loyalty Integrity Calmness Negative: Coldness Conservatism Rigidity
Red
Assocations: Noticeable and bright; Negative business connotations; Associated with debt ("in the red") Positive: Love Confidence Power Energy Negative: Anger Aggression Negativity Failure
Orange
Assocations: Suggests adventure and fun; Used in products for children Positive: Creativity Enthusiasm Vibrancy Negative: Cheapness Superficiality Insincerity
Green
Assocations: Used in promotion of environmental sustainability; Associated with health, food, and wellness; Opposite effects from red; Indicates wealth and money; Used by financial companies and grocery stores Positive: Dependability Nurturing Sustainability Improvement Good performance Negative: Envy Greed
Box Plots for Anomaly Detection
Box plots are useful for showing details within a distribution, as well as anomaly detection The assumption of a box plot is that most of the quantitative values of the category fall inside the box: Values closer to the box are more expected The further away the minimum or maximum bars are from the box, the more unexpected those values are
Dashboard Characteristics
By compiling different visualizations, a dashboard presents multiple key takeaways supporting the overall story of the dashboard It's important to make sure the visualizations included in a dashboard have certain characteristics: -They relate to one another -Each visualization supports the overall story -Each visualization has unique and important takeaways They are necessary -They do not distract from the message
Exploratory Analysis
By turning raw data into a visual, we can easily perform flexible and insightful descriptive and diagnostic analytics In visualization, we call this process of investigating unknown data exploratory analysis
Techniques for Exploratory Analysis
Data visualization techniques can be used for exploratory analysis to answer a number of questions: Composition of data: What variables are included in the data set? Comparison of data points: How is one variable performing compared to other variables in the data set? Distribution of data: How often does a variable occur in the data set? Relationships between data points: How do the different variables in the data set relate to one another? Geospatial location of data points: Where are variables geographically located in the data set?
Design Concept
Data visualizations, like all other visual cues and graphics, are based on design concepts A design concept is the central idea, or theme, that drives a design's meaning and tone In business, professionals who work with visual cues, including data visualizations, pay careful attention to details to ensure that the different elements of the design work together to tell the appropriate story
Color Psychology
Different colors convey different emotions Color psychology is the study of human behavior related to colors Every color is associated with positive and negative traits, and colors can influence human perception by evoking emotions
Relationship
Discovering relationships between categorical values in a data set is a common goal of descriptive, diagnostic, and predictive analytics From regression analysis, to identify the impact of a value on the outcome of another, to forecasting, to predict what comes next
Distribution
Distribution techniques are used to determine how often a variable occurs in a data set This is different from comparison in a bar chart or heat map because distribution is a statistical analysis A summary of data in a bar chart or heat map doesn't consider the relationship of the data points In contrast, distribution uses statistical functions to determine the relationships between categorical values and quantitative values
Sweet Spot of Explanatory Analysis
Explanatory analysis occurs at the sweet spot where data, visuals, and the story intersect
Explanatory Analysis
Explanatory analysis occurs once the story, or purpose, of a visualization is known and the data is ready to be presented to the audience During explanatory analysis, visualizations are crafted, the presentation is fine-tuned, and key points are highlighted When crafting explanatory data visualizations, remember to consider these questions: -Who is the audience? -What is the key takeaway of this story? -How does the data support the key takeaway? -Which visualization techniques best portray the story?
Geospatial Maps
Geospatial map refers to analytics and visualizations that utilize geographic data such as map coordinates, GPS data, and more Business uses for mapping geospatial data include: Showing corporate activity, such as sales and expenses, across geographic regions Providing insights into distribution of employees, office locations, and customers around the globe Adding an easy-to-understand, real-world context to corporate data, such as presenting a map that relates corporate risk to physical locations
Heat Maps
Heat maps are useful for any type of data where quantitative values can be separated into different ranges, or groups In a heat map: -Colors are applied from darker to lighter shades, based on quantitative values -A single color and its shades can be used -Gradients across multiple colors can be used, such as shades of green, yellow, and red that indicate lower to higher values
Iconography
Iconography is the use of visual images and symbols to represent ideas In data visualization, icons are symbols that provide visual cues along with data, text, or charts and graphs An icon directly communicates information without distraction
Advantages of Icons Over Photos
Icons have an advantage over photographs in that photos can distract the audience from the message of a visualization For example: o They might show people wearing outdated clothing o They might depict unconvincing behaviors o They might be dull or overly dramatic o They might lack diversity and accurate representation
Scatter Plots
Important data analytics techniques that depict relationships also include clustering and classification, which group similar categorical values based on similarities in the data set The best visualization for depicting these types of relationships is a scatter plot, which shows the relationship between two variables For the purposes of visualization techniques, scatter plots use shapes, such as dots or circles, mapped along the X-axis and Y- axis, which show the two variables being measured for relationships
The Five Parts of a Story Arc
Introduction: Characters and settings are introduced Rising action: This includes articulating the problem being addressed or the question being answered Climax: The climax shows the results of the analytics that have been performed Falling action: If the presentation has identified something negative, like a problem or risk, the falling action may include suggestions for improvements Resolution: At the end of the presentation, the audience is presented with a powerful call to action
Pie Charts
One of the most basic composition visualizations is the pie chart, which divides data into groups proportional to their size within the data set While pie charts are common visualizations, they don't always paint the clearest picture of data The human brain is not trained to compare angles around a circle. For this reason, it can be difficult to correctly gauge the comparison between slices of a pie chart
Color
One of the most important decisions in a design concept is choosing colors Colors draw attention, incite emotional reactions, and determine tone There are two families of colors: Warm colors: Red, orange, and yellow are often associated with joy, energy, and playfulness Cool colors: Green, blue, and purple often evoke feelings of relaxation, calm, and stability
Line Charts
One of the most popular types of relationships that visualizations can explore is changes over time A time series captures data that occurs in chronological order across a period of time Line charts visualize time series analysis, which identifies trends or anomalies in time series data Line charts are popular because they are simple to understand and can be used for trend analysis, forecasting, and even anomaly detection
Tree Maps
One popular substitute for a pie chart is a tree map, which is a mosaic chart that presents groups and subgroups as rectangular portions of the larger whole A key difference between a tree map and a pie chart is that a tree map can handle multiple groupings of categorical values, such as hierarchies A tree map is a useful, quick visual reference to a data set Due to the rectangular shapes, a tree map makes it easy to interpret proportional sizes
Importance of Visualization Audience
Overlooking the audience when creating a data visualization is a mistake Data scientists, data analysts, or data engineers may know advanced data analytics techniques, but they may not understand certain principles, such as: -What a business presentation needs in order to be effective -How a business stakeholder may interpret a visualization -The appropriate level of detail for the business need -The value of white space
Bar Charts
Pie charts and tree maps show the breakdown of categories within a data set as part of the whole, making them ideal for both composition and comparison In contrast, traditional bar charts present categorical data as rectangular bars with heights or lengths proportional to the quantitative values they represent Because bar charts display the categories within a data set side by side, they're best used for comparison and aren't ideal for showing composition
Setting the Tone
Subliminal messages in data visualizations influence end users by conveying emotions A well-crafted visualization sets a tone the audience members will remember, even if they forget the exact numbers Setting the tone of a data visualization allows end users to feel the data as well as read it A great data visualization strikes a balance between providing facts and communicating the desired emotional response from those facts
Turning Data into a Story
There are many ways to create data visualizations: Many information systems, including enterprise resource planning (ERP) systems, have built-in reporting modules that can generate some of the basic visualizations Alternatively, data visualization software, like Tableau or Power BI, can visualize data captured by a system
Designing for a User
There's an art to creating dashboards that are beautiful, intuitive, and—most importantly—user-friendly The first step of any data visualization project is knowing your visualization audience, which includes all the end users of your data visualization, and understanding their requests
Typography
Typography describes the style and appearance of printed matter -One aspect of typography concerns the font choice for a visualization -Fonts influence the tone of a story, and using a professional font increases the credibility of a visualization with the audience -Fonts should be readable, even at small sizes
White Space
While color is important, so is its absence A cluttered data visualization distracts the end user from its intended message White space is negative space that creates a visual pause It allows end users to process a visualization's message and reduces cognitive load by eliminating unnecessary elements and distractions Its use is also considered modern and visually appealing
Stacked Bar Charts
While traditional vertical and horizontal bar charts don't intuitively display composition, there is another type of bar chart that does Stacked bar charts illustrate parts of a whole within the rectangular bars of a traditional bar chart They are useful when illustrating major and minor categories within one visualization
Importance of White Space
White space is important when creating visualizations for many reasons, including: o Comprehension: Eliminates distracting visual elements, which improves understanding o Simplicity: Avoids overwhelming the reader o Focus: Highlights key information by isolating it o Keep in mind that overusing white space can indicate a lack of content o It's important to find the right balance
Determining a Visualization's Audience
Who is the end user? The chief executive officer (CEO) needs a different approach than the internal audit department. How will the visualization be used? Design considerations are different when a visualization is going to be used as part of a PowerPoint presentation than when it is used to drill down into data to investigate fraud. What are the technical user requirements? End users may need certain filters to analyze a visualization, may have preferences as to the layout of a dashboard, and may even prefer specific colors, depending on the purpose of the visualization.
Data Visualization
a data analytics technique that presents data in a graphical format, such as a chart or graph, for analysis and communication Creating a high-quality visualization requires certain things of the designer: -Understanding the audience and their requirements -Using fundamental design principles to create user-friendly visualizations -Understanding the data and selecting the appropriate visualization technique -Putting it all together to develop the visualization