Google Data Analytics
Fill Handle
A box in the lower-right-hand corner of a selected spreadsheet cell that can be dragged through neighboring cells in order to continue an instruction
Cell Reference
A cell or a range of cells in a worksheet typically used in formulas and functions
Attribute
A characteristic or quality of data used to label a column in a table
Pivot Chart
A chart created from the fields in a pivot table
Dataset
A collection of data that can be manipulated or analyzed as one unit
Data
A collection of facts
Record
A collection of related data in a data table, usually synonymous with row
Range
A collection of two or more cells in a spreadsheet
Bias
A conscious or subconscious preference in favor of or against a person, group of people, or thing
Bad Data Source
A data source that is not reliable, original, comprehensive, current, and cited (ROCCC) (Refer to Good data source)
Good Data Source
A data source that is reliable, original, comprehensive, current, and cited (ROCCC) (Refer to Bad data source)
Pivot Table
A data summarization tool used to sort, reorganize, group, count, total, or average data
Boolean Data
A data type with only two possible values, usually true or false
Metadata Repository
A database created to store metadata
Normalized Database
A database in which only related data is stored in each table
Relational Database
A database that contains a series of tables that can be connected to form relationships
Long Data
A dataset in which each row is one time point per subject, so each subject has data in multiple rows
Wide Data
A dataset in which every data subject has a single row with multiple columns to hold the values of various attributes of the subject
Data Science
A field of study that uses raw data to create new ways of modeling and understanding the unknown
Foreign Key
A field within a database table that is a primary key in another table (Refer to Primary key)
Return on Investment (ROI)
A formula that uses the metrics of investment and profit to evaluate the success of an investment
DATEDIF Function
A function that calculates the time between two dates
MAX
A function that returns the largest numeric value from a range of cells
MIN
A function that returns the smallest numeric value from a range of cells
Data Interoperability
A key factor leading to the successful use of open data among companies and governments
Metric Goal
A measurable goal set by a company and evaluated using metrics
Gap Analysis
A method for examining and evaluating the current state of a process in order to identify opportunities for improvement in the future
Data Element
A piece of information in a dataset
Function
A preset command that automatically performs a process or task using the data in a spreadsheet
Data Governance
A process for ensuring the formal management of a company's data assets
Algorithm
A process or set of rules followed for a specific task
Cross-Field Validation
A process that ensures certain conditions for multiple data fields are satisfied
Hypothesis Testing
A process to determine if a survey or experiment has meaningful results
Fairness
A quality of data analysis that does not create or reinforce bias
Relevant Question
A question that has significance to the problem to be solved
Specific Question
A question that is simple, significant, and focused on a single topic or a few closely related ideas
Time-Bound Question
A question that specifies a timeframe to be studied
Leading Question
A question that steers people toward a certain response
Measurable Question
A question whose answers can be quantified and assessed
Action-Oriented Question
A question whose answers lead to change
Confidence Interval
A range of values that conveys how likely a statistical estimate reflects the population
Query
A request for data or information from a database
Regular Expression (Regex)
A rule that says the values in a table must match a prescribed pattern
Text Data Type
A sequence of characters and punctuation that contains textual information (Refer to String data type)
String Data Type
A sequence of characters and punctuation that contains textual information (Refer to Text data type)
Formula
A set of instructions used to perform a calculation using the data in a spreadsheet
Field
A single piece of information from a row or column of a spreadsheet; in a data table, typically a column in the table
Metric
A single, quantifiable type of data that is used for measurement
Quantitative Data
A specific and objective measure, such as a number, quantity, or range
COUNT
A spreadsheet function that counts the number of cells in a range
VLOOKUP
A spreadsheet function that vertically searches for a certain value in a column to return a corresponding piece of information
Report
A static collection of data periodically given to stakeholders
Qualitative Data
A subjective and explanatory measure of a quality or characteristic
Scope of Work (SOW)
An agreed-upon outline of the tasks to be performed during a project
Data Type
An attribute that describes a piece of data based on its values, its programming language, or the operations it can perform
Primary Key
An identifier in a database that references a column in which each value is unique (Refer to Foreign key)
Notebook
An interactive, editable programming environment for creating data reports and showcasing data skills
CSV file
Comma-separated values file; a delimited text file that uses a comma to separate values
Naming Conventions
Consistent guidelines that describe the content, creation date, and version of a file in its name
Metadata
Data about data; in database management, it helps data analysts interpret the contents of the data within a database
Second Party Data
Data collected by a group directly from its audience and then sold
First-Party Data
Data collected by an individual or group using their own resources
Structured Data
Data organized in a certain format such as rows and columns
Third Party Data
Data provided from outside sources who didn't collect it directly
Open Data
Data that is available to the public
Discrete Data
Data that is counted and has a limited number of values; i.e., the number of students in a class, the number of test questions answered correctly
Continuous Data
Data that is measured and can have almost any numeric value; i.e. height of children, speed of cars
Unstructured Data
Data that is not organized in any easily identifiable manner
External Data
Data that lives and is generated outside of an organization
Internal Data
Data that lives within a company's own systems
Access Control
Features such as password protection, user permissions, and encryption that are used to protect a spreadsheet
Data Design
How information is organized
Sample
In data analytics, a segment of a population that is representative of the entire population
Population
In data analytics, all possible data values in a dataset
Big Data
Large, complex datasets typically involving long periods of time, which enable data analysts to address far-reaching business problems
Descriptive Metadata
Metadata that describes a piece of data and can be used to identify it at a later point in time
Structural Metadata
Metadata that indicates how a piece of data is organized and whether it is part of one or more than one data collection
Administrative Metadata
Metadata that indicates the technical source of a digital asset
Data Range
Numerical values that fall between predefined maximum and minimum values
Sampling Bias
Overrepresenting or underrepresenting certain members of a population as a result of working with a sample that is not representative of the population as a whole
Stakeholders
People who invest time and resources into a project and are interested in its outcome
General Data Protection Regulation of the European Union (GDPR)
Policy-making body in the European Union created to help protect people and their data
Data Privacy
Preserving a data subject's information any time a data transaction occurs
Data Security
Protecting data from unauthorized access or corruption by adopting safety measures
Ordinal Data
Qualitative data with a set order or scale, i.e., income level, education level, satisfaction, etc.
Small Data
Small, specific data points typically involving a short period of time, which are useful for making day-to-day decisions
Data Analyst
Someone who collects, transforms, and organizes data in order to draw conclusions, make predictions, and drive informed decision-making
Technical Mindset
The ability to break things down into smaller steps or pieces and work with them in an orderly and logical way
Data Integrity
The accuracy, completeness, consistency, and trustworthiness of data throughout its lifecycle
Problem Domain
The area of analysis that encompasses every activity affecting or affected by a problem
Transaction Transparency
The aspect of data ethics that presumes all data-processing activities and algorithms should be explainable and understood by the individual who provides the data
Consent
The aspect of data ethics that presumes an individual's right to know how and why their personal data will be used before agreeing to provide it
Ownership
The aspect of data ethics that presumes individuals own the raw data they provide and have primary control over its usage, processing, and sharing
Currency
The aspect of data ethics that presumes individuals should be aware of financial transactions resulting from the use of their personal data and the scale of those transactions
Openness
The aspect of data ethics that promotes the free access, usage, and sharing of data
Observation
The attributes that describe a piece of data contained in a row of a table
Estimated Response Rate
The average number of people who typically complete a survey
Data Analysis
The collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making
Data Constraints
The criteria that determine whether a piece of a data is clean and valid
Consistency
The degree to which data is repeatable from different points of entry or collection
Validity
The degree to which the data conforms to constraints when it is input, collected, or created
Accuracy
The degree to which the data conforms to the actual entity being measured or described
Completeness
The degree to which the data contains all desired components or measures
Geolocation
The geographical location of a person or device by means of digital information
Data Visualization
The graphical representation of data
Data Strategy
The management of the people, processes, and tools used in data analysis
Margin of Error
The maximum amount that the sample results are expected to differ from those of the actual population
Confidence Level
The probability that a sample size accurately reflects the greater population
Statistical Power
The probability that a test of significance will recognize an effect that is present
Statistical Significance
The probability that sample results are not due to random chance
Sorting
The process of arranging data into a meaningful order to make it easier to understand, analyze, and visualize
Data Manipulation
The process of changing data to make it more organized and easier to read
Data Transfer
The process of copying data from a storage device to computer memory or from one computer to another
Data-Inspired Decision-Making
The process of exploring different data sources to find out what they have in common
Analytical Thinking
The process of identifying and defining a problem, then solving it by using data in an organized, step-by-step manner
Data Anonymization
The process of protecting people's private or sensitive data by eliminating identifying information
Structured Thinking
The process of recognizing the current problem or situation, organizing available information, revealing gaps and opportunities, and identifying options
Reframing
The process of restating a problem or challenge, then redirecting it toward a potential resolution
Filtering
The process of showing only the data that meets a specified criteria while hiding the rest
Data Replication
The process of storing data in multiple locations
A/B Testing
The process of testing two variations of the same web page to determine which page is more successful at attracting user traffic and generating revenue
Business Task
The question or problem data analysis answers for a business
Root Cause
The reason why a problem occurs
Data Analytics
The science of data
SELECT
The section of a query that indicates from which column(s) to extract the data
FROM
The section of a query that indicates from which table(s) to extract the data
WHERE
The section of a query that specifies criteria that the extracted data must meet
Data Life Cycle
The sequence of stages that data experiences, which include plan, capture, manage, analyze, archive, and destroy
Data Analysis Process
The six phases of ask, prepare, process, analyze, share, and act whose purpose is to gain insights that drive informed decision-making
Observer Bias
The tendency for different people to observe things differently (Refer to Experimenter bias)
Experimenter Bias
The tendency for different people to observe things differently (Refer to Observer bias)
Interpretation Bias
The tendency to interpret ambiguous situations in a positive or negative way
Confirmation Bias
The tendency to search for or interpret information in a way that confirms pre-existing beliefs
Data Ecosystem
The various elements that interact with one another in order to produce, manage, store, organize, analyze, and share data
Problem Types
The various problems that data analysts encounter, including categorizing things, discovering connections, finding patterns, identifying themes, making predictions, and spotting something unusual
Data Ethics
Well-founded standards of right and wrong that dictate how data is collected, shared, and used
Data Bias
When a preference in favor of or against a person, group of people, or thing systematically skews data analysis results in a certain direction
Redundancy
When the same piece of data is stored in two or more places
Unbiased Sampling
When the sample of the population being measured is representative of the population as a whole
Operator
A symbol that names the operation or calculation to be performed
SMART Methodology
A tool for determining a question's effectiveness based on whether it is: specific, measurable, action-oriented, relevant, and time-bound
Data Model
A tool for organizing data elements and how they relate to one another
Dashboard
A tool that monitors live, incoming data
Nominal Data
A type of qualitative data that is categorized without a set order; i.e. country, gender, race, hair color, etc.
Unique Value
A value that can't have a duplicate
Schema
A way of describing how something, such as data, is organized
Random Sampling
A way of selecting a sample from a population so that every possible type of the sample has an equal chance of being chosen