Ch.16 Data and Competitive Advantage: Databases, Analytics, AI, and Machine Learning

¡Supera tus tareas y exámenes ahora con Quizwiz!

column or field:

A column in a database table. Columns represent each category ofdata contained in a record

Data lake:

A data lake is a central storage repository that holds big data from manysources in a raw, granular format. It can store structured, semi-structured, orunstructured data, which means data can be kept in a more flexible format for futureuse. When storing data, a data lake associates it with identifiers and metadata tagsfor faster retrieval.

Data sourcing:

A data source is where that data that is being used to run a report or gain information is originating from. For a database management system, the source is the database.

Foreign Key:

A field in a related table pointing back to the Primary Key inanother table.

python:

A general purpose programming language that is also popular for data analytics.

Structured query language (SQL):

A language used to create and manipulate databases

Primary Key:

A primary key is a column(s) within a relational database tablethat uniquely represents each record in the table. For example, the idealprimary key for a table of students would be their ID number, as this woulduniquely identify each student in the table. Each table within a database willhave its own primary key. The main purpose of designating a primary key is toidentify each unique record in a particular table.

R:

A programming language specifically created for analytics, statistical, and graphical computing

relational database:

A relational database contains tables that relate to another tableby using a relationship. The tables are connected by a common field.

row or record:

A row in a database table. Records represent a single instance ofwhatever the table keeps track of.

What is Business intelligence (BI)?

A term combining aspects of reporting, dataexploration and ad hoc queries, and sophisticated data modeling and analysis.

What is Analytics?

A term describing the extensive use of data, statistical andquantitative analysis, explanatory and predictive models, and fact-basedmanagement to drive decisions and actions

query tools:

A tool to interrogate a data source or multiple sources and return a subset of data, possibly summarized, based on a set of criteria.

What is Machine learning?

A type of artificial intelligence that leverages massiveamounts of data so that computers can act and improve on their own withoutadditional programming.

serverless computing::

A type of cloud computing where a third-party vendor manages servers, replication, fault-tolerance,computing scalability, and certain aspects of security, freeingsoftware developers to focus on building "Business Solutions"and eliminating the need to spend time and resources managingthe technology complexity of much of the underlying "ITSolution."

semi-supervised learning:

A type of machine learning where the data used to build models contains data with explicit classifications but is also free to develop its own additional classifications that may further enhance result accuracy.

graphical query tools:

Allow a user to create a query through a point-and-click or drag-and-drop interface, rather than requiring programming knowledge.

CAPTCHAs:

An acronym standing for completely automated public Turing test to tell computers and humans apart.

Extract:

Bringing in data from a variety of different data sources. once you have all of them together, we can then "Transform" it.

Statistics

Building models Interpreting the strength and validity of results

Data Warehouses, Data Marts, Data Lakes, and the TechnologyBehind "Big Data": Data warehouse-A set of databases to support decision-making in an organization.Data Warehouse takes a long time for data handling

Collects data from many different operational systems

Turing test:

Conceived by Alan Turing, a Turing test of software's ability to exhibit behavior equivalent to, or indistinguishable from, a human being.

Data mining is the process of extracting useful information from an accumulation of data, often from a data warehouse or collection of linked data sets.

Data mining tools include powerful statistical, mathematical, and analytics capabilities whose primary purpose is to sift through large sets of data to identify trends, patterns, and relationships to support informed decision-making and planning.

Deep learning:

Deep learning is an artificial intelligence function that imitates the working of the human brain in processing data and creating patterns for use in decision making.

neural networks:

Examines data and hunts down and exposes patterns, in order to build models to exploit findings.

Many to Many- occurs when multiple records in a table are associated with multiple records in another table. This relationship exists between customers and products: customers can purchase various products, and products can be purchased by many customers.

Example: For enrollment in the university, the enrollment office must record which studentsare enrolled in each class. A class can have many students, and we must also recordthat students can be enrolled in many courses.

Understanding How Data is Organized- One to Many- this is the most common type of a table relationship. For everyrecord in Table A, there are multiple records in Table B

Example: One customer may make several purchases, but each purchase is madeby a single customer..

dashboards:

Heads-up display of critical indicators that allows managers to get a graphical glance at key performance metrics.

Business knowledge

Helping set system goals and requirements Offering deeper insight into what the data really says about the firm's operating environment

Data Mastery- How data-powered organizations outperform their competitors

In today's increasingly volatile and competitive business environment, organizations areforced to look to data to hone their strategic and tactical responses. But some organizationsare clearly better at making that data work for them

Data quantity:

Information that can be quantified. It can be counted or measured and given a numerical value—such as length in centimeters or revenue in dollars.

Extract, transform, and load (ETL) is a data pipeline used to collect data from various sources.

It then transforms the data according to business rules

Database administrator (DBA):

Job title focused on directing, performing, or overseeingactivities associated with a database or set of databases.

expert systems:

Leverages rules or examples to perform a task in a way that mimics applied human expertise.

table or file:

List of data, arranged in columns or fields and rows or records

genetic algorithms:

Model building techniques where computers examine many potential solutions to a problem.

Understanding How Data is Organized: Database-Single table or a collection of related tables.

Most organizations have several databases—perhaps even hundreds or thousands. Andthese various databases might be focused on any combination of functional areas.

OCR:

Optical Character Recognition. Software that can scan images and identify text within them.

data cloud:

Over 400 million Software as a Service (SaaS) datasets remained siloed globally, isolated in cloud data storage andon-premise data centers. The Data Cloud eliminates these silos,allowing you to seamlessly unify, analyze, share, and evenmonetize your data

canned reports:

Provide regular summaries of information in a predetermined format.

ad hoc reporting tools:

Puts users in control so that they can create custom reports on an as-needed basis by selecting fields, ranges, summary conditions, and other parameters.

Data quality:

Quality data is useful data. To be of high quality, data must be consistent and unambiguous.

Database management systems (DBMS):

Software for creating, maintaining, and manipulating data (also known as database software)

Self-supervised learning:

Sometimes called unsupervised learning, where systems build pattern-recognizing algorithms using data that has not been pre-classified.

Supervised learning:

Supervised learning helps organizations solve for a variety of real-world problems at scale, such as classifying spam in a separate folder from your inbox.

loyalty card:

System that provides rewards in exchange forconsumers, allowing tracking and recording of their activities.Enhances data collection and represents a significant switching cost.

Transaction processing system:

Systems that record a transactionor some form of business-related exchange, such as a cash registersale, ATM withdrawal or product return

The Customer table contains data about the customer: Customer ID (primary key) Customer name Billing address Shipping address In the Customer table, the customer ID is a primary key that uniquely identifies who thecustomer is in the relational database. No other customer would have the same Customer ID

The Order table contains transactional information about an order: Order ID (primary key) Customer ID (foreign key) Order date Shipping date Order status Here, the primary key to identify a specific order is the Order ID. You can connect a customerwith an order by using a foreign key to link the customer ID from the Customer table.

Starbucks Rewards program is considered one of the most successfulin all of retail. • Over 20 million members and claims that 40 percent of the firm's overall salesare from Starbucks Rewards. • Nearly three-quarters of all Starbucks app users visit a store at least once aweek, with App users 5.6 times more likely to visit every day.

The Starbucks App allows: • Customers to scan purchases made outside the store. • Customers to order ahead of time, which decreases line waits and increasescustomer satisfaction. • Customers to send Starbucks-spendable love to a friend. • Linking to Alexa for voice reorder. • Customers to earn stars by employing gamification. • Corporate messaging

What is big data?

The collections, storage, and analysis of extremely large, complex, and oftenunstructured data sets that can be used by organizations to generate insights that wouldotherwise be impossible.

data visualization:

The graphical representation of data and information.

Information technology

Understanding how to pull together data Selecting analysis tools

Big Data has 3 major factors:

Volume, Velocity, Variety

Data relevance:

a level of consistency between the data content and the area of interest of the user.a level of consistency between the data content and the area of interest of the user.

Machine learning:

a sub-category of artificial intelligence, it involves using pattern recognition software to find trends in data, building models that explain the trends/patterns, and then using the models to predict something.

Load:

after we have this new view, new perspective on our data, we want to load that new curated data into another data source.

A common problem that organizations face is

how to gather data from multiple sources, in multiple formats. We need to move it to one or more data stores. The destination might not be the same type of data store as the source. Often the format is different, or the data needs to be shaped or cleaned before loading it into its destination.

Data governance:

is a principled approach to managing data during its life cycle, from acquisition to use to disposal.

Hadoop:

is an open source framework that is used to efficientlystore and process large datasets ranging in size from gigabytesto petabytes of data

Transform:

process of decoupling, de-normalizing, combining, data that you never had the perspective to put together before. Now you have your own playground to really start to make some new relationships. Maybe you throw in a little bit of relational database and SQL in there to do some processing as well.

Data hosting:

the act of storing the data on a stable and accessible web platform.

Data, Analytics, and Competitive Advantage: (T/F) Anyone can acquire technology—but data is considered a defensible source of competitive advantage

true

Data, Analytics, and Competitive Advantage: (T/F) Differentiation will be the key in distinguishing operationally effective datause from those efforts that can yield true strategic positioning.

true

Data, Information, and Knowledge- (T/F): When this information can be combined with a manager's knowledge—theirinsight from experience and expertise—stronger decisions can be made

true

data mart- Database or databases focused on addressing the concerns of aspecific problem or business unit.

• By reducing the volume of data, a data mart helps to improve user response timeand offers quick access to frequently used data. • It is easy to implement with much less cost, as compared to implementing a full datawarehouse. • It is scalable and agile, which comes in handy when changing models. • Marts and warehouses may contain huge volumes of data.

The Internet also allows for easy access to data that had been public but isotherwise difficult to access:

• Consider home sale prices and home value assessments.

Benefits from data mastery:

• Data leverage lies at the center of competitive advantage in many of the firms thatwe've studied, including Amazon, Netflix, and Zara. • In many organizations data lies dormant, spread across inconsistent formats andincompatible systems, unable to be turned into anything of value.

There are four primary dvantages when using Big Data technologies in a data cloud: flexibility, scalability, cost-effectiveness, fault tolerance

• Flexibility: Data lakes can absorb any type of data, structured or not, fromany type of source. • Scalability: These systems can start on a single PC, but thousands ofmachines can eventually be combined to work together for storage andanalysis • Cost-effectiveness: May further reduce hardware and management costs. • Fault tolerance: Big Data storage is designed in such a way that there willbe no single point of failure

The data a firm can leverage is a true strategic asset when it's valuable, rare,imperfectly imitable, and lacking in substitutes:

• If more data brings more accurate modeling, moving early to capture thisrare asset can be the difference between a dominating firm and an also-ran. • Advantages based on formulas, algorithms, and data that others canacquire will be short-lived

One must be aware of the digital tracking of individuals:

• Made possible by the availability of personal information online.

Data aggregators: Firms that collect and resell data.

• One must be aware of the digital tracking of individuals. • The Internet also allows for easy access to data that had been public but is otherwise difficult to access • There are accuracy concerns • Sometimes data aggregators are just plain sloppy, committing errors that can be costly for the firm and potentially devastating for victimized users.

Decision-making is data-driven, fact-based, and enabled by:

• Standardized corporate data. • Access to third-party data sets through cheap/fast computing and easier-to-use software .

There are accuracy concerns:

• The terror watch-list has had several embarrassing, high-profile errors in its data


Conjuntos de estudio relacionados

World History: Quiz 2- The Reformation

View Set

AGECO 121: Final Exam from Notes

View Set

Gastric & Colon Cancer NCLEX - AHII Test 1

View Set