Information Technology Management Essentials - D075 UNIT 3 Module 4 &5

Ace your homework & exams now with Quizwiz!

Extract:

Once you have determined where your data reside, you can start extracting. The data are often extracted from customer relationship management (CRM) or enterprise resource planning (ERP) systems. One of ERP's main functions is to centralize an organization's data so that they end up being a wealth of data with value across the organization.

The data are full of redundancies, which the data analyst needs to remove. Which process can the data analyst use to remove the redundancies?

Normalization Correct! Normalization is the process of removing redundancies in data.

Freedom Rock Bicycles found that it has multiple records due to multiple sales to the same individual. This has caused access problems when looking for specific customers. What can Freedom Rock Bicycles do to alleviate this problem?

Normalize the database Correct! Normalization will reduce redundancy in the database.

Load:

Once data are transformed and normalized, they are ready to finally be transferred into the data warehouse or datamart. The more often this is done, the more up-to-date analytic reports can be.

Transform:

Once you have extracted data, they need to become normalized. Normalizing data means that your data are typically organized into the fields and records of a relational database. Normalization also reduces redundancy in storing data, which saves space and helps ensure that data are consistent.

Which approach can be used to retrieve this information efficiently?

Online analytical processing (OLAP) Correct! OLAP is basic processing to determine counts of information from the database.

What is the process of normalization of a database?

Removing redundancies within the database Correct! Removing redundant data improves the accuracy of the data and saves space.

Data can be categorized into three groups:

Structured data Unstructured data Semi-structured data

The data warehouse is very costly and time-consuming to manage, and the data analyst notices that the data warehouse only used 20 percent of its capacity during discovery. How can the company reduce the cost and time associated with managing the data warehouse?

Switch to a data mart Correct! A data mart is a smaller and more targeted version of a data warehouse.

A business analyst wants to use the social media data to create and present business intelligence. She will create visualizations that will be used by the executive team. Which tool is appropriate for creating and presenting this business intelligence?

Tableau Correct! Tableau is business intelligence software.

A data analyst wants to search though unstructured data from social media posts to look for useful customer behavior patterns and sentiments. Which type of analytics are appropriate for this task?

Text analytics Correct! Text analytics search through unstructured text data to look for useful patterns.

Some concepts of data analytics are as follows:

Text analytics Topic analytics Data mining

Volume

The main characteristic of big data is that it is big, and the remarkable volume it takes just to hold and manage it can be an effort in itself. The total volume of big data is growing exponentially year after year.

Data level security

This is where businesses implement processes to protect the actual data from getting stolen or tampered with in the database computers. One method of securing the information is database encryption (encrypting the data so that only authorized access can know how to unencrypt).

Online analytical processing (OLAP)

when basic queries are run against the database to gain insight into elementary data relations.

Foreign key

A field in a database table that provides a link between two tables in a relational database.

Primary key

A field in a database table that uniquely identifies a record in the table. For example, every student at a university has a unique student ID number. That would be the primary key in the student table.

data mapping

A simple definition of data maps is that they are extra notes about the definition of a field, its data, and its use.

Tables

Are where a database holds data. Consist of records (rows) separated by fields (columns) that can be queried (questioned) to produce subsets of information

big data

Data come from everywhere—including smartphone metadata, internet usage records, social media activity, computer usage records, and countless other data sources—to be sifted for patterns and trends

Which term refers to managing the availability, integrity, and security of an organization's data to ensure that the data remain high quality and valid?

Data governance Correct! Data governance is the management of the availability, integrity, and security of an organization's data to ensure that the data remain high quality and valid.

There are three main levels of security that need implementation to protect database information:

Data level security System level security User level security

A data analyst wants to use software to look for useful patterns and hidden relationships in this large set of social media data. Which process can be used to look for these patterns and relationships?

Data mining Correct! Data mining is the process of looking for meaningful patterns in data.

Database Management Systems (DBMS)

Databases are created using this software systems. data are stored in computer files called tables, and tables are connected to other tables with related information; hence, a relational database

The following are forms of business analytics:

Descriptive analytics Predictive analytics Decision analytics

What can the data analyst use to retrieve and process the data to put into the data warehouse?

ETL Correct! ETL is the process of extracting, transforming, and loading the data into the database.

There are two main big data tools sets used to extract data

ETL and Hadoop.

Which restriction applies to the data in the primary field of a database?

Each key must be unique. Correct! The key in the primary field of a database must be unique and cannot be repeated.

The data analyst at Freedom Rock Bicycles wants to link two tables in the database. What is the name of the field in a database table that is used to connect or link two tables together?

Foreign key Correct! A foreign key is the field in a database that links two tables together.

Freedom Rock Bicycles is considering adopting a cloud-database system. What is the primary advantage of a cloud database?

It can be accessed from anywhere there is an internet connection. Correct! Cloud databases are not located on a physical server within a corporation.

By segmenting data in a data management system, a business achieves a significant advantage over nonsegmentation. What is that advantage?

It targets the right customers. Correct! Data segmentation segments information into useful categories.

Which valuable business intelligence (BI) can Freedom Rock Bicycles acquire from the database that directly relates to overall revenue and profitability based on this scenario?

Knowing which products sell the best in each region Correct! This is valuable BI for marketing that would increase revenue and profitability.

System level security

This means protecting the hardware that the database resides on and other communications equipment from malicious software that tries to enter the system. Firewalls and other network security systems prevent unauthorized intrusion into the system. Corporations fear a ransomware attack where their systems are invaded and ransom software is installed on their database systems so that it cannot be accessed unless a ransom is paid and a key is provided by the malicious person

User level security

User level security starts with logon IDs and passwords but can go much further in verification to restrict the user from visiting unauthorized websites or downloads from untrusted sources. Network administrators, along with corporate policies, define who has access to what, and what type of access that can be (read, write, etc.).

Which attribute of big data relates to whether the data are structured or unstructured?

Variety

Velocity

Velocity is how many data are presented that need to be stored or processed over a given time period. Streaming applications such as Amazon Web Services are an example of a good velocity of data, not to mention how many data your cell phone generates each minute or satellites generate and stream down for analysis.

There are four general attributes that define big data. They are the 4 Vs.

Volume Variety Veracity Velocity

Master data management, or MDM

a methodology or process used to define, organize, and manage all the data of an organization that provides a reference for decision-making. can be used to support master data management by removing duplicates, standardizing data, and incorporating rules to prevent incorrect data from entering the system, thus creating an accurate source of master data.

relational database

a structured database that allows a business to identify and access data in relation to another piece of data within the same database

Querying

a tool that lets you ask your data questions that in turn lead to answers and assist in making decisions

Data management processes

acquiring data, making sure the data are valid, and then storing and processing the data into usable information for a business.

ETL

an acronym for extract, transform, and load, and are tools that are used to standardize data across systems and allow the data to be queried

Data analysis

applying statistics and logic techniques to define, illustrate, and evaluate data. Simply stated, data analysis attempts to make sense of an organization's collected data, turn those data into useful information, and validate the organization's future decisions (like what product to sell or whom the organization should hire)

Databases

are well-thought-out collections of computer files, the most important of which are called tables

Business analytics

attempts to make connections between data so organizations can try to predict future trends that may give them a competitive advantage. Business analytics can also uncover computer system inadequacies within an organization.

Predictive analytics

attempts to reveal future patterns in a marketplace, essentially trying to predict the future by looking for data correlations between one thing and any other things that pertain to it.

Decision analytics

builds on predictive analysis to make decisions about future industries and marketplaces. Decision analytics looks at an organization's internal data, analyzes external conditions like supply abundance, and then endorses the best course of action.

data management (DM)

consists of the practices, architectural techniques, and tools for achieving consistent access to and delivery of data across the spectrum of data subject areas and data structure types in the enterprise.

Oracle Data Mining (ODM)

features Structured Query Language (SQL) that can dig data out of big data sets.

Variety

indicates that data come both from structured and unstructured areas and in various forms. Structured data are data that you can easily recognize, such as your electric bill, and this information easily fits into a relational database. But there are also the unstructured data that come from personal writings, such as Facebook, MRI images, photographs, and blogs, and data from satellites. This is valuable information that does not fit a form.

Veracity

is akin to the "trustworthiness" of the data. Do the data represent what you believe they should? And, are there discrepancies within the data that must be ferreted out and "scrubbed" to make the data worthwhile and valuable?

Descriptive analytics

is the baseline that other types of analytics are built upon. Descriptive analytics defines past data you already have that can be grouped into significant pieces, like a department's sales results, and also start to reveal trends. This is categorizing the information.

Semi-structured data

lands somewhere in-between structured and unstructured data. It can possibly be converted into structured data, but not without a lot of work.

Data governance

managing the availability, integrity, and security of the data to ensure that the data remain high quality and valid for data analytics. Policies and procedures are established that define the data governance program, such as who has access, who has update capabilities, when and how backups are made and stored, and who administers the policies to ensure that they are followed

Some of the more common uses of data mining tools are in..

marketing, fraud protection, and surveillance

The data management process.....

obtains, authenticates, stores, and protects data and then processes the data to ensure that they are accessible, reliable, secure, and timely. This knowledge could have kept the mailing campaign from going bad

terabyte

one thousand gigabytes

Yottabytes

one trillion terabytes of data

Tableau

produces interactive data visualization products focused on business intelligence. Much like spreadsheets, it helps with simplifying raw data into different formats that can be understood using graphs, charts, and numerical analysis

Structured data

resides in fixed formats. This data is typically well labeled and often with traditional fields and records of common data tables. Structured data does not necessarily have to be "table-like," but needs to at least be in a standard format and have recognizable patterns that allow it to be more easily queried and searched.

Data mining

sometimes called data discovery, is the examination of huge sets of data to find patterns and connections and identify outliers. Data mining also provides insight into relationships that the user may not recognize but that are useful as information.

Text analytics

sometimes called text-mining, hunts through unstructured text data to look for useful patterns such as whether their customers on Facebook or Instagram are unsatisfied with the organization's products or services.

business intelligence (BI)

strategies and technologies used by enterprises for the data analysis of business information. an assortment of software applications used to analyze an organization's raw data. BI can be described as computer applications that change data into significant, meaningful information that helps organizations make better decisions

Data mining, sometimes called data discovery

the examination of huge sets of data to find patterns and connections and identify outliers and hidden relationships

Structured Query Language (SQL)

the most widely used standard computer language for relational databases, as it allows a programmer to manipulate and query data.creates infinite sets of data output for reporting, analysis, and, most importantly, helping with tactical and strategic decisions

schema

the organization or layout of the database that defines the tables, fields, keys, and integrity of the database. It is the reference or the blueprint of the database.

validity checks

the process of ensuring that a concept or construct is acceptable in the context of the process or system that it is to be used in

Topic analytics

tries to catalog phrases of an organization's customer feedback into relevant topics. For example, if a customer said, "the barista was friendly," that would be categorized under the topic "Employee Friendliness."

Unstructured data

unorganized data that cannot be easily read or processed by a computer because it is not stored in rows and columns like traditional data tables. Imagine collecting massive numbers of Facebook messages and Instagram posts to determine future fashion trends. Organizations attempt to collect videos, documents, and all manner of fragmented data to eventually make sense of it. A whopping 80% of all data is unstructured!

Datamarts

use data from smaller parts of an organization, like the marketing or purchasing departments. limit the complexity of databases because they focus on one particular subject of data storage

Hadoop

uses a distributed file system that allows files to be stored on multiple servers. attempts to identify data files on other servers.needs a highly qualified data scientist to run it, is best for large companies like Facebook, eBay, and American Express that create terabytes and petabytes of data every day

qualitative ROI

which addresses the intangible values of data loss or a decrease in operating efficiencies


Related study sets

chapter 18- Program Design and Technique for Plyometric Training

View Set

Business Management II - VB Management Reading

View Set