ban exam 1
Why is an effective ETL process essential to data warehousing?
"Dirty data" can result in incorrect or misleading statistics used for decision making.
Briefly describe how a Wi-Fi network operates.
A computer's wireless adapter translates data into a radio signal and easily transmits it using an antenna. After that, a wireless router receives the signal and decodes it. The router is there also to send information to the internet using a wired connection.
Which of the following statements is NOT true about a mainframe computer?
A single computer with a single user is typical of a mainframe computer.
In what ways can a BI and analytics system be useful to a grocery store or pharmacy?
BI and analytics system could be useful to a grocery store as it could track what people buy and where to put these items so they will be purchases by customers. For a pharmacy BI and analytics could be used to record and store data about people's medications and the computer can organize when customers are due for a refill and it can store what prescription they have.
All of the following are examples of someone using personal productivity software EXCEPT _________.
Callie entering a customer's order into a restaurant's ordering system
Briefly describe the hierarchy of data.
Data is generally organized in a hierarchy that begins with the smallest piece of data used by computers (a bit) and progresses through the hierarchy to a database. A bit (a binary digit) represents a circuit that is either on or off. Bits can be organized into units called bytes. A byte is typically eight bits. Each byte represents a character, which is the basic building block of most information. Characters are put together to form a field. A field is typically a name, number, or combination of characters that describes an aspect of a business object or activity. A collection of data fields all related to one object, activity, or individual is called a record. A collection of related records is a file. At the highest level of the data hierarchy is a database, a collection of integrated and related files. Together, bits, characters, fields, records, files, and databases form the hierarchy of data.
Explain entity-relationship (ER) diagrams.
Entity-relationship (ER) diagrams use basic graphical symbols to show an organization of and relationships between data. In other words, ER diagrams show data items in tables (entities) and the ways they are related. ER diagrams help ensure that the relationships among the data entities in a database are correctly structured so that any application programs developed are consistent with business operations and user needs. In addition, ER diagrams can serve as reference documents after a database is in use. If changes are made to the database, ER diagrams help design them.
_____ is a markup language designed to transport and store data on the Web.
Extensible Markup Language (XML)
Which of the following is required to create a traditional data warehouse but NOT a data lake?
Extract Transform Load process
Outline the steps in the Extract Transform Load (ETL) process and explain the purpose of each step.
Extract data from its original source, transform data by deduplicating it, combining it, and ensuring quality, to then load data into the target database.
_____ is an approach concerned with the efficient and environmentally responsible design, manufacture, operation, and disposal of IS-related products.
Green computing
Which statement about Hadoop is correct?
Hadoop's HDFS divides data into subsets and distributes them onto different servers.
In the data center for Mack's company, one server is dedicated to each application. In contrast, the data center for Jack's company uses virtualized servers. Which statement would you expect to be true?
Jack's company pays for fewer software licenses
_____ is an open-source operating system whose source code is freely available to everyone.
Linux
_____ is an example of a popular general-purpose software suite for personal computer users.
Microsoft Office
Monte's employer provides SaaS applications for its staff to use for their daily job functions. This means that ____.
Monte can sign in to use these applications from any computer or device
A newly discovered entity or attribute can be added to a NoSQL database dynamically because _____.
NoSQL databases do not require a predefined schema
One difference between NoSQL and relational databases is that _____.
NoSQL databases have a greater horizontal scaling capability
_____ are used to grade standardized tests, including the SAT and GMAT tests, and to record votes in elections.
Optical mark recognition (OMR) systems
_____ is a process for driving traffic to a Web site by using techniques that improve the site's ranking in search results.
Search engine optimization
_____ is a special-purpose programming language for accessing and manipulating data stored in a relational database.
Structured Query Language (SQL)
Distinguish between systems software and applications software.
System software is the interface between the application software and the computer's hardware. System software includes operating systems, utilities, and middleware and itcoordinates the activities and functions of the hardware and other programs throughout the computer system. Applications software consists of programs that help users solve computing problems.
_____ is a communication standard that enables computers to route communications traffic from one network to another as needed.
The Internet Protocol (IP)
Briefly explain the passing of packets over the Internet.
The Internet works by chopping data into chunks called packets. Each packet then moves through the network in a series of hops.
_____ consists of server and client software, the Hypertext Transfer Protocol (HTTP), standards, and markup languages that combine to deliver information and services over the Internet.
The Web
The graphical representation that summarizes the steps a consumer takes in making the decision to buy your product and become a customer is called _____.
a conversion funnel
Hadoop's two major components are _____.
a data processing component and a distributed file system
New cars come with onboard computer systems that control antilock brakes, air bag deployment, fuel injection, etc. They run operating system software known as ____.
an embedded operating system
During the modeling phase of the CRISP-DM method, the team conducting the data mining project ______.
applies selected modeling techniques
Graph NoSQL databases _____.
are well-suited for analyzing interconnections
A tier 1 or 2 data center would be most appropriate for an organization with which characteristic?
capable of conducting critical operations manually
Which of the following is NOT one of the common purposes of utility programs?
creating spreadsheets
Raw facts such as a social security number or catalog item number for a shirt are known as _____.
data
A _____ is a climate- and access-controlled building or a set of buildings that houses the computer hardware that delivers an organization's data and information services.
data center
A ______ is a collection of instructions and commands used to define and describe data and relationships in a specific database.
data definition language
Melanie's company takes a "store everything" approach to big data, saving all of it in a raw, unaltered form. Only when she needs to analyze some of the data is it extracted from this _____.
data lake
A _____ is a subset of a data warehouse that is used by small- and medium-sized businesses and departments within large companies to support decision making.
data mart
_____ is used to explore large amounts of data for hidden patterns to predict future trends.
data mining
Because Airbnb needs to support a rapid growth in its number of users but prefers to hire a service to organize and configure its information systems infrastructure, it relies on a(n) _____.
database as a service
With _____, the database is stored on a service provider's servers and accessed by the client over a network, typically the Internet.
database as a service
A tier 3 or 4 data center would be most necessary for an organization with which characteristic?
dependent on computers to manage manufacturing operations
Each attribute in a relational database model can be constrained to a range of allowable values called a _____.
domain
What criteria are used by the Uptime Institute to classify data centers into four tiers?
expected annual downtime, fault tolerance, and power outage protection
During which step of the ETL process can data that fails to meet expected patterns or values be rejected to help clean up "dirty data"?
extract
Which component of the data hierarchy is ranked just below the database and represents a collection of similar entities?
file
_____ is the use of a collection of computers, often owned by many people or different organizations, to work in a coordinated manner to solve a common problem.
grid computing
A DaaS arrangement can be especially cost effective for businesses that _____.
have fluctuating needs for database storage capacity
When Peter purchases software, _____.
he is acquiring a license to use it on his computer
A database system that stores the entire database in random access memory is known as a(n) _____.
in-memory database
KDDI Corporation chose to consolidate their servers into a single Oracle SuperCluster running the Oracle Times Ten in-memory database in order to _____.
increase data access rates and efficiency
Which type of end user license is considered a volume license because it allows the licensee to install the software on a certain number of computers?
individual/multiuser
When rules and relationships are set up to organize raw facts, creating value beyond that of those individual facts, this produces _____.
information
Compared with commercially licensed software, open-source software _____.
is available for similar purposes such as CPU operation and database management
A SaaS provider such as Oracle or SAP manages service levels and availability. This is advantageous because _____.
it allows SaaS customers to increase the number of users without expanding their communications capacity
One of the disadvantages of proprietary software is that ______________.
it can take a long time and significant resources to develop the required software features
For the ____ operation, it is required that the the two tables have a common data attribute.
join
The _____ is the heart of the operating system and controls its most critical processes.
kernel
The component of a computer that provides the CPU with a working storage area for program instructions and data is called the __________.
main memory
An operating system with _____ capabilities allows a user to run more than one program concurrently.
multitasking
Which type of end user license requires that a single copy of the software reside on a file server?
network/multiuser
Mike chooses a DaaS arrangement for his small business because it reduces costs. Why is DaaS less costly than the traditional alternative?
no in-house database installation, maintenance, or monitoring
A type of memory whose contents are not lost if the power is turned off or interrupted is said to be _____.
nonvolatile
Compared with the traditional licensing model in which users purchase and install software, SaaS _____.
offers less expensive upgrades and new releases
Which of the following is NOT a recognized BI and analytics technique?
online transaction processing
What type of software is distributed, typically for free, with the source code also available so that it can be studied, changed, and improved by its users?
open-source
Julian has chosen to use open-source software to help run his small business. He believes it is often more reliable and secure than commercial software because _____.
open-source software bugs are detected and fixed more quickly
When choosing from various types of flat-panel displays, choose a(n) _______ for the lowest power consumption.
organic light-emitting diode (OLED) display
After entering data into a relational database, users can make all of the following basic data manipulations except:
organizing
Letitia's open-source word processing application is not working properly. How can she get help troubleshooting the problem?
post her question to an Internet discussion area
A(n) _____ is a characteristic or set of characteristics in a record that uniquely identifies the record.
primary key
Some people are alarmed that big data applications allow organizations to develop extensive profiles of individuals without their knowledge or consent. This represents which type of concern related to big data?
privacy
Completing an instruction involves two phases—instruction and execution—which are each broken down into two steps for a total of four steps. Which of the following is NOT one of the four steps?
process data
CPU clock speed is the predetermined rate at which the processor _____.
produces a series of electronic pulses
The set of instructions that signal the CPU to perform circuit-switching operations is known as _____.
program code
Mark creates a new relational database table that includes only five of the seven columns in the existing products table. This action is known as ____.
projecting
All of the following are examples of activities performed by an operating system EXCEPT ________.
providing word processing capabilities to users
Because electronic devices are composed of materials such as beryllium, cadmium, lead, mercury, BFRs, selenium, and polyvinyl chloride, one of the three goals of green computing is to _____.
reduce the use of hazardous material
Which of the goals of green computing, if implemented successfully, would be the BEST practice for reducing the hazards posed by e-waste in the United States?
reduce the use of hazardous material
The key challenges associated with big data include the difficulty of locating and deriving value from _____.
relevant data to make decisions
The class of computer systems used by multiple concurrent users offers businesses the potential to increase their processing capability to handle more users, more data, or more transactions in a given period, which is known as _____.
scalability
Which of the following DBMS elements can be represented in a visual diagram or defined using a DDL?
schema
Much of the popular open-source software available is protected by the GNU General Public License. Which of the following is NOT permitted by this type of license?
selling a modified version of the program
Hardware utilization can be improved by logically dividing the resources of a single physical server to create multiple logical servers. This approach is known as _____.
server virtualization
To identify and make predictions about various alternative scenarios, a manager would use _______.
simulation techniques
Which type of end user license allows the program to be installed and used and one CPU that is not accessed by other users over a network?
single-user
Which class of general-purpose computer systems is the least expensive option that meets a wide range of personal computing needs, from data entry to computer-aided design or engineering, and from accessing Internet applications to software development?
single-user nonportable computers
When a business wishes to move away from hosting its own applications, a solution that offers many advantages is to use ______.
software as a service
_____ application software includes a wide range of built-in functions for statistical, financial, logical, database, graphics, and date and time calculations.
spreadsheet
_____ are the most powerful computers with the fastest processing speed and highest performance.
super computers
Big data veracity is a measure of _____.
the accuracy, completeness, and currency of the data
During the load phase of the ETL process, _____.
the data is checked against the constraints defined in the database schema
One of the advantages of off-the-shelf software is that ________________.
the initial cost is lower because the software firm can spread the development costs over many customers
One advantage of proprietary software versus off-the-shelf software is that _____.
the software provides a company with a competitive advantage by solving problems in a unique manner
Traditional data centers require huge amounts of energy because _____.
they constantly run powerful air-conditioning systems
A _____ is a low-cost, centrally managed computer with limited capabilities and no internal or external attached drives for data storage.
thin client
Which type of of data center offers the highest and most predictable level of performance through redundant hardware, power-related devices, and alternate power sources?
tier 4
3D printing is ________.
used to make solid objects from filaments or powder
One key characteristic of big data is that it is being generated at a rate of 2.5 quintillion bytes per day. This is known as big data's _____.
velocity
_____ are used to support engineering and technical users who perform heavy mathematical computing, computer-assisted design (CAD), video editing, and other applications requiring a high-end processor.
workstations
You would use Query by Example if _____.
you wish to use a visual approach to query building