SIM Chapter 6 (2nd One)

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

14) List at least three conditions that contribute to data redundancy and inconsistency. Answer: Data redundancy occurs when different divisions, functional areas, and groups in an organization independently collect the same piece of information. Because it is collected and maintained in so many different places, the same data item may have:

1. different meanings in different parts of the organization, 2. different names may be used for the same item, and 3. different descriptions for the same condition. In addition, the fields into which the data is gathered may have different field names, different attributes, or different constraints.

16) A DBMS makes the: A) physical database available for different logical views. B) relational database available for different logical views. C) physical database available for different analytic views. D) relational database available for different analytic views. E) logical database available for different analytic views.

A

30) Which of the following is an automated or manual file that stores information about data elements and data characteristics such as usage, physical representation, ownership, authorization, and security? A) Data dictionary B) Data definition diagram C) Entity-relationship diagram D) Relationship dictionary E) Data table

A

34) The process of streamlining data to minimize redundancy and awkward many-to-many relationships is called: A) normalization. B) data scrubbing. C) data cleansing. D) data defining. E) optimization.

A

50) The term big data refers to all of the following except: A) datasets with fewer than a billion records. B) datasets with unstructured data. C) machine-generated data (i.e. from sensors). D) data created by social media (i.e. tweets, Facebook Likes). E) data from Web traffic.

A

56) All of the following are technologies used to analyze and manage big data except: A) cloud computing. B) noSQL. C) in-memory computing. D) analytic platforms. E) Hadoop.

A

59) OLAP enables: A) users to obtain online answers to ad-hoc questions in a rapid amount of time. B) users to view both logical and physical views of data. C) programmers to quickly diagram data relationships. D) programmers to normalize data. E) users to quickly generate summary reports.

A

6) Data ________ occurs when the same data is duplicated in multiple files of a database. A) redundancy B) repetition C) independence D) partitions E) discrepancy

A

63) Which of the following enables you to create a script that allows a web server to communicate with a back-end database? A) CGI B) HTML C) Java D) SQL E) NoSQL

A

69) Which of the following would you use to find patterns in user interaction data recorded by a web server? A) Web usage mining B) Web server mining C) Web structure mining D) Web content mining E) Web protocol mining

A

90) An organization's rules for sharing, disseminating, acquiring, standardizing, classifying, and inventorying information is called a(n): A) information policy. B) data definition file. C) data quality audit. D) data governance policy. E) data policy.

A

95) In a large organization, which of the following functions would be responsible for policies and procedures for managing internal data resources? A) Data administration B) Database administration C) Information policy administration D) Data auditing E) Database management

A

48) List and describe three main capabilities or tools of a DBMS.

A DBMS includes capabilities and tools for organizing, managing, and accessing the data in the database. Its most important capabilities and tools are data definition, data dictionary, and data manipulation language. The data definition capability enables a user to be able to specify the structure of the content of the database. This capability is used to create database tables and to define the characteristics of the fields in each table. The data dictionary is used to store definitions of data elements and their characteristics in the database. In large corporate databases, the data dictionary may capture additional information, such as usage; ownership; authorization; security; and the individuals, business functions, programs, and reports that use each data element. A data manipulation language, such as SQL, that is used to add, change, delete, and retrieve the data in the database. This language contains commands that permit end users and programming specialists to extract data from the database to satisfy information requests and develop applications.

88) What are the similarities and differences between a data warehouse and a data mart?

A data warehouse stores current and historical data of potential interest throughout a company. Data warehouses gather data from multiple operational systems inside the organization. Data warehouses make data available, but do not allow that information to be altered. Data marts are subsets of data warehouses, in which a highly focused portion of an organization's data is placed in a separate database for specific users. Data marts are decentralized, whereas data warehouses are enterprise-wide, central locations for data.

40) Every record in a file should contain at least one key field. A) TRUE B) FALSE

A) TRUE

43) Complicated groupings of data in a relational database need to be adjusted to eliminate awkward many-to-many relationships. A) TRUE B) FALSE

A) TRUE

44) A physical view shows data as it is actually organized and structured on the data storage media. A) TRUE B) FALSE

A) TRUE

45) DBMS have a data definition capability to specify the structure of the content of the database. A) TRUE B) FALSE

A) TRUE

70) Spanner is an example of a distributed database. A) TRUE B) FALSE

A) TRUE

71) Legacy systems are used to populate and update data warehouses. A) TRUE B) FALSE

A) TRUE

73) You can use OLAP to perform multidimensional data analysis. A) TRUE B) FALSE

A) TRUE

74) OLAP can manage and handle queries with very large sets of data. A) TRUE B) FALSE

A) TRUE

76) Middleware is an application that transfers information from an organization's internal database to a web server for delivery to a user as part of a web page. A) TRUE B) FALSE

A) TRUE

78) You can manipulate data on a web server by using a CGI script. A) TRUE B) FALSE

A) TRUE

79) You can use text mining tools to analyze unstructured data, such as memos and legal cases. A) TRUE B) FALSE

A) TRUE

82) High-speed analytic platforms use both relational and non-relational tools to analyze large datasets. A) TRUE B) FALSE

A) TRUE

96) The term data governance refers to the policies and processes for managing the integrity and security of data in a firm. A) TRUE B) FALSE

A) TRUE

83) You have been hired by a furniture leasing company to implement its first business intelligence systems and infrastructure. To prepare for your initial report, describe the types of data the firm can use to support business intelligence and the systems that you will implement to support both power users and casual users, and explain how these systems or tools work together.

All types of data can be used for their business intelligence systems, including operational, historical, machine-generated, Web/social data, audio and video data, and external data. The large datasets can be collected in a Hadoop cluster and used by an analytic platform to support power user queries, data mining, OLAP, etc. A data warehouse can be used to house all data, including smaller data sets and operational data, and be used to support casual use, for queries, reports, and digital dashboards, as well as support the analytic platforms. Smaller data marts can be created from the data warehouse to enable faster querying and typical queries from casual users.

49) Identify and describe three basic operations used to extract useful sets of data from a relational database.

Answer: The select operation creates a subset consisting of all records (rows) in the table that meets stated criteria. The join operation combines relational tables to provide the user with more information than is available in individual tables. The project operation creates a subset consisting of columns in a table, permitting the user to create new tables that contain only the information required.

24) A field identified in a record as holding the unique identifier for that record is called the: A) primary key. B) key field. C) primary field. D) unique ID. E) key attribute.

B

28) The project operation: A) combines relational tables to provide the user with more information than is otherwise available. B) creates a subset consisting of columns in a table. C) organizes elements into segments. D) identifies the table from which the columns will be selected. E) creates a subset consisting of rows in a table.

B

31) Which of the following is a specialized language that programmers use to add and change data in the database? A) Data access language B) Data manipulation language C) Structured query language D) Data definition language

B

33) DBMSs typically include report generating tools in order to: A) retrieve and display data. B) display data in a more structured and polished format than would be possible just by querying. C) display data in graphs. D) perform predictive analysis. E) analyze the database's performance.

B

5) ________ creates confusion that hampers the creation of information systems that integrate data from different sources. A) Batch processing B) Data redundancy C) Data independence D) Online processing E) Data quality

B

55) A data lake is composed of: A) historical data from legacy systems. B) unstructured and structured data that has not been analyzed. C) internal and external data sources. D) historic and current internal data. E) historic external data.

B

57) A household appliances manufacturer has hired you to help analyze its social media datasets to determine which of its refrigerators are seen as the most reliable. Which of the following tools would you use to analyze this data? A) Text mining tools B) Sentiment analysis software C) Web mining technologies D) Data mining software E) Data governance software

B

60) Data mining allows users to: A) quickly compare transaction data gathered over many years. B) find hidden relationships in data. C) obtain online answers to ad-hoc questions in a rapid amount of time. D) summarize massive amounts of data into much smaller, traditional reports. E) access the vast amounts of data in a data warehouse.

B

64) Which of the following is software that handles all application operations between browser-based computers and a company's back-end business applications or databases? A) Database server software B) Application server software C) Web browser software D) Data mining software E) Web server software

B

67) In data mining, which of the following involves events linked over time? A) Associations B) Sequences C) Classifications D) Clustering E) Forecasting

B

9) The fact that a traditional file system cannot respond to unanticipated information requirements in a timely fashion is an example of which of the following issues with traditional file systems? A) Program-data dependence B) Lack of flexibility C) Poor security D) Lack of data sharing E) Data redundancy

B

91) In a large organization, which of the following functions would be responsible for physical database design and maintenance? A) Data administration B) Database administration C) Information policy administration D) Data auditing E) Database management

B

39) The logical and physical views of data are combined into a single view in a DBMS. A) TRUE B) FALSE

B) FALSE

41) NoSQL technologies are used to manage sets of data that don't require the flexibility of tables and relations. A) TRUE B) FALSE

B) FALSE

42) CGI is a DBMS programming language that end users and programmers use to manipulate data in the database. A) TRUE B) FALSE

B) FALSE

46) Relational DBMSs use key field rules to ensure that relationships between coupled tables remain consistent. A) TRUE B) FALSE

B) FALSE

80) In a client/server environment, a DBMS is located on a dedicated computer called a web server. A) TRUE B) FALSE

B) FALSE

81) Associations are occurrences linked to multiple events. A) TRUE B) FALSE

B) FALSE

10) Which common database challenge is illustrated by a person receiving multiple copies of an L.L. Bean catalog, each addressed to a slightly different variation of his or her full name? A) Data normalization B) Data accuracy C) Data redundancy D) Data inconsistency E) Data duplication

D

20) Microsoft SQL Server is a(n): A) DBMS for both desktops and mobile devices. B) Internet DBMS. C) desktop relational DBMS. D) DBMS for midrange computers. E) DBMS for mobile devices.

D

26) The select operation: A) combines relational tables to provide the user with more information than is otherwise available. B) creates a subset consisting of columns in a table. C) identifies the table from which the columns will be selected. D) creates a subset consisting of all records in the file that meet stated criteria. E) creates a subset consisting of rows in a table.

D

38) Which of the following is not one of the benefits of a blockchain database? A) It enables the ability to use relational databases. B) It prevents data from being altered retroactively. C) It allows administrators to manage data more effectively. D) It enables firms to create and verify translations on a network very rapidly. E) It provides users with an integrated view of the data.

D

54) You work for a car rental agency and want to determine what characteristics are shared among your most loyal customers. To do this, you will want to use the data mining software you are using to do which of the following? A) Identify associations B) Identify clusters C) Identify sequences D) Classify data E) Create a forecast

D

86) What makes data mining an important business tool? What types of information does data mining produce? In what type of circumstance would you advise a company to use data mining?

Data mining is one of the data analysis tools that helps users make better business decisions and is one of the key tools of business intelligence. Data mining allows users to analyze large amounts of data and find hidden relationships between data that otherwise would not be discovered. For example, data mining might find that a customer that buys product X is ten times more likely to buy product Y than other customers. Data mining finds information such as: • Associations or occurrences that are linked to a single event. • Sequences, events that are linked over time. • Classification, patterns that describe the group to which an item belongs, found by examining existing items that have been classified and by inferring a set of rules. • Clusters, unclassified but related groups. I would advise a company to use data mining when they are looking for new products and services, or when they are looking for new marketing techniques or new markets. Data mining might also be helpful when trying to analyze unanticipated problems with sales whose causes are difficult to identify.

15) Which of the following enables a DBMS to reduce data redundancy and inconsistency? A) Ability to enforce referential integrity B) Ability to couple program and data C) Use of a data dictionary D) Ability to create two-dimensional tables E) Ability to minimize isolated files with repeated data

E

17) The logical view of a database: A) displays the organization and structure of data on the physical storage media. B) includes a digital dashboard. C) allows the creation of supplementary reports. D) enables users to manipulate the logical structure of the database. E) presents data as they would be perceived by end users.

E

29) Microsoft Access's data dictionary displays all of the following information about a field except the: A) size of the field. B) format of the field. C) description of the field. D) type of the field. E) the organization within the organization that is responsible for maintaining the data.

E

93) Data cleansing not only corrects errors but also: A) establishes logical relationships between data. B) structures data. C) normalizes data. D) removes duplicate data. E) enforces consistency among different sets of data.

E

85) Describe the ways in which database technologies could be used by a toy manufacturer to achieve product differentiation.

Product databases could be made available to customers for greater convenience and ordering online. Databases could be used to track customer preferences and to help anticipate customer desires. Sales databases could also help clients such as toy stores anticipate when they would need to re-supply, providing an additional service. Data mining, Web mining, and sentiment analysis of big data could help anticipate trends in sales or other factors to help determine new services and products to sell to clients.

84) Describe the ways in which database technologies could be used by an office stationery supply company to achieve low-cost leadership.

Sales databases could be used to make the supply chain more efficient and minimize warehousing and transportation costs. You can also use sales databases, as well as text mining and sentiment analysis, to determine what supplies are in demand by which customers and whether needs are different in different geographical areas. Business intelligence databases could be used to predict future trends in office supply needs, to help anticipate demand, and to determine the most efficient methods of transportation and delivery.

47) The small publishing company you work for wants to create a new database for storing information about all of their author contracts. What factors will influence how you design the database?

Student answers will vary, but should include some assessment of data quality, business processes and user needs, and relationship to existing IT systems. Key points to include are: Data accuracy when the new data is input, establishing a good data model, determining which data is important and anticipating what the possible uses for the data will be, beyond looking up contract information, technical difficulties linking this system to existing systems, new business processes for data input and handling, and contracts management, determining how end users will use the data, making data definitions consistent with other databases, what methods to use to cleanse the data.

99) Distributors for a furniture manufacturer are complaining that the billing for goods they order is frequently not correct, and sometimes are sent to the wrong e-mail and postal addresses. What steps would you take to improve the quality of data in the manufacturer's databases?

The first step is to perform a data quality audit, a survey of the accuracy and level of completeness in all the firm's major databases. Once issues are identified, initiate a program for data cleansing to correct data that is incomplete, improperly formatted, redundant, or just plain wrong.

13) You have been asked to design a new contracts database for a small publishing company. What fields do you anticipate needing? Which of these fields might be in use in other databases used by the company?

Author first name, author last name, author address, agent name and address, title of book, book ISBN, date of contract, amount of money, payment schedule, date contract ends. Other databases might be an author database (author names, address, and agent details), a book title database (title and ISBN of book), and financial database (payments made).

1) Which of the following best illustrates the relationship between entities and attributes? A) The entity CUSTOMER with the attribute PRODUCT B) The entity CUSTOMER with the attribute PURCHASE C) The entity PRODUCT with the attribute PURCHASE D) The entity PRODUCT with the attribute CUSTOMER E) The entity PURCHASE with the attribute CUSTOMER

B

2) All of the following are issues with the traditional file environment except: A) data inconsistency. B) inability to develop specialized applications for functional areas. C) lack of flexibility in creating ad-hoc reports. D) poor security. E) data sharing.

B

62) ________ tools are used to analyze large unstructured data sets, such as e-mail, memos, and survey responses to discover patterns and relationships. A) OLAP B) Text mining C) In-memory D) Clustering E) Classification

B

97) Data scrubbing is a more intensive corrective process than data cleansing. A) TRUE B) FALSE

B) FALSE

92) Detecting and correcting data in a database or file that are incorrect, incomplete, improperly formatted, or redundant is called: A) data auditing. B) defragmentation. C) data scrubbing. D) data optimization. E) data normalization.

C

94) Which of the following is not a method for performing a data quality audit? A) Surveying entire data files B) Surveying samples from data files C) Surveying data definition and query files D) Surveying end users about their perceptions of data quality E) Surveying managers about their perceptions of data quality

C

87) What are the differences between data mining and OLAP? When would you advise a company to use OLAP?

Data mining uncovers hidden relationships and is used when you are trying to discover data and new relationships. It is used to answer questions such as: Are there any product sales that are related in time to other product sales? In contrast, OLAP is used to analyze multiple dimensions of data and is used to find answers to complex, but known, questions, such as: What were sales of a product—broken down by month and geographical region, and how did those sales compare to sales forecasts?

98) List three ways that a business's data can become redundant or inconsistent.

Data redundancy and inconsistency can occur because of (1) employing different names and descriptions for the same entities or attributes; (2) multiple systems feeding a data warehouse; (3) incorrect data entry.

100) What is an information policy and why is it needed in a firm?

An information policy specifies the organization's rules for sharing, disseminating, acquiring, standardizing, classifying, and inventorying information. Information policy lays out specific procedures and accountabilities, identifying which users and organizational units can share information, where information can be distributed, and who is responsible for updating and maintaining the information. An information policy is needed in firms because data are an important resource, and you don't want people doing whatever they want with them. You need to have rules on how the data are to be organized and maintained and who is allowed to view the data or change them.

72) Multiple data marts are combined and streamlined to create a data warehouse. A) TRUE B) FALSE

B) FALSE

75) In-memory computing relies primarily on a computer's disk drive for data storage. A) TRUE B) FALSE

B) FALSE

77) Implementing a web interface for an organization's internal database usually requires substantial changes to be made to the database. A) TRUE B) FALSE

B) FALSE

66) In data mining, which of the following involves recognizing patterns that describe the group to which an item belongs by examining existing items and inferring a set of rules? A) Associations B) Sequences C) Classifications D) Clustering E) Forecasting

C

19) A(n) ________ represent data as two-dimensional tables. A) non-relational DBMS B) mobile DBMS C) relational DBMS D) hierarchical DBMS E) object-oriented DBMS

C

25) In a relational database, the three basic operations used to develop useful sets of data are: A) select, project, and where. B) select, join, and where. C) select, project, and join. D) where, from, and join. E) where, find, and select.

C

32) Which of the following is the most prominent data manipulation language today? A) Access B) DB2 C) SQL D) Crystal Reports E) NoSQL

C

35) A schematic of the entire database that describes the relationships in a database is called a(n): A) data dictionary. B) intersection relationship diagram. C) entity-relationship diagram. D) data definition diagram. E) data analysis table.

C

37) You are creating a database to store temperature and wind data from various airports. Which of the following fields is the most likely candidate to use as the basis for a primary key in the Airport table? A) Address B) City C) Airport code D) State E) Day

C

4) A database ___________ describes a database entity. A) byte B) field C) record D) value E) file

C

52) Which of the following is not one of the techniques used in web mining? A) Content mining B) Structure mining C) Server mining D) Usage mining E) Data mining

C

53) You work for a retail clothing chain whose primary outlets are in shopping malls and are conducting an analysis of your customers and their preferences. You wish to find out if there are any particular activities that your customers engage in, or the types of purchases made in the month before or after purchasing select items from your store. To do this, you will want to use the data mining software you are using to do which of the following? A) Identify associations B) Identify clusters C) Identify sequences D) Classify data E) Create a forecast

C

58) A distributed database: A) uses predictive analysis. B) uses SQL. C) is a database that is stored in multiple physical locations. D) is a database that is distributed across many business firms. E) uses Hadoop to process information.

C

61) In the context of data relationships, the term associations refers to: A) events linked over time. B) patterns that describe a group to which an item belongs. C) occurrences linked to a single event. D) undiscovered groupings. E) relationships between different customers.

C

68) MongoDB and SimpleDB are both examples of: A) open source databases. B) SQL databases. C) NoSQL databases. D) cloud databases. E) big data databases.

C

18) Which of the following is a DBMS for desktop computers? A) DB2 B) Oracle Database C) Microsoft SQL Server D) Microsoft Access E) Microsoft Exchange

D

3) A characteristic or quality that describes a particular database entity is called a(n): A) field. B) tuple. C) key field. D) attribute. E) relationship.

D

8) Which of the following is a grouping of characters into a word, a group of words, or a complete number? A) File B) Table C) Entity D) Field E) Tuple

D

89) Explain what the term big data refers to. What benefits does it have, and what challenges does it pose?

The term big data is used to describe datasets with volumes so huge that they are beyond the ability of typical DBMS to capture, store, and analyze. Big data is created by the explosion of data coming from the Web, such as Web traffic, e-mail, Twitter, and Facebook, as well as information from other electronic and networked devices such as sensors and meters. Businesses are interested in big data because it contains more patterns and interesting anomalies than smaller data sets, with the potential to provide new insights into customer behavior, weather patterns, financial market activity, or other phenomena. However, to derive business value from big data, organizations need new technologies and tools capable of managing and analyzing non-traditional data along with their traditional enterprise data. They also need to know what questions to ask of the data and the limitations of big data. Capturing, storing, and analyzing big data can be expensive, and information from big data may not necessarily help decision-makers.

22) In a relational database, a record is referred to in technical terms as a(n): A) tuple. B) table. C) entity. D) field. E) key.

A

23) A field identified in a table as holding the unique identifier of the table's records is called the: A) primary key. B) key field. C) primary field. D) unique ID. E) primary entity.

A

27) The join operation: A) combines relational tables to provide the user with more information than is otherwise available. B) identifies the table from which the columns will be selected. C) creates a subset consisting of columns in a table. D) organizes elements into segments. E) creates a subset consisting of rows in a table.

A

21) In a table for customers, the information about a single customer resides in a single: A) field. B) row. C) column. D) table. E) entity.

B

51) Which of the following technologies would you use to analyze the social media data collected by a major online retailer? A) OLAP B) Data warehouse C) Data mart D) Hadoop E) DBMS

D

36) A one-to-many relationship between two entities is symbolized in a diagram by a line that ends with: A) one short mark. B) two short marks. C) three short marks. D) a crow's foot. E) a crow's foot topped by a short mark.

E

65) In data mining, which of the following involves using a series of existing values to determine what other future values will be? A) Associations B) Sequences C) Classifications D) Clustering E) Forecasting

E

7) Which of the following occurs when the same attribute in related data files has different values? A) Data redundancy B) Data duplication C) Data dependence D) Data discrepancy E) Data inconsistency

E

11) A record is a group of related fields. A) TRUE B) FALSE

A) TRUE

12) Program-data dependence refers to the coupling of data stored in files and the specific programs required to update and maintain those files such that changes in programs require changes to the data. A) TRUE B) FALSE

A) TRUE


संबंधित स्टडी सेट्स

Chapter 3: Toxic Effects of Drugs - ML5

View Set

Astronomy 102 Test 4 study guide

View Set