Chapter 6

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

List and describe the components of a contemporary business intelligence infrastructure.

(p. 261) - Data warehouses and data marts - Hadoop - In-memory comptuing - Analytic platforms

Name and briefly describe the capabilities of a DBMS.

- The DBMS acts as an interface between application programs and the physical data files. - The DBMS relieves the programmer or end user from the task of understanding where and how the data are actually stored by separating the logical and physical views of the data. - The database management software makes the physical database available for different logical views required by users.

Describe the roles of information policy and data administration in information management.

An information policy specifies the organization's rules for sharing, disseminating, acquiring, standardizing, classifying, and inventorying information. Information policy lays out specific procedures and accountabilities, identifying which users and organizational units can share information, where information can be distributed, and who is responsible for updating and maintaining the information. Data administration is responsible for the specific policies and procedures through which data can be managed as an organizational resource. These responsibilities include developing an information policy, planning for data, overseeing logical database design and data dictionary development, and monitoring how information systems specialists and end-user groups use data.

Define big data and describe the technologies for managing and analyzing it.

Big data: Data sets with volumes so huge that they are beyond the ability of typical relational DBMS to capture, store, and analyze. The data are often unstructured or semi-structured. New technologies are needed to manage and analyzing non-traditional data and existing typical data.

Define data mining, describing how it differs from OLAP and the types of information it provides.

Data Mining: more discovery driven, provides insights into corporate data that cannot be obtained with OLAP by finding hidden patterns and relationships in large databases and inferring rules from them to predict future behavior. The types of data that can be obtained include: associations, sequences, classification, clustering, and forecasts.

Describe how users can access information from a company's internal databases through the web.

Using Web browser software on a client PC, a user can access a corporate Web site over the Internet. The Web browser software requests data from the organization's database using HTML commands to communicate with the Web server. Because many back-end databases cannot interpret commands written in HTML, the Web server passes these requests for data to special middleware software that then translates HTML commands into SQL so that they can be processed by the DBMS working with the database. The DBMS receives the SQL requests and provides the required data. The middleware transfers information from the organization's internal database back to the Web server for delivery in the form of a Web page to the user. The software working between the Web server and the DBMS can be an application server, a custom program, or a series of software scripts.

Explain why data quality audits and data cleansing are essential.

Data residing in any database that is not accurate or timely or that does not contain relevant information adds little if any value to an organization. In fact, data that does not contain these essential elements would more than likely do more harm to the organization than good. For example, the firm would not be able to provide their customers with good customer service, which would result in losing their business. Organizations need to identify and correct faulty data and establish routines to edit/update data once a database becomes operational. Analysis of the quality of the data involves doing a data quality audit. Basically, this consists of a structured survey of the accuracy and level of completeness of the data in the information system. Data cleansing consists of activities for detecting and correcting data in a database or file that are incorrect, incomplete, improperly formatted, or redundant. Data cleansing not only corrects data but also enforces consistency among different sets of data that originated in separate information systems.

Define a database and a database management system.

Database: a group of related files. Database management system: special software to create and maintain a database and enable individual business applications to extract the data they need without having to create separate files or data definitions in their computer programs.

Define and describe an entity-relationship diagram and explain its role in database design.

Entity-relationship diagram: a methodology for documenting databases illustrating the relationship between various entities in the database The boxes represent entities. The lines connecting the boxes represent relationships. A line connecting two entities that ends in two short marks designates a one-to-one relationship. A line connecting two entities that ends with a crow's foot topped by a short mark indicates a one-to-many relationship. It can't be emphasized enough: If the business doesn't get its data model right, the system won't be able to serve the business well.

Define and explain the significance of entities, attributes, and key fields.

Entity: a person, place, thing, or event about which information must be kept. Attribute: a piece of information describing a particular entity. Key field: a field in a record that uniquely identifies instances of that record so that it can be retrieved, updated, or sorted.

List and describe each of the components in the data hierarchy.

Figure 6-1 in the textbook shows the data hierarchy. The data hierarchy includes bits, bytes, fields, records, files, and databases. Data are organized in a hierarchy that starts with the bit, which is represented by either a 0 (off) or a 1 (on). Bits can be grouped to form a byte to represent one character, number, or symbol. Bytes can be grouped to form a field, such as a name or date, and related fields can be grouped to form a record. Related records can be collected to form files, and related files can be organized into a database.

Explain how text mining and web mining differ from conventional data mining.

Text mining helps businesses analyze this data by extracting key elements from big data sets to summarize the information. Web mining: the discovery and analysis of useful patterns and information from the world wide web These are different than traditional data mining because they allow you to find more specific, concise information.

List and describe the problems of the traditional file environment.

1) data redundancy and inconsistency. Data redundancy is the presence of duplicate data in multiple data files. In this situation, inconsistencies arise because the data can have different meanings in different files. 2) program-data dependence. Program-data dependence is the tight relationship between data stored in files and the specific programs required to update and maintain those files. This dependency is very inefficient, resulting in the need to make changes in many programs when a common piece of data, such as the postal code structure, changes. 3) lack of flexibility. Lack of flexibility refers to the fact that it is very difficult to create new reports from data when needed. Ad-hoc reports are impossible to generate; a new report could require several weeks of work by more than one programmer and the creation of intermediate files to combine data from disparate files. 4) poor security. Poor security results from the lack of control over the data because the data are so widespread. 5) lack of data sharing and availability. Data sharing can be virtually impossible if data are distributed in so many different files around the organization.

Explain why non-relational databases are useful.

Non-relational database management systems use a more flexible data model and are designed for managing large data sets across many distributed machines and for easily scaling up or down. They are useful for accelerating simple queries against large volumes of structured and unstructured data, including web, social media, graphics, and other forms of data that are difficult to analyze with traditional SQL-based tools. Non-relational databases are ideal for storing data that may be changed frequently or for applications that handle many different kinds of data. They can support rapidly developing applications requiring a dynamic database able to change quickly and to accommodate large amounts of complex, unstructured data.

Define and describe normalization and referential integrity and explain how they contribute to a well-designed relational database.

Normalization: the process of creating small stable data structures from complex groups of data when designing a relational database - minimizes redundant data Referential integrity: rules to ensure that relationships between coupled database tables remain consistent

Describe the capabilities of online analytical processing (OLAP).

Online analytical processing and data mining enable the manipulation and analysis of large volumes of data from many perspectives; for example, sales by item, by department, by store, by region, in order to find patterns in the data. This type of pattern is difficult to find with normal database methods, which is why a data warehouse and data mining are usually parts of OLAP. Data mining uses a variety of techniques to find hidden patterns and relationships in large pools of data and infer rules from them that can be used to predict future behavior and guide decision making.

Define a relational DBMS and explain how it organizes data.

Relation DBMS: a type of logical database model that treats data as if they were stored in two-dimensional tables. It can relate data stored in one table to data in another as long as the two tables share a common data element.

List and describe the three operations of a relational DBMS.

Select: creates a subset consisting of all records in the file that meet stated criteria Join: combines relational tables to provide the user with more information than is available in individual tables Project: creates a subset consisting of columns in a table, permitting the user to create new tables that contain only the information required


संबंधित स्टडी सेट्स

Chapter 1 Introduction to Corporate Finance

View Set

Med Sure Chapter 62: Managements of Patients With Burn Injury second

View Set

Physics - Chapter 8: Rotational Motion

View Set

PHARM - Neuro practice questions

View Set

Psych Chapter 7 Online Quiz and Learning Curve

View Set