Data Resource Management Chpt 5
Application Development
(1) Conceiving, designing, and implementing a system. (2) Developing information systems by a process of investigation, analysis, design, implementation, and maintenance. Also called the systems development life cycle (SDLC), information systems development, or application development.
Hypermedia Databases
A Web site stores such information in a hypermedia database consisting of hyperlinked pages of multimedia (text, graphic and photographic images, video clips, audio segments, and so on). That is, from a database management point of view, the set of interconnected multimedia pages on a Web site is a database of interrelated hypermedia page elements, rather than interrelated data records.
Data Planning
A corporate planning and analysis function that focuses on data resource management. It includes the responsibility for developing an overall information policy and data architecture for the firm's data resources.
Entity Relationship Diagram (ERD)
A data planning and systems development diagramming tool that models the relationships among the entities in a business process.
External Databases
Access to a wealth of information from external databases is available for a fee from commercial online services and with or without charge from many sources on the World Wide Web. Web sites provide an endless variety of hyperlinked pages of multimedia documents in hypermedia databases for you to access. Data are available in the form of statistics on economic and demographic activity from statistical databanks, or you can view or download abstracts or complete copies of hundreds of newspapers, magazines, newsletters, research papers, and other published material and periodicals from bibliographic and full-text databases. Whenever you use a search engine like Google or Yahoo to look up something on the Internet, you are using an external database—a very, very large one!
Database management approach
An approach to the storage and processing of data in which independent files are consolidated into a common pool, or database, of records available to different application programs and end users for processing and data retrieval
Data integration
Having data in independent files made it difficult to provide end users with information for ad hoc requests that required accessing data stored in several different files. Special computer programs had to be written to retrieve data from each independent file. This retrieval was so difficult, time-consuming, and costly for some organizations that it was impossible to provide end users or management with such information. End users had to extract the required information manually from the various reports produced by each separate application and then prepare customized reports for management.
Data integrity
In file processing systems, it was easy for data elements such as stock numbers and customer addresses to be defined differently by different end users and applications. This divergence caused serious inconsistency problems in the development of programs to access such data. In addition, the integrity (i.e., the accuracy and completeness) of the data was suspect because there was no control over their use and maintenance by authorized end users. Thus, a lack of standards caused major problems in application program development and maintenance, as well as in the security and integrity of the data files needed by the organization.
Types of databases
Operational, data warehouse, distributed, external, hypermedia
File Processing
Organizing data into specialized files of data records designed for processing only by specific application programs. Contrast with Database Management Approach.
Operational Databases
Store detailed data needed to support the business processes and operations of a company. They are also called subject area databases (SADB), transaction databases, and production databases. Examples are a customer database, human resource database, inventory database, and other databases containing data generated by business operations.
Database Maintenance
The activity of keeping a database up to date by adding, changing, or deleting data. The database maintenance process is accomplished by transaction processing systems and other end-user applications, with the support of the DBMS. End users and information specialists can also employ various utilities provided by a DBMS for database maintenance. The databases of an organization need to be updated continually to reflect new business transactions (e.g., sales made, products produced, inventory shipped) and other events. Other miscellaneous changes also must be made to update and correct data (e.g., customer or employee name and address changes) to ensure the accuracy of the data in the databases.
Distributed Databases
The concept of distributing databases or portions of a database at remote sites where the data are most frequently referenced. Sharing of data is made possible through a network that interconnects the distributed databases.
Network structure
The network structure can represent more complex logical relationships and is still used by some mainframe DBMS packages. It allows many-to-many relationships among records; that is, the network model can access a data element by following one of several paths because any data element or record can be related to any number of other data elements. It should be noted that neither the hierarchical nor the network data structures are commonly found in the modern organization.
Field
The next higher level of data is the field, or data item. A field consists of a grouping of related characters. For example, the grouping of alphabetic characters in a person's name may form a name field (or typically, last name, first name, and middle initial fields), and the grouping of numbers in a sales amount forms a sales amount field.
Duplication
Updating a distributed database ....The duplication process, in contrast to replication, is much less complicated. It basically identifies one database as a master and then duplicates that database at a prescribed time after hours so that each distributed location has the same data. One drawback to the duplication process is that no changes can ever be made to any database other than the master to avoid having local changes overwritten during the duplication process. Nonetheless, properly used, duplication and replication can keep all distributed locations current with the latest data.
Replication
Updating a distributed database using replication involves using a specialized software application that looks at each distributed database and then finds the changes made to it. Once these changes have been identified, the replication process makes all of the distributed databases look the same by making the appropriate changes to each one. The replication process is very complex and, depending on the number and size of the distributed databases, can consume a lot of time and computer resources.
Data Mining
Using special purpose software to analyze data from a data warehouse to find hidden patterns and trends.
Record
a record represents a collection of attributes that describe a single instance of an entity. An example is a person's payroll record, which consists of data fields describing attributes such as the person's name, Social Security number, and rate of pay. Fixed-length records contain a fixed number of fixed-length data fields. Variable-length records contain a variable number of fields and field lengths. Another way of looking at a record is that it represents a single instance of an entity. Each record in an employee file describes one specific employee.
Multidimensional model
a variation of the relational model that uses multidimensional structures to organize data and express the relationships between data. You can visualize multidimensional structures as cubes of data and cubes within cubes of data.
Primary Key
the first field in a record is used to store some type of unique identifier for the record. This unique identifier is called the primary key
Attribute
(a characteristic or quality) of some entity For example, an employee's salary is an attribute that is a typical data field used to describe an entity who is an employee of a business.
Entity
(object, person, place, or event). For example, an employee's salary is an attribute that is a typical data field used to describe an entity who is an employee of a business.
Data Warehouse
A data warehouse stores data that have been extracted from the various operational, external, and other databases of an organization. It is a central source of the data that have been cleaned, transformed, and cataloged so that they can be used by managers and other business professionals for data mining, online analytical processing, and other forms of business analysis, market research, and decision support. Data warehouses may be subdivided into data marts, which hold subsets of data from the warehouse that focus on specific aspects of a company, such as a department or a business process.
Report Generator
A feature of database management system packages that allows an end user to quickly specify a report format for the display of information retrieved from a database.
File
A group of related records is a data file (sometimes referred to as a table or flat file). When it is independent of any other files related to it, a single table may be referred to as a flat file. As a point of accuracy, the term flat file may be defined either narrowly or more broadly. Strictly speaking, a flat file database should consist of nothing but data and delimiters. Regardless of the name used, any grouping of related records in tabular (row-and-column form) is called a file. Thus, an employee file would contain the records of the employees of a firm. Files are frequently classified by the application for which they are primarily used, such as a payroll file or an inventory file, or the type of data they contain, such as a document file or a graphical image file. Files are also classified by their permanence, for example, a payroll master file versus a payroll weekly transaction file. A transaction file, therefore, would contain records of all transactions occurring during a period and might be used periodically to update the permanent records contained in a master file. A history file is an obsolete transaction or master file retained for backup purposes or for long-term historical storage, called archival storage.
Query Language
A high-level, humanlike language provided by a database management system that enables users to easily extract data and information from a database. The query language feature lets you easily obtain immediate responses to ad hoc data requests: You merely key in a few short inquiries—in some cases, using common sentence structures just like you would use to ask a question.
Data Resource Management
A managerial activity that applies information systems technology and management tools to the task of managing an organization's data resources. Its three major components are database administration, data administration, and data planning
Data Modeling
A process in which the relationships between data elements are identified and defined to develop data models. Each data model defines the logical relationships among the data elements needed to support a basic business process. For example, can a supplier provide more than one type of product to us? Can a customer have more than one type of account with us? Can an employee have several pay rates or be assigned to several project workgroups? Answering such questions will identify data relationships that must be represented in a data model that supports business processes of an organization. These data models then serve as logical design frameworks (called schema and subschema). These frameworks determine the physical design of databases and the development of application programs to support the business processes of the organization. A schema is an overall logical view of the relationships among the data elements in a database, whereas the subschema is a logical view of the data relationships needed to support specific end-user application programs that will access that database.
Structured Query Language (SQL)
A query language that is becoming a standard for advanced database management system packages. A query's basic form is SELECT ... FROM ... WHERE.
Database management system (DBMS)
A set of computer programs that controls the creation, maintenance, and utilization of the databases of an organization.
Data Dictionary
A software module and database containing descriptions and definitions concerning the structure, data elements, interrelationships, and other characteristics of a database. A database management catalog or directory containing metadata (i.e., data about data). A data dictionary relies on a specialized database software component to manage a database of data definitions, which is metadata about the structure, data elements, and other characteristics of an organization's databases.
Database Administrators (DBAs)
A specialist responsible for maintaining standards for the development, maintenance, and security of an organization's databases.
Character
Consists of a single alphabetic, numeric, or other symbol. You might argue that the bit or byte is a more elementary data element, but remember that those terms refer to the physical storage elements provided by the computer hardware Using that understanding, one way to think of a character is that it is a byte used to represent a particular character. From a user's point of view (i.e., from a logical as opposed to a physical or hardware view of data), a character is the most basic element of data that can be observed and manipulated.
Metadata
Data about data; data describing the structure, data elements, interrelationships, and other characteristics of a database.
Logical data elements
Data elements that are independent of the physical data media on which they are recorded. A conceptual framework of several levels of data has been devised that differentiates among different groupings, or elements, of data. Thus, data may be logically organized into characters, fields, records, files, and databases, just as writing can be organized into letters, words, sentences, paragraphs, and documents.
Database structures
Database management system (DBMS) packages are designed to use a specific data structure to provide end users with quick, easy access to information stored in databases. Five fundamental database structures are the hierarchical, network, relational, object-oriented, and multidimensional models.
Hierarchial structure
Early mainframe DBMS packages used the hierarchical structure, in which the relationships between records form a hierarchy or treelike structure. In the traditional hierarchical model, all records are dependent and arranged in multilevel structures, consisting of one root record and any number of subordinate levels. Thus, all of the relationships among records are one-to-many because each data element is related to only one element above it. The data element or record at the highest level of the hierarchy (the department data element in this illustration) is called the root element. Any data element can be accessed by moving progressively downward from a root and along the branches of the tree until the desired record (e.g., the employee data element) is located. It should be noted that neither the hierarchical nor the network data structures are commonly found in the modern organization.
Data Dependence
In file processing systems, major components of a system—the organization of files, their physical locations on storage hardware, and the application software used to access those files—depended on one another in significant ways. For example, application programs typically contained references to the specific format of the data stored in the files they used. Thus, changes in the format and structure of data and records in a file required that changes be made to all of the programs that used that file. This program maintenance effort was a major burden of file processing systems. It proved difficult to do properly, and it resulted in a lot of inconsistency in the data files.
Data Redundancy
Independent data files included a lot of duplicated data; the same data (such as a customer's name and address) were recorded and stored in several files.
Database interrogation
The primary use of a database by end users involves employing the database interrogation capabilities of a DBMS to access the data in a database to selectively retrieve and display information and produce reports, forms, and other documents. A database interrogation capability is a major benefit of the database management approach. End users can use a DBMS by asking for information from a database using a query feature or a report generator. They can receive an immediate response in the form of video displays or printed reports. No difficult programming is required.
Relational Model
The relational model is the most widely used of the three database structures. It is used by most microcomputer DBMS packages, as well as by most midrange and mainframe systems. In the relational model, all data elements within the database are viewed as being stored in the form of simple two-dimensional tables, sometimes referred to as relations. The tables in a relational database are flat files that have rows and columns. Each row represents a single record in the file, and each column represents a field. The major difference between a flat file and a database is that a flat file can only have data attributes specified for one file. In contrast, a database can specify data attributes for multiple files simultaneously and can relate the various data elements in one file to those in one or more other files.
Database
an integrated collection of logically related data elements. A database consolidates records previously stored in separate files into a common pool of data 173174elements that provides data for many applications. The data stored in a database are independent of the application programs using them and of the type of storage devices on which they are stored. Thus, databases contain data elements describing entities and relationships among entities.
Object-oriented model
an object consists of data values describing the attributes of an entity, plus the operations that can be performed upon the data. This encapsulation capability allows the object-oriented model to handle complex types of data (graphics, pictures, voice, and text) more easily than other database structures.