V369 Exam 2
Server Operating Systems
(network operating system) has the task of creating a single system image for all services on the network, so that the system is transparent to users and even application programmers. Include variations of Microsoft Windows Sever, several variation of UNIX, and Linux.
Types of Intra-Organizational Systems
1. Enterprise Systems 2. Managerial Support Systems
Architectures for MDM
1. Identity Registry 2. Integration Hub 3. Persistent Approach
Data Management Process
1. Plan 2. Source 3. Acquire & Maintain 4. Define/Describe and Inventory 5. Organize and Make Accessible 6. Control Quality & Integrity 7. Protect & Secure 8. Account for Use 9. Recover/Restore & Upgrade 10.Determine Retention and Dispose 11. Train & Consult for Effective Use
View Integration
A bottom up approach to detailing and organization's data requirements. It analyzes each report, screen, form, and document in the organization and combines each of these views into one consolidated and consistent picture of all organization data.
Data Stewards
A business manager responsible for the quality of data in particular subject or process area, such as a customer, product, or billing. They are given defined role to manage specific kinds of data like customer, product, or employee subject area data.
Transport Stack Software
Allows communications employing TCP/IP to be sent across the network. Some of the elements needed for this process are contained in the server operating system but some middleware may also be required.
Database Programming (3 GL)
Can be specified in either procedural programs written in this language or via special purpose languages developed for data processing. In a regular this program, instructions to write new data to the customer record, its indexes, the order record and its indexes, would have to be provided individually. With the commands available through the special enhancements to the language provided by the DBMS, only one instruction is needed in the program, and all the associated indexes and records are automatically updated, which makes the programming task more productive and less error prone.
Data Resources
Consists of the facts and information an organization gathers while conducting business and in order to conduct business at all levels of the organization. Can include numeric, text, audio, video, and graphical data collected both within the organization and from sources external to it, as well as the metadata, which describe the business and technical characteristics of the data resource.
Integration Hub Approach
Data changes are broadcast through a central service to all subscribing databases. Redundant data are kept, but there are some mechanisms to ensure consistency, yet each application does not have to collect and maintain all of the data it needs.
Prepackaged Corporate Data Model Advantages
Data models can be developed using proven components developed from cumulative experiences. Projects take less time and cost less because the essential components and structures are already defined and only need to be quickly customized. Prepackaged models are developed form best practices, so you data model is easier to evolve as additional data requirements are identified for the given situation. Adaptation of a data model from your DBMS vendor usually means that your data model will easily work with other applications from the same vendor or their software partners. A prepackaged model provides a starting point for asking requirements questions that will help to surface unspoken requirements. It may be easier to share data for inter-organizational systems.
Strategic Databases
Data warehousing, used to support decision making and business intelligence are typically subsets, summaries, or aggregations of operational databases, with key external data as supplements.
Organize & Make Accessible
Databases need to be designed so that data can be retrieved and reported efficiently and in the format that business managers require. What data are required? How the data need to be selected?
Source
Decisions must be made about the timeliest and highest quality source for each element. A master data management program often drives decisions about data sources.
Define/Describe & Inventory
Defines what is being managed. Each data entity, data element, and relationship must be defined, a format for storage and reporting established, and the organization of the data described so users know how to access the data. All definitions and descriptions are kept, volume statistics on data are maintained and other data about data are stored.
Enterprise Systems
Designed to support the entire organization or large portions of it
Protect & Secure
Determine the rights each manager has to each type of data must be defined. Security should be considered when databases and application systems are originally built and not developed as an afterthought.
Naming
Distinct and meaningful names must be given to each kind of data retained in organization databases. Many organizations develop a scheme or template constructing this, with common terms to be used for different element schemes.
Definition
Each data entry and element is given a description that clarifies its meaning.
Physical
Exist in databases and other file systems. Critical issues: computer performance, accessibility, integrity and security.
Account for Use
Frequently the organizational unit responsible for acquiring data is not the primary user of the data. Usage is shared because data are not consumer from usage. Develop a fair charging scheme that promotes good management of data but does not deter beneficial use.
Managerial Issues in Managing Data
How to plan for data, to control data integrity, to secure access to and use data, and to make data accessible are important to the business manager.
Data Modeling
Involves both a methodology and a notation. The methodology includes the steps that are followed to identify and describe organizational data entries, and the notation is a way to show these findings, usually graphically. Managers must be integrally involved in these methodologies to insure that the data you need are planned for inclusion in organizational databases and that the data captured and stored have the required business meaning.
Plan
Involves developing a blueprint for data and the relationships among data across business units and functions. There will be macro level data plan called an enterprise data model to identify entities and relationships among the entities and more detailed plans to define schedules for the implementation of databases for different parts of this blueprint. The plan identifies which data are required, where they are used in the business, how they will be used and how much data is expected.
Entity Relationship Diagram (ERD)
Is the most commonly accepted notation for representing the data needs in an organization. It consists of entities, the things about which data are collected, attributes, the elements of data that are to be collected, and relationships. Because of the nontechnical nature, it is a very useful tool for facilitating communication between end users who need the data and database designers and developers who will create and maintain the database. Not sufficient for documenting data needs.
Train & Consult for Effective Use
Just because data is available does not mean that it will be effectively used.
Identity Registry Approach
MDM remains in their source systems, and applications refer to the registry to determine where the agreed upon source of the data resides. Helps each system match its master record with corresponding master records in other source systems.
Data Transfer (Data Transfer and Integration Applications)
Move data from one database to another or otherwise bring together data from various databases to meet some processing need. These applications are often called bridges or interfaces because they connect related databases. This kind of application extracts and summarizes the data as well as distributes copies of original data.
Objectives of Data Modeling
Objective: the modeling effort must be justified by some clear overriding need. Scope: the coverage for a data model must be carefully considered, the broader the scope the more difficult the project is to manage. Outcome: choices here include a subject area database definition, identification of common data capture systems to be shared by several departments. The more uncertain the outcome, the lower the chances for success. Timing: few organizations can put all systems on hold while a complete data model is developed. This evolutionary approach might be more practical, but it must be done within the context of an initial overall, general enterprise data model. Represents a radical change to the more traditional approach of making short-term fixes to systems. Not an issue of centralized versus decentralized control. The data administration approach emphasizes placing decision-making power in the hands of the most knowledgable about the data.
Acquire & Maintain
Once the best sources are identified and selected, data capture systems must be built to acquire and maintain these data. Changes in data need to be broadcast to all databases that store these data. Appropriate applications systems need to be built to track data acquisition and transfer.
Persistent Approach
One consolidated record is maintained and all applications draw on that one application to the persistent record so that it contains the most recent values and to go to the persistent record so that it contains the most recent values and go to the persistent record when any system needs common data.
Data Model
Overall map for business data which shows rules by which the organization operates.
Recover/Restore & Upgrade
Procedures must be in place to restore the database to a clean and uncontaminated version.
Transaction Processing System
Process thousands of transactions that occur everyday in most organizations. It also produces document and updated records that result from the transactions. Might be mainframe or midrange based, or two tier or three tier client/server system, or they might involve the use SOA
Online Processing
Processes each transaction as it happens, meaning that updates happen instantly. A fully implemented system is also called an interactive system because the user is directly interacting with the computer.
Data Analysis and Presentation (Data Analysis and Presentation Applications)
Provide data and information to authorized persons. Data might be input to a decision support system or executive information system. Data and the way they are presented should be independent, and who determine the format for presentation should not necessarily control the location and format for capture and storage of data.
Virtualization
Server Virtualization Desktop Virtualization
Poor systems development productivity is frequently due to a lack of data management and some methods, such as prototyping, cannot work unless the source of data is clear and the data is available.
Systems development time is greatly enhanced by the reuse of data and programs as new applications are designed and built. Unless data are cataloged, named in standard ways, protected but accessible to those with a need to know, and maintained with high quality, the data and the programs that capture and maintain them cannot be reused.
Determine Retention & Dispose
The business must decided on legal and other grounds how much data history needs to be kept. Keeping data too long is not only costly in terms of storage space, but the use of out of dat data can bias forecasts and other analyses.
Meta Data Repository or Data Dictionary/Directory (DD/D)
The central repository of data about data helps users learn more about organizational databases. Database management systems should also use this model to access and authorize use of data.
Control Quality & Integrity
The concept of application independence implies that such controls must be stored as part of the data definitions and enforced during data capture and maintenance. The more data are used to support organizational operation, the cleaner the data should be. The quality of the data has a direct relationship to the quality of the processes performed by these systems.
Shared
The data that is exchanged between different user groups, and hence there must be agreements on the definition, format, and timing for the exchange of data among those sharing. It exists because of a dependency between different organizational units or functions.
Integrity Rules
The permissible range or set of values must be clear for each data element. These rules add to the meaning of the data conveyed by data definitions and names. A central and single standard for valid values can be used by those developing all data capture applications to detect mistakes.
Database Administrator (DBA)
The primary person responsible for the management of computer databases. They might be placed in technical unit that supports various system software and hardware.
Data Administrator
Their job is to lead the efforts in data management, often reports as a staff unit to the IS director, although other structures are possible.
Usage Rights
These standards prescribe who can do what and when to each type of data.
Enterprise Modeling
Top down approach involving describing the organization and its data requirements at a very high level, independent of particular reports, screens, or detailed descriptions of data processing requirements. The work of the organization is divided into its major functions. Each of these processes is then further divided into processes and each process into activities, and activity is described at a very high level. The lists of entities are then checked to make sure that consistent names are used and the meaning of each entity is clear. Based on general business policies and rules of operation, relationship between the entities are identified and a corporate model is drawn. Is not biased by a lot of details, current databases and files, or the business actually operates today. It is future oriented and should identify a comprehensive set of generic data requirements. On the other it can be incomplete or inaccurate because it might ignore some important details.
Batch Processing
Transactions are accumulated over the day then processed together at one time. The major problem with this is that the master file is only up to date at the beginning of the day.
Service-Specific Software
Used to carry out a specific service such as email or the WWWs HTTP.
Identifier
a characteristic of a business object or event that uniquely distinguishes one instance of this entity from every other instance. It must be unique and stable for a long time.
Corporate Data Model
a chart that describes all of the data requirements in a given organization. This chart shows what data entities and relationships between the entities are important for the organization. Priorities are set for what parts of the chart are in need of the greatest improvement, and more detailed work assignments are defined to describe these more clearly and to revise databases accordingly.
Data Standards
a clear and useful way to uniquely identify every instance of data and to give unambiguous business meaning to all data.
Web Services
a collection of technologies built around the XML standard of communications, used interchangeably with SOA
Data Architecture
a map or blueprint for organizational data. A data model shows the data entities and relationships that are important to an organization.
Database Programming (4 GL)
a nonprocedural language, such as SQL (structured query language) which is standardized by the International Organzation for Standardization (ISO). Adopting such and international standard for writing queries means that no or minimal changes will need to be made in the program if you change DBMS.
Data Warehousing Applicance
a packaged solution consisting of hardware and software, where the software has been specifically pre-installed and pre-optimized for data warehousing.
Server Virtualization
a physical server is split into multiple virtual servers. Each virtual server can run its own full-fledged operating system, and these systems can be different from one virtual server to the next. The physical server typically runs a hypervisor program to create the virtual servers and manage the resources of the various operating systems. Each virtual server can be employed as if it were a stand alone physical server, reducing the number of physical servers needed.
Database/Database Administration
a special management unit, that provides overall organizational leadership in the data management function.
Enterprise Resource Planning (ERP)
a subset of transaction processing systems. A set of integrated business applications, or modules, that carry out common business functions such as general ledger accounting, accounts payable, accounts receivable, etc. Usually purchased from a software vendor. These modules have been designed to reflect a particular way of doing business-- a particular set of business processes. Systems are based on a value chain view of the business in which functional departments coordinate their work.
Vertically Integrated Systems
a system that serves more than one vertical level in an organization or an industry, this is an important characteristic of applications.
Column Store Database
an alternate way of setting up a database where all of the values for the first attribute (column) of a record. Operational databases, which are used primarily for transaction processing, are almost always row-store, while an increasing number of data warehouses, which are used for a variety of purposes including querying, are using this method.
Service Oriented Architecture (SOA)
an application architecture based on a collection of functions, or services, where these services can communicate with one another. Used interchangeably with Web Services
Three Tiers
an application server that is separate from the database server is employed. The user interface is housed on the client, usually a PC (tier 1), the processing is performed on a midrange system or high end PC operating as the application server (tier 2), and the data are stored on a large machine (mainframe or midrange computer) that operates as the data base server (tier 3). Programmers use C and C++ to develop the tier 1 and tier 3 components of the system. With a Citrix approach, applications execute on a server and are merely displayed on the client, with the client acting as a "dumb" terminal.
Disposable Applications
an application that can be discarded when it becomes obsolete without affecting the operation of any other application; this is made possible by application independence.
Functional Information Systems
an information system, usually composed if multiple interrelated subsystems, that provide the information necessary to accomplish various task within a specific functional area of the business, such as production, marketing, accounting, personnel, or engineering.
Core
are those that require an organization wide definition and souring, but if there are multiple copies, the creation of these copies are carefully planned and managed.
Customer Relationship Management (CRM)
attempts to provide and integrated approach to all aspects of interaction a company has with its customers, including marketing, sales, and support. The goal of this system is to use technology to forge a strong relationship between a business and its customers. The business is seeking to better manage its own enterprise around customer behaviors. Software packages enable organizations to market to, sell to, and service customers across multiple channels, including the web, call centers, field representatives, business partners, and retail and dealer networks.
Two Tiers
client and server. Fat client/thin server: most of the processing is done on the client Thin client/fat server: most of the processing is done on the server
Metadata
data about data needed to unambiguously describe data for the enterprise. Documents the meaning and all the business rules that govern data. These rules come from the nature of the organization, so business managers are typically the source of the knowledge to develop these rules. You can purchase business rules and the repository software to help with management.
Managerial Support Systems
designed to support a specific manager or a small group of managers
Transborder Data Flows
electronic movements of data that cross a country's national boundary for processing, storage, or retrieval of that data in a foreign country. All data is subject to the exporting countries laws
Desktop Virtualization
everything the user sees and uses on a PC desktop is separated from the physical desktop machine and accessed through a client/server computing model. This virtualized desktop environment is stored on a server, rather than on the local storage of the desktop device, all programs, applications, and data are kept on the server and all programs and applications are run on the server. The server does almost all of the work so a thin client model is appropriate.
Middleware in Client/Server Systems
functions as the / in client/server systems It falls into one of three categories 1. Server Operating Systems 2. Transport Stack Software 3. Service-Specific Software
Data warehouse
is created when a firm pulls data from its operation systems-- the transaction processing systems-- and puts the data in a separate warehouse so that many users may access and analyze the data without endangering the operational systems. Thus it is the establishment and maintenance of a large data storage facility containing data on all aspects of the enterprise. It must be accurate, current, and stored in a usable form. It must also have easy to use data access and analysis tools for managers and other user to encourage full use of the data.
Service
is function that is well defined and self contained, and that does not depend on the context or state of other services.
Warehouse Construction Software
is required to extract relevant data from the operational databases, make sure the data are clean, and load the data into the data warehouse.
Corporate Information Policy
is the foundation for managing ownership of data
Data Governance
is the organizational process for establishing strategy, objectives, and policies for organizational data-- that is to oversee data stewardship, even overseeing local data stewards responsible for similar activities for specific data subject areas or business units.
Logical
is the view or understanding of data needed to use data. The data is presented to users via query languages and applications.
Warehouse Operation Software
is used to store the data and manage the data warehouse.
Data Pyramid
new data can enter at any level, most new data are captured at the base in operation databases. These databases contain the business transaction history of customer orders, purchases from suppliers, internal work orders, changes to the general ledger, personnel transfers, and other day to day business activities.
Warehouse Access and Analysis Software
permit the user to produce customized reports from the data warehouse, perhaps on a regular basis, as well as query the data warehouse to answer specific questions.
Client/Server Systems
processing power is distributed between a central server computer, and any number of client computer. The split in responsibilities between the server and client varies considerably from application to application, but the client usually provides the GUI, accepts data entry, and displays the immediate output while the server maintains the database against which new data are processed. The processing of the transaction may occur on either the client or a server.
Inline Processing
provide for online data entry but the actual processing of the transaction is deferred until a batch has been accumulated.
Master Data Management (MDM)
refers to the disciplines, technologies, and methods to ensure the currency, meaning, and quality of reference data within and across various subject areas. This ensures that everyone knows the current description of a product, the current salary of an employee, and the current billing address of a customer. It does not address sharing transactional data, such as customer purchases. It determines the best source for each piece of data and makes sure that all application reference the same virtual "golden record."
Distributed Systems/Distributed Data Processing
refers to the mode of delivery rather than a traditional class of applications like transaction processing or decision support systems. The processing power is distributed to multiple sites, which are then tied together via LANs, WANs and telecommunication lines. Systems in which computers of some size are located at various physical sites at which the organization does business and in which the computers are linked by telecommunication lines of some sort in order to support some business process.
Application Independence
separation or decoupling of data from application systems.
Business Processes
the chain of activities required to achieve an outcome such as order fulfillment or materials acquisition. A focus on this makes it easier to recognize where formerly distinct information systems are related, and thus where they should be integrated.
Semantic
the meta data that describe organization data. Critical issues: clarity, consistency, and sharability.
Normalization
the process of creating simple data structures from more complex ones, consists of a set of rules that yields a data structure that is very stable and very useful across many different requirements. It is used as a tool to rid data of troublesome anomalies associated with inserting, deleting, and updating data. Using this method database can evolve with very few changes to the parts that have already been developed and populated. After the process is completed, each user view is integrated into one comprehensive description. The enterprise model and view integrated model are reconciled, and the final data model is developed.
Hosted/ On Demand Solutions (SaaS)
the software runs on vendor hardware, and the customer pays a subscription fee on a per user, per month basis to use the application.
Data Capture
these applications gather data and populate the database. They store and maintain data in the data pyramid. Each datum is captured once and fully tested for accuracy and completeness. Localized applications are developed for data with an isolated use or data for which coordination across units is not required. An inventory of data elements must be maintained of all database contents.
Local
those that have relevance to only a single user or small group of organization members. It does not need extensive control and do not need to follow organizational standards. May have a limited life and use, and is acceptable that it may be inconsistent and duplicated across the organization
XML (eXtensible markup language)
used to describe the structure of data and to label data being exchanged between computer programs. It has essentially become the standard for e-commerce data exchange because neither system needs to know anything about the database technology each is using. As long as the different organizations agree on schema for data and what labels to use for different pieces of data, data can be exchanged. It is also the basis for web services.
Middleware
will allow SQL in programs on one computer to access data on another computer when software on both computers understand SQL, even if SQL processors on the different computers come from different vendors.