Data Management CORE exam prep
Regulatory Compliance scope activites
Communicating, monitoring, enforcing.
Business Intelligence Program Manager
Coordinates BI requirements and initiatives across the corporation and integrates them into a cohesive prioritized program and roadmap.
Government and Regulatory Bodies
Data Management rules of engagement in the market are specified and enforced by various government and regulatory bodies. Privacy, confidential, proprietary data, and information are key areas.
Recurring Themes: Data Integration
Every data management function contributes to and benefits from data integration techniques, managing data assets through minimizing redundancy, consolidating data from multiple sources, and ensuring consistency across controlled redundant data with a "golden version".
Corporate Information Factory
Inmon, along with Claudia Imhoff and Ryan Sousa, identified and wrote about components of a corporate data architecture for DW-BIM and coined it what term?
Data Security Management Component Function
Insuring privacy, confidentiality and appropriate access.
Meta-data Management Component Function
Integrating, controlling and providing meta-data.
Document and Content Management Component Function
Managing data found outside of databases.
Reference and Master Data Management Component Function
Managing golden versions and replicas.
Data Governance Office (DGO)
A staff organization in larger enterprises supporting the efforts of the Data Governance Council, Data Stewardship Steering Committees, and Data Stewardship Teams.
Controllability (1 of 6 characteristics of reasonable data quality metrics)
This characteristic of reasonable metrics defined in the context of the types of data quality dimensions expresses that any measurable characteristic of information that is suitable as a metric should reflect some controllable aspect of the business. In other words, the assessment of the data quality metric's value within an undesirable range should trigger some action to improve the data being measured.
Accountability / Stewardship (1 of 6 characteristics of reasonable data quality metrics)
This characteristic of reasonable metrics defined in the context of the types of data quality dimensions is associated with defined roles indicating notification of the appropriate individuals when the measurement for the metric indicates that the quality does not meet expectations. The business process owner is essentially the one who is accountable, while a data steward may be tasked with taking appropriate corrective action.
Business Relevance (1 of 6 characteristics of reasonable data quality metrics)
This characteristic of reasonable metrics defined in the context of the types of data quality dimensions is limited if it cannot be related to some aspect of business operations or performance. Therefore, every data quality metric should demonstrate how meeting its acceptability threshold correlates with business expectations.
Data Stewardship Steering Committee(s)
One or more cross-functional groups of coordinating data stewards responsible for support and oversight of a particular data management initiative launched by the Data Governance Council, such as Enterprise Data Architecture, Master Data Management, or Meta-data Management.
Data Stewardship Team(s)
One or more temporary or permanent focused groups of business data stewards collaborating on data modeling, data definition, data quality requirement specification and data quality improvement, reference and master data management, and meta-data management, typically within an assigned subject area, led by a coordinating data steward in partnership with a data architect and a data stewardship facilitator.
Data Management Service Organization(s)
One or more units of data management professionals responsible for data management within the IT organization. A centralized organization is sometimes known as an Enterprise Information Management (EIM) Center of Excellence (COE).
Data Lifecycle Stages
Plan Specifying Enable Create Use Archive Purge
Data Operations Management Component Function
Planning, control, and support for structured data assets across the data lifecycle, from creation and acquisition through archival and purge.
Data Security Management Component Function
Planning, development, and execution of security policies and procedures to provide proper authentication, authorization, access, and auditing of data and information.
Data Quality Management Component Function
Planning, implementation, and control activities that apply quality management techniques to measure, assess, improve, and ensure the fitness of data for use.
Meta-data Management Component Function
Planning, implementation, and control activities to enable easy access to high quality, integrated meta-data.
Reference and Master Data Management Component Function
Planning, implementation, and control activities to ensure consistency with a "golden version" of contextual data values.
Document and Content Management Component Function
Planning, implementation, and control activities to store, protect, and access data found within electronic files and physical records (including text, graphics, images, audio, and video).
Data Warehousing and Business Intelligence Management Component Function
Planning, implementation, and control processes to provide decision support data and support for knowledge workers engaged in reporting, query and analysis.
Data Governance Component Function
Planning, supervision and control over data management and use.
Communication scope activites
Promoting, building awareness and appreciation
Architectural Framework
Provides the "architecture for architecture" and a way of thinking about and understanding architecture, and the structures or systems requiring architecture.
Data operations management Component Function
Providing support from data acquisition to purging.
Measurability (1 of 6 characteristics of reasonable data quality metrics)
This characteristic of reasonable metrics defined in the context of the types of data quality dimensions must be measurable, and should be quantifiable within a discrete range. Note that while many things are measurable, not all translate into useful metrics, implying the need for business relevance.
For General Audiences (1 of 5 data security confidentiality classification levels)
This confidential level is described as Information available to anyone, including the general public.
Internal Use Only (1 of 5 data security confidentiality classification levels)
This confidential level is described as Information limited to employees or members, but with minimal risk if shared. It may be shown or discussed, but not copied outside the organization.
Restricted Confidential (1 of 5 data security confidentiality classification levels)
This confidential level is described as Information limited to individuals performing certain roles with the "need to know". And may require individuals to qualify through clearance.
Registered Confidential (1 of 5 data security confidentiality classification levels)
This confidential level is described as Information so confidential that anyone accessing the information must sign a legal agreement to access the data and assume responsibility for its secrecy.
Confidential (1 of 5 data security confidentiality classification levels)
This confidential level is described as Information which should not be shared outside the organization. Client information may not be shared with other clients.
Build and test information products
Tasks that data professionals including DBAs should collaborate with software developers on: · Implementing mechanisms for integrating data from multiple sources, along with the appropriate meta-data to ensure meaningful integration of the data. · Implementing mechanisms for reporting and analyzing the data, including online and web-based reporting, ad-hoc querying, BI scorecards, OLAP, portals, and the like. · Implementing mechanisms for replication of the data, if network latency or other concerns make it impractical to service all users from a single data source. Software developers are responsible for coding and testing programs, including database access calls. Software developers are also responsible for creating, testing, and maintaining information products, including screens and reports. Testing includes unit, integration, and performance testing.
A Data Management Scope Statement (1 of 3 data strategy deliverables)
The Goals and objectives for some planning horizon, usually 3 years, and the roles, organizations, and individual leaders accountable for achieving these objectives.
A Data Management Program Charter (1 of 3 data strategy deliverables)
The Overall vision, business case, goals, guiding principles, measures of success, critical success factors, recognized risks, etc.
Data Operations Management Function
This data management function can be defined as the development, maintenance, and support of structured data to maximize the value of the data resources to the enterprise and includes two sub-functions: database support and data technology management.
Data Governance Council
The primary and highest authority organization for data governance in an organization. Includes senior managers serving as executive data stewards, along with the DM Leader and the CIO.
The primary purpose of Data Marts
The primary purpose of this term is to provide data for analysis to knowledge workers.
Normalization
The process of applying rules to organize business complexity into stable data structures with a goal to keep each data element in only one place.
The purpose of a Data Warehouse
The purpose of this term is to integrate data from multiple sources and then serve up that integrated data for BI purposes.
Abstraction
The redefinition of data entities, elements, and relationships by removing details to broaden the applicability of data structures to a wider class of situations, often by implementing super-types rather than sub-types.Using the generic Party Role super-type to represent the Customer, Employee, and Supplier sub-types is an example of applying this term.
Data
The representation of facts as text, numbers, graphics, images, sound or video.
Enterprise Data Architect
The senior data architect responsible for developing, maintaining, and leveraging the enterprise data model.
Service Accounts
These accounts are designed as a convenience for administrators, these accounts often come with enhanced privileges and are untraceable to any particular user or administrator. And may also run the risk of data security breaches.
Detailed data design activities
These activities are included in: · Detailed physical database design, including views, functions, triggers, and stored procedures. · Other supporting data structures, such as XML schemas and object classes. · Information products, such as the use of data in screens and reports. · Data access solutions, including data access objects, integration services, and reporting and analysis services.
Data Security Administrators
These administrators must asses the administrative requirements of software tools, application packages, and IT systems used by the enterprise.
Natural Key Advantages
These are advantages to using what type of key? · Lower overhead: The key fields are already present, not requiring any additional modeling to create or processing to populate. · Ease of change: In RDBMS where the concept of a domain exists, it is easy to make global changes due to changes on the source system. · Performance advantage: Using the values in the unique keys may eliminate some joins entirely, improving performance. · Data lineage: Easier to track across systems, especially where the data travels through more than two systems.
Surrogate Key Advantages
These are advantages to using what type of key? · Performance: Numeric fields sometimes search faster than other types of fields. · Isolation: It is a buffer from business key field changes. The key may not need changing if a field type or length changes on the source system. · Integration: Enables combinations of data from different sources. The identifying key on the source systems usually do not have the same structure as other systems. · Enhancement: Values, such as "Unknown" or "Not Applicable", have their own specific key value in addition to all of the keys for valid rows. · Interoperability: Some data access libraries and GUI functions work better with these keys, because they do not need additional knowledge about the underlying system to function properly. · Versioning: Enables multiple instances of the same dimension value, which is necessary for tracking changes over time. · De-bugging: Supports load issue analysis, and re-run capability.
Meta-Data Benefits
These are benefits of what type of data? Which is also the descriptive tags or context on the data (the content) in a managed data environment. And shows business and technical users where to find information in data repositories. Also provides details on where the data came from, how it got there, any transformations, and its level of quality; and it provides assistance with what the data really means and how to interpret it. 1. Increase the value of strategic information (e.g. data warehousing, CRM, SCM, etc.) by providing context for the data, thus aiding analysts in making more effective decisions. 2. Reduce training costs and lower the impact of staff turnover through thorough documentation of data context, history, and origin. 3. Reduce data-oriented research time by assisting business analysts in finding the information they need, in a timely manner. 4. Improve communication by bridging the gap between business users and IT professionals, leveraging work done by other teams, and increasing confidence in IT system data. 5. Increase speed of system development's time-to-market by reducing system development life-cycle time. 6. Reduce risk of project failure through better impact analysis at various levels during change management. 7. Identify and reduce redundant data and processes, thereby reducing rework and use of redundant, out-of-date, or incorrect data.
Data Security Management Function
This data management function involves the planning, development, and execution of security policies and procedures to provide proper authentication, authorization, access, and auditing of data and information assets. And establishes judicious governance mechanisms that are easy enough to abide by on a daily operational basis by all stakeholders.
Reporting services
These services give business users the ability to execute both canned and ad-hoc reports, and have the data made available to them in a number of different ways, such as delivered (published) via email or RSS feed, accessible via web browser or portal, extracted to an Excel spreadsheet, and so on.
Derive data from stored data
This denormalization technique is used to reduce calculation expense at query time, especially calculations that require data from multiple tables, pre-calculate columns and store the results in a table, either a new table or one of the participants in the calculation.
Data modeling and database design standards
These standards serve as the guiding principles to effectively meet business data needs, conform to data architecture, and ensure data quality. Data architects, data analysts, and database administrators must jointly develop these standards. They must complement and not conflict with related IT standards.
Data Quality Management Governance
These tasks are included in what data management function governance? · Engaging business partners who will work with the data quality team and champion the DQM program. · Identifying data ownership roles and responsibilities, including data governance board members and data stewards. · Assigning accountability and responsibility for critical data elements and DQM. · Identifying key data quality areas to address and directives to the organization around these key areas. · Synchronizing data elements used across the lines of business and providing clear, unambiguous definitions, use of value domains, and data quality rules. · Continuously reporting on the measured levels of data quality. · Introducing the concepts of data requirements analysis as part of the overall system development life cycle. · Tying high quality data to individual performance objectives.
Acceptability (1 of 6 characteristics of reasonable data quality metrics)
This characteristic of reasonable metrics defined in the context of the types of data quality dimensions frame the business requirements for data quality, and quantifying quality measurements along the identified dimension provides hard evidence of data quality levels. Base the determination of whether the quality of data meets business expectations on specified acceptability thresholds. If the score is equal to or exceeds the acceptability threshold, the quality of the data meets business expectations. If the score is below the acceptability threshold, notify the appropriate data steward and take some action.
Combine and pre-join tables
This denormalization technique is used to reduce joins where two tables are joined in a significant number of queries, consider creating a table which already has the result set of a join of both tables.
Collapse hierarchies (roll-up)
This denormalization technique is used to reduce joins, combine direct-path parent / child relationships into one table, repeating the parent columns in each row.
Horizontally split
This denormalization technique is used to reduce query sets, create subset tables using the value of a column as the differentiator. For example, create regional customer tables that contain only customers in a specific region.
Definitional Conformance (1 of 10 data quality business rule templates)
This business rule template indicates that the same understanding of data definitions is understood and used properly in processes across the organization. Confirmation includes algorithmic agreement on calculated fields, including any time, or local constraints, and rollup rules.
Mapping conformance (1 of 10 data quality business rule templates)
This business rule template indicates that the value assigned to a data element must correspond to one selected from a value domain that maps to other equivalent corresponding value domain(s). The STATE data domain again provides a good example, since state values may be represented using different value domains (USPS Postal codes, FIPS 2-digit codes, full names), and these types of rules validate that "AL" and "01" both map to "Alabama."
Value domain membership (1 of 10 data quality business rule templates)
This business rule template specifies that a data element's assigned value is selected from among those enumerated in a defined data value domain, such as 2-Character United States Postal Codes for a STATE field.
Historical
This characteristic do a data warehouse entails how operational systems rightfully focus on current-valued data, a hallmark of data warehouses is that they contain a vast amount of historical data (typically 5 to 10 years worth of data). The bulk of this data is typically at a summarized level. The older the data is, the more summarized it usually is.
Summarized and Detailed Data
This characteristic do a data warehouse must contain detailed data, representing the atomic level transactions of the enterprise, as well as summarized data.
Time Varient
This characteristic do a data warehouse refers to how every record in the data warehouse is accurate relative to a moment in time, and often shows up as an element of time in the key structure.
Non-Volatile
This characteristic do a data warehouse refers to the fact that updates to records during normal processing do not occur, and if updates occur at all, they occur on an exception basis.
Integrated
This characteristic do a data warehouse refers to the unification and cohesiveness of the data stored in the data warehouse, and covers many aspects, including key structures, encoding and decoding of structures, definitions of the data, naming conventions, and so on.
Subject Orientation
This characteristic of a data warehouse refers to the organization of data along the lines of the major entities of the corporation.
Trackability (1 of 6 characteristics of reasonable data quality metrics)
This characteristic of reasonable metrics defined in the context of the types of data quality dimensions enables an organization to measure data quality improvement over time. Tracking helps data stewards monitor activities within the scope of data quality SLAs, and demonstrates the effectiveness of improvement activities. Once an information process is stable, tracking enables instituting statistical control processes to ensure predictability with respect to continuous data quality.
Vertically split
This denormalization technique is used to reduce query sets, create subset tables which contain subsets of columns. For example, split a customer table into two based on whether the fields are mostly static or mostly volatile (to improve load / index performance), or based on whether the fields are commonly or uncommonly included in queries (to improve table scan performance).
Divide hierarchies (push down)
This denormalization technique is used to reduce query sets, where parent tables are divided into multiple child tables by type. For example, create customer tables that each contain a different type of customer, such as checking, mortgage, investment, etc.
Repeat columns in one row
This denormalization technique is used to reduce row counts or to enable row-to-row comparisons, create a table with repeated rows. For example, rather than 12 rows for 12 months, have 12 columns, one for each month.
Performance and Ease of Use
This design principle ensures quick and easy access to data by approved users in a usable and business-relevant form, maximizing the business value of both applications and data.
Integrity
This design principle ensures that the data should always have a valid business meaning and value, regardless of context, and should always reflect a valid state of the business.
Security
This design principle ensures that true and accurate data should always be immediately available to authorized users, but only to authorized users. The privacy concerns of all stakeholders, including customers, business partners, and government regulators, must be met.
Reusability
This design principle ensures that, where appropriate, multiple applications would be able to use the data. The database structure should also ensure that multiple business purposes, (such as business analysis, quality improvement, strategic planning, customer relationship management, and process improvement) could use the data.
Maintainability
This design principle ensures to perform all data work at a cost that yields value by ensuring that the cost of creating, storing, maintaining, using, and disposing of data does not exceed its value to the organization. Ensure the fastest possible response to changes in business processes and new business requirements.
Type 1 Overwrite (1 of 5 Dimension Attribute Types)
This dimension attribute type has no need for any historical records at all. The only interest is in the current value, so any updates completely overwrite the prior value in the field in that row.
Type 4 New Table (1 of 5 Dimension Attribute Types)
This dimension attribute type initiates a move of the expired row into a 'history' table, and the row in the 'current' table is updated with the current information.
Type 2 New Row (1 of 5 Dimension Attribute Types)
This dimension attribute type needs all historical records. Every time one of these Type fields changes, a new row with the current information is appended to the table, and the previously current row's expiration date field is updated to expire it.
Type 3 New Column (1 of 5 Dimension Attribute Types)
This dimension attribute type needs only a selected, known portion of history. Multiple fields in the same row contain the historical values. When an update occurs, the current value is moved to the next appropriate field, and the last, no longer necessary, value drops off.
Type 6 1+2+3 (1 of 5 Dimension Attribute Types)
This dimension attribute type treats the dimension table as a Type 2, where any change to any value creates a new row, but the key value (surrogate or natural) does not change. One way to implement this type attribute is to add three fields to each row—effective date, expiration date, and a current row indicator.
Document Management System
This type of management system is an application used to track and store electronic documents and electronic images of paper documents. And commonly provide storage, versioning, security, meta-data management, content indexing, and retrieval capabilities. Used to collect, organize, index, and retrieve information content; storing the content either as components or whole documents, while maintaining links between components.
Source-to-target mapping
This type of mapping is the documentation activity that defines data type details and transformation rules for all required entities and data elements, and from each individual source to each individual target.
Matching
This type of master data management attempts to remove redundancy, to improve data quality, and provide information that is more comprehensive. And is performed by applying inference rules.
Bi-Directional Meta-data Architecture
This type of meta-data architecture allows meta-data to change in any part of the architecture (source, ETL, user interface) and then feed back from the repository into its original source. The repository is a broker for all updates.
Data Warehousing and BI Management Activities
Understand Business Intelligence Information Needs Define and Maintain the DW / BI Architecture (same as 2.6) Implement Data Warehouses and Data Marts Implement BI Tools and User Interfaces Process Data for Business Intelligence Monitor and Tune Data Warehousing Processes Monitor and Tune BI Activity and Performance
Data Security Management Activities
Understand Data Security Needs and Regulatory Requirements Define Data Security Policy Define Data Security Standards Define Data Security Controls and Procedures Manage Users, Passwords, and Group Membership Manage Data Access Views and Permissions Monitor User Authentication and Access Behavior Classify Information Confidentiality Audit Data Security
Subject Area Data Model (SAM)
Which data model is a list of major subject areas that collectively express the essential scope of the enterprise. This list is one form of the "scope" view of data (Row 1, Column 1) in the Zachman Framework. At a more detailed level, business entities and object classes can also be depicted as lists.
Completeness (1 of 11 Dimensions of Data Quality)
This dimension of data quality indicates that certain attributes always have assigned values in a data set. Another expectation of this dimension is that all appropriate rows in a dataset are present. Assign completeness rules to a data set in varying levels of constraint-mandatory attributes that require a value, data elements with conditionally optional values, and inapplicable attribute values.
Referential Integrity (1 of 11 Dimensions of Data Quality)
This dimension of data quality is the condition that exists when all intended references from data in one column of a table to data in another column of the same or different table is valid. The expectations of this dimension include specifying that when a unique identifier appears as a foreign key, the record to which that key refers actually exists. The rules of this dimension also manifest as constraints against duplication, to ensure that each entity occurs once, and only once.
Reasonableness (1 of 11 Dimensions of Data Quality)
This dimension of data quality is used to consider consistency expectations relevant within specific operational contexts. For example, one might expect that the number of transactions each day does not exceed 105% of the running average number of transactions for the previous 30 days.
Consistency (1 of 11 Dimensions of Data Quality)
This dimension of data quality refers to ensuring that data values in one data set are consistent with values in another data set. The concept of this dimension is relatively broad; it can include an expectation that two data values drawn from separate data sets must not conflict with each other, or define consistency with a set of predefined constraints.
Accuracy (1 of 11 Dimensions of Data Quality)
This dimension of data quality refers to the degree that data correctly represents the "real-life" entities they model. In many cases, measure this dimension by how the values agree with an identified reference source of correct information, such as comparing values against a database of record or a similar corroborative set of data values from another table
Currency (1 of 11 Dimensions of Data Quality)
This dimension of data quality refers to the degree to which information is current with the world that it models. This dimension measures how "fresh" the data is, as well as correctness in the face of possible time-related changes. Measure this dimension as a function of the expected frequency rate at which different data elements refresh, as well as verify that the data is up to date. The rules of this dimension define the "lifetime" of a data value before it expires or needs updating.
Precision (1 of 11 Dimensions of Data Quality)
This dimension of data quality refers to the level of detail of the data element. Numeric data may need accuracy to several significant digits. For example, rounding and truncating may introduce errors where exact this dimension is necessary.
Privacy (1 of 11 Dimensions of Data Quality)
This dimension of data quality refers to the need for access control and usage monitoring. Some data elements require limits of usage or access.
Timeliness (1 of 11 Dimensions of Data Quality)
This dimension of data quality refers to the time expectation for accessibility and availability of information. As an example, measure one aspect of this dimension as the time between when information is expected and when it is readily available for use.
RACI (Responsible, Accountable, Consulted, and Informed)
This matrix helps to establish clear accountability and ownership among the parties involved in the outsourcing engagement, leading to support of the overall data security policies and their implementation.
DW-Bus Matrix
This matrix is a tabular way of showing the intersection of data marts, data processes, or data subject areas with the shared conformed dimensions.The opportunity for conformed dimensions appears where a data mart is marked as using multiple dimensions (the row). The DW-bus appears where multiple data marts use the same dimensions (the column).
Assess Existing Meta-data Sources and Information Architecture (1 of 5 Meta-Data Strategy Implementation Phases)
This meta-data strategy implementation phase determines the relative degree of difficulty in solving the meta-data and systems issues identified in the interviews and documentation review. During this stage, conduct detailed interviews of key IT staff and review documentation of the system architectures, data models, etc.
Conduct Key Stakeholder Interviews (1 of 5 Meta-Data Strategy Implementation Phases)
This meta-data strategy implementation phase involves stakeholder interviews to provide a foundation of knowledge for the meta-data strategy. Stakeholders would usually include both business and technical stakeholders.
Meta-data Strategy Initiation and Planning (1 of 5 Meta-Data Strategy Implementation Phases)
This meta-data strategy implementation phase prepares the meta-data strategy team and various participants for the upcoming effort to facilitate the process and improve results. It outlines the charter and organization of the meta-data strategy, including alignment with the data governance efforts, and establishes the communication of these objectives to all parties.
Develop Future Meta-data Architecture (1 of 5 Meta-Data Strategy Implementation Phases)
This meta-data strategy implementation phase refine and confirm the future vision, and develop the long-term target architecture for the managed meta-data environment in this stage. This phase includes all of the strategy components, such as organization structure, including data governance and stewardship alignment recommendations; managed meta-data architecture; meta-data delivery architecture; technical architecture; and security architecture.
Develop Phased MME Implementation Strategy and Plan (1 of 5 Meta-Data Strategy Implementation Phases)
This meta-data strategy implementation phase review, validate, integrate, prioritize, and agree to the findings from the interviews and data analyses. Develop the meta-data strategy, incorporating a phased implementation approach that takes the organization from the current environment to the future managed meta-data environment.
Enterprise Application Integration (EAI)
This method enables data to be easily passed from application to application across disparate platforms.
Semantic Modeling
This modeling is a type of knowledge modeling. It consists of a network of concepts (ideas or topics of concern) and their relationships.
First normal form (1NF)
This normalization form ensures each entity has a valid primary key, every data element depends on the primary key, and removes repeating groups, and ensuring each data element is atomic (not multi-valued).
Third normal form (3NF)
This normalization form ensures each entity has no hidden primary keys and that each data element depends on no data element outside the key ("the key, the whole key and nothing but the key").
Second normal form (2NF)
This normalization form ensures each entity has the minimal primary key and that every data element depends on the complete primary key.
Ralph Kimball
This person defined the data warehouse as "a copy of transaction data specifically structured for query and analysis."
Bill Inmon
This person defined the data warehouse as "a subject oriented, integrated, time variant, and non-volatile collection of summary and detailed historical data used to support the strategic decision-making processes for the corporation."
Data Security Policy
This policy based on data security requirements is a collaborative effort involving IT security administrators, data stewards, internal and external audit teams, and the legal department. This policy is also more granular in nature and take a very data-centric approach
Data Mining
This predictive analysis tool is a particular kind of analysis that reveals patterns in data using various algorithms.
Data Warehousing Process
This process focuses on enabling an integrated and historical business context on operational data by enforcing business rules and maintaining appropriate business data relationships.
One-to-one relationship
This relationship says that a parent entity may have one and only one child entity.
Monitor Stage (1 of 4 data quality management cycle stages)
This stage of the data quality management lifecycle is for actively monitoring the quality of data as measured against the defined business rules. As long as data quality meets defined thresholds for acceptability, the processes are in control and the level of data quality meets the business requirements. However, if the data quality falls below acceptability thresholds, notify data stewards so they can take action during the next stage.
Act Stage (1 of 4 data quality management cycle stages)
This stage of the data quality management lifecycle is for taking action to address and resolve emerging data quality issues.
The implementer perspective (Component Assemblies)
This stakeholders perspective in the Zachman Framework entails a technology-specific, out-of-context view of how components are assembled and operate configured by Technicians as Implementers.
The participant perspective (Operations Classes)
This stakeholders perspective in the Zachman Framework entails an actual functioning system instances used by Workers as Participants.
The designer perspective (System Logic)
This stakeholders perspective in the Zachman Framework entails logical models detailing system requirements and unconstrained design represented by Architects as Designers.
The builder perspective (Technology Physics)
This stakeholders perspective in the Zachman Framework entails physical models optimizing the design for implementation for specific use under the constraints of specific technology, people, costs, and timeframes specified by Engineers as Builders.
The owner perspective (Business Concepts)
This stakeholders perspective in the Zachman Framework entails semantic models of the business relationships between business elements defined by Executive Leaders as Owners.
The planner perspective (Scope Contexts)
This stakeholders perspective in the Zachman Framework lists business elements defining scopes identified by Strategists as Theorists.
Information Content Architecture
This term is the process of creating a structure for a body of information or content. And identifies the links and relationships between documents and content, specifies document requirements and attributes, and defines the structure of content in a document or content management system. When creating this architecture, taxonomy meta-data (along with other meta-data) is used.
C.R.U.D
This term is used to depict which roles have responsibility for creating, updating, deleting, and using data about which business entities
Data Warehousing
This term is used to describe the operational extract, cleansing, transformation, and load processes—and associated control processes—that maintain the data contained within a data warehouse.
Authorization (1of 4 data security requirements categories)
This term means to identify the right individuals and grant them the right privileges to specific, appropriate views of data.
Authentication (1of 4 data security requirements categories)
This term means to validate users are who they say they are.
ANSI / NISO Z39.19-2005
This term provides the Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies.
Digital Asset Management
This term refers to the management of digital assets such as audio, video, music, and digital photographs. Tasks involve cataloging, storage, and retrieval of digital assets.
Glossary
This term typically provides guidance for use of terms, and a thesaurus can direct the user through structural choices involving three kinds of relationships: equivalence, hierarchy, and association.
Scorecards
This tool is a specialized type of analytics display that indicates scores or calculated evaluations of performance.
Dashboards
This tool is a type of user interface designed to display a wide array of analytics indicators, such as charts and graphs, efficiently. The user can "drill down" through these indicators to view the data beneath.
Data Profiling Tools
This tool is mostly used to define data rules for validation, assessing frequency distributions and corresponding measurements, and then applying the defined rules against the data sets.
Strategic BI
This type of BI involves providing metrics to executives, often in conjunction with some formal method of business performance management, to help them determine if the corporation is on target for meeting its goals. Use This BI to support long-term corporate goals and objectives.
Tactical BI
This type of BI is defined as the application of BI tools to analyze business trends by comparing a metric to the same metric from a previous month or year, etc. or to analyze historical data in order to discover trends that need attention. Use this BI to support short-term business decisions.
Master Data Management
This type of data management is explained as control over master data values to enable consistent, shared, contextual use across systems, of the most accurate, timely, and relevant version of truth about essential business entities.
Document and Content Management
This type of data management is the control over capture, storage, access, and use of data and information stored outside relational databases. And focuses on integrity and access.
Dimensional Data Modeling
This type of data modeling is the preferred modeling technique for designing data marts. And in general, focuses on making it simple for the end-user to understand and access the data. This type of data modeling is a subset of entity relationship data modeling, and has the basic building blocks of entities, attributes, and relationships. The entities come in two basic types: facts, which provide the measurements; and dimensions, which provide the context.
Cross Column Analysis
This type of data profiling analysis can expose embedded value dependencies
Inter-Table Analysis
This type of data profiling analysis explores overlapping values sets that may represent foreign key relationships between entities.
Location Master Data
This type of data provides the ability to track and share reference information about different geographies, and create hierarchical relationships or territories based on geographic information to support other processes.
Unstructured Data
This type of data references any document, file, graphic, image, text, report, form, video, or sound recording that has not been tagged or otherwise structured into rows and columns or records.
Passive Monitoring
This type of data security monitoring tracks changes over time by taking snapshots of the current state of a system at regular intervals, and comparing trends against a benchmark or defined set of criteria. And is considered an assessment mechanism.
Process Meta-Data (1 of 4 types of meta-data)
This type of data that defines and describes the characteristics of other system elements (processes, business rules, programs, jobs, tools, etc.).
Value Domain
This type of domain is defined as the set of allowable data values.
Vocabulary Management
This type of management is described defining, sourcing, importing, and maintaining a vocabulary and its associated reference data.
Affiliation Management
This type of management is the establishment and maintenance of relationships between master data records. Examples include ownership affiliations (such as Company X is a subsidiary of Company Y, a parent-child relationship) or other associations (such as Person XYZ works at Company X).
Document/Record Managment
This type of management is the lifecycle management of the designated significant documents of the organization.
Meta-Data Management
This type of management is the set of processes that ensure proper creation, storage, integration, and control to support associated usage of meta-data.
Document Management
This type of management is the storage, inventory, and control of electronic and paper documents. And encompasses the processes, techniques, and technologies for controlling and organizing documents and records, whether stored electronically or on paper.
Records Management
This type of management manages paper and microfiche / film records from their creation or receipt through processing, distribution, organization, and retrieval, to their ultimate disposition.
Content Management
This type of management refers to the processes, techniques, and technologies for organizing, categorizing, and structuring access to information content, resulting in effective retrieval and reuse.
What are other ways to create meaningful information by interpreting the context around data?
1. The business meaning of data elements and related terms. 2. The format in which the data is presented. 3. The timeframe represented by the data. 4. The relevance of the data to a given usage.
Data Management Function Goals
1. To understand the information needs of the enterprise and all its stakeholders. 2. To capture, store, protect, and ensure the integrity of data assets. 3. To continually improve the quality of data and information, including: o Data accuracy. o Data integrity. o Data integration. o The timeliness of data capture and presentation. o The relevance and usefulness of data. o The clarity and shared acceptance of data definitions. 4. To ensure privacy and confidentiality, and to prevent unauthorized or inappropriate use of data and information. 5. To maximize the effective use and value of data and information assets. Other non-strategic goals of data management include: 6. To control the cost of data management. 7. To promote a wider and deeper understanding of the value of data assets. 8. To manage information consistently across the enterprise. 9. To align data management efforts and technology with business needs.
Data Integration Specialist
A software designer and developer responsible for implementing systems to integrate (replicate, extract, transform, load) data assets in batch or near real time.
Data Stewardship Facilitator
A business analyst responsible for coordinating data governance and stewardship activities.
Analytics / Report Developer
A software developer responsible for creating reporting and analytical application solutions.
Coordinating Data Steward
A business data steward with additional responsibilities, who will: 1. Provide business leadership for a Data Stewardship Team. 2. Participate on a Data Stewardship Steering Committee. 3. Identify business data steward candidates. 4. Review and approve changes to reference data values and meanings. 5. Review and approve logical data models. 6. Ensure application data requirements are met. 7. Review data quality analysis and audits.
Data Entity
A collection of data about something that the business deems important and worthy of capture.
Data Warehouse Architect
A data architect responsible for data warehouses, data marts, and associated data integration processes.
Data Strategy
A data management program strategy-a plan for maintaining and improving data quality, integrity, security, and access.
Strategic Plan
A high-level course of action to achieve high-level goals.
Composite Key
A key containing two or more attributes where one of the candidate keys becomes the primary key.
Surrogate Key
A key that contains a randomly generated value uniquely assigned to an entity instance.
Foreign Key
A key that is an attribute that provides a link to another entity. And is an attribute that appears in both entities in a relationship, and partially or fully identifies either one or both of the entities.
Business Data Steward
A knowledge worker and business leader recognized as a subject matter expert who is assigned accountability for the data specifications and data quality of specifically assigned business entities, subject areas or databases.
John Zachman Framework
A logical structure for identifying and organizing descriptive representations (models) used to manage enterprises and develop systems.
Executive Data Steward
A role held by a senior manager sitting on the Data Governance Council, who will: 1. Serve as an active Data Governance Council member. 2. Represent departmental and enterprise data interests . 3. Appoint coordinating and business data stewards. 4. Review and approve data policies, standards, metrics, and procedures. 5. Review and approve data architecture, data models, and specifications. 6. Resolve data issues. 7. Sponsor and oversee data management projects and services. 8. Review and approve estimates of data asset value. 9. Communicate and promote the value of information. 10. Monitor and enforce data policies and practices within a department.
Business Intelligence Architect
A senior business intelligence analyst responsible for the design of the business intelligence user environment.
Data Architect
A senior data analyst responsible for data architecture and data integration.
Data Integration Architect
A senior data integration developer responsible for designing technology to integrate and improve the quality of enterprise data assets.
Strategy
A set of choices and decisions that together chart a high-level course of action to achieve high-level goals.
Data Model
A set of data specifications and related diagrams that reflect data requirements and designs.
Data Model
A set of data specifications and related diagrams that reflect data requirements and designs. Also thought of as a diagram that uses text and symbols to represent data elements and relationships between them. More formally this is the integrated collection of specifications and related diagrams that represent data requirements and designs.
Data Policy
A short statement of management intent and fundamental rules governing the creation, acquisition, integrity, security, quality, and use of data and information.
What should each subject area in the subject area data model have?
A short, one or two word name and a brief definition.
Data implementation data management activities
Activities performed · Database implementation and change management in the development and test environments. · Test data creation, including any security procedures, such as obfuscation. · Development of data migration and conversion programs, both for project development through the SDLC and for business situations like consolidations or divestitures. · Validation of data quality requirements. · Creation and delivery of user training. · Contribution to the development of effective documentation.
Recurring Themes: Cultural Change Leadership
Adopting the principles and practices of data management within an organization requires leadership from change agents at all levels.
The information value chain analysis (1 of 3 enterprise data architecture design components)
Aligns data model components (subject areas and / or business entities) with business processes and other enterprise architecture components, which may include organizations, roles, applications, goals, strategies, projects, and / or technology platforms.
The information value chain analysis
Aligns data with business processes and other enterprise architecture components.
Data Analyst/ Data Modeler
An IT professional responsible for capturing and modeling data requirements, data definitions, business rules, data quality requirements, and logical and physical data models.
Index
An alternate path for accessing data in the database to optimize query (data retrieval) performance.
Reference and Master Data Hub Operational Data Store (ODS)
An alternative implementation of the basic "hub and spokes" design is to have each database of record provide its authoritative reference and master data into a master data operational data store (ODS) that serves as the hub for all reference and master data for all OLTP applications. Some applications may even use the ODS as their driving database, while other applications have their own specialized application databases with replicated data supplied from the ODS data hub through a "subscribe and publish" approach.
Data Modeling
An analysis and design method used to 1) define and analyze data requirements, and 2) design data structures that support these requirements.
Data Modeling
An analysis and design method used to: 1. Define and analyze data requirements, and 2. Design logical and physical data structures that support these requirements.
Attributive / characteristic entity
An entity that depends on only one other parent entity, such as Employee Beneficiary depending on Employee.
Associative / mapping entity
An entity that depends on two or more entities, such as Registration depending on a particular Student and Course.
Category / sub-type or super-type entity
An entity that is "a kind of" another entity and are examples of generalization and inheritance.
Enterprise Architecture (Separate from Enterprise Data Architecture)
An integrated set of business and IT specification models and artifacts reflecting enterprise integration and standardization requirements.
Data Architecture
An integrated set of specification artifacts used to define data requirements, guide integration and control of data assets, and align data investments with business strategy.
Enterprise Data Model (EDM)
An integrated, subject-oriented data model defining the essential data produced and consumed across an entire organization.
IT Auditor
An internal or external auditor of IT responsibilities, including data quality and / or data security.
Information value-chain analysis
An output of data architecture, and the glue binding together the various forms of "primitive models" in enterprise architecture.
Data development Component Function
Analysis, design, implementation, testing, deployment, maintenance.
Hot Backups
Backups taken while applications are running are known as?
Staging Area benefits
Benefits of this area are: · Improving performance on the source system by allowing limited history to be stored there. · Pro-active capture of a full set of data, allowing for future needs. · Minimizing the time and performance impact on the source system by having a single extract. · Pro-active creation of a data store that is not subject to transactional system limitations.
Knowledge Workers
Business analyst consumers of data and information who add value to the data for the organization.
Data Governance Activities
Data Management Planning Understand Strategic Enterprise Data Needs Develop and Maintain the Data Strategy Establish Data Professional Roles and Organizations Identify and Appoint Data Stewards Establish Data Governance and Stewardship Organizations Develop and Approve Data Policies, Standards, and Procedures Review and Approve Data Architecture Plan and Sponsor Data Management Projects and Services Estimate Data Asset Value and Associated Costs Data Management Control Supervise Data Professional Organizations and Staff Coordinate Data Governance Activities Manage and Resolve Data Related Issues Monitor and Ensure Regulatory Compliance Monitor and Enforce Conformance with Data Policies, Standards and Architecture Oversee Data Management Projects and Services Communicate and Promote the Value of Data Assets
Data Development Activites
Data Modeling, Analysis, and Solution Design Analyze Information Requirements Develop and Maintain Conceptual Data Models Develop and Maintain Logical Data Models Develop and Maintain Physical Data Models Detailed Data Design Design Physical Databases Design Information Products Design Data Access Services Design Data Integration Services Data Model and Design Quality Management Develop Data Modeling and Design Standards Review Data Model and Database Design Quality Manage Data Model Versioning and Integration Data Implementation Implement Development / Test Database Changes Create and Maintain Test Data Migrate and Convert Data Build and Test Information Products Build and Test Data Access Services Validate Information Requirements Prepare for Data Deployment
Data Management Recurring Themes List
Data Stewardship Data Quality Data Integration Enterprise Perspective Cultural Change Leadership
Information
Data in context.
6 Data Quality Principle Tools
Data profiling, parsing and standardization, data transformation, identity resolution and matching, enhancement, and reporting.
The issues adjudicated by data governance organizations
Data security issues, data access issues, data quality issues, regulatory compliance issues, policy and standards conformance issues, name and definition conflicts, and data governance procedural issues.
Meta-Data
Data that defines and describes the characteristics of other data used to improve both business and technical understanding of data and data related processes.
Physical data model design decisions
Decisions talked about when designing this data model include: · The technical name of each table and column (relational databases), or file and field (non-relational databases), or schema and element (XML databases). · The logical domain, physical data type, length, and nullability of each column or field. · Any default values for columns or fields, especially for NOT NULL constraints. · Primary and alternate unique keys and indexes, including how to assign keys. · Implementation of small reference data value sets in the logical model, such as a) separate code tables, b) a master shared code table, or c) simply as rules or constraints. · Implementation of minor supertype / subtype logical model entities in the physical database design where the sub-type entities' attributes are merged into a table representing the super-type entity as nullable columns, or collapsing the super-type entity's attributes in a table for each sub-type.
Data Architecture Management Component Function
Defining the blueprint for managing data assets.
Data Architecture Component Function
Defining the data needs of the enterprise, and designing the master blueprints to meet those needs.
Data Strategy and Policies scope activities
Defining, communicating, monitoring.
Data Quality Management Component Function
Defining, monitoring and improving data quality.
Data Lifecycle
Describes the processes performed to manage data assets.
Data Development Component Function
Designing, implementing, and maintaining solutions to meet the data needs of the enterprise. The data-focused activities within the system development lifecycle (SDLC), including data modeling, data requirements analysis, and design, implementation, and maintenance of databases' data-related solution components.
Data Quality Management Activities
Develop and Promote Data Quality Awareness Define Data Quality Requirement Profile, Analyze, and Assess Data Quality Define Data Quality Metrics Define Data Quality Business Rules Test and Validate Data Quality Requirements Set and Evaluate Data Quality Service Levels Continuously Measure and Monitor Data Quality Manage Data Quality Issues Clean and Correct Data Quality Defects Design and Implement Operational DQM Procedures Monitor Operational DQM Procedures and Performance
Implementing Development / Test Database Changes
Developer roles and responsibilities when? · Developers may have the ability to create and update database objects directly , such as views, functions, and stored procedures, and then update the DBAs and data modelers for review and update of the data model. · The development team may have their own "developer DBA" who is given permission to make schema changes, with the proviso that these changes be reviewed with the DBA and data modeler. · Developers may work with the data modelers, who make the change to the model in the data modeling tool, and then generate 'change DDL" for the DBAs to review and implement. · Developers may work with the data modelers, who interactively 'push' changes to the development environment, using functionality in the data-modeling tool, after review and approval by the DBAs.
Document and Content Management Activities
Documents / Records Management Plan for Managing Documents / Records Implement Documents / Records Management Systems for Acquisition, Storage, Access, and Security Controls Backup and Recover Documents / Records Retain and Dispose of Documents / Records Audit Documents / Records Management Content Management Define and Maintain Enterprise Taxonomies (same as 2.7) Document / Index Information Content Meta-data Provide Content Access and Retrieval Govern for Quality Content
Activities Environmental Elements
Each function is composed of lower level activities. Some activities are grouped into sub-activities. Activities are further decomposed into tasks and steps.
Data Warehousing and Business Intelligence Management Component Function
Enabling reporting and analysis.
What is the most important and beneficial aspect of enterprise data architecture?
Establishing a common business vocabulary of business entities and the data attributes (characteristics) that matter about these entities.
Data Asset Valuation scope activites
Estimating, approving, monitoring.
Recurring Themes: Data Quality
Every data management function contributes in part to improving the quality of data assets.
Factors influencing physical database design
Factors influencing what kind of database design: · Purchase and licensing requirements, including the DBMS, the database server, and any client-side data access and reporting tools. · Auditing and privacy requirements (e.g., Sarbanes-Oxley, PCI, HIPAA, etc.). · Application requirements; for example, whether the database must support a web application or web service, or a particular analysis or reporting tool. · Database service level agreements (SLAs).
Enterprise data model (1 of 3 enterprise data architecture design components)
Identifies subject areas, business entities, the business rules governing the relationships between business entities, and at least some of the essential business data attributes.
Business Value Chain
Identifies the functions of an organization that contribute directly and indirectly to the organization's ultimate purpose, such as commercial profit, education, etc., and arranges the directly contributing functions from left to right in a diagram based on their dependencies and event sequence.
Different approaches to estimate the value of enterprise data assets
Identify the direct and indirect business benefits derived from use of the data. Identify the cost of lost data or information Identifying the impacts of not having the current amount and quality level of data
A Data Management Implementation Roadmap (1 of 3 data strategy deliverables)
Identifying specific programs, projects, task assignments, and delivery milestones.
Issue Management scope activities
Identifying, defining, escalating, resolving.
Data Operations Management Activities
Implement and Control Database Environments Acquire Externally Sourced Data Plan for Data Recovery Backup and Recover Data Set Database Performance Service Levels Monitor and Tune Database Performance Plan for Data Retention Archive, Retain, and Purge Data Support Specialized Databases Data Technology Management Understand Data Technology Requirements Define the Data Technology Architecture (same as 2.4) Evaluate Data Technology Install and Administer Data Technology Inventory and Track Data Technology Licenses Support Data Technology Usage and Issues
Database replication scheme
In this scheme data moves to another database on a remote server. In the event of database failure, applications can then "fail over" to the remote database and continue processing.
Deploy Stage (1 of 4 data quality management cycle stages)
In this stage of the data quality management lifecycle profile the data and institute inspections and monitors to identify data issues when they occur. During this stage, the data quality team can arrange for fixing flawed processes that are the root cause of data errors, or as a last resort, correcting errors downstream. When it is not possible to correct errors at their source, correct errors at the earliest point in the data flow.
Plan Stage (1 of 4 data quality management cycle stages)
In this stage of the data quality management lifecycle the data quality team assesses the scope of known issues, which involve determining the cost and impact of the issues and evaluating alternatives for addressing them. Also metrics development occurs during this step.
Reference data management
In this type of data management business data stewards maintain lists of valid data values (codes, and so on) and their business meanings, through internal definition or external sourcing. Business data stewards also manage the relationships between reference data values, particularly in hierarchies.
Related Data delivery architecture
Includes data technology architecture, data integration architecture, data warehousing / business intelligence architecture, enterprise taxonomies for content management, and meta-data architecture.
Related Data delivery architecture
Includes database architecture, data integration architecture, data warehousing / business intelligence architecture, document content architecture, and meta-data architecture.
Data standards and guidelines
Includes naming standards, requirement specification standards, data modeling standards, database design standards, architecture standards, and procedural standards for each data management function.
Recursive Relationships
Relates instances of an entity to other instances of the same entity.
Asset
Resources with recongnized value under the control of an individual or organization.
Data Model Administrator
Responsible for data model version control and change control.
Data Quality Analyst
Responsible for determining the fitness of data for use.
Data Security Administrator
Responsible for ensuring controlled access to classified data.
Help Desk Administrator
Responsible for handling, tracking, and resolving issues related to use of information, information systems, or the IT infrastructure.
Meta-data Specialist
Responsible for integration, control, and delivery of meta-data, including administration of meta-data repositories.
Business Intelligence Analyst / Administrator
Responsible for supporting effective use of business intelligence data by business professionals.
Database Administrator
Responsible for the design, implementation, and support of structured data assets.
Business Process Analyst
Responsible for understanding and optimizing business processes.
Data Standards and Architecture scope activites
Reviewing, approving, monitoring.
Enterprise Process Architect
Senior business process analyst responsible for overall quality of the enterprise process model and enterprise business model.
Application Architect
Senior developer responsible for integrating application systems.
Technical Engineer
Senior technical analyst responsible for researching, implementing, administering, and supporting a portion of the information technology infrastructure.
Technical Architect
Senior technical engineer responsible for coordinating and integrating the IT infrastructure and the IT technology portfolio.
Data Stewards
Serve as the appointed trustees for data assets. Data management professionals serve as the expert curators and technical custodians of these data assets.
Recurring Themes: Data Stewardship
Shared partnership for data management requires the ongoing participation of business data stewards in every function.
Business Entity
Something of interest to the organization, an object, or an event.
Data Management Projects scope activities
Sponsoring, overseeing.
Data Brokers
Suppliers of data and meta-data often by subscription for use in an organization.
Collaborators
Suppliers or consortium participants of an organization. These may engage in data sharing agreements.
Document / Record Management Lifecycle activities
The activities in this Lifecycle include: · Identification of existing and newly created documents / records. · Creation, Approval, and Enforcement of documents / records policies. · Classification of documents / records. · Documents / Records Retention Policy. · Storage: Short and long term storage of physical and electronic documents / records. · Retrieval and Circulation: Allowing access and circulation of documents / records in accordance with policies, security and control standards, and legal requirements. · Preservation and Disposal: Archiving and destroying documents / records according to organizational needs, statutes, and regulations.
Meta-Data Industry Standards
The are which type of industry standards: 1. OMG specifications: OMG is a nonprofit consortium of computer industry leaders dedicated to the definition, promotion, and maintenance of industry standards for interoperable enterprise applications. Companies, such as Oracle, IBM, Unisys, NCR, and others, support OMG. OMG is the creator of the CORBA middleware standard and has defined the other meta-data related standards: o Common Warehouse Meta-data (CWM): Specifies the interchange of meta-data among data warehousing, BI, KM, and portal technologies. CWM is based on UML and depends on it to represent object-oriented data constructs. The CWM has many components that are illustrated in Figure 11.3. o Information Management Metamodel (IMM): The next iteration of CWM, now under OMG direction and development, is expected to be published in 2009. It promises to bridge the gap between OO, data, and XML while incorporating CWM. It is aiming to provide traceability from requirement to class diagrams, including logical / physical models, and DDL and XML schemas. o MDC Open Information Model (OIM): A vendor-neutral and technology-independent specification of core meta-data types found in operational, data warehousing, and knowledge management environments. o The Extensible Markup Language (XML): The standard format for the interchange of meta-data using the MDC OIM. o Unified Modeling Language (UML) is the formal specification language for OIM o Structured Query Language (SQL): The query language for OIM. o Extensible Markup Interface (XMI): eases the interchange of meta-data between tools and repositories. XMI specification consists of rules for generating the XML document containing the actual meta-data and the XML DTD. o Ontology Definition Metamodel (ODM): A specification for formal representation, management, interoperability, and application of business semantics in support of OMG vision model driven architectures (MDA).
Roles and Responsibilities Environmental Elements
The business and IT roles involved in performing and supervising the function, and the specific responsibilities of each role in that function. Many roles will participate in multiple functions.
Domain
The complete set of all possible values for an attribute.
Business Entities
The concepts and classes of things, people, and places that are familiar and of interest to the enterprise.
Denormalization
The deliberate transformation of a normalized logical data model into tables with redundant data.
Information Gaps
The difference between what information is needed and what information is currently available that represent business liabilities
Goals and Principles Environmental Elements
The directional business goals of each function and the fundamental principles that guide performance of each function.
Knowledge Management
The discipline that fosters organizational learning and the management of intellectual capital as an enterprise resource.
Data management procedures
The documented methods, techniques, and steps followed to accomplish a specific activity or task.
Chief Knowledge Officer (CKO)
The executive with overall responsibility for knowledge management, including protection and control of intellectual property, enablement of professional development, collaboration, mentoring, and organizational learning.
Data Governance Component Function
The exercise of authority and control (planning, monitoring, and enforcement) over the management of data assets.
Data Security Auditing
The goal of this activity is to provide management and the data governance council with objective, unbiased assessments, and rational, practical recommendations.
The enterprise data model
The heart and soul of the enterprise data architecture
Taxonomy
The hierarchical structure used for outlining topics.
Data Management Executive
The highest-level manager of Data Management Services organizations in an IT department. Reports to the CIO and is the manager most directly responsible for data management, including coordinating data governance and data stewardship activities, overseeing data management projects, and supervising data management professionals.
Primary Deliverables Environmental Elements
The information and physical databases and documents created as interim and final outputs of each function. Some deliverables are essential, some are generally recommended, and others are optional depending on circumstances.
The rules defined by data governance organizations
The overall data strategy, data policies, data standards, data management procedures, data management metrics, the business data names, business definitions and business rules found in the enterprise data model, additional data requirement specifications, and data quality business rules.
Data Management Function Definition
The planning and execution of policies, practices, and projects that acquire, control, protect, deliver, and enhance the value of data and information assets.
Components of the Kimball's Data Warehouse Chess Pieces view of DW / BI architecture
These are components of which DW and BI architecture: Operational Source Systems Operational / Transactional Applications of the Enterprise. These provide the source data to be integrated into the ODS and DW components. Equivalent to the Application systems in the CIF diagram. Data Staging Area Kimball artfully uses the analogy of a "kitchen" to refer to this area as one where the data is prepared behind-the-scenes for presentation. He refers to it as the comprehensive set of all storage and ETL processes that stand between the source systems and the data presentation area. The key difference in the architectural approach here is that Kimball's focus has always been on the efficient end-delivery of the analytical data. With that scope, smaller than Inmon's corporate management of data, the data staging area becomes a potentially eclectic set of processes needed to integrate and transform data for presentation. Similar to combining two CIF components, such as Integration and Transformation, and DW. Note: In recent years, Kimball has acknowledged that an enterprise DW can fit into the architecture inside his Data Staging Area. Data Presentation Area Similar to the Data Marts in the CIF picture, with the key architectural difference being an integrating paradigm of a "DW Bus", such as shared or conformed dimensions unifying the multiple data marts. Data Access Tools Focus on the needs and requirements for the end customers / consumers of the data has been a hallmark of Kimball's approach. These needs translate into selection criteria from a broad range of data access tools to the right tools for the right task. In the CIF model, the access tools are outside of the DW architecture.
Data Security Stakeholder Concerns
These are concerns organizations must recognize the privacy and confidentiality needs of their stakeholders, including clients, patients, students, citizens, suppliers, or business partners. Stakeholders are the ultimate owners of the data about them, and everyone in the organization must be a responsible trustee of this data.
Data Enhancement Examples
These are examples of how to enhance data: · Time / Date stamps: One way to improve data is to document the time and date that data items are created, modified, or retired, which can help to track historical data events. · Auditing Information: Auditing can document data lineage, which also is important for historical tracking as well as validation. · Contextual Information: Business contexts such as location, environment, and access methods are all examples of context that can augment data. Contextual enhancement also includes tagging data records for downstream review and analysis. · Geographic Information: There are a number of geographic enhancements possible, such as address standardization and geocoding, which includes regional coding, municipality, neighborhood mapping, latitude / longitude pairs, or other kinds of location-based data. · Demographic Information: For customer data, there are many ways to add demographic enhancements such as customer age, marital status, gender, income, ethnic coding; or for business entities, annual revenue, number of employees, size of occupied space, etc. · Psychographic Information: Use these kinds of enhancements to segment the target population by specified behaviors, such as product and brand preferences, organization memberships, leisure activities, vacation preferences, commuting transportation style, shopping time preferences, etc.
Data Operations Management Function Goals
These are goals of a specific data management function include 1. Protect and ensure the integrity of structured data assets. 2. Manage the availability of data throughout its lifecycle. 3. Optimize performance of database transactions.
DBA detailed database design responsibilities
These are items the DBA is responsible for when creating a detailed database design: · Ensuring the design meets data integrity requirements. · Determining the most appropriate physical structure to house and organize the data, such as relational or other type of DBMS, files, OLAP cubes, XML, etc. · Determining database resource requirements, such as server size and location, disk space requirements, CPU and memory requirements, and network requirements. · Creating detailed design specifications for data structures, such as relational database tables, indexes, views, OLAP data cubes, XML schemas, etc. · Ensuring performance requirements are met, including batch and online response time requirements for queries, inserts, updates, and deletes. · Designing for backup, recovery, archiving, and purge processing, ensuring availability requirements are met, and database maintenance operations can be performed within the window(s) of time available (see Chapter 6). · Designing data security implementation, including authentication, encryption needs, application roles, and the data access and update permissions they should be assigned. The general rule is never to grant permissions on database objects to individual users, only to roles. Users can then be moved into and out of roles as needed; this greatly reduces maintenance and enhances data security (see Chapter 7). · Determine partitioning and hashing schemes, where appropriate. · Requiring SQL code review to ensure that the code meets coding standards and will run efficiently.
4 Data Quality Tool Category Activities
These are the categories used to distinguish activities in what data management function? Analysis, Cleansing, Enhancement, and Monitoring
The three tenets of Business Dimensional Lifecycle
These are the characteristics of which Lifecycle: · Business Focus: Both immediate business requirements and more long-term broad data integration and consistency. · Atomic Dimensional Data Models: Both for ease of business user understanding and query performance. · Iterative Evolution Management: Manage changes and enhancements to the data warehouse as individual, finite projects, even though there never is an end to the number of these projects.
Cardinality rules
These rules define the quantity of each entity instance that can participate in a relationship between two entities. For example, "Each company can employ many persons."
Referential integrity rules
These rules ensure valid values. For example, "A person can exist without working for a company, but a company cannot exist unless at least one person is employed by the company."
Analysis services
These services give business users the ability to "slice and dice" data across multiple business dimensions, such as to analyze sales trends for products or product categories across multiple geographic areas and / or dates / times. This also includes "predictive analytics", which is the analysis of data to identify future trends and potential business opportunities.
Create reporting copies
This denormalization technique is used to improve report performance, create a table which contains all the elements needed for reporting, already calculated and joined, and update that periodically.
Basic components of Corporate Information Factory
These are the components of a corporate data architecture: Raw Detailed data Operational / Transactional Application data of the enterprise. The raw detailed data provides the source data to be integrated into the Operational Data Store (ODS) and DW components. They can also be in database or other storage or file format. Integration and Transformation This layer of the architecture is where the un-integrated data from the various application sources stores is combined / integrated and transformed into the corporate representation in the DW. Reference Data Reference data was a precursor to what is currently referred to as Master Data Management. The purpose was to allow common storage and access for important and frequently used common data. Focus and shared understanding on data upstream of the Data Warehouse simplifies the integration task in the DW. Historical Reference Data When current valued reference data is necessary for transactional applications, and at the same time it is critical to have accurate integration and presentation of historical data, it is necessary to capture the reference data that was in place at any point in time. For more discussion on reference data, see Chapter 8 Master and Reference Data Management. Operational Data Store (ODS) The focus of data integration is meeting operating and classically operational reporting needs that require data from multiple operational systems. The main distinguishing data characteristics of an ODS compared to a DW include current-valued vs. DW historical data and volatile vs. DW non-volatile data. Note: ODS is an optional portion of the overall CIF architecture, dependent upon specific operational needs, and acknowledged as a component that many businesses omit. Operational Data Mart (Oper-Mart) A data mart focuses on tactical decision support. Distinguishing characteristics include current-valued vs. DW historical data, tactical vs. DW strategic analysis, and sourcing of data from an ODS rather than just the DW. The Oper-Mart was a later addition to the CIF architecture. Data Warehouse (DW) The DW is a large, comprehensive corporate resource, whose primary purpose is to provide a single integration point for corporate data in order to serve management decision, and strategic analysis and planning. The data flows into a DW from the application systems and ODS ,and flows out to the data marts, usually in one direction only. Data that needs correction is rejected, corrected at its source, and re-fed through the system. Data Marts (DM) The purpose of the data marts is to provide for DSS / information processing and access that is customized and tailored for the needs of a particular department or common analytic need.
The two most common drivers for Reference and Master Data Management
These are the two most common drivers for which Data Management function? · Improving data quality and integration across data sources, applications, and technologies · Providing a consolidated, 360-degree view of information about important business parties, roles and products, particularly for more effective reporting and analytics.
Data Security Proprietary Business Concerns
These concerns ensure a competitive advantage provided by intellectual property and intimate knowledge of customer needs and business partner relationships is a cornerstone in any business plan.
Conceptual data model and logical data model design reviews
These data model design review should ensure: 1. Business data requirements are completely captured and clearly expressed in the model, including the business rules governing entity relationships. 2. Business (logical) names and business definitions for entities and attributes (business semantics) are clear, practical, consistent, and complementary. The same term must be used in both names and descriptions. 3. Data modeling standards, including naming standards, have been followed. 4. The conceptual and logical data models have been validated.
Physical names
These database names must conform to the maximum length allowed by the DBMS and use abbreviations where necessary. And also use underscores as separators between words
Logical names
These database names should be meaningful to business users, using full words as much as possible and avoiding all but the most familiar abbreviations. And also use blank spaces as separators between words
Production DBAs primary deliverables
These deliverables are to be completed by which role? 1. A production database environment, including an instance of the DBMS and its supporting server, of a sufficient size and capacity to ensure adequate performance, configured for the appropriate level of security, reliability, and availability. Database System Administration is responsible for the DBMS environment. 2. Mechanisms and processes for controlled implementation and changes to databases into the production environment. 3. Appropriate mechanisms for ensuring the availability, integrity, and recoverability of the data in response to all possible circumstances that could result in loss or corruption of data. 4. Appropriate mechanisms for detecting and reporting any error that occurs in the database, the DBMS, or the data server. 5. Database availability, recovery, and performance in accordance with service level agreements.
Conformed facts
These facts use standardized definitions of terms across individual marts.
Measuring Data Quality Requirements
These measures are used to create what type of requirements? 1. Identifying key data components associated with business policies. 2. Determining how identified data assertions affect the business. 3. Evaluating how data errors are categorized within a set of data quality dimensions. 4. Specifying the business rules that measure the occurrence of data errors. 5. Providing a means for implementing measurement processes that assess conformance to those business rules.
Data Security Legitimate Access Needs
These needs require individuals in certain roles to take responsibility for access to and maintenance of certain data.
Master Data Management focus areas
These primary data management focus areas are: 1. Identification of duplicate records within and across data sources to build and maintain global IDs and associated cross-references to enable information integration. 2. Reconciliation across data sources and providing the "golden record" or the best version of the truth. These consolidated records provide a merged view of the information across systems and seek to address name and address inconsistencies. 3. Provision of access to the golden data across applications, either through direct reads, or by replication feeds to OLTP and DW / BI databases.
Data Security Government Regulations
These regulations protect some of the stakeholder security interests. Some regulations restrict access to information, while other regulations ensure openness, transparency, and accountability.
Production DBAs data operation management responsibilities
These responsibilities are done by which role? · Ensuring the performance and reliability of the database, including performance tuning, monitoring, and error reporting. · Implementing appropriate backup and recovery mechanisms to guarantee the recoverability of the data in any circumstance. · Implementing mechanisms for clustering and failover of the database, if continual data availability data is a requirement. · Implementing mechanisms for archiving data operations management.
Action Rules
These rules are instructions on what to do when data elements contain certain values.
Data Rules
These rules constrain how data relates to other data and is the focus of data models.
Business Rules
These rules define constraints on what can and cannot be done.
Techniques used when a physical data model transforms a logical data model
These techniques are used when a physical data model transforms a logical data model · Denormalization: Selectively and justifiably violating normalization rules, re-introducing redundancy into the data model to reduce retrieval time, potentially at the expense of additional space, additional insert / update time, and reduced data quality. · Surrogate keys: Substitute keys not visible to the business. · Indexing: Create additional index files to optimize specific types of queries. · Partitioning: Break a table or file vertically (separating groups of columns) or horizontally (separating groups of rows). · Views: Virtual tables used to simplify queries, control data access, and rename columns, without the redundancy and loss of referential integrity due to de-normalization. · Dimensionality: Creation of fact tables with associated dimension tables, structured as star schemas and snowflake schemas, for business intelligence (see Chapter 9).
Master Data Management Challenges
These type of challenges are listed as 1. To determine the most accurate, golden data values from among potentially conflicting data values. 2. To use the golden values instead of other less accurate data.
Customer Relationship Management CRM
These type of systems perform MDM for customer data, in addition to other business functions. This system attempts to provide the most complete and accurate information about each and every customer and compares customer data from multiple sources. An essential aspect of This system is identifying duplicate, redundant, and conflicting data about the same customer.
Fact Tables
These types of tables represent and contain important business measures.The rows of this table correspond to a particular measurement and are numeric, such as amounts, quantities, or counts. Some measurements are the results of algorithms so that meta-data becomes critical to proper understanding and usage.
Why are subject areas important tools for data stewardship and governance?
They define the scope of responsibilities for subject area-oriented data stewardship teams.
Business Entities
Things important enough to the business that data about these things is necessary to run the business.
CRUD (Create, Read, Update, Delete)
This Data-to-process and data-to-role relationship matrix helps map data access needs and guide definition of data security role groups, parameters, and permissions.
Online Analytical Processing (OLAP)
This Multi-dimensional analysis refers to an approach to providing fast performance for multi-dimensional analytic queries.The typical output of for this type of processing queries are in a matrix format. The dimensions form the rows and columns of the matrix; and the factors, or measures, are the values inside the matrix.
Roll-Up (1 of 5 OLAP Operations)
This OLAP operation involves computing all of the data relationships for one or more dimensions. To do this, define a computational relationship or formula.
Dice (1 of 5 OLAP Operations)
This OLAP operation is a slice on more than two dimensions of a data cube, or more than two consecutive slices.
Drill Down/Up (1 of 5 OLAP Operations)
This OLAP operation is a specific analytical technique whereby the user navigates among levels of data, ranging from the most summarized (up) to the most detailed (down).
Slice (1 of 5 OLAP Operations)
This OLAP operation is a subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions not in the subset.
Pivot (1 of 5 OLAP Operations)
This OLAP operation means to change the dimensional orientation of a report or page display.
Clinger-Cohen Act (CCA)
This act requires all U.S. federal agencies to have and use formal enterprise architecture
Auditing Data Security
This activity is a recurring control activity with responsibility to analyze, validate, counsel, and recommend policies, standards.
Diagnosis and evaluation of remediation alternatives (1 of 4 activities the data quality operations team members are responsible)
This activity that the data quality operations team members are responsible for ensures to review the symptoms exhibited by the data quality incident, trace through the lineage of the incorrect data, diagnose the type of the problem and where it originated, and pinpoint any potential root causes for the problem.
Reporting (1 of 4 activities the data quality operations team members are responsible)
This activity that the data quality operations team members are responsible for providing transparency for the DQM process, there should be periodic reports on the performance status of DQM. The data quality operations team will develop and populate these reports, which include: o Data quality scorecard, which provides a high-level view of the scores associated with various metrics, reported to different levels of the organization. o Data quality trends, which show over time how the quality of data is measured, and whether the quality indicator levels are trending up or down. o Data quality performance, which monitors how well the operational data quality staff is responding to data quality incidents for diagnosis and timely resolution. o These reports should align to the metrics and measures in the data quality SLA as much as possible, so that the areas important to the achievement of the data quality SLA are at some level, in internal team reports.
Inspection and monitoring (1 of 4 activities the data quality operations team members are responsible)
This activity that the data quality operations team members are responsible is performed either through some automated process or via a manually invoked process, subject the data sets to measurement of conformance to the data quality rules, based on full-scan or sampling methods. Use data profiling tools, data analyzers, and data standardization and identity resolution tools to provide the inspection services. Accumulate the results and then make them available to the data quality operations analyst.
Resolving the Issue (1 of 4 activities the data quality operations team members are responsible)
This activity that the data quality operations team members are responsible providing a number of alternatives for resolving the issue, and thereafter the data quality team must confer with the business data owners to select one of the alternatives to resolve the issue.
SLAs
This agreement set availability expectations, allowing time for database maintenance and backup, and set recovery time expectations for different recovery scenarios, including potential disasters. And should also include an agreement with the data owners as to how frequently to make these backups. Will also typically identify an expected timeframe of database availability, and a select few application transactions (a mix of complex queries and updates), each with a specified maximum allowable execution time during identified availability periods.
Requirements Analysis
This analysis includes the elicitation, organization, documentation, review, refinement, approval, and change control of business requirements.
Top-Down Approach (1 of 2 data quality assessment techniques)
This approach of assessing existing data quality issues involves engaging business users to document their business processes and the corresponding critical data dependencies. This approach involves understanding how their processes consume data, and which data elements are critical to the success of the business application. By reviewing the types of reported, documented, and diagnosed data flaws, the data quality analyst can assess the kinds of business impacts that are associated with data issues.
Bottom-Up Approach (1 of 2 data quality assessment techniques)
This approach of assessing existing data quality issues involves inspection and evaluation of the data sets themselves. Direct data analysis will reveal potential data anomalies that should be brought to the attention of subject matter experts for validation and analysis. This approach highlights potential issues based on the results of automated processes, such as frequency analysis, duplicate analysis, cross-data set dependency, 'orphan child' data rows, and redundancy analysis.
Deterministic matching (1 of 2 approaches to data quality matching)
This approach to matching is like parsing and standardization, relies on defined patterns and rules for assigning weights and scores for determining similarity. The algorithms of this approach are predictable in that the patterns matched and the rules applied will always yield the same matching determination. Tie performance to the variety, number, and order of the matching rules. This type of matching works out of the box with relatively good performance, but it is only as good as the situations anticipated by the rules developers.
Probabilistic matching (1 of 2 approaches to data quality matching)
This approach to matching relies on statistical techniques for assessing the probability that any pair of records represents the same entity. matching relies on the ability to take data samples for training purposes by looking at the expected results for a subset of the records and tuning the matcher to self-adjust based on statistical analysis. These matchers are not reliant on rules, so the results may be nondeterministic. However, because the probabilities can be refined based on experience, probabilistic matchers are able to improve their matching precision as more data is analyzed.
Data Integration Architecture
This architecture defines how data flows through all systems from beginning to end.
Business Intelligence Architecture
This architecture defines how decision support makes data available, including the selection and use of specific (name of the architecture) tools.
Data Technology Architecture
This architecture defines standard tool categories, preferred tools in each category, and technology standards and protocols for technology integration.
Data Delivery Architecture
This architecture defines the master blueprint for how data flows across databases and applications and ensures data quality and integrity to support both transactional business processes and business intelligence reporting and analysis.
Data Warehouse Architecture
This architecture focuses on how data changes and snapshots are stored in specific (name of the architecture) systems for maximum usefulness and performance.
Data Integration Architecture
This architecture shows how data moves from source systems through staging databases into data warehouses and data marts.
Cold Backup
This backup is taken when the database is off-line.
Data Quality Oversight Board
This board can be created that has a reporting hierarchy associated with the different data governance roles. And is accountable for the policies and procedures for oversight of the data quality community.
Accuracy verification (1 of 10 data quality business rule templates)
This business rule template compares a data value against a corresponding value in a system of record to verify that the values match.
Timeliness validation (1 of 10 data quality business rule templates)
This business rule template implements rules that indicate the characteristics associated with expectations for accessibility and availability of data.
Uniqueness verification (1 of 10 data quality business rule templates)
This business rule template implements rules that specify which entities must have a unique representation and verify that one and only one record exists for each represented real world object.
Consistency rules (1 of 10 data quality business rule templates)
This business rule template indicates that conditional assertions that refer to maintaining a relationship between two (or more) attributes based on the actual values of those attributes.
Range conformance (1 of 10 data quality business rule templates)
This business rule template indicates that data element's assigned value must be within a defined numeric, lexicographic, or time range, such as greater than 0 and less than 100 for a numeric range.
Format compliance (1 of 10 data quality business rule templates)
This business rule template indicates that one or more patterns specify values assigned to a data element, such as the different ways to specify telephone numbers.
Value presence and record completeness (1 of 10 data quality business rule templates)
This business rule template indicates that rules defining the conditions under which missing values are unacceptable.
Data Quality Management (DQM)
This data management function is a critical support process in organizational change management. Changing business focus, corporate business integration strategies, and mergers, acquisitions, and partnering can mandate that the IT function blend data sources, create gold data copies, retrospectively populate data, or integrate data. It involves analyzing the quality of data, identifying data anomalies, and defining business requirements and corresponding business rules for asserting the required data quality. While also involving instituting inspection and control processes to monitor conformance with defined data quality rules, as well as instituting data parsing, standardization, cleansing, and consolidation, when necessary.
Data Warehousing and Business Intelligence Management (DW-BIM)
This data management function is defined the collection, integration, and presentation of data to knowledge workers for the purpose of business analysis and decision-making. And is also composed of activities supporting all phases of the decision support life cycle that provides context, moves and transforms data from sources to a common target data store, and then provides knowledge workers various means of access, manipulation, and reporting of the integrated target data.
Conceptual Data Model
This data model contains only the basic and critical business entities within a given realm and function, with a description of each entity and the relationships between entities.
Logical Data Model
This data model identifies the data needed about each instance of a business entity and for some enterprise data models is a diagram for each subject area. And is a detailed representation of data requirements and the business rules that govern data quality, usually in support of a specific usage context.
Physical data model
This data model optimizes the implementation of detailed data requirements and business rules in light of technology constraints, application usage, performance requirements, and modeling standards.
Object Role Modeling (ORM)
This data modeling style is an alternate modeling style with a syntax that enables very detailed specification of business data relationships and rules. The diagrams in this style present so much information that effective consumption usually requires smaller subject area views, with fewer business entities on a single diagram. It is not widely used, but its proponents strongly advocate its benefits and is particularly useful for modeling complex business relationships.
Unified Modeling Language (UML)
This data modeling style is an integrated set of diagramming conventions for several different forms of modeling.
IDEF1X
This data modeling style uses circles (some darkened, some empty) and lines (some solid, some dotted) instead of "crow's feet" to communicate similar meanings.
Information Engineering (IE)
This data modeling style uses tridents or "crow's feet", along with other symbols, to depict cardinality.
Identity Resolution and Matching
This data quality tool employs record linkage and matching in identity recognition and resolution, and incorporate approaches used to evaluate "similarity" of records for use in duplicate analysis and elimination, merge / purge, house holding, data enhancement, cleansing and strategic initiatives such as customer data integration or master data management.
Data Parsing Tools
This data quality tool enables the data analyst to define sets of patterns that feed into a rules engine used to distinguish between valid and invalid data values. Actions are triggered upon matching a specific pattern. Extract and rearrange the separate components (commonly referred to as "tokens") into a standard representation
Data Transformation
This data quality tool identifies data errors, trigger data rules to transform the flawed data into a format that is acceptable to the target architecture.
Data Enhancement
This data quality tool is defined as a method for adding value to information by accumulating additional information about a base set of entities and then merging all the sets of information to provide a focused view of the data. And is a process of intelligently adding data from alternate sources as a byproduct of knowledge inferred from applying other data quality techniques, such as parsing, identity resolution, and data cleansing.
Physical database design review
This database design review should ensure: 1. The design meets business, technology, usage, and performance requirements. 2. Database design standards, including naming and abbreviation standards, have been followed. 3. Availability, recovery, archiving, and purging procedures are defined according to standards. 4. Meta-data quality expectations and requirements are met in order to properly update any meta-data repository. 5. The physical data model has been validated. All concerned stakeholders, including the DBA group, the data analyst / architect, the business data owners and / or stewards, the application developers, and the project managers, should review and approve this database design document.
Server clustering
This database protection option includes databases on a shared disk array can failover from one physical server to another, and server virtualization, where the failover occurs between virtual server instances residing on two or more physical machines.
Star Schema
This database schema is the representation of a dimensional data model with a single fact table in the center connecting to a number of surrounding dimension tables.
Database of Record
This database serves as a reference data "hub" supplying reference data to other "spoke" applications and databases. Some applications can read reference and master data directly from this database. Other applications subscribe to published, replicated data from this database.
Meta-Data Delivery Layer
This delivery layer is responsible for the delivery of the meta-data from the repository to the end users and to any applications or tools that require meta-data feeds to them.
Create duplicates (mirrors)
This denormalization technique is used to improve performance where certain data sets are frequently used and are often in contention, create duplicate versions for separate user groups, or for loading vs. querying.
Validity (1 of 11 Dimensions of Data Quality)
This dimension of data quality refers to whether data instances are stored, exchanged, or presented in a format that is consistent with the domain of values, as well as consistent with other similar attribute values. This dimension ensures that data values conform to numerous attributes associated with the data element: its data type, precision, format patterns, use of a predefined enumeration of values, domain ranges, underlying storage formats, and so on.
Uniqueness (1 of 11 Dimensions of Data Quality)
This dimension of data quality states that no entity exists more than once within the data set and that a key value relates to each unique entity, and only that specific entity, within the data set. Many organizations prefer a level of controlled redundancy in their data as a more achievable target.
Physical database design document
This document guides implementation and maintenance. It is reviewable to catch and correct errors in the design before creating or updating the database. It is modifiable for ease of implementation of future iterations of the design.
Reuse File (1 of 4 files created from meta-data repository scanning)
This file created from meta-data repository scanning contains the rules for managing reuse of process loads.
Control File (1 of 4 files created from meta-data repository scanning)
This file created from meta-data repository scanning contains the source structure of the data model.
Log Files (1 of 4 files created from meta-data repository scanning)
This file created from meta-data repository scanning is produced during each phase of the process, one for each scan / extract and one for each load cycle.
Temporary/ Back-Up Files (1 of 4 files created from meta-data repository scanning)
This file created from meta-data repository scanning is used during the process or for traceability.
Data Quality Framework
This framework includes defining the requirements, inspection policies, measures, and monitors that reflect changes in data quality and performance. These requirements reflect three aspects of business data expectations: a manner to record the expectation in business rules, a way to measure the quality of data within that dimension, and an acceptability threshold.
Classification Frameworks
This framework organizes the structure and views that encompass enterprise architecture. Frameworks define the standard syntax for the artifacts describing these views and the relationships between these views. Most artifacts are diagrams, tables, and matrices.
Process Frameworks
This framework specifies methods for business and systems planning, analysis, and design processes. Some IT planning and software development lifecycle (SDLC) methods include their own composite classifications.
Data Development
This function is the analysis, design, implementation, deployment, and maintenance of data solutions to maximize the value of the data resources to the enterprise.
Ontology
This is a type of model that represents a set of concepts and their relationships within a domain. Both declarative statements and diagrams using data modeling techniques can describe these concepts and relationships. And mostly describes individuals (instances), classes (concepts), attributes, and relations. And can be a collection of taxonomies, and thesauri of common vocabulary for knowledge representation and exchange of information.
Project
This is an organized effort to accomplish something.
Snowflaking
This is the term given to de-normalizing the flat, single-table, dimensional structure in a star schema into the respective component hierarchical or network structures.
Extensible Markup Language (XML)
This language facilitates the sharing of data across different information systems and the Internet. And puts tags on data elements to identify the meaning of the data rather than its format (e.g. HTML). Simple nesting and references provide the relationships between data elements. And lastly uses meta-data to describe the content, structure, and business rules of any document or database.
Data Definition Language (DDL)
This language is a subset of Structured Query Language (SQL) used to create tables, indexes, views, and other physical database objects.
Revision (1 of 3 ANSI Standard 859 levels of control)
This level of control is the least rigid and is less formal, notifying stakeholders and incrementing versions when a change is required.
Custody (1 of 3 ANSI Standard 859 levels of control)
This level of control is the least rigid and is the least formal, merely requiring safe storage and a means of retrieval.
Formal (1 of 3 ANSI Standard 859 levels of control)
This level of control is the most rigid and requires formal change initiation, thorough change evaluation for impact, decision by a change authority, and full status accounting of implementation and validation to stakeholders.
Business Dimensional Lifecycle
This lifecycle advocates using conformed dimensions and facts design. The conformation process enforces an enterprise taxonomy and consistent business rules so that the parts of the data warehouse become re-usable components that are already integrated.
Duplicate identification match rule (1 of 3 match rules)
This match rule focuses on a specific set of fields that uniquely identify an entity and identify merge opportunities without taking automatic action. Business data stewards can review these occurrences and decide to take action on a case-by-case basis.
Match-link rule (1 of 3 match rules)
This match rule identifies and cross-references records that appear to relate to a master record without updating the content of the cross-referenced record.
Match-merge rule (1 of 3 match rules)
This match rule matches records and merge the data from these records into a single, unified, reconciled, and comprehensive record. If the rules apply across data sources, create a single unique and comprehensive record in each database.
Meta-Data Documentation Quality (1 of 7 standard meta-data metrics)
This standard meta-data metric assess the quality of meta-data documentation through both automatic and manual methods. Automatic methods include performing collision logic on two sources, measuring how much they match, and the trend over time. Another metric would measure the percentage of attributes that have definitions, trending over time. Manual methods include random or complete survey, based on enterprise definitions of quality.
Meta-data Repository Completeness (1 of 7 standard meta-data metrics)
This standard meta-data metric compares ideal coverage of the enterprise meta-data (all artifacts and all instances within scope) to actual coverage. Reference the Strategy for scope definitions.
Meta-data Repository Availability (1 of 7 standard meta-data metrics)
This standard meta-data metric focuses on uptime, processing time (batch and query).
Meta-data Management Maturity (1 of 7 standard meta-data metrics)
This standard meta-data metric is developed to judge the meta-data maturity of the enterprise, based on the Capability Maturity Model (CMM) approach to maturity assessment.
Meta-data Usage / Reference (1 of 7 standard meta-data metrics)
This standard meta-data metric measures user uptake on the meta-data repository usage can be measured by simple login measures. Reference to meta-data by users in business practice is a more difficult measure to track.
Steward Representation / Coverage (1 of 7 standard meta-data metrics)
This standard meta-data metric represents organizational commitment to meta-data as assessed by the appointment of stewards, coverage across the enterprise for stewardship, and documentation of the roles in job descriptions.
Master Data Service Data Compliance (1 of 7 standard meta-data metrics)
This standard meta-data metric shows the reuse of data in SOA solutions. Meta-data on the data services assists developers in deciding when new development could use an existing service.
Code Management System
This system is the system of record for many reference data sets. And its database would be the database of record.
Human Resource Management
This system manages master data about employees and applicants.
System of Record
This system stores the "official" version of a data attribute and provides the definitive data about an instance.
Matrices
This term can define the relationships to other aspects of the enterprise architecture besides business processes.
Data Profiling
This term defines the process of examining the data available in an existing data source (e.g. a database or a file) and collecting statistics and information about that data. The purpose of these statistics may be to: Find out whether existing data can easily be used for other purposes. And is a set of algorithms for two purposes · Statistical analysis and assessment of the quality of data values within a data set. · Exploring relationships that exist between value collections within and across data sets.
Data Transformation
This term focuses on activities that provide organizational context between data elements, entities, and subject areas. Organizational context includes cross-referencing, reference and master data management (see Chapter 8), and complete and correct relationships.
Data Cleansing
This term focuses on the activities that correct and enhance the domain values of individual data elements, including enforcement of standards.
Access (1of 4 data security requirements categories)
This term involves enabling individuals and their privileges in a timely manner.
Audit (1of 4 data security requirements categories)
This term involves reviewing security actions and user activity to ensure compliance with regulations and conformance with policy and standards.
Attribute
This term is a property of an entity; a type of fact important to the business whose values help identify or describe an entity instance.
Enterprise Data Warehouse
This term is defined as a centralized data warehouse designed to service the business intelligence needs of the entire organization. And adheres to an enterprise data model to ensure consistency of decision support activities across the enterprise.
Vocabulary
This term is defined as a collection of terms / concepts and their relationships.
Data Warehouse
This term is defined as a combination of two primary components. The first is an integrated decision support database. The second is the related software programs used to collect, cleanse, transform, and store data from a variety of operational and external sources.
Directory
This term is defined as a type of meta-data store that limits the meta-data to the location or source of data in the enterprise. Tag sources as system of record (it may be useful to use symbols such as "gold") or other level of quality.
Control of Repositories
This term is defined as control of meta-data movement and repository updates performed by the meta-data specialist. These activities are administrative in nature and involve monitoring and responding to reports, warnings, job logs, and resolving various issues in the implemented repository environment.
Confirmed Dimension
This term is defined as dimension that has the same meaning to every fact with which it relates. These dimensions allow facts and measures to be categorized and described in the same way across multiple facts and/or data marts, ensuring consistent reporting across the enterprise.
Dependent Data Marts
This term is defined as subset copies of a data warehouse database.
Grain
This term is defined as the meaning or description of a single row of data in a fact table. Or, put another way, it refers to the atomic level of the data for a transaction.
Taxonomy
This term is defined as the science or technique of classification.
Availability
This term is described as the percentage of time that a system or database can be used for productive work.
Operational BI
This type of BI provides BI to the front lines of the business, where analytical capabilities guide operational decisions. Use this BI to manage and optimize business operations. And entails the coupling of BI applications with operational functions and processes, with a requirement for very low tolerance for latency (near real-time data capture and data delivery). Therefore, more architectural approaches such as Service-oriented architectuure (SOA) become necessary to support this BI fully.
Application DBA
This type of DBA is responsible for one or more databases in all environments (development / test, QA, and production), as opposed to database systems administration for any of these environments.
Procedural DBA
This type of DBA specializes in development and support of procedural logic controlled and execute by the DBMS: stored procedures, triggers, and user defined functions (UDFs). This DBA ensures that procedural logic is planned, implemented, tested, and shared (reused). This DBA leads the review and administration of procedural database objects.
Hybrid Online Analytical Processing (HOLAP) (1 of 3 implementation approaches support Online Analytical Processing)
This type of Online Analytical Processing is a combination of ROLAP and MOLAP. Implementations allow part of the data to be stored in MOLAP form and another part of the data to be stored in ROLAP.
Database Online Analytical Processing (DOLAP)
This type of Online Analytical Processing is a virtual OLAP cube available as a special proprietary function of a classic relational database.
Multi-dimensional Online Analytical Processing (MOLAP) (1 of 3 implementation approaches support Online Analytical Processing)
This type of Online Analytical Processing supports OLAP by using proprietary and specialized multi-dimensional database technology.
Relational Online Analytical Processing (ROLAP) (1 of 3 implementation approaches support Online Analytical Processing)
This type of Online Analytical Processing supports OLAP by using techniques that implement multi-dimensionality in the two-dimensional tables of relational database managements systems (RDBMS). Star schema joins are a common database design technique used in these environments.
Analytical Applications
This type of application includes the logic and processes to extract data from well-known source systems, such as vendor ERP systems, a data model for the data mart, and pre-built reports and dashboards.
Staging Area
This type of area is the intermediate data store between an original data source and the centralized data repository. All required cleansing, transformation, reconciliation, and relationships happen in this area.
Data Stewardship Meta-Data (1 of 4 types of meta-data)
This type of data about data stewards, stewardship processes, and responsibility assignments.
Term and Abbreviation Standardization
This type of data cleansing activity that ensures certain terms and short forms of those terms consistently appear in the standardized data set
Product Master Data
This type of data focuses on an organization's internal products or services or the entire industry, including competitor products, and services and may exist in structured or unstructured formats.
Financial Master Data
This type of data includes data about business units, cost centers, profit centers, general ledger accounts, budgets, projections, and projects. Typically, an Enterprise Resource Planning (ERP) system serves as the central hub for this type of data.
Party Master Data
This type of data includes data about individuals, organizations, and the roles they play in business relationships. In the commercial environment, this includes customer, employee, vendor, partner, and competitor data. In the public sector, the focus is on data about citizens. In law enforcement, the focus is on suspects, witnesses, and victims. In not-for-profit organizations, the focus is on members and donors. In healthcare, the focus is on patients and providers, while in education, the focus in on students and faculty. The challenges of this type of data is: · The complexity of roles and relationships played by individuals and organizations. · Difficulties in unique identification. · The high number of data sources. · The business importance and potential impact of the data.
Technical Meta-Data (1 of 4 types of meta-data)
This type of data includes physical database table and column names, column properties, other database object properties, and data storage.
Business Meta-Data (1 of 4 types of meta-data)
This type of data includes the business names and definitions of subject and concept areas, entities, and attributes; attribute data types and other attribute properties; range descriptions; calculations; algorithms and business rules; and valid domain values and their definitions.
Master Data
This type of data is data about the business entities that provide context for business transactions. And is the authoritative, most accurate data available about key business entities, used to establish the context for transactional data. The data values are considered golden.
Operational Meta-Data (1 of 4 types of meta-data)
This type of data is targeted at IT operations users' needs, including information about data movement, source and target systems, batch programs, job frequency, schedule anomalies, recovery and backup information, archive rules, and usage.
Reference Data
This type of data is used to classify or categorize other data.
Master data management
This type of data management function requires identifying and / or developing a "golden" record of truth for each product, place, person, or organization. And is the process of defining and maintaining how master data will be created, integrated, maintained, and used throughout the enterprise. And can be implemented through data integration tools (such as ETL), data cleansing tools, operational data stores (ODS) that serve as master data hubs
Reference Data Management
This type of data management is explained as control over defined domain values (also known as vocabularies), including control over standardized terms, code values and other unique identifiers, business definitions for each value, business relationships within and across domain value lists, and the consistent, shared use of accurate, timely and relevant reference data values to classify and categorize data.
Centralized Meta-data Architecture
This type of meta-data architecture consists of a single meta-data repository that contains copies of the live meta-data from the various sources. Organizations with limited IT resources, or those seeking to automate as much as possible, may choose to avoid this architecture option. Organizations with prioritization for a high degree of consistency and uniformity within the common meta-data repository can benefit from a centralized architecture.
Hybrid Meta-data Architecture
This type of meta-data architecture is where meta-data still moves directly from the source systems into a repository. However, the repository design only accounts for the user-added meta-data, the critical standardized items, and the additions from manual sources.
Distributed Meta-data Architecture
This type of meta-data architecture maintains a single access point. The meta-data retrieval engine responds to user requests by retrieving data from source systems in real time; there is no persistent repository. In this architecture, the meta-data management environment maintains the necessary source system catalogs and lookup information needed to process user queries and searches effectively.
Proprietary interface (1 of 2 meta-data repository scanning methods)
This type of meta-data repository scanning is defined as a single-step scan and load process, a scanner collects the meta-data from a source system, then directly calls the format-specific loader component to load the meta-data into the repository. In this process, there is no format-specific file output and the collection and loading of meta-data occurs in a single step.
Semi-Proprietary interface (1 of 2 meta-data repository scanning methods)
This type of meta-data repository scanning is defined as a two-step process, a scanner collects the meta-data from a source system and outputs it into a format-specific data file. The scanner only produces a data file that the receiving repository needs to be able to read and load appropriately. The interface is a more open architecture, as the file is readable by many methods.
Business Continuity Plan
This type of plan contains written policies, procedures, and information designed to mitigate the impact of threats to all media of an organization's documents / records, and to recover them in the event of a disaster, to recover them in a minimum amount of time, and with a minimum amount of disruption.
Document / records retention and disposition program
This type of program defines the period of time during which documents / records for operational, legal, fiscal or historical value must be maintained. It defines when the documents / records are not active anymore and can be transferred to a secondary storage facility, such as off-site storage. The program specifies the processes for compliance, and the methods and schedules for the disposition of documents / records.
Vital Records Program
This type of program provides the organization with access to the records necessary to conduct its business during a disaster, and to resume normal business afterward.
Production Reporting
This type of reporting crosses the DW-BIM boundary and often queries transactional systems to produce operational items such as invoices or bank statements. The developers of production reports tend to be IT personnel.
Data quality incident reporting system
This type of reporting system can log the evaluation, initial diagnosis, and subsequent actions associated with data quality events.
Meta-Data Repository
This type of repository refers to the physical tables in which the meta-data are stored.
Meta-Data Strategy
This type of strategy is a statement of direction in meta-data management by the enterprise. It is a statement of intent and acts as a reference framework for the development teams. It's is to gain an understanding of and consensus on the organization's key business drivers, issues, and information requirements for the enterprise meta-data program. The objective is to understand how well the current environment meets these requirements, both now and in the future.
Bridge Tables
This type of table is formed in two situations. The first is when a many-to-many relationship between two dimensions that is not or cannot be resolved through a fact table relationship. One example is a bank account with shared owners.This table captures the list of owners in an 'owner group' table. The second is when normalizing variable-depth or ragged hierarchies. This table can capture each parent-child relationship in the hierarchy, enabling more efficient traversal.
Snowflake Tables (1 of 3 types of snowflake tables)
This type of table is formed when a hierarchy is resolved into level tables. For example: a daily Period Dimension table resolves into the detail table for Date, and another table for Month or Year that is linked directly to the Date table.
Outrigger Tables
This type of table is formed when attributes in one dimension table links to rows in another dimension table. For example, a date field in one dimension (such as Employee Hire Date) links to the Period Dimension table to facilitate queries that want to sort Employees by Hire Date Fiscal Year.
Dimension Tables
This type of table represents the important objects of the business and contain textual descriptions of the business. And serve as the primary source for "query by" or "report by" constraints.The two main approaches to identifying keys for these tables tables are surrogate keys and natural keys.
Flat Taxonomy (1 of 4 Taxonomy Types)
This type of taxonomy has no relationship among the controlled set of categories as the categories are equal. An example is a list of countries.
Hierarchical Taxonomy (1 of 4 Taxonomy Types)
This type of taxonomy is a tree structure of at least two levels and is bi-directional. An example is geography, from continent down to address.
Facet Taxonomy (1 of 4 Taxonomy Types)
This type of taxonomy looks like a star where each node is associated with the center node.
Network Taxonomy (1 of 4 Taxonomy Types)
This type of taxonomy organizes content into both hierarchical and facet categories. Any two nodes in this taxonomy links based on their associations. An example is a recommender engine (...if you liked that, you might also like this...). Another example is a thesaurus.
Data Management Function Mission
To meet and exceed the information needs of all the stakeholders in the enterprise in terms of information availability, security, and quality.
Data Architecture Management Activites
Understand Enterprise Information Needs Develop and Maintain the Enterprise Data Model Analyze and Align With Other Business Models Define and Maintain the Database Architecture (same as 4.2.2) Define and Maintain the Data Integration Architecture (same as 6.3) Define and Maintain the DW / BI Architecture (same as 7.2) Define and Maintain Enterprise Taxonomies and Namespaces (same as 8.2.1) Define and Maintain the Meta-data Architecture (same as 9.2)
Meta-Data Management Activities
Understand Meta-data Requirements Define the Meta-data Architecture (same as 2.8) Develop and Maintain Meta-data Standards Implement a Managed Meta-data Environment Create and Maintain Meta-data Integrate Meta-data Manage Meta-data Repositories Distribute and Deliver Meta-data Query, Report, and Analyze Meta-data
Reference and Master Data Management Activities
Understand Reference and Master Data Integration Needs Identify Master and Reference Data Sources and Contributors Define and Maintain the Data Integration Architecture (same as 2.5) Implement Reference and Master Data Management Solutions Define and Maintain Match Rules Establish "Golden" Records Define and Maintain Hierarchies and Affiliations Plan and Implement Integration of New Data Sources Replicate and Distribute Reference and Master Data Manage Changes to Reference and Master Data
Knowledge
Understand; awareness cognizance, and the recognition of a situations and familiarity with its complexity.
Essential Attributes
What are data attributes without which the enterprise cannot function?
Systems architecture
What type of architecture artifacts are these? Applications, software components, interfaces, projects.
Information architecture
What type of architecture artifacts are these? Business entities, relationships, attributes, definitions, reference values.
Process architecture
What type of architecture artifacts are these? Functions, activities, workflow, events, cycles, products, procedures.
Business architecture
What type of architecture artifacts are these? Goals, strategies, roles, organization structures, locations.
Information value chain analysis artifacts
What type of architecture artifacts are these? Mapping the relationships between data, process, business, systems, and technology.
Technology architecture
What type of architecture artifacts are these? Networks, hardware, software platforms, standards, protocols.
Recurring Themes: Enterprise Perspective
Whenever possible, manage data assets consistently across the enterprise. Enterprise Information Management (EIM) is a best practice for data management.
SDLC systems development lifecycle
Where are these specification and implementation activities included? · Project Planning , including scope definition and business case justification. · Requirements Analysis. · Solution Design. · Detailed Design. · Component Building. · Testing, including unit, integration, system, performance, and acceptance testing. · Deployment Preparation, including documentation development and training. · Installation and Deployment, including piloting and rollout.
Data Quality Management Cycle
Which data management cycle involves the following? · Planning for the assessment of the current state and identification of key metrics for measuring data quality. · Deploying processes for measuring and improving the quality of data. · Monitoring and measuring the levels in relation to the defined business expectations. · Acting to resolve any identified issues to improve data quality and better meet business expectations. This cycle begins by identifying the data issues that are critical to the achievement of business objectives, defining business requirements for data quality, identifying key data quality dimensions, and defining the business rules critical to ensuring high quality data.
S.M.A.R.T
specific, measurable, achievable, realistic, and timely
Data Strategy Components
· A compelling vision for data management. · A summary business case for data management, with selected examples. · Guiding principles, values, and management perspectives. · The mission and long-term directional goals of data management. · Management measures of data management success. · Short-term (12-24 months) SMART (specific / measurable / actionable / realistic / time-bound) data management program objectives. · Descriptions of data management roles and organizations, along with a summary of their responsibilities and decision rights. · Descriptions of data management program components and initiatives. · An outline of the data management implementation roadmap (projects and action items). · Scope boundaries and decisions to postpone investments and table certain issues.
Physical database design document elements
· An introductory description of the business function of the database design; for example, what aspect or subset of the business data does this database design encompass? · A graphical model of the design, done in ER format for a relational design, or in UML for an object-oriented design. · Database-language specification statements. In Structured Query Language (SQL), these are the Data Definition Language (DDL) specifications for all database objects (tablespaces, tables, indexes, indexspaces, views, sequences, etc., and XML Namespaces). · Documentation of the technical meta-data, including data type, length, domain, source, and usage of each column, and the structure of keys and indexes related to each table. · Use cases or sample data, showing what the actual data will look like.
The purpose of a data model is facilitate?
· Communication: A data model is a bridge to understanding data between people with different levels and types of experience. Data models help us understand a business area, an existing application, or the impact of modifying an existing structure. Data models may also facilitate training new business and / or technical staff. · Formalization: A data model documents a single, precise definition of data requirements and data related business rules. · Scope: A data model can help explain the data context and scope of purchased application packages. Data models that include the same data may differ by: · Scope: Expressing a perspective about data in terms of function (business view or application view), realm (process, department, division, enterprise, or industry view), and time (current state, short-term future, long-term future). · Focus: Basic and critical concepts (conceptual view), detailed but independent of context (logical view), or optimized for a specific technology and use (physical view).
Related Data Governance Frameworks
· Corporate Governance (COSO ERM). · IT Governance (COBIT). · Enterprise Architecture (Zachman Framework, TOGAF). · System Development Lifecycle (Rational Unified Process, for example). · System Development Process Improvement (SEI CMMI). · Project Management (PRINCE II, PMI PMBOK).
Technology Architecture component categories
· Current: Products currently supported and used. · Deployment Period: Products deployed for use in the next 1-2 years. · Strategic Period: Products expected to be available for use in the next 2+ years. · Retirement: Products the organization has retired or intends to retire this year. · Preferred: Products preferred for use by most applications. · Containment: Products limited to use by certain applications. · Emerging: Products being researched and piloted for possible future deployment.
Topics covered in the data policy
· Data modeling and other data development activities within the SDLC. · Development and use of data architecture. · Data quality expectations, roles, and responsibilities (including meta-data quality). · Data security, including confidentiality classification policies, intellectual property policies, personal data privacy policies, general data access and usage policies, and data access by external parties. · Database recovery and data retention. · Access and use of externally sourced data. · Sharing data internally and externally. · Data warehousing and business intelligence policies. · Unstructured data policies (electronic files and physical records).
Technology categories in the data technology architecture
· Database management systems (DBMS). · Database management utilities. · Data modeling and model management tools. · Business intelligence software for reporting and analysis. · Extract-transform-load (ETL), changed data capture (CDC), and other data integration tools. · Data quality analysis and data cleansing tools. · Meta-data management software, including meta-data repositories.
Things that enterprise architecture can help achieve
· Enable integration of data, processes, technologies, and efforts. · Align information systems with business strategy. · Enable effective use and coordination of resources. · Improve communication and understanding across the organization. · Reduce the cost of managing the IT infrastructure. · Guide business process improvement. · Enable organizations to respond effectively to changing market opportunities, industry challenges, and technological advances. Enterprise architecture helps evaluate business risk, manage change, and improve business effectiveness, agility, and accountability.
Other data management terms
· Information Management (IM). · Enterprise Information Management (EIM). · Enterprise Data Management (EDM). · Data Resource Management (DRM). · Information Resource Management (IRM). · Information Asset Management (IAM).
Four related factors affecting database availability
· Manageability: The ability to create and maintain an effective environment. · Recoverability: The ability to reestablish service after interruption, and correct errors caused by unforeseen events or component failures. · Reliability: The ability to deliver service at specified levels for a stated period. · Serviceability: The ability to determine the existence of problems, diagnose their causes, and repair / solve the problems.
Data models that include the same data may differ by?
· Scope: Expressing a perspective about data in terms of function (business view or application view), realm (process, department, division, enterprise, or industry view), and time (current state, short-term future, long-term future). · Focus: Basic and critical concepts (conceptual view), detailed but independent of context (logical view), or optimized for a specific technology and use (physical view).