MIS 3376 - Key Terms
Report writer
A DBMS tool that enables you to set up reports on the screen to specify how items will be displayed or calculated. Most of these tasks are performed by dragging data onto the screen.
GROUP BY
A SQL SELECT clause that computes an aggregate value for each item in a group. For example, SELECT Department, SUM(Salary) FROM Employee GROUP BY Department; computes and lists the total employee salaries for each department.
HAVING
A SQL clause used with the GROUP BY statement. It restricts the output to only those groups that meet the specified condition.
BETWEEN
A SQL comparison operator that determines whether an item falls between two values. Often useful for dates.
Collaboration diagram
A UML diagram to display interactions among objects. It does not show time as a separate dimension. It is used to model processes.
Derived class
A class that is created as an extension of another class. The programmer need only define the new attributes and methods. All others are inherited from the higher-level classes. See inheritance.
Table
A collection of data for one class or entity. It consists of columns for each attribute and a row of data for each specific entity or object.
Foreign key
A column in one table that is a primary key in a second table. It does not need to be a key in the first table. For example, in an Order table, CustomerID is a foreign key because it is a primary key in the Customer table.
Primary key
A column or set of columns that identify a particular row in a table. In nonrelational DBMSs, the primary key defines how the data will be stored and retrieved.
Master-detail
A common one-to-many relationship often found on business forms, where the main form (e.g., Order) displays data for the master component, and a subform (e.g., Order Items) displays detail data. Sometimes called a parent-child relationship.
Data repository
A complete listing of all terms used in a database design, including column names and tables. See data dictionary.
Pseudo column
A computed column in a table that can only display data, not accept updates. Used to predefine row-wise calculations that are commonly used, such as Value=price*quantity.
Binary large object (BLOB)
A data domain for undefined, large chunks of data. A BLOB (or simple object) type can hold any type of data, but the programmer is often responsible for displaying, manipulating, and searching the data.
Referential integrity
A data integrity constraint where data can be entered into a foreign key column only if the data value already exists in the base table. For example, clerks should not be able to enter an Order for CustomerID 1173 if CustomerID 1173 is not in the Customer table.
Object-oriented database management system (OODBMS)
A database system that holds objects, including properties and methods. It supports links between objects, including inheritance.
Hidden dependency
A dependency specified by business rules that is not shown in the table structure. It generally indicates that the table needs to be normalized further and is an issue with Boyce-Codd or fourth normal form.
Class
A descriptor for a set of objects with similar structure, behavior, and relationships. That is, a class is the model description of the business entity. A business model might have an Employee class, where one specific employee is an object in that class.
Attribute
A feature or characteristic of an entity. An attribute will become a column in a data table. Employee attributes might include name, address, date hired, and phone.
Query by example (QBE)
A fill-in-the-form approach to designing queries. You select tables and columns from a list and fill in blanks for conditions and sorting. It is relatively easy to use, requires minimal typing skills, generally comes with a Help system, and is useful for beginners.
Method
A function or operation that a class can perform. For example, a Customer class would generally have an AddNew method that is called whenever a new customer object is added to the database.
Class diagram
A graph of classes connected through relationships. It is designed to show the static structure of the model. Similar to the entity-relationship diagram.
Class hierarchy
A graph that highlights the inheritance relationships between classes.
Globally-unique identifier (GUID)
A large number that can be created on one computer and be different from all other numbers created. Often used for generated keys in a replicated database.
Null
A missing (or currently unassigned) value.
Intranet
A network internal to a company that uses Internet technologies to share data.
Composite key
A primary key that consists of more than one column. Indicates a many-to-many relationship between the columns.
Feasibility study
A quick examination of the problems, goals, and expected costs of a proposed system. The objective is to determine whether the problem can reasonably be solved with a computer system.
Generalization association
A relationship among classes that begins with a generic class. More detailed classes are derived from it and inherit the properties and methods of the higher level classes.
Reflexive association
A relationship from one class back to itself. Most commonly seen in business in an Employee class, where some employees are managers over other employees.
Composition association
A relationship in which an object is composed of a collection of other objects. For example, a bicycle is built from components. In UML, it is indicated with a small filled diamond on the association end.
Report services
A server-based tool that processes reports and enables users to browse for reports and generate them on demand.
Data definition language (DDL)
A set of commands that are used to define data, such as CREATE TABLE. Graphical interfaces are often easier to use, but the data definition commands are useful for creating new tables with a program.
Data manipulation language (DML)
A set of commands used to alter the data. See INSERT, DELETE, and UPDATE.
XQuery
A standardized language to retrieve individual elements or groups of data from an XML file or text group.
Unified Modeling Language (UML)
A standardized modeling language for designing and documenting computer and business systems.
Rapid application development (RAD)
A systems design methodology that attempts to reduce development time through efficiency and overlapping stages.
First normal form (1NF)
A table is in 1NF when there are no repeating groups within it. Each cell can contain only one value. For example, how may items can be placed in one Order table? The items repeat, so they must be split into a separate table.
Second normal form (2NF)
A table is in 2NF if every nonkey column depends on the entire key (not just part of it). This issue arises only if there is a concatenated key (with multiple columns).
Third normal form (3NF)
A table is in third normal form (3NF) if each nonkey column depends on the whole key and nothing but the key.
Extensible markup language (XML)
A tagbased notation system that is used to assign names and structure to data. It was mainly designed for transferring data among diverse systems.
Alias
A temporary name for a table or a column. Often used when you need to refer to the same table more than once, as in a self-join.
Autonumber
A type of data domain where the DBMS automatically assigns a unique identification number for each new row of data. Useful for generating primary keys.
Data type
A type of data that can be held by a column. Each DBMS has predefined system domains (integer, float, string, etc.). Some systems support user-defined domains that are named combinations of other data types.
Boyce-Codd normal form (BCNF)
All dependencies must be explicitly shown through keys. There cannot be a hidden dependency between nonkey and key columns.
DISTINCT
An SQL keyword used in the SELECT statement to remove duplicate rows from the output.
DeMorgan's law
An algebraic law that states: To negate a condition that contains an AND or an OR connector, you negate each of the two clauses and switch the connector. An AND becomes an OR and vice versa.
N-ary association
An association among three or more classes. It is drawn as a diamond on a UML class diagram. The term comes from extending English terms: unary means one, binary means two, ternary means four; so N-ary means many.
Relationship
An association between two or more entities. See association.
Property
An attribute or feature of an entity that we wish to track. The term is often applied in an object-oriented context. See attribute.
Network database
An older DBMS type that expanded the hierarchical database by supporting multiple connections between entities. A network database is characterized by the requirement that all connections had to be supported by an index.
Hierarchical database
An older DBMS type that organizes data in hierarchies that can be rapidly searched from top to bottom, e.g., Customer - Order - OrderItem.
Cross join
Arises when you do not specify a join condition for two tables. It matches every row in the first table with every row in the second table. Also known as the Cartesian product. It should be avoided.
Association
Connections between classes or entities. Generally, they represent business rules. For example, an order can be placed by one customer. It is important to identify whether the association is one-to-one, one-to-many, or many-to-many.
Boolean algebra
Creating and manipulating logic queries connected with AND, OR, and NOT conditions.
Metadata
Data about data, or the description of the data tables and columns. Usually held in the data dictionary. For example, table definitions and column domains are metadata.
Database
Database A collection of data stored in a standardized format, designed to be shared by multiple users. A collection of tables for a particular business situation.
Database engine
Database engine The heart of the DBMS. It is responsible for storing, retrieving, and updating the data.
Subtable
In SQL 1999 a subtable inherits all of the columns from a base table. It provides inheritance similar to that of the abstract data types; however, all data is stored in separate columns.
Abstract data types
In SQL 1999 the ability to define more complex data domains that support inheritance for storing objects.
Persistent stored modules (PSM)
In SQL99 a proposed method for storing methods associated with objects. The module code would be stored and retrieved automatically by the DBMS.
Association role
In UML the point where an association attaches to a class. It can be named, and generally shows multiplicity, aggregation, or composition.
Polymorphism
In a class hierarchy each new class inherits methods from the prior classes. Through polymorphism you can override those definitions and assign a new method (with the same name) to the new class.
Inheritance
In object-oriented design, the ability to define new classes that are derived from higher-level classes. New classes inherit all prior properties and methods, so the programmer only needs to define new properties and methods.
Persistent objects
In object-oriented programming, the ability to store objects (in a file or database) so that they can be retrieved at a later date.
Dependence
An issue in data normalization. An attribute A depends on another attribute B if the values of A change in response to changes in B. For example, a customer's name depends on the CustomerID (each employee has a specific name). On the other hand, a customer's name does not depend on the OrderID. Customers do not change their names each time they place an order.
Entity
An item in the real world that we wish to identify and track.
Default values
Values that are displayed and entered automatically. Used to save time at data entry.
JOIN
When data is retrieved from more than one table, the tables must be joined by some column or columns of data. See INNER JOIN and LEFT JOIN.
Cascading delete
When tables are linked by data, if you delete a row in a higher level table, matching rows in other tables are deleted automatically. For example, if you delete Customer 1173, all orders placed by that customer are also deleted.
Data dictionary
Holds the definitions of all of the data tables and describes the type of data that is being stored.
Repeating groups
Groups of data that repeat, such as items being ordered by a customer, multiple phone numbers for a client, and tasks assigned to a worker.
Encapsulation
In object-oriented programming, the technique of defining attributes and methods within a common class. For example, all features and capabilities of an Employee class would be located together. Other code objects can use the properties and methods but only by referencing the Employee object.
Surrogate keys
Internally generated keys used to identify objects. They are often better keys than keys created by external sources because they are easier to guarantee uniqueness. See AutoNumber.
Data integrity
Keeping accurate data, which means few errors and means that the data reflects the true state of the business. A DBMS enables you to specify constraints or rules that help maintain integrity, such as prices must always be greater than 0.
Common language runtime (CLR)
Microsoft's base programming language. It is embedded into versions of SQL Server from 2005, so database procedures can be written in CLR langugages such as Visual Basic and C# including access to all of their functions. Needed to use RegEx within SQL Server. Note that CLR support is turned off by default.
Deletion anomaly
Problems that arise when you delete data from a table that is not in third normal form. For example, if all customer data is stored with each order, when you delete an order, you could lose all associated customer data.
Insertion anomaly
Problems that arise when you try to insert data into a table that is not in third normal form. For example, if you find yourself repeatedly entering the same data (e.g., a customer's address), the table probably needs to be redefined.
Data mining
Searching databases for unknown patterns and information. Tools include statistical analysis, pattern-matching techniques, and data segmentation analysis, classification analysis, association rules, and cluster analysis.
Data independence
Separates the data from the programs, which often enables the data definition to be changed without altering the program.
Database management system (DBMS)
Software that defines a database, stores the data, supports a query language, produces reports, and creates data entry screens.
FROM
The SQL SELECT clause that signifies the tables from which the query will retrieve data. Used in conjunction with the JOIN or INNER JOIN statement.
NOT
The SQL negation operator. Used in the WHERE clause to reverse the truth value of a statement. See DeMorgan's law
LIKE
The SQL pattern-matching operator used to compare string values. The standard uses percent (%) to match any number of characters, and underscore (_) to match a single character. Some systems (e.g., Access) use an asterisk (*) and a question mark (?) instead.
Multiplicity
The UML term for signifying the quantities involved in an association. It is displayed on an association line with a minimum value, an ellipses (...), and a maximum value or asterisk (*) for many. For example, a customer can place from zero to many orders, so the multiplicity is (0...*).
ORDER BY
The clause in the SQL SELECT statement that lists the columns to sort the output. The modifiers ASC and DESC are used to specify ascending and descending sort orders.
FETCH
The command used in SQL cursor programming to retrieve the next row of data into memory.
Data volume
The estimated size of the database. Computed for each table by multiplying the estimated number of rows times the average data length of each row
Aggregation
The generic name for several SQL functions that operate across the selected rows. Common examples include SUM, COUNT, and AVERAGE.
DESC
The modifier in the SQL SELECT ... ORDER BY statement that specifies a descending sort (e.g., Z ... A). ASC can be used for ascending, but it is the default, so it is not necessary.
Relational database
The most popular type of DBMS. All data is stored in tables (sometimes called relations). Tables are logically connected by the data they hold (e.g., through the key values). Relational databases should be designed through data normalization
Relational database
The most popular type of DBMS. All data is stored in tables (sometimes called relations). Tables are logically connected by the data they hold (e.g., through the key values). Relational databases should be designed through data normalization.
Data normalization
The process of creating a well-behaved set of tables to efficiently store data, minimize redundancy, and ensure data integrity. See first, second, and third normal form.
Forms development
The process of designing and creating input forms to collect data and store it in the database.
Atomic
The smallest single-valued form of a data element. A table cannot be in first normal form if the cells contain non-atomic data. The definition is subjective depending on the application. For instance, an address such as 123 Main Street is usually considered to be atomic even though it refers to both a house number and a street.
Domain-key normal form (DKNF)
The ultimate goal in designing a database. Each table represents one topic, and all of the business rules are expressed in terms of domain constraints and key relationships. That is, all of the business rules are explicitly described by the table rules.
Online analytical processing (OLAP)
The use of a database for data analysis. The focus is on retrieval of the data. The primary goals are to provide acceptable response times, maintain security, and make it easy for users to find the data they need.
Fourth normal form (4NF)
There cannot be hidden dependencies between key columns. A multi-valued dependency exists when a key determines two separate but independent attributes. Split the table to make the two dependencies explicit.