BCIS Exam 1 Review
5 major constraints
1. Not Null (NN) 2. Unique (U) 3. Primary Key (PK) 4. Foreign Key (FK) 5. Check (CHK)
Purpose of constraints
A constraint is a rule placed on a table or column that restricts operations and data values allowed▪ Enforces rules at the table level▪ Enforces integrity▪ Prevents deletion of tables, if dependencies exist▪ Can be defined during or after table creation
Database management system
A database system is an information or computer system that manages the collection, storage, retrieval of data
File system tasks
Allows different file system tasks (e.g., copy, move, delete file and folders)
PowerBI Filters
Allows you to further edit and streamline your report by adding columns to the filters
A RDBMS processes transactions using the ACID property. What does ACID stand for
Atomicity Consistency Isolation Durability
A in ACID
Atomicity. All tasks performed or none at all
T-SQL Predicates
BETWEEN• EXISTS• IN• LIKE• CONTAINS• IS [NOT] NULL
Database design process
Conceptual design, Logical database design, Physical database design
Fact Tables
Contain measures related to the subject of analysis and the foreign keys (associating fact tables with dimension tables) record the measures of business events (e.g., sales) occurring in an enterprise
SSIS Package
Control Flow(workflow) + Data Flow (Data Transformation) + Connection (Connection to source,destination) = Package
Logical database design
Converting ERDs from previous step to logical relational tables; Normalizing and designing tables
You are designing a table to store demographic information for a genealogy company in the US. The table must store information for the past 250 years. What data types should you use for the [BirthDate] field?
DATETIME
A SQL statement that rolls back a transaction is generally referred to as a ________.
DCL
A SQL statement that alters a table is generally called
DDL
What is the difference between delete and drop
Drop is ddl and delete is dml
ER or Dim models are used for operational databases?
ER
The design of operational data is based on...
ER Model
First Normal Form
Every column is atomic, single valued: the column cannot be further divided into sub-columns All attributes are functionally dependent on all or part of the key
Control Flow tasks
Executes SQL Statements, stored procedures Different from Execute T-SQL Statement
E in ETL
Extract. Get data from several different sources (e.g., Excel, SQL Server, Oracle, text files)
T-SQL clauses
FROM • WHERE• GROUP BY• HAVING• ORDER BY
Logical Design for Many to Many relationship
First, a new junction table is created Second, the primary keys from each related table is placed as a concatenated primary key in the new junction table Third, related attributes are added to the junction table
Physical database design
Implementing or physically creating the database, the relational tables, using T-SQL
What is a data warehouse?
In general, a DW is a part of the overall business intelligence system
What data type is the primary key?
Integer
Exact Numerics
Integer, SmallInt, TinyInt, decimal, numeric
I in ACID
Isolation. Result only visible once committed
To create a simple Database in SQL Server 2019 involves the following steps.
Open SSMS, Right-click on Database, Click New Database, Enter Database Name, OK
SSIS Architecture. Slide 12 Lec 8
Package: Control Flow Task Data flow
Which of the following is the correct processing order of the DW Architecture?
Sources, ETL, DW
3 SSIS constraints
Success Complete Failure
Data Visualization
The process of transforming data and metrics into visual charts, graphs, that make is easier for users to understand, make sense of, and gain new insights
Primary key constraint
This constraint ensures that the column is the PK for the table and will not have duplicate values
A relational database is a database structure that allows objects to have relationships with other objects in the database.
True
T/F decimal and numeric are synonyms?
True
Changing Attribute SCD
Updates column is the source value changes
Normalization
a process of creating database schema. the process to eliminate data redundancy and enhance data integrity in the table. Normalization also helps to organize the data in the database.
the design of data warehouse is based on...
dimensional model
D in ACID
durability. When committed change permanent.
ER or Dim model have normalized tables
er
Which of the following can be used to maintain referential entity integrity?
foreign key
Surrogate Key
have no meaning or purpose except to give each dimension a new column that serves as a primary key within the dimensional model instead of the operational key
What kind of data is Acct #
integer
Slowly Changing Dimension
is a dimension that stores and manages both current and historical data over time in a data warehouse
In a star schema, a fact table is on the _____ side of the relationship
many
3 types of cardinalities
one to one, one to many, many to many
To convert a 1:1 relationship:
place the primary key of the one table into the second table as a FK
SQL
structured query language
Which of the following is not a database object?
transaction
What is business intelligence?
umbrella term that combines architectures, tools, databases, analytical tools, applications and methodologies used for decision making
When to use CHAR
when the sizes of the column data entries are consistent (e.g., EUID)
T-SQL Functions
• COUNT• MIN• MAX• SUM• AVG
Features of data warehouse
Subject oriented Integrated Non volatile Time variant
when to use VarChar?
Used when the sizes of the column data entries varies (e.g., FirstName)
What kind of data is FirstName
Var char
what kind of data is address?
Var char
Nonclustered Index
can be defined on a table and the data rows are not in any particular order. Uniqueness is a property of nonclustered indexes
what kind of data is phone number
char
Three main types of SQL statements used for database operations are ____, ____ and ____?
data definition, data manipulation, data control
What values does Bit hold?
0, 1, null
Steps for dimensional modeling
1. Op database 2. ER modeling 3. DW database 4. dim model
What does Decimal (5,2) look like?
123.00
Completed SSIS
A linked task executes if the prior task completes, irrespective of whether task succeeds or fails (BLUE)
Success SSIS constraints
A linked task executes only if the prior task completes successfully (GREEN)
Failure SSIS
A linked task executes only if the prior task fails to complete. Typically used for notifications (RED)
Historical Attribute SCD
A new record (row) is created when a source value changes and the old record is preserved
Which of the following best describes an orphaned record.
A record whose foreign key points to a non-existing primary key value
Scorecard business report
A report that shows business goals versus performance measures. Example, financial, customer, and growth indicators Provides managers with information on how well they are doing (performance)
WHAT IS A BUSINESS REPORT?
A written document that presents business information in a structured format It is used for business decision making, gets its data from a DW or other sources, its structure of the report consists of tables, charts, dials, graphs
Second Normal Form
Achieve 1NF ▪ All non key attributes are functionally dependent on the key. This eliminates any attribute that lacks complete dependency to PK
Third Normal Form
Achieve 1NF, and 2NF ▪ Non key attributes must not depend on another non key attribute
Visualizations
Allows you to select visualizations, add values to the visuals, and add columns to the axis
Which of the following best describes an operational database?
An operational database is a system that captures all elements of the day-to-day activities of the business
Slowly Changing Dimension 1
Changes the value in the dimension's record ▪ Old values are lost ▪ Simple to implement ▪ No history is preserved
what kind of data is SS#
Char fixed
Dimension Tables
Columns in dimension tables contain descriptive information that is often textual Dimensions represent the who, what, where, and when of these business events
Data Control Language (DCL)
Commands that control a database, including administering privileges and committing data. Grant revoke callback
Data definition language
Commands that define a database, including creating, altering, and dropping tables and establishing constraints. Create Alter Drop
Data Manipulation Language (DML)
Commands that maintain and query a database. Select Insert, Update, Delete
DML CRUD operations
Create, Read, Update, Delete
Operational data is known for being ________; while Data warehouse data is known for being ________________.
Current: granular
Which of the following correctly describe the relationship between Data warehouse and Business intelligence
DW is subset of BI
The term "non-volatile" describes which aspect of a data warehouse?
Data cannot be deleted
In the DW/BI Architecture, in which of the following stages or components does dimensional modeling occur?
Data warehousing
Article Data Warehouse Design for E-Commerce Environment Talks about?
Data warehousing and e-commerce are two of the most rapidly expanding fields. Paper goes into detail about how to build DW for e-commerce environments
The set of processes that is used to populate the data warehouse tables with the appropriate relevant data retrieved from the operational databases is known as:
ETL
One of the following is NOT an objective of business intelligence solution.
Easy access to data that is current detailed and always modifiable
Unique key constraint
Enforces unique constraint on a column. This means that there can be no duplicate values for this column of data
Data warehouses do not reveal past trends that can help organization predict and identify new business areas.
False
Microsoft Access is a multi-user RDBMS
False
T/F Views store data
False. They create a layer of abstraction between the table and the user
One of the following is NOT a type of index.
Foreign Index
Dashboard
Graphical presentation of several performance indicators in a single page Provides managers with information on what they are doing (progress) A combination of several report types
Conceptual database design
Identifying data elements; planning the metadata, structure; creating Entity Relationship Diagrams (ERD); conceptually connecting the data elements
Logical Design for 1 to 1 relationship
If all attributes of an entity do not logically belong to one entity, consider creating a separate table that corresponds to the second entity Then place the primary key of the one table into the second table
You work as a data analyst at an IT company with several divisions ( e.g, Servers, Desktops and laptops, Accessories). Each division has its own operational system comprising of marketing, sales, and revenue. The CEO and top leaders of the organization want to see last year's monthly sales of all the organization's divisions. Which of the following solutions is best suited to allow this type of report?
Implementing a data warehouse system that combines the data sources from all the divisions
L in ETL
Load the data into a target location
Fixed Attribute SCD
No changes are made. ETL ignores changes or generates an error
Anna was just hired as a consultant to design and build a data warehouse system. She makes an assumption that the organization has an operational database system. Is Anna wrong in making this assumption?
No she is not wrong. You need an operational database to build a data warehouse
Difference between primary and unique key
PK does not allow NULL Unique key can allow a NULL• Table has only 1 PK Tables can include multiple Unique Keys
PowerBI Fields
Provides the list of tables that were loaded at the onset. Allows you to see and select table fields/columns for report design
PowerBI Benefits
Quick and simple to create reports Data Visualization Report Interactivity Various Data Source Connections Excel (demo)
What shape are entities
Rectangles
PowerBI Views
Report view - This is where you use data and/or queries to build visualizations Data view - This is where you see the data in your report in table format. You can add new measures and create new columns. Relationships view - This shows a graphical representation of the relationships and cardinalities in the data model
SSIS
SQL Server Integration Services Runs in the background
SSIS vs ETL
SSIS is an ETL tool and ETL is an technology in this we Extract the data and transform as per business need and then load it into a destination.
Which of the following is not the correct way to create a database in SQL Server.
Script from a SQL Agent
datawarehouse/bi architecture
Sources(operational) -> extraction, transformation&loading -> data warehouse -> BI/ analytics applications -> BI analysts
Not Null key constraint
This constraint is an unknown value. It restricts a column from being an unknown value. Please note that a NULL value is not blank, zero, empty.
Foreign key constraint
This defines a column as a foreign key (reference). It references to the primary key column of another table. Ensures referential integrity
Operational Datawarehouse
Transactional, day-to-day operations data ▪ Known as Online Transaction Processing (OLTP) Fulfils ACID
T in ETL
Transform. Clean and perform calculation on data into a desired format
A data warehouse implementation is a recognition that there is potential business value in organizational data
True
When a 1:M relationship is mapped, the foreign key must appear in the relation mapped from the M side
True
You are designing a database that requires users to write their own SELECT queries against the database from different applications. However, management has determined that users must not directly access the database. As a database engineer, you are tasked to create database objects that are based on tables that will allow the users to write their queries. What database objects should you use?
Views
You are designing a database that requires users to write their own SELECT queries against the database from different applications. However, management has determined that users must not directly access the database. As a database engineer, you are tasked to create database objects that are based on tables that will allow the users to write their queries. What database objects should you use?
Views
Logical Design for 1 to Many relationship
We place the primary key from the One side as a foreign key to the Many side.
To convert a 1: Many relationship:
We place the primary key from the One side as a foreign key to the Many side.
Regional database management system
a database structure that allows database objects to have relationships with other objects in the database
Entity
anything that captures data
What shape are attributes
circles
C in ACID
consistency. transaction must always keep database in a stable state
A function that returns the number of items in a group, usually in integer is known as _______.
count
Date V DateTime
date time has 24hr clock
You are designing a [Grades] table in a school database. A few of the attributes (fields) in this table include: GradeID, CourseID, SemesterID, StudentID, GradeNumber.... Which of the following data types is better to use for the attribute, GradeNumber?
decimal
Grain
determines lowest level of detail of each activity. determines fact measures is atomic EX one row is sales transaction one is product item
What shape are relationships
diamonds
ER or Dim model have denormalized tables
dim
ER or Dim models are used for data warehousing
dim
Slowly Changing Dimension 2
▪ Creates an new additional record using a new value for the surrogate key every time a value in a dimension record changes ▪ Preserves current and previous value ▪ Used in cases where history should be preserved
To convert a M:N relationship
▪ First, a new junction table is created▪ Second, the primary keys from each related table is placed as a concatenated primary key in the new junction table▪ Third, related attributes are added to the junction table
Slowly Changing Dimension 3
▪ Involves creating a "previous" and "current" column in the dimension table for each column where changes are anticipated (or version number) ▪ Applicable in cases in which there is a fixed number of changes possible per column of a dimension, or in cases when only a limited history is recorded.
You are designing a database for a healthcare software in Canada. The healthcare software has a field [AdmitDate] that will be used to store the date and time that a patient is admitted to a hospital. What data types should you use for this field? Choose all that apply. Hint: there are 2 correct answers
SMALLDATE and DATETIME would work
Destination Connections
Specifies the location of the destination storage in the Data Flow Task OLE DB Destination (Default for SQL)▪ ADO.Net Destination▪ SQL Server Destination▪ Excel Destination▪ ODBC Destination▪ Raw File Destination
Source Connections
Specifies the location of the source data in the Data Flow Task OLE DB Source (Default for SQL)▪ Excel Source▪ Flat file Source▪ ADO Net Source▪ XML Source▪ Raw File Source
A regional databse consists of what 3 things
Structure(Defined database objects used for storing and accessing the database) Operations(Defined actions that allow users to manipulate the data and the data structures) Rules (Rules that govern the operations performed on the data and data structure)
CamelCase vs PascalCase
pascal case requires the first letter of the first word to also be capitalized so every word starts with an uppercase letter. ie MiloCat
Clustered Index
sorts and stores the data rows of the table in order based on the clustered index key. Uniqueness is a property of clustered indexes
The dimensional schema is often referred to as the:
star schema
Datawarehouse compared to operational dw
▪ Analytical, stores historical data ▪ Known as Online Analytical Processing (OLAP) Used for decision making