Database and data storage

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

What is Data Warehousing

A system used for reporting as well as data analysis. Data comes from multiple heterogeneous sources. Historical data.

Smart Business Applications

Embedded in the users workflow the application can alert when criteria set by business are crossed. Automate analysis results and distribution

What is a fact table

Fact table is central table found in star schema or snowflakes schema which is surrounded by dimension tables.Fact table contains numeric values that are known as measurements.Fact table has two types of columns: 1.Facts 2.Foreign key of dimension tables.

What is granularity

Granularity in a table represents the level of information stored. High granularity is at a transaction level, low granularity describes data not at a detailed level.

What is a snowflake schema

Snowflake schema is a form of dimensional modeling where dimensions are stored with multiple dimension tables. In a Snowflake schema dimensions are normalized. Normalization splits up data in to additional tables.

Why is NoSQL faster than RDBMS

The DB doesn't have to do all of these complex operations and locking across much of the dataset, so it's really easy to partition the thing across many servers/disks/whatever and have it work really fast.

What are the measurement types of a fact table

The measures in a fact table are of three types : 1.Additive : Measures that can be added across any dimension 2.Non-additive: Measures that can not be added across any dimension 3.Semi-additive: Measures that can be added across some dimensions.

What does RDBMS offer that NoSQL does not

Things like automatic enforcement of referential integrity, transactions, etc. ( ACID )

Federated database system

This is a type of meta-database management system (DBMS), which transparently maps multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer network and may be geographically decentralized.

What is universe in Business Analytics

Universe is the interface layer between the client and the data warehouse. Universe defines the relationship between various tables in the data warehouse. Hides the tables structure from the user

Describe OLAP ( Online Analytical Processing)

.OLAP is used for complex calculations,Trends Analysis,sophisticated data modeling. Key purpose is to decrease query time and increase effectiveness

Four steps of Dimensional Modeling

1) Choose business process -- describe the business process 2) Declare grain -- a sentence describing the focus of the model. ex an individual line item from customer slip 3) Identify the dimensions -- nouns ( date, product, store, time ) 4) Identify the facts -- the numeric facts / measures that the users will need

What are 4 rules in dimensional modeling

1) Load detailed atomic data - better to have than not. aggregated/summerized data can be present for performance 2)structure dimensional models around business processes. Processes are measurable events. Find these and record at a fine grain. 3) Ensure each fact table has a date dimension 4) Ensure all fact tables have same grain / level of detail

Example of aggregates

1.Average 2.Count 3.MAX 4.MIN 5.Median 6.SUM

Stages of BI

1.Data Source 2.Data Analysis 3.Decision making support 4.Situation Awareness 5.Risk Management

What are benefits BI offers

1.Improving Decision Making 2.Accelerate Decision Making 3.Optimizing internal business process 4.Increase Operational Efficiency 5.Driving new revenues 6.Gaining competitive advantages over competitors

What is a schema?

Schema is logical description of whole database.Database schema is a skeleton or structure of the database which represents database logically.

The acronym ACID stands for ?

Atomicity, Consistency, Isolation, and Durability − commonly known as ACID properties − in order to ensure accuracy, completeness, and data integrity.

The acronym ACID stands for ?

Atomicity, Consistency, Isolation, and Durability − commonly known as ACID properties − in order to ensure accuracy, completeness, and data integrity. Atomicity : each transaction is single unit Consistency: a transaction will bring a DB to a valid state Isolation: State of DB is whether a trans in a group or not. Durability: states are saved even if system failure. Write to non-volatile disk

The acronym BASE stands for what

BASE (Basically Available, Soft State, Eventual consistency) principles.

Describe what scale up is

Buy a bigger box. problem: Pay four to five times what you paid for your previous solution, and only get twice the performance.

What are different Data Warehouse systems

Data Mart Online Analytical Processing (OLAP) Predictive Analysis

What is a Data Mart

Data Mart is a simplest set of Data warehouse which is used to focus on single functional area of the business

Describe Dimensional Modeling

Design method to create DW

What is a dimension table?

Dimension table is table which describes the business entities of an enterprise -- they describe the the objects in a fact table. Dimension table has primary key which uniquely identifies each dimension row.

What is a star schema

In a Star schema there is a Fact table at the center with dimension tables surrounding it .

Describe scale out

In a scale-out architecture a distributed set of nodes is used as the basic architecture. It provides highly elastic scaling capability, enabling you to add nodes to handle load on the fly.

Slowly changing dimensions

Is the technique used to manage attribute changes in a dimension over time. SCD type 1: Overwriting. New overwrites current. SCD type 2:Create another dimension row. Retains history of values. SCD type 3:holds both current and last value. SCD type 4:history table created of dimension SCD type 6: combine 1,2 & 3 ( 1+2+3=6). "overwrite" by marking the current row, add another row to history, thus curr and most recent are represented

Drawbacks of sharding

Need logical key to shard on. If cannot distribute your traffic evenly. Application logic that needs to operate across shards or maintain referential integrity has poor performance.

What is Business Intelligence

Provides the user with data and tools to answer questions which are important to run business or part of a business.

What is sharding

Scale database past the single-instance by sharding, or partitioning data across multiple servers.

What is immutable

doesn't change

describe ranking functions in SQL queries (rank, dense rank, ntitle, row_number)

functions that rank results using following syntax : SELECT Names, Grade, ROW_NUMBER () OVER (ORDER BY Grade DESC) as ROW_NUMBER, RANK () OVER (ORDER BY Grade DESC) as RANK, DENSE_RANK () OVER (ORDER BY Grade DESC) as DENSE_RANK, NTILE(3) OVER(ORDER BY Grade desc) AS NTILE FROM dbo.Grades row_number will list (1..n); rank = (1,2,2,4...); dense rank =(1,2,2,3...); ntitle puts into three groups ( 1,1,2,2,3...)

What is embedded analytics

the integration of analytic content and capabilities within business process applications. It provides relevant information and analytical tools designed for the task at hand so users can work smarter and more efficiently in the applications they use every day.


Kaugnay na mga set ng pag-aaral

ARRT Score Report: Procedures TREATMENT VOLUME LOCALIZATION (18)

View Set

Acute 2 - Cardio - Quiz Questions - Blood Disorders

View Set