Databases

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

RDBMS provides.

data dictionaries and metadata collections to assist in data handling. These helps support well-defined data structures and relationships. An additional feature of these types of databases is data storage.

Data flows into a

data warehouse from a variety of transactional systems (point-of-sale, online transactions, etc.), databases, and other data-generating sources.

Relational database management systems (RDBMS)

have a variety of tools that can be used to execute/run queries. Query tools included in many RDBMS include crosstab queries, action queries, parameter queries, and SQL-specific queries.

Database management systems (DBMS)

help to overcome many of the issues associated with traditional file management systems. Here are some of the issues DBMS help to overcome.

Site visibility

includes how/when the site surfaces when queries are executed in search engines. Search engine optimization (SEO) can be enhanced through the information gained by web mining. This information can also assist marketers in online ad placement and search engine advertising.

Before the use of computers and database management systems (DBMS),

manual file systems were used to maintain an organization's records and files.

Usability

refers to how easily website users/visitors can interact with the site. Data gained from web mining can help web designers to optimize website navigation and structure of website information.

Information is generated using a

specific query language. Structured Query Language (SQL) is one of the most popular query languages. Queries can filter based on specific criteria, calculate/summarize data, and automate data management tasks.

Database Foreign Key The ERD illustrates

the use of a Foreign Key. In a relational database, the Foreign Key is a common field between tables that is not the primary key. For example, What is the Foreign Key in the Songs Table? AlbumID is the foreign key in the Songs Table. What is the Foreign Key in the RentInvoices Table? FKDoctorID is the Foreign Key in the RentInvoices Table. Notice the Foreign Key is the Primary Key in the Physicians Table.

You should only be able to delete a primary key if

there are no associated records to that primary key. If you delete this record, you'll end up with an orphaned record.

SQL statements are

used to perform a variety of database tasks, including the retrieval of data (query) and database updates.

The use of web mining by organizations can lead to improved website

visibility, usability, and accessibility.

Database

A database is a collection of tables, relationships, and metadata. A DBMS helps to organize the data found in a database.

Reduction of Data Redundancy

Data redundancy is the duplication of data.

A database management system (DBMS)

is a computer program that is used to create, process, and administer a database.

query

is a question—a request for information from a database.

For many years, relational databases have been the most popular choice for businesses. Due to increasing volumes of data, the increased use of web services, and the need for data storage, alternatives to relational databases are starting to emerge. Instead of focusing on relational databases, some companies are turning to

nonrelational databases (NoSQL).

Reports

offer a way to view, format, and summarize the information in a database. Reports can be used to display or distribute a summary of data and archive snapshots of the data. Reports can also be used to provide details about individual records and to create labels.

Relational databases

organize data into tables based on structured data groupings. Relational databases use links, called relationships, between tables.

Microsoft Access is a relational database. A relational database

organizes data into various tables based on logical groupings. In Design View, select the Design tab, then select Relationships. A relationship is a link between tables that defines how the data are related. A common field between the two tables is used to create the link. A relationship takes one of three forms.

RAM storage and parallel processing are

two of the main components of in-memory computing.

Data mining

(also referred to as knowledge discovery in data—KDD) is the searching of large data stores and sets to uncover patterns and trends that cannot be executed using OLAP or simple analysis techniques. Data mining is executed using mathematical algorithms that segment the data and evaluate the likelihood of future events. It is a multidisciplinary skill that uses machine learning, statistics, AI, and database technology.

An ERD involves the use of different symbols and connectors that help to visualize two different types of information:

The entities within the system The interrelationships among these entities

Data marts

are designed to collect and measure data from specific operational areas of a business and are used by individual departments or groups.

Data is analyzed using

business intelligence (BI) tools, Structured Query Language (SQL) clients, and a variety of analytics applications designed to interpret the data. The output created from data warehouses includes reports, dashboards, and queries. Data and the analytics provided from the analysis of data allow organizations to create/maintain a competitive advantage.

Data normalization

is a method of organizing various types of data in the database. Normalization is an organized approach of breaking down/simplifying tables to eliminate data redundancy and undesirable data characteristics.

A data warehouse.

is a repository of data and information that organizations analyze to make informed business and operational decisions.

A relational database management system (RDBMS)

is a type of database management system (DBMS) with a row and column-based table structure that connects data via relationships. elements include features that maintain data security, accuracy, integrity, and consistency.

DBMS

A DBMS is a software program designed to organize and administer a database.

The creation of relationships between data is achieved using:

A primary key value (in the primary or parent table) Foreign keys (in associated tables) Because of this, we need to ensure that data on both sides of the relationship remain intact.

Independent Data Mart

A stand-alone system that is created separate from a data warehouse and focuses on specific organizational functions.

cloud database features

Ability for enterprise users to host databases without having to buy and maintain dedicated hardware. Can be self-managed or maintained and managed by a provider. Support SQL and NoSQL databases Accessed through the web or a vendor-provided API (application programming interface)

Variety: Different forms of data

Data comes from many structured and unstructured sources. These sources include social media platforms, email, photos, videos, and point-of-sale interactions. YouTube estimates that over 1 billion hours of video content is viewed each day. We're now at a point at which over 2,400 exabytes of healthcare data have been generated and stored.

Decreased Data Inconsistency

Data redundancy often leads to inconsistent data.

Entity relationship diagrams

ERD are a method of structurally representing database design via the use of diagrams.

Allows for delivery of reports and analytics

Newly developed OLAP systems maintain a constant connection with back-end systems and allow for the delivery of reports and analytics in Microsoft Excel and other front-end tools used to collect data.

Allows users to perform multidimensional analysis

OLAP software allows users to perform multidimensional analysis of a wide range of business data, complex calculations, and trend analysis, as well as data modeling. Whereas relational databases store information in rows and columns (similar to a worksheet), OLAP stores information in a multidimensional database structure using data cubes. Using OLAP, data analysts can execute multiple views of the data (known as slices) to create a user-friendly view of the data.

Structured Query Language (SQL),

Used for human interface and communication with relational databases Considered the standard language

Information flows into

a data warehouse at regular intervals and is stored for later processing. variety of people within an organization have access to the data warehouse, including data scientists, key decision makers (KDMs), and data specialists.

data warehouse use

contain large amounts of historical data (data is stored in a series of snapshots) that represent data points at a specific time. This gives organizations the ability to compare different time periods to make more informed business decisions. One of the advantages of data warehouses is their ability to provide access and analysis of information from a variety of subject areas.

Big data

encompasses all of the analysis tools and processes related to applying and managing large volumes of data. was conceived out of the need of organizations to better understand trends, patterns, and preferences that emerge from the interaction with different systems and databases.

The Data mining Process

includes: Problem definition Data gathering and preparation. Model building and evaluation. Knowledge deployment

NoSQL Databases example

According to Amazon, DynamoDB can handle more than 10 trillion requests per day and can support peaks of more than 20 million requests per second. Many of the world's fastest growing businesses, such as Lyft, Airbnb, and Redfin, as well as enterprises such as Samsung, Toyota, and Capital One, depend on the scale and performance of DynamoDB to support their mission-critical workloads

Web usage mining (WUM)

Also called log mining, includes analysis of web access logs or the when, how, and frequency of website access.

Analytics platforms purpose

Analytics platforms provide information about a variety of business and operational areas including: Customer analytics Sales and marketing analytics Social media analytics Cybersecurity Plant and facilities data

Dependent Data Mart

Constructed from existing data warehouses and utilize a top-down approach where organizational data is stored in a centralized location, then specific data is extracted when analysis is needed.

Cross tab

Cross tab queries calculate the sum, average, or other aggregate functions, then group the results by two sets of values. The two groupings include one set of data/results on the side of a datasheet and the other set across the top of the sheet.

Database management systems (DBMS) are developed to allow for the creation, reading, updating, and deletion of data in a database.

DBMS include security features and access controls. also have the ability for managers and database administrators to manipulate, query, and store information in a database.

Using this traditional file system:

Data and information were stored and processed using a traditional file system (paper, files, and documents). Each file is independent of the other file, which leads to data redundancy, inconsistency, and file management issues.

DBMS software vendors

Due to the complexities surrounding the DBMS development process, most organizations do not develop their own DBMS. Instead, they use DBMS created by software vendors including Oracle, Microsoft, and IBM. It is important to note the differences between a database and a DBMS.

Common web mining applications used by today's organizations include

Google Analytics, Data Miner, and Tableau.

Web structure mining (WSM)

Includes analysis of hyperlinks, nodes, and related web pages.

Web content mining (WCM)

Includes extraction of information from web pages/documents, including text, images, videos, and interactives.

One of the popular analytics platforms is IBM's Integrated Analytics System. According to IBM, "The IBM Integrated Analytics System":

Is a unified hybrid data management analytics solution providing massively parallel processing (MPP) Comprises a high-performance hardware platform, optimized database query engine software and networking capabilities that work together to support various data analysis and business-reporting capabilities

Assists businesses in a wide range of areas

OLAP is used to assist businesses in a wide range of areas including: Performance management Financial reporting Simulation models Data warehouse reporting Using OLAP, end-users can perform analysis in multiple dimensions (looking at data in different ways), which leads to insights and understanding of how to make better business decisions.

Databases are designed to maintain data and information about various types of data objects including:

Objects (items in stock/inventory) Events (transactions and item returns/exchanges) People (customers, employees, vendors) Places (procurement centers and wholesalers)

Parameter

Parameter queries prompt the user for values in order to run/execute the query. When a value is supplied, the parameter query applies the field as a criterion. If a value is not supplied, it is interpreted as an empty string.

Database Primary Key

Select SongID and set it as the Primary Key for the Songs table. Create the Primary Key by clicking the Primary Key button. A primary key is a special relational database field designated to uniquely identify all records in a table. A primary key must contain a unique value. It cannot contain a null value. Almost all individuals deal with primary keys frequently but may not realize it. Common Primary Keys include: student ID numbers, Social Security numbers, and customer ID numbers.

Relational databases use links, called relationships, between tables.

Tables are used to hold information about the objects to be represented in the database. Information in tables is stored in rows called records or objects, and columns called fields. These relationships define how the data in the tables are related. common field that is included in both tables is used to create the relationship. Rows among multiple tables can be made related using foreign keys. Data can be accessed in many different ways without reorganizing the database tables themselves.

Velocity: Analysis of streaming data

The pace at which data is generated is mind-blowing. Ninety percent of the data on the Internet has been created since 2016. Every minute on Facebook: 510,000 comments are posted 293,000 statuses are updated 136,000 photos are uploaded Over 3.5 billion Google searches are conducted worldwide each minute of every day. That is 2 trillion searches per year worldwide.

Increased Data Security

The use of a DBMS makes it is easier to secure data and information. A DBMS allows for the creation of access constraints so that only authorized users are able to access the data. Users are assigned a different set of access rules; this helps to protect data and users from identity theft, data leaks, and the misuse of data.

NoSQL Databases description

These databases are designed to manage large data sets across many platforms and have the ability to analyze structured and nonstructured data. They are also useful for creating queries from the data created from social media platforms, web apps, and other emerging forms of digital content variety of NoSQL databases are available on the market today.

Veracity: Uncertainty of data

With all of the data being generated and stored, it is important to ensure that data is meaningful and useful. Poor quality data is estimated to cost the U.S. economy over $3.3 trillion dollars per year. A recent survey found that one in three business leaders do not trust the information they use to make decisions.

A Cloud database

a type of database that is built and accessed via a Cloud platform.

Analytics platforms

are designed to assist large data-driven companies in the analysis and interpretation of organizational data. A variety of database software providers have developed high-speed platforms that are used in relational and nonrelational database technologies that enable the analysis of large data sets.

Entities in entity relationship diagrams (ERD)

are the various business objects that make up the database. These objects include: Roles/people (employees and customers) Tangible business objects (different types of products or services) Intangible business objects (logs, system information) Relationships refer to how these various entities relate to each other within the database/system.

Basic RDBMS functions

Allow the user to create, read, update, and delete data. Are collectively referred to as CRUD.

Hybrid Data Mart

Assimilates data from a data warehouse as well as other data collection systems. It incorporates a top-down approach, end-user inputs, and enterprise-level integration.

Data Mining Key Properties

Automatic discovery of patterns Prediction of likely outcomes Creation of actionable information Focus on large data sets and databases

How does Big Data help?

Big data allows organizations to use analytics to help uncover a variety of predictive behaviors to help create new offerings.

Silver is produced by silver mining. It is extracted from silver ore, yielding the final outcome of a precious metal. ________ takes data and refines it down into precious information. Data mining is a means of analyzing data that can help organizations find patterns and relationships within data sets.

Data Mining

Reduction of Data Redundancy example

If you are managing the data of a gym where a customer is enrolled in multiple workout classes, the same customer details will be stored twice, taking up storage and causing data redundancy. Data redundancy can lead to higher storage fees and inefficient access times.

Decreased Data Inconsistency example

If you are managing the data of a gym, let's say the customer needs to change their address. In a file management system, customer data is stored twice. If all the address data is not changed in each record, data inconsistency will occur.

Structured Query Language (SQL)

Structured Query Language (SQL)-specific queries use specific SQL statements to execute the query. SQL statements are translated by the RDBMS to create output.

Action

There are four types of action queries included in many RDBMS and include append, delete, update, and make-table queries. These queries (except for make-table queries, which create new tables) make changes or move the data in tables (records) they are based on.

Key ideas about data mining

While data mining is a powerful tool, it does not replace the need to have an intimate knowledge of the organization, the data that is produced, and analytical methods employed to turn data into information. Data mining assists businesses in uncovering information that may be hidden in data sets, but does not offer an organization why this information may be valuable. This step is usually executed by managers and data analysts who take reports and interpret the data. Predictive information and relationships that are produced from data mining are not causal relationships. Data mining yields probabilities, not exact answers

Forms

are used to control how data are entered into a database. Forms structure data input to ensure data integrity. Data are entered into the blank areas of the form. Forms turn data into information. Forms are created using a database management system (DBMS) software. Navigate to other records that have forms using the buttons at the bottom of the Form window.

Data warehouses

help to create a decision support system (DSS) environment that allows businesses to gauge the performance of an enterprise over measurable periods of time.

A Cloud platform

includes the hardware and operating environment of servers in an Internet-based datacenter. Cloud databases have many of the same functions of traditional databases, but with added features supported by the Cloud.

Business intelligence (BI)

includes the technologies, computer applications, and procedures for the collection, analysis, and presentation of business information to help support decision making. Fundamentally, business intelligence systems are data-driven decision support systems (DSS) that aid businesses to make better strategic decisions. BI systems provide businesses a picture of historic, current, and future views of operations. BI systems use information stored in data warehouses, data marts, in-memory computing, and other analytic platforms to create information output.

A record

is a collection of related fields in a data file. Records are a collection of characteristics that describe the identity of an entity. A record is also referred to as a row in a table. Select a record by moving the cursor to the left of the record you want to select, then click the left button of the mouse. Select a record by moving the cursor to the left of the record you want to select, then click the left button of the mouse. You can also select a record by using the buttons at the bottom of the table.

. A field

is a group of related characters in a database table. A field is a column in a table that represents a characteristic of someone or something. Field Names in the Members table include Member ID, First Name, Last Name, Address, City, State, Zip, E-mail, and Cell Phone. Move from field to field by using the left and right arrow buttons on the keyboard with a left-click on the mouse. Switch to Design View to edit Field Names and Field Properties.

data mart

is a subsection of a data warehouse that is designed and built specifically for individual departments or business functions. There are three types of data marts:

Online analytical processing (OLAP)

is included in many business intelligence (BI) software applications and is used for a variety of data discovery activities. The activities include report creation and analysis, analytical calculations, forecasting, budgeting, planning, and what-if predicative analysis.

data marts use

used to track inventories, purchase transactions, and the supply chain. Data marts assist with the analysis of what data a user needs rather than focusing on existing data.

SQL uses

user-generated lines of code (statements) to answer questions against the database.

Web mining

uses the principles of data mining to uncover and extract information from websites, social media sites, e-commerce platforms, and web services. Organizations use web mining to assist them in gaining a better understanding of consumer behavior, website efficacy, and usage patterns, as well as for the analysis of web searches.

The Four Vs of Big Data include.

volume, variety, veracity, and velocity.

Referential integrity requires that,

whenever a foreign key value is used, it must reference a valid existing primary key in the parent table. For example, if you were to delete record 1556 in a primary table, you need to be sure that there is not a foreign key in any related table with a value of 1556.

Volume: Scale of data

Enormous amounts of data are created every day. Most companies have over 120 terabytes of information stored (that's 120,000 gigabytes!). It is estimated that 2.3 trillion gigabytes of information are created each day.

Middleware

is software that provides processing capabilities outside what is offered by the user's operating system. This allows for the processing of data in parallel, which leads to enhanced data processing and accessing speeds.

In-memory computing

is the use of middleware software to assist in the storage of data in random-access memory (RAM) across a group of different computers.

Design View

lets you change Field Names and Field Properties. Data Type specifies the data in the field, such as text, numeric, or date/time. Setting Data Types for each field decreases data input mistakes. As you move from one field to the next, the Field Properties change. Notice the wide array of field properties you can adjust.

Data integrity

means the database is reliable, accurate, and aligned to the goals of the organization. Data centralization is critical in increasing data integrity.

Data mining example

might determine that females with incomes between $75,000 and $100,000 who subscribe to certain female-targeted magazines may be more likely to purchase various products. It is important that analysts not assume that the population identified through data mining buys the product because they belong to the identified population.

databases enable

more efficient data maintenance.

Referential integrity

is the accuracy and consistency of data within a table relationship in a database. In relational databases, two or more tables can be linked using a relationship.

Function: In-memory computing

is used by businesses to help unite transactional and analytical processing to provide real-time insights and analytics. This creates an environment that increases the amount and speed at which data can be ingested and analyzed.

When data is centralized,

it means it is stored in only one place. When multiple lists and data sources are maintained, information can become inconsistent leading to decreased data integrity.

Most relational databases use SQL, but

most also have proprietary extensions that allow for customized interactivity. Standard SQL commands include select, insert, update, delete, create, and drop.


संबंधित स्टडी सेट्स

02/25/16 FG_Growth of the Cranium and the Cranial Base

View Set

N372 Week 15 Hematology/Oncology

View Set

ap environmental unit 2 progress check q's

View Set

NEMCC Intro to Business Ms Bowlin FINIAL

View Set

Chapter 1: The Life-Span Perspective

View Set

Topic A: Configure and Use Linux

View Set

MUH 3016 Jazz Styles and Musical Elements Test Spring 2019

View Set