ISTM 210 Phinney, Ch 15-17, Exam 3 TAMU
relational database
A database that represents data as a collection of tables in which all data relationships are represented by common values in related tables
dashboard
A graphical user interface that organizes and summarizes information vital to the user's role and the decisions that user makes.
data validation rules
A set of rules that determine what users can enter in a specific cell or range
Predictive Analytics
Attempts to reveal future patterns in a marketplace, essentially trying to predict the future by looking for data correlations between one thing, and any other things that pertain to it.
OpenOffice database
Base
OpenOffice spreadsheet
Calc
CSP
Cloud Service Provider
iDrives
Companies like GoDaddy.com have offered off-site file storage for years The original form of the Cloud, off-site file storage Stands for Internet Drives
Hadoop
Created in 2005 by Doug Cutting and Mike Cafarella. It was originally developed to support distribution for a search engine and is now an open source program supported by the Apache Foundation that manages thousands of computers and implements Map Reduce. One issued query searches multiple servers.
structured data
Data that: (1) are typically tables consisting of records and fields (numeric or categorical) (2) can be organized and formatted in a way that is easy for computers to read, organize, and understand (recognizable patterns) (3) can be inserted into a database in a seamless fashion.
DBSM
Database Management System. A software system that stores data in files called tables and tables are connected to other tables with related information.
Decision Analytics
Decision Analytics looks at an organization's internal data and then analyzes external conditions like supply abundance and then endorse a best course of action. Builds on Predictive Analysis to make decisions about future industries and marketplaces.
If Ned drops Acct?
Delete info from Student x Course table there should be no blank cells in the relational database example
Descriptive Analytics
Descriptive Analytics define past data you already have that can be grouped into significant pieces like a department's sales results, and also start to reveal trends. It is the baseline that other types of analytics are built.
OpenOffice graphics editor
Draw
ETL
Extract, Transform, and Load tools that are used to standardize data across systems, allowing it to be queried
delete anomaly
If Ned withdraws from all his classes and you eliminate all three of his rows from the table, then you will no longer have a record of Ned. If Ned is planning to take classes next semester, then you probably didn't really want to delete all records of him.
OpenOffice presentation software
Impress
IaaS
Infrastructure as a Service. An on-demand, pay-as-you-go model cloud computing technology useful for heavily utilized systems and networks. Organizations can limit their hardware footprint and personnel costs by renting access to hardware such as servers. Compare to PaaS and SaaS. Examples: Window Azure, Google Compute Engine,
data validation
Insuring a field is populated with usable data
disadvantage of iDrives
Internet was too slow
Data Veracity
Is the data your organization collected any good, trustable or valuable? Lots of data sources have data that is not "clean," meaning it may be too fragmented to be valuable or usable, or that it was simply collected poorly.
How do you decide how much data is enough?
Jeopardy, ask the right questions
PaaS
Platform as a Service. Provides cloud customers with an easy-to-configure operating system and on-demand computing capabilities (Windows). Compare to IaaS and SaaS. Examples: Amazon's Elastic Beanstalk, Force.com, Googles App Engine
REO
Real Estate Owned
Data Volume
Refers to the amount of data collected by an organization. How much data does your business need, and further, where do you keep it once you've collected it?
SQL statements
SELECT, FROM, WHERE, ORDER, APPEND, UPDATE, DELETE
SaaS
Software as a Service A vendor hosts the software online and user accesses and uses the software over the Internet Example: Google Apps, Apache OpenOffice, QuickBooks Online, SalesForce.com, Microsoft Office 365
StaaS
Storage as a Service. Storing, or renting space from a CSP. Example: Dropbox
update anomaly
Suppose Alice Simpson changes her phone number. You need to make the change in three places. If you fail to change it in all three places or change it incorrectly in one place, then the records for Alice will be inconsistent.
TOS
Terms of Service
Data Visualization
The graphic display of the results of data mining, analytics and BI in general, typically in real time
entity relationship (ER) diagram
Used by database designers to construct databases. A data model that uses basic graphical symbols to show the organization of and relationships between data.
Datawarehouse
Used to consolidate a shit ton of disparate data A large collection of data that contains and organizes in one place all the data from an organization's multiple databases
Four V's of Big Data
Volume, Velocity, Variety, Veracity
insert anomaly
What happens if you have a new student to add, but he hasn't signed up for any courses yet? Or what if there is a new class to add, but there are no students enrolled in it yet? In either case, the record will be partially blank.
queries
What produces subsets of information?
table
Where a database holds data
OpenOffice word processor
Writer
cardinality
a "One to Many" relationship is a type of?
semi-structured data
a hybrid data format that consists of structured and unstructured data can possibly be converted into structured data
attributes
a name of a field
entity
anything about which an organization wants to collect and store information students are entities
Structured Query Language (SQL)
asks users to write lines of code to answer questions against a database
business analytics
attempts to make connections between data so organization's can try to predict future trends business analytics can also uncover computer system inadequacies within an organization
Decision Support System (DSS)
computer-based systems using and processing data to support decision-making activities Example: verification of a loan credit
data redundancyDatabase Management Systems help avoid what?
data redundancy
DSS
decision support system
Three form of business analytics
descriptive, predictive and decision
unstructured data
disorganized data that cannot be easily read or processed by a computer because it is not stored in rows and columns like traditional data tables 80% of data is unstructured
reports
display database information on screen or on paper
Cloud Drive advantage
doesn't need to be backed up
referential integrity rule
ensures there will be no update anomaly problem in foreign keys
schema
is a "map" of database tables and their relationship to each another
entity relationship modeling (ERM)
is a database-modeling method used to construct a theoretical and conceptual representation of data to produce a schema
forms
overlay data tables and queries for more specific views of data
field mask
restricts data input
records & fields
rows & columns
data mining
sometimes called Data Discovery is the examination of huge sets of data to find patterns and connections, and identify outliers
text analysis
sometimes called Text-mining hunts through unstructured text data to look for useful patterns, like whether their customers on Facebook are unsatisfied with the organization's products or service.
filtering
the ability to add a filter to a data table that would display specific information
Map Reduce
the processing arm or engine of Hadoop Allows data to be queried and processed directly on the server where it lives, instead of moving the data across the network to be analyzed on the computer Only the query is transported
Business Intelligence (BI)
the set of techniques and tools for the transformation of raw data into meaningful and useful information for business analysis purposes
Load
transferring data into a Datawarehouse or Datamart
topic analysis
tries to catalog phrases of an organization's customer feedback into relevant topics For example, if a customer said, "the barista was friendly", that would be categorized under the topic "Employee Friendliness."
query files
used to find specific populations (subsets) of information
Normalizing
your data is typically organized into the fields and records of a relational database attributes appear multiple times only when they function as foreign keys
a relationship
"On to Many" is an example of?
Data Velocity
How fast can you collect data, and more importantly, how quickly can you analyze it?
another name for SaaS
On-Demand
Datamart
smaller, more focused data warehouse
Big Data
the huge and complex data sets generated by today's sophisticated information generation, collection, storage, and analysis technologies
Does Amazon have the right to access what you upload
yes