cis 2010 test 3
prescriptive analytics
The set of analytical techniques that yield a best course of action.
data element
The smallest or basic unit of information
Predictive Analytics
extracts information from data and uses it to predict future trends and identify behavioral patterns
data model
logical data structures that detail the relationships among data elements using graphics or pictures
computer aided software engineering (CASE)
tools are software suites that automate systems analysis, design, and development
data dictionary
compiles all of the metadata about the data elements in the data model
primary key
A field (or group of fields) that uniquely identifies a given entity in a table
Double-Spend
- Scenario, in the Bitcoin network, where someone tries to send a bitcoin transaction to two different recipients at the same time. - However, once a bitcoin transaction is confirmed, it makes it nearly impossible to double spend it. - The more confirmations that a particular transaction has, the harder it becomes to double spend the bitcoins.
Data mining process
1) Business understanding 2) Data understanding 3) Data Preparation 4) Data Modeling 5) Evaluation 6) Deployment
Data mining analysis techniques
1) Estimation analysis 2) Affinity Grouping Analysis 3) Cluster Analysis 4) Classification Analysis
Digital Ledger
A bookkeeping list of assets (money, property, ideas...), identified ownership, and transactions that record the transfer of ownership among participants
Management Information Systems
A business function, like accounting and human resources, which moves information about people, products, and processes across the company to facilitate decision-making and problem-solving
Peer-to-peer
A controller network that simply connects computers to each other or to a device such as a printer, but a server is not necessary
Cryptocurrency
A digital currency in which encryption techniques are used to regulate the generation of units of currency and verify the transfer of funds.
project plan
A formal, approved document that manages and controls project execution
System
A group of parts that work together as a whole
data warehouse
A logical collection of information - gathered from many different operational databases - that supports business analysis activities and decision-making tasks
foreign key
A primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables
information scrubbing
A process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information
boolean
A single value of either TRUE or FALSE
project
A temporary endeavor undertaken to create a unique product, service, or result.
Blockchain
A type of distributed ledger technology consisting of data structure blocks that may contain data or programs, with each block holding batches of individual transactions and the results of any executables. Each block contains a time stamp and a link to a previous block.
table
An arrangement of data made up of horizontal rows and vertical columns.
project manager
An individual who is an expert in project planning and management, defines and develops the project plan, and tracks the plan to ensure the project is completed on time and on budget
multitasking
An operating system feature that allows more than one application to run at a time.
array
Arrays contain a specific number of elements of a particular type.
structured data
Data already in a database or a spreadsheet
attributes
Data associated with an entity.
Input-Process-Output
Describes the structure of an information processing program or other process. It is the most basic structure for describing a process.
data analytics
The science of examining raw data with the purpose of drawing conclusions about that information
Decentralization
Each node in the participating computer network has a full copy of the digital ledger. This avoids the need to have a centralized database managed by a trusted party. Transactions are broadcast to the network. Network nodes can validate transactions and add them to their copy, then broadcast those additions to other nodes.
Entity relationship diagram
a technique for documenting the entities and relationships in a database environment
Dirty data problems
Incomplete, outdated, or otherwise inaccurate data
Relational databases (advantages)
Increased flexibility Increased scalability and performance Reduced information redundance Increased information integrity Increased information security
Business Intelligence
Information collected from multiple sources such as suppliers, customers, competitors, partners, and industries that analyzes patterns, trends, and relationships for strategic decision making
barriers
Obstacles that interfere with the understanding of a message
common sql commands
SELECT, FROM, WHERE, ORDER BY,
Software distribution methods
Software updates and upgrades
AND
The AND operator displays a record if all the conditions separated by AND is TRUE.
WHERE
The WHERE clause can be combined with AND, OR, and NOT operators.
Normalization
The process of applying rules to a database design to ensure that information is divided into the appropriate tables.
project scope
The work performed to deliver a product, service, or result with the specified features and functions.
Miners
Transactions are authenticated by a network of 'miners' who complete complex mathematical problems. When all miners arrive at the same unique solution, the transaction is verified and recorded as a new 'block'.
Big data
a broad term for datasets so large or complex that traditional data processing applications are inadequate.
Field
a characteristic of a table
information integrity
a measure of the quality of information
extraction, transformation & load
a process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse.
data quality audit
a structured survey of the accuracy and level of completeness of the data in an information system
cluster analysis
a technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible
DBMS tool (SQL)
a tool that asks users to write lines of code to answer questions against a database.
Bitcoin
a type of digital currency in which encryption techniques are used to regulate the generation of units of currency and verify the transfer of funds, operating independently of a central bank.
Systems thinking
a way of monitoring the entire system by viewing multiple inputs being processed or transformed to produce outputs while continuously gathering feedback on each part
System Restore
allows your computer's configuration settings to be reset to those of another earlier time
Legacy system
an old system that is fast approaching or beyond the end of its useful life within an organization
strings
any combination of letters, numbers, etc... that are enclosed by "quotations"
One-to-Many relationship
between two entities in which an instance of one entity can be related to many instances of a related entity
One-to-One relationship
between two entities in which an instance of one entity can be related to only one instance of a related entity
Many-to-many relationship
between two entities in which an instance of one entity is related to many instances of another and one instance of the other can be related to many instances of the first entity
records
collection of related data
information cube
common term for the representation of multidimensional information
Application Software
computer software created to allow the user to perform a specific job or task
data mart
contains a subset of data warehouse information
System Software
controls how the various technology tools work together along with the application software
Operating systems software
controls the application software and manages how the hardware devices work together
bugs
defects in the code of an information system
data visualization
describes technologies that allow users to see or visualize data to transform information into a business perspective
estimation analysis
determines values for an unknown continuous variable behavior or estimated future value
dirty data
erroneous or flawed data
data scientist
extracts knowledge from data by performing statistical analysis, data mining, and advanced analytics on big data to identify trends, market changes, and other relevant information
process modeling
involves graphically representing the processes that capture, manipulate, store, and distribute information between a system and its environment
JOIN
joins table
database
maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses)
unstructured data
nonnumeric information that is typically formatted in a way that is meant for human eyes and not easily understood by computers
software updates
occur when the software vendor releases updates to software to fix problems or enhance features
software upgrades
occurs when the software vendor releases a new version of the software, making significant changes to the program
Open Systems
organizations that are affected by, and that affect, their environment
Hash
plaintext that has been transformed into short code
distributed computing
processes and manages algorithms across many machines in a computing environment
Utility Software
provides additional functionality to the operating system
affinity grouping analysis
reveals the relationship between variables along with the nature and frequency of the relationships
integrity constraints
rules that help ensure the quality of information
SELECT
selects data from a database
Subsystems
smaller systems that operate within the context of a larger system
FROM
specifies which table is a key used to link two tables together
entities
stores information about a person, place, thing, transaction, or event
off-the-shelf application
supports general business processes and does not require any specific software customization to meet the organization's needs
Feedback loops
system structure that causes output from one node to eventually influence input to that same node.
feed forward loop
term describing a element or pathway within a system that passes a controlling signal from a source in its external environment to a load elsewhere in its external environment.
information redundancy
the duplication of data, or the storage of the same data in multiple places
data mining
the process of analyzing data to extract information not offered by the raw data alone
classification analysis
the process of organizing data into categories or groups for its most effective and efficient use
conversion
the process of transferring information from a legacy system to a new system
business requirement
the specific business requests the system must meet to be successful
tools
things people use to help them do a job
DBMS (Database Management System)
this creates, reads, updates, and deletes data in a database while controlling access and security.
business rule
this defines how a company performs certain aspects of its business and typically results in either a yes/no or true/false answer
DBMS tool (query by example)
this is a tool that helps users graphically design the answer to a question against a database.
Metadata
this provides details about data. For example, for an image this could be size, resolution, and date created.
relational databases
this stores information in the form of logically related two-dimensional tables
common characteristics of big data
variety, veracity, volume, velocity