Intro to info systems exam 3 study guide
data element
The smallest or basic unit of information
Utility software
provides additional functionality to the operating system (antivirus software, screen savers, and anti-spam software)
primary key
A field (or group of fields) that uniquely identifies a given entity in a table
ETL
(Extraction, transformation, and loading) a process that extracts information from internal and external databases, transforms it using a common set of enterprise definitions, and loads it into a warehouse.
advantages and features of relational databases
- Increased flexibility - Increased scalability and performance - Reduced information redundance - Increased information integrity - Increased information security
Double-Spend
- Scenario, in the Bitcoin network, where someone tries to send a bitcoin transaction to two different recipients at the same time. - However, once a bitcoin transaction is confirmed, it makes it nearly impossible to double spend it. - The more confirmations that a particular transaction has, the harder it becomes to double spend the bitcoins.
Examples of metadata
- metadata for an image could be its size, resolution, and date created - metadata for text could contain document length, data created, authors name, and summary,.
central components of blockchain
-Digital Ledger -Hash and Digital Signature -Miners -Decentralized
characteristics of redundant data
-Due to the unorganized structure present within file systems -updates made to one data file may not be carried to the corresponding data file in a different location.
advantages of blockchain
-Transparency: It is possible for users to verify and track transactions in the public, decentralized ledger. -Decentralized: Instead of being stored in any one single point, blockchain system is entirely decentralized, which means that it is not possible for an overarching authority to advance its own agenda and control the network. -Security/immutability: Once every block is sealed cryptographically, it is impossible to copy, delete or edit, ensuring the immutability of the digital ledger.
external databases
-competitor information -industry information -mailing lists -stock market analysis
Dirty data problems
-inaccurate data -duplicate data -misleading data -incorrect data -non-formatted data -violates business rules data -nonintegrated data
internal databases
-marketing -sales -biling -inventory
unstructured data
-satellite images -photographic data -video data -social media data -text message -voice mail data
structured data
-sensor data -weblog data -financial data -click stream data -point of sale data -accounting data
types of attributes
-simple vs. composite ( a simple attribute can not be broken down into a smaller component, a composite attribute can) -simple valued vs. multivalued (simple valued means only having a single value for each attribute, multivalued means having the potential to contain more than one value for an attribute) -stored vs. derived (derived can be calculated using the value of another attribute and stored is the attribute that is used to derive the attribute) -null-valued (assigned to an attribute when no other value applies or when a value is unknown
licensing methods
-single user license (restricts the use of software to one use at a time) -network user license (enables anyone on the network to install and use the software) -site license (enables any qualified users within the organization to install the software) -application service provider license (specialty software payed for on a license basis or per-use basis or usage-based licensing)
common characteristics of big data
-variety -veracity -volume -velocity
data artist
A business analytics specialist who uses visual tools to help people understand complex data
Peer-to-peer
A computer network that relies on the computing power and bandwidth of the participants in the network rather than a centralized server
Cryptocurrency
A digital currency in which encryption techniques are used to regulate the generation of units of currency and verify the transfer of funds.
Foreign Key
A primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables
Advantages of Bitcoin
A prohibitively high cost to attempt to rewrite or alter transaction history. Any well-connected node in the Bitcoin blockchain can determine, with certainty, whether a transaction does or does not exist in the data set.
Subsystems
A unit or device that is part of a larger system
Hash and Digital Signature ( central components of blockchain )
Computer Science and advanced mathematics (in the form of cryptographic hash functions) protect the blockchain's integrity and anonymity. Each transaction has a digital hash calculated and attached. The hash includes digital signatures from the existing blockchain as well as the new transaction.
Data Visualization
Describes technologies that allow users to "see" or visualize data to transform information into a business perspective
Decentralization
Each node in the participating computer network has a full copy of the digital ledger. This avoids the need to have a centralized database managed by a trusted party. Transactions are broadcast to the network. Network nodes can validate transactions and add them to their copy, then broadcast those additions to other nodes.
data scientist
Extracts knowledge from data by performing statistical analysis, data mining, and advanced analytics on big data to identify trends, market changes, and other relevant information
barriers
Obstacles that interfere with the understanding of a message
Miners ( central components of blockchain )
People who authenticate transactions who complete complex mathematical problems.
distributed computing
Processes and manages algorithms across many machines in a computing environment
Fast data
The application of big data analytics to smaller data sets in near-real or real-time order to solve a problem or create a business value
information cube
The common term for the representation of multidimensional information
information redundancy
The duplication of data, or the storage of the same data in multiple places
Business Intelligence dashboard
Tracks corporate metrics such as critical success factors and key performance indicators and include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis
application software
Used for specific information processing needs, including payroll, customer relationship management, project management, training, and many others.
record
a collection of related data elements
Open Systems
a computer system that combines portability and interoperability, and makes use of open software standards
Management Information Systems
a computerized information-processing system designed to support the activities of company or organizational management
blockchain
a distributed ledger that provides a way for information to be recorded and shared by a community
data warehouses
a logical collection of information, gathered from many different operational databases, that supports business analysis activities and decision making tasks
information integrity
a measure of the quality of information
information scrubbing
a process that weeds out and fixes or discards inconsistent, incomplete, or incomplete information
Systems Thinking
a way of monitoring the entire system by viewing multiple inputs being processed or transformed to produce outputs while continuously gathering feedback on each part
multitasking
allowing a user to perform more than one computer task (such as the operation of an application program) at a time
One-to-Many relationship
between two entities in which an instance of one entity can be related to many instances of a related entity
Many-to-many relationship
between two entities in which an instance of one entity can be related to many instances of another and one instance of the other can be related to many instances of the first entity
One-to-One relationship
between two entities in which an instance of one entity can be related to only one instance of a related entity (employee-> manages-> store)
Field
characteristic of a table
Big data
collection of large, complex data sets, including structured and unstructured data
System
collection of parts that link to achieve a common purpose
records
collection of related data elements
JOIN
combines records from two tables
data dictionary
compiles all of the metadata about the data elements in the data model
tables
composed of rows and columns that represent an entity
data marts
contains a subset of data warehouse information
system software
controls how the various technology tools work together along with the application software. Includes both operating system software and utility software.
Operating system software
controls the application software and manages how the hardware devises work together
virtualization
creates multiple virtual machines on a single computing device
Database Management System (DBMS)
creates, reads, updates, and deletes data in a database while controlling access and security
attributes
data elements associated with an entity
business rule
defines how a company performs certain aspects of its business and typically results in either a yes/no or true/false answer
Metadata
details about data
AND
displays a record if all the conditions separated by AND are TRUE.
redundant data
duplicate information
feed forward loop
element or pathway within a control system that passes a controlling signal from a source in its external environment to a load elsewhere in its external environment
Cardinality
expresses the specific number of instances in an entity
Conditions
features of a programming language, which perform different computations or actions depending on whether a programmer-specified boolean condition evaluates to true or false
Decentralized ( central components of blockchain )
governmental power is spread among more than one person or group
query by example (QBE)
helps users graphically design the answer to a question against a database
Hash
includes digital signatures from the existing blockchain as well as the new transaction.
three aspects of systems: Inputs, Process & Output
input: lettuce, tomatoes, patty, bun, ketchup process: cook the patty--put the ingredients together output: hamburger
Digital Ledger ( central components of blockchain )
list of assets (money, property, ideas...), identified ownership, and transactions that record the transfer of ownership among participants.
data model
logical data structures that detail the relationships among data elements using graphics or pictures
Database
maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses)
format
may include what needs to be done to enable the business rule to be implemented
Distinct types of data including numbers, strings, Boolean and lists (arrays or vectors)
numbers: type is probably the simplest. It consists of numbers such as integers strings: used in programming, such as an integer and floating point unit, but is used to represent text rather than numbers. ( the word "hamburger" and the phrase "I ate 3 hamburgers" are both strings.) Boolean: data type that has one of two possible values (usually denoted true and false) lists: ordered set of components stored in a 1D vector (A vector is the most common and basic data structure in R and is pretty much the workhorse of R.) (ARRAY is a homogeneous collection of elements of same data types )
software upgrades
occurs when the software vendor releases a new version of the software, making significant change to the program
software updates
occurs when the software vendor releases updates to software to fix problems or enhance features
types of entity relationships
one to one one to many many to many
primary key vs. foreign key
primary: makes it possible to uniquely identify every record in a table foreign: primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables
software distribution
process of making a software available to the end user from the developer
integrity constraints
rules that help ensure the quality of information
Analytics
science of fact based decision making
Assignment
sets and/or re-sets the value stored in the storage location(s) denoted by a variable name; in other words, it copies a value into the variable
business rule
statements (or conditions) that tell a person whether they can perform a specific action that relates to how the business operates
Entity
stores information about a person, place thing, action, or event.
entities
stores information about a person, place, thing, transaction, or event
feedback loops
take the system output into consideration, which enables the system to adjust its performance to meet a desired output response
Prescriptive Analytics
takes predictive analytics one step further by offering specific and actionable next steps for how to solve the issues brought up in the predictive data analysis
Entity relationship diagram
technique for documenting the entities and relationships in a database environment
Data aggregation
the collection of data from various sources for the purpose of data processing
attribute
the data elements associated with an entity. (also called columns or fields.
Bitcoin
the most popular and fastest-growing digital currency
tools
things people use to help them do a job
Predictive analytics
to detect problems before they even occur
FROM
used to list the tables and any joins required for the SQL statement.
SELECT
used to select the collection of records from the table, which is based on some condition. Ex: select * from student - Get all the records of student table.
SQL (Structured Query Language)
users write lines of code to answer questions against a databse
Appropriate methods for assigning variables in R
x <- 1 z <- "hello"