Understanding Data Types, Databases, and Big Data Concepts in Information Systems
Primary Key
A unique identifier for each record in a table. Example: Customer ID.
Database Schema
A visual structure of the database showing tables, fields, and relationships between tables.
Data located centrally
All data stored in one place.
Complete
All necessary data is included.
Data manipulation component
Allows users to query and update data.
When are data visualization tools most useful?
when people need to quickly analyze large amounts of data and communicate insights clearly. Visuals help decision-makers see important information faster than reading raw numbers in tables. Example: A company might use a bar chart or dashboard to quickly see which product is selling the most each month.
Processing
Analyzing large datasets requires powerful computing tools.
When do we use the sensemaking apporach
when you don't know what's going on and for making unstructured problems more structured.
How AI works
Generative AI like ChatGPT is trained on large datasets and uses machine learning to generate new text by predicting likely word patterns based on a user's input.
GB
Gigabyte
#3 Model
Goal-seeking analysis works backwards. Instead of asking what will happen, it asks: 👉 What value do we need to reach a specific goal? Example: A company wants $50,000 profit. The model calculates what price they must charge to reach that goal. Key idea: Start with the desired result, then find the input needed.
Data Storage Units (Smallest to Largest)
KB < MB < GB < TB < PB
KB
Kilobyte
#4 Model
Optimization analysis finds the best possible solution given certain limits. It looks for the maximum or minimum result. Example: A factory wants to maximize profit while limited by: labor hours materials budget The model finds the best production combination. Key idea: Find the best possible outcome under constraints.
Value
The usefulness of data for decision making. Example: customer data used to improve marketing.
Paradox
a seemingly absurd or self-contradictory statement or proposition that when investigated or explained may prove to be well founded or true. Outcome: a change in how you see reality which leads to different / opposite actions Examples "the less you study the better your grades become" "if you use fewer words you sound more intelligent" "the slower you swing the golf club, the farther the ball goes"
Dillema
a situation in which a difficult choice has to be made between two or more alternatives, especially equally undesirable ones. Outcome: choose the best alternative which may not turn out well anyway Example "if I don't pay a bribe, I might not get the business. But if I do pay a bribe, I might get fired " "To reduce my stress, I can cut down on the number of courses I take, but then I'll graduate later which will increase my stress"
what kind of app is DSS?
business application that analyzes data to help people make decisions.
Unstructured Data
data that does not follow a specific format or structure. It is harder for computers to analyze because it is not organized in rows and columns. Examples of unstructured data: emails videos images social media posts audio recordings
Action
do something and see what happens Ex: Doctors prod to see where pain is
Bad Model
inaccurate, misleading, or based on poor data or assumptions.
Deliberation
reflection, ponder, muse Ex: Simply need time to let things work through in your mind
Contextualization
relate to something we do know Ex: Cost accountant comparing July's surprising sales to last July's sales
Joint Application Development (JAD)
A process where users and IT staff work together to define database requirements.
Affiliation
share our "sense" with others Ex: crowdsourcing
Triangulation
take different readings from different sources Ex: different people/roles, facts, impressions/opinions
expert system
A programmer writes a program that uses the same rules as human experts (after interviews and observations) Ex: doctor diagnosis, oil drilling locations, financial investments Why it is AI: Because it imitates the decision-making ability of a human expert.
Record
A row in a table representing one entry. Example: One customer.
Opportunity
A set of circumstances that makes it possible to do something different. Outcome: a realization of a possible action Examples "they are making lots of money selling X, can I?" "no one has thought to do X this way, so I can" "if I do this now, and not wait, I'll get rewarded"
Database
A structured collection of related data.
Sensemaking approach
...is a process of creating meaning when there is no single meaning available. ...is about interpretation and negotiation with others to form an agreed upon reality ...happens when we are confused, when there are multiple conflicting interpretations of what's going on ...happens when we are surprised that we are surprised. ...leads one to decide if a situation is a problem, a decision, a predicament, an opportunity, a dilemma, etc.
Foreign Key
A field that links one table to another using the primary key from another table.
Volume
The massive amount of data being generated. Example: billions of social media posts.
business decision making model
A business decision-making model is a tool or system that uses data and calculations to help managers analyze situations and choose the best decision. These models let businesses test different scenarios and see what results might happen before making a real decision.
File/Table
A collection of related records.
Database
A collection of related tables.
Field
A column in a table representing a type of data. Example: Customer Name.
Semi-structured
A decision for which some parts are structured and some parts are unstructured. Some info is known, some is not. A known process will answer some of it, but not all of it. Examples: -What price should we give our new product? (unknown: market elasticity) -Should we merge with Company X? (unknown: X's real value) -What is the best route to deliver pizzas? (unknown: traffic)
Unstructured Decision
A decision that is novel and therefore has no agreed upon, well-understood procedure for making the decision. We don't know what info we need and we don't know what procedure to use. Often, we don't know what we don't know. Examples: -Which products should our company design that will make at least $1 billion? -What career will I be successful in? -How do we solve the problem of homelessness?
Structured Decision
A decision that is routine and repetitive, and often has well-defined procedures for making the decision. We know what info we need and we know what procedure to use to make the decision Examples: -Compare actual spending to estimated budget -Deciding which of three advertising plans to pay for -Solving a math problem for which there is one right answe
SQL (Structured Query Language)
A language used to retrieve and manipulate data in a database.
Problem
A matter or situation regarded as unwelcome or harmful and needing to be dealt with and overcome. Quality of urgency. Outcome: a process that leads to a different situation Examples: "we're losing sales and we need to change that" "my life-long friend is having trouble which is messing up my life" "I'm having trouble figuring out how to get a job
intelligent agent
Application that does specific tasks on behalf of its users Ex: shopping, stock picking, or spamming Ex: Siri, Bixby, Alexa, Youper, etc. Why it is AI: Because it uses data and decision rules to act intelligently and automatically.
Data visualization tools
Are software or features that turn data into visual formats such as charts, graphs, dashboards, or maps. These visuals make it easier for people to understand patterns, trends, and relationships in the data.
Data accessible
Authorized users can access data easily.
Accessible
Authorized users can easily obtain the data.
RS #7
Check that the solution solves the problem
Timely
Data is available when needed.
Accurate
Data is correct and free from errors.
Consistent
Data is the same across the system.
Reason 2: Improve data consistency
Data stays accurate across the system.
Data isolation
Data stored in separate systems that cannot communicate.
Reason 3: Better security
Databases allow controlled access to information.
Reason 1: Reduce data redundancy
Databases prevent duplicate data.
DSS
Decision Support System Models 1. What-if analysis checks the impact of a change in an assumption on the proposed solution 2. Goal-seeking analysis Finds the inputs necessary to achieve a goal such as a desired level of output 3. Sensitivity analysis The study of the impact that changes in one (or more) parts of the model have on other parts of the model 4. Optimization Analysis Extension of goal-seeking analysis, finds the optimum (best) value for a target variable by repeatedly changing other variables to see what scenario produces that optimum value.
RS #4
Define decision criteria
RS #2
Define the requirements and goals of the decision.
Data definition component
Defines the database structure.
Data inconsistency
Different versions of the same data exist.
Weighted Decision Matrix
Each criterion is given a weight based on how important it is. The scores are multiplied by the weights to calculate a final score. Price and quality matter more than convenience, so they affect the final decision more. Key idea: Some criteria matter more than others, so they are weighted
Data quality controlled
Ensures accuracy and consistency.
Data quality
Ensuring data is accurate and reliable.
Populating a Database
Entering data into a database.
RS #6
Evaluate alternative solutions using criteria
Big Data
Extremely large and complex datasets that are difficult to process using traditional data tools.
Storage
Huge amounts of data require large storage systems.
RS #3
Identify alternative solutions
RS #1
Identify and define/describe the problem
Decision
Make a decision Solve a problem Exploit an opportunity Resolve a paradox Resolve a dilemma Decide what the decision is Infer an instinct (book: Blink, by Malcolm Gladwell) Commence an action Design something
MB
Megabyte
Reason 4: Easier data sharing
Multiple users can access the same database.
PB
Petabyte
Security
Protecting large amounts of sensitive data.
Data redundancy
Same data stored in multiple places.
RS #5
Select appropriate decision making process and tools.
#2 Model
Sensitivity analysis looks at how much the result changes when one variable changes repeatedly. It tests many values of a single variable to determine which has the greatest impact. Example: A company tests profits if the price is: $8 $9 $10 $11 $12 This helps see how sensitive profit is to price changes. Key idea: Change one variable multiple times to see how results react.
Database Management System (DBMS)
Software that allows users to create, store, manage, and retrieve data from a database.
Data dictionary
Stores metadata about the database.
Which is easier for businesses to process?
Structured data, because it is organized and easily searchable in databases.
TB
Terabyte
Artificial Intelligence
The ability of computers or software to perform tasks that normally require human intelligence, such as learning, reasoning, problem-solving, and making decisions.
Data value
The actual piece of data stored in a cell.
Variety
The different types of data. Example: text, video, images, sensor data.
Veracity
The reliability or accuracy of data. Example: inaccurate social media data.
Velocity
The speed at which data is generated and processed. Example: real-time stock market data.
Data insecurity
Unauthorized access or weak protection of data.
Data easier to maintain
Updates and changes are easier to manage.
Rational Strategy
Use when the decision is mostly structured 1 Identify and define/describe the problem 2 Define the requirements and goals of the decision. 3 Identify alternative solutions 4 Define decision criteria 5 Select appropriate decision making process and tools. 6 Evaluate alternative solutions using criteria 7. Check that the solution solves the problem
Decision matrix
Used to compare different options based on several criteria. Each option is rated on the criteria, and the scores are compared to help decide which option is best. Each criterion is treated equally important. Key idea: All criteria have the same importance.
supervised learning in machine learning applications
We give the ML algorithm known and labeled data to learn from. This is "supervised". Then using that knowledge, it looks at new, unknown data and uses what it learned to identify it.
unsupervised learning in machine learning applications
We give the ML algorithm unknown and unlabeled data to learn from. This is "unsupervised". Then it picks out patterns on its own, remembers what it sees, then uses that to identify new, unknown data.
#1 Model
What-if analysis asks: "What happens if we change something?" It tests different scenarios to see how results change. Example: A company asks: What happens to profit if we raise the price from $10 to $12? The model shows the new profit. Key idea: Change a variable → see what happens.
Good Model
is accurate, useful, and realistic
Structured data
is data that is organized in a clear format, usually in rows and columns in a database or spreadsheet. Because it is organized, computers can easily search, sort, and analyze it. Examples of structured data: customer names and IDs in a database sales numbers in a spreadsheet product prices dates and transaction records
