MIS Final Take 2
Volume, Velocity, Variety
3 Characteristics of Big Data
CUSTOMER & ORDER
A CUSTOMER may have zero but can have many ORDERs An ORDER must have one and only one CUSTOMER
Orders
A CUSTOMER may have zero or many _____________?
Orders
A PRODUCT can have zero but can have many ___________?
Distributor
A ______________ may have zero but can have many ORDERS
Normalization
A method for analyzing and reducing a relational database to its most streamlined form to ensure minimum redundancy, maximum data integrity, and optimal processing performance
Data Warehouse
A repository of historical data that are organized by subject to support decision makers in the organization
Web Browser
Allows users to access the world wide web (WWW)
Difficulties of Managing Data
Amount of data increases exponentially with time
ORDER & DISTRIBUTOR
An ORDER must have one and only one DISTRIBUTOR A DISTRIBUTOR may have zero, but can have many ORDERs
Customer
An ORDER must have one and only one ______________?
Products
An ORDER must have one but can have many ________?
ORDER & PRODUCT
An ORDER must have one, but could have many PRODUCTs A PRODUCT may have zero, but could have many ORDERs
Order
An ___________ must have one and only one DISTRIBUTOR.
Data Governance
An approach to managing information across an entire organization. Involves a formal set of business processes and policies that are designed to ensure that data are handled in a certain, well-defined fashion
Intellectual Capital/Assets
Another term for knowledge
Data independence
Applications and data are independent of one another, that is, applications and data are not linked to each other, so all applications are able to access the same data
Data isolation
Applications cannot access data associated with other applications
Data security
Because data are "put in one place" in databases, there is a risk of losing a lot of data at one time. Therefore, databases must have extremely high security measures in place to minimize mistakes and deter attacks
Bit
Binary digit that represents the smallest unit of data a computer can process. The term binary means that a bit can consist only of a 0 or a 1.
E-Commerce
Buying and selling of goods and services over the internet
Data Hierarchy
Data is organized in a hierarchy that begins with bits and proceeds all the way to databases
Difficulties of Managing Data
Data is so scattered throughout organizations and are collected by many individuals using various methods and devices, usually stored in multiple servers/locations
Data integrity
Data meet certain constraints, for example, there are not alphabetical characters in a social security number field
Difficulties of Managing Data
Data rot
Clickstream Data
Data that is produced when visitors/customers produce when they visit a website and click on hyperlinks
Database
For example, the student course file could be grouped with files on students' personal histories and financial backgrounds to create a student database
Difficulties of Managing Data
Generated from multiple sources (internal, personal, and external, web: clickstream data)
E-Business
Includes E-commerce along with all activities related to internal and external business operations
Origin of the Internet
Internet began as emergency military communications system then to a communications tool for scientists then to business
Knowledge, Create, Capture, Refine, Store, Manage, Disseminate
KMS Cycle
Four Cardinality Symbols
Mandatory Single, Optional Single, Mandatory Many, Optional Many
Unstructured decisions
Operational Control: --- Management Control: Negotiating, recruiting an executive, buying hardware, lobbying Strategic Planning: New technology development, product R&D, social responsibility planning IS Support: Decision support systems, expert systems, enterprise resource planning, neural networks, business intelligence, big data
Structured decisions
Operational Control: Accounts receivable, order entry Management Control: Budget Analysis, short term forecasting, personnel reports, make-or-buy analysis Strategic Planning: ---- IS Support: MIS, statistical models (mgmt. science, financial, etc.)
Semi-structured decisions
Operational Control: Production scheduling, inventory control Management Control: Credit evaluation, budget prep, plant layout, project scheduling, reward systems design Strategic Planning: Building a new plant, mergers and acquisitions, planning IS Support: Decision support systems, business intelligence
Difficulties of Managing Data
Over time, organizations have developed IT systems for specific business purposes, such as transaction processing, SCM, and customer relationship management, which can cause repetition and conflicts across the organization
World Wide Web (WWW)
Provides access to internet information through documents including text, graphics, audio, and video files that use a special formatting language called Hypertext Markup Language (HTML)
Data Rot
Refers primarily to problems with the media on which the data is stored. Over time, temp, humidity, and exposure to light can cause physical problems with storage media which makes it difficult to access
Key performance Indicators (KPIs)
Revenue, return on investment, overhead, and operational costs
Reintermediation
Steps are added to the value chain as new players find ways to add value to the business process
HTTP HyperText Transport Protocol
The internet protocol web browsers use to request and display web pages using a Universal Resource Locator (URL)
Granularity of the Data
The level of depth represented by the data in a fact or dimension table in a data warehouse
Data redundancy
The same data stored in multiple locations
Data inconsistency
Various copies of that data do not agree
Innovate
What was the name of the system McDonald's was trying to develop?
What-If Analysis
•(Think of excel) A model builder must make prediction and assumptions regarding the input data, many of which are based on the assessment of uncertain futures •Results depend on accuracy of these assumptions •Attempts to predict the impact of a change in the assumptions (inputs) on the proposed solution
Business Intelligence
•A broad category of applications, technologies, and processes for gathering, storing, accessing, and analyzing data to help business users to make better decisions •Enable decision makers to quickly ascertain the status of a business enterprise by examining key information
Data File
•A collection of logically related records •In each file management environment, each application has a specific data file related to it that contains all the data records the application requires
Foreign Key
•A field or group of fields in one table that uniquely identifies a row of another table •Used to establish and enforce a link between the two tables
Byte
•A group of 8 bits •Represents a single character •Can be a letter, a number, or a symbol
Record
•A logical group of relating fields, such as student's name, courses taken, the date, and the grade •For example: in the iTunes store, a song is a field in the record, with other fields containing the song's title, its price, and the album on which it appears
Table or data file
•A logical grouping of related records •Also called a data file •For example: a grouping of records from a particular course, consisting of course number, professors, and students grades
Data Mart
•A low-cost, scaled-down version of a data ware house that is designed fro the end-user needs in a strategic business unit (SBU) or individual department •Can be implemented more quickly than data warehouses, often less than 90 days •Support local rather than central control
OLAP (online analytical processing)
•A set of capabilities for slicing and slicing data using dimensions and measures associated with the data •Involves the analysis of accumulated data by end users
Metadata
•A set of data that describe and gives information about other data •Important to maintain •Used by IT personnel and users •Users' needs include data definitions, report/query tools, report distribution information, and contact information
Data Visualization
•After data is processed, they can be presented to users in visual formats such as text, graphics, and tables •Becoming increasingly popular on the web for decision support •Applications: geographic information systems and reality mining
Intermediaries
•Agents, software, or businesses that provide a trading infrastructure to bring buyers and sellers together •Disintermediation, reintermediation, cybermediation)
Predicting Airline Arrivals More Accurately
•Air travelers are accustomed to long flight delay and cancellations for any number of reasons •Few airlines can accurately predict flight arrivals •Sponsored by Alaska Airlines and General Electrics: Flight quest contest aimed at developing algorithm that could airlines better predict flight arrivals •A team for Singapore won, make times 40% more accurate, and helped reduce congestion, manage crews more efficiently, and save travelers up to five minutes at the gate. •Each minute saved in a flight saves $1.2 mill in annual crew costs and $5 mill in fuel savings
Secondary Key
•Another field that has some identifying information, but typically doesn't identify the record with complete accuracy •For example: a student major might be a secondary key if a user wanted to identify all students of a certain major
Types of E-Commerce
•B2B: Business sells products and services to customers who are primarily other businesses (Where the largest amount of e-commerce money is generated) B2C: business sells products and services to customers who are primarily individuals (the glitzy e-commerce like Amazon, iTunes, eBay, etc.)
Relational Data Mode
•Based on concept of 2-dimensional tables •Designed with a number of related tables, each which contain records (rows) and attributes (columns) •To be valuable, must be organized so that users can retrieve, analyze, and understand the data they need •Key to designing effectively is the data model
Three Broad Business Models
•Bricks-and-mortar •Clicks-and-mortar/Bricks-and-clicks •Pure Play (virtual organization)
Decision Support Systems (DSS)
•Combine models and data to analyze semi-structured problems and some unstructured problems that involved extensive user eDanvolvement •Can enhance learning and contribute to all levels of decision making •Have related capabilities of sensitivity analysis, what-if analysis, and goal seeking analysis
Explicit Knowledge
•Consists of policies, procedural guides, reports, products, strategies, core competencies, and IT infrastructure of the enterprise •The knowledge that has be codified/documented in a form that can be distributed to others or transformed into a process or strategy
Tacit Knowledge
•Cumulative store of subjective or experiential learning •Consists of an organization's experiences, insights, expertise, know-how, trade secrets, skill sets, understanding, and learning •Generally imprecise, costly to transfer, highly personal •Example: a salesperson who has worked with particular customers over time and has come to know their needs quite well
KMS Cycle
•Cyclical because knowledge is dynamically refined over time •Knowledge is an effective KMS is never finalized because the environment changes over time and must be updated
Data Warehouse Given Nordea Bank a Single Version of the Truth
•Data warehouse implemented because of new financial regulations and business challenges •Managed by a finance group because they "owned" the data and data-management processes •Objectives: to improve customer services and to comply with all relevant regulation •Nordea was created from the merger of four separate financial institutions, each with it's own legacy IS •Data at Nordea was governed by manual processes, stored on spreadsheets, and managed locally, which was inadequate to meet the demands of a modern global banking system (management problems) •As a result of the management problems, the most important principle for the data WH to create was a single version of the truth. The finance team built common data definitions and master data to compare variables across geographies and business functions •Based in Stockholm, stores 11 terabytes and 7 billion records •Reporting lead times decreased from 8 to 4 days, enabled bank to conduct analyses more quickly, accurately, and cheaply •Can drill down to customer, account, & product in one source •Current financial climate requires banks to focus more carefully on compliance •Now Nordea can meet regulations quickly and accurately
Emergency Department Information Exchange (EDIE)
•Database that contains the records of each patient treated in every hospital ER in the state (Washington) •Allows physicians to track patients' ER visits to multiple hospitals •Previously many tried regional databases, but many hospitals didn't join •State would not reimburse patients for more than 3 hospital visits per year & made a list of 500 medical problems (ex: Bronchitis) that couldn't be reimbursed in the ER •EDIE meets federal health privacy laws by allowing only approved staff members to access patient data •When a patient registers in the ER, a fax/email is received with all patient's recent ER admissions, diagnoses, and treatments •When patients leave-able to track care, such as sending paramedics to check on high risk patients
OTLP (Online Transaction Processing
•Database that is detailed and current data, and schema used to store transactional databases is the entity model •Business transactions are processed online as soon as they occur
Entity-Relationship Modeling
•Designers plan and create databases through this process •Uses an entity-relationship diagram •ER diagrams consist of entities, attributes, and relationships •Valuable because it allows database designers to communicate with users throughout the organization to ensure that all entities and the relationships among these entities are represented •To properly identify entities, attributes, database designers first identify the business rules
Attribute
•Each characteristic or quality of a particular entity •For example: our entities were a customer, an employee, and a product, entity attributes would include customer name, employee number, and product color.
Primary Key
•Every record in a database must contain at least one field that uniquely identifies that record so it can be retrieved, updated, and sorted •For example: A student record in a US university would use a student ID number as it's primary key
Dashboards
•Evolved from executive information systems •Provides easy access to timely information and direct access to management reports User friendly, supported by graphics, and enables managers to examine exception reports and drill down into detailed data
Google's Knowledge Graph
•Example of a database, 600 million+ entries 18 billion+ links •Considered as a vast database that enables Google Software to connect facts on people, places, and things to one another •Purpose is to enables Google's future products and truly understand the people who use them and the things they are about •interprets a searcher's query in a much more sophisticated way and directly retrieve relevant information •Is integrated into YouTube, to be used to organize videos by topic and to suggest new videos to users based on previously watched videos •Ex: Can use relationships among entities to enable the knowledge graph to determine if two famous people are married
Significance in the evolution of the Internet
•Global Reach •Abundant (almost limitless) information •Create/enhance relationships •Interaction •Entertainment
Why might the internet be called the "ultimate disruptor" to business?
•Internet is one of the biggest forces changing business •Organizations must be able to transform/evolve as markets, economic environments, and technologies change •Focus on the unexpected
Why do Managers Need IT Support
•Key to good decision making is to explore and compare many relevant alternative, more alternatives=more decision maker needs computer-assisted searches/comparisons •Most decisions must be made under time pressure •Decisions are becoming more complex, usually necessary to conduct a sophisticated analysis to make a good decision •Necessary to rapidly access remote information, consult with experts, or conduct group decision-making session, all without incurring major expenses •To make quality decisions
Knowledge Management
•Knowledge is a vital asset •A process that helps organizations manipulate important knowledge that comprises part of the organization's memory, usually in an unstructured format •To be successful, knowledge, as a form of capital, must exist in a format that can be exchanged among many people, i.e. it must be able to grow •Distinct from data and info •Contextual, and relevant, useful •Knowledge is information in action •Goal is to help an organization make the most productive use of the knowledge it has accumulated
Field
•Logical grouping of characters into a word, small group of words, or an identification number •For example, a student's name in a university's computer files would appear in the name field, and a SSN would be in the social security number field •Can also contain data other than texts and numbers, such as images or other multimedia, such as a driver's license photograph or a voice sample
MetLife Wall (2013)
•MetLife is the largest global provider of insurance, annuities, and employee benefit programs •Inspired by the Facebook "wall" •Relied heavily on humans to integrate systems and data bases. Customer services reps and claims researchers had to access multiple applications and utilize 40+ screens to gather all data/documents to answer customer questions, which decreased worker productivity and customer satisfaction •Integrates different sources of customer data to let representatives review customer's histories, conversations with the company, claims filed and paid, and their various policies-all on a simple timeline •Answer questions more efficiently and quickly, increased customer satisfaction •In addition to the wall, Metlife is using MongoDB to store resumes and a mobile app that can upload documents, videos, and photos, etc., to be shared with select individuals at some future data, such as a life insurance policy made available to beneficiaries after the policyholder has passed away
Disintermediation
•Occurs when a business sells directly to the customer online and cuts out the intermediary •One of the most pressing EC issues relating to online services and marketing tangible products •Intermediaries/middlemen (two functions: provide info and perform value added services) are eliminated
Big Data
•Organizations must manage huge quantities of data, consists of structured and unstructured data and are called big data •Essentially is about prediction, that don't come from "teaching" computers to "think" like humans. Instead, predictions come from applying mathematics to huge quantities of data to infer probabilities •For example: likelihood that an email is spam, likelihood that typed letters "teh" are supposed to be "the" •Diverse, high-volume, high-velocity information assets that require new froms of processing to enable enhanced decision making, insight discovery, and process optimization •Vast data sets that perform the following" exhibit variety •Include structured, unstructured, and semi structured data
Entity
•Person, place, thing, or event (such as a customer, employee or product) about which information is maintained •Usually identified in user's work environment •A record generally describes an entity
Data Dictionary
•Provides information on each attribute, such as its name, if it is a key, part of a key, or a non key, the type of data expected (alphanumeric, numeric, alphabetical, dates, etc.) •Can provide information on why the attribute is needed in the database; which business functions, applications, forms, and reports use the attribute, and how often the attribute should be updated
Data Quality
•Quality must meet users' needs. If it doesn't users wont trust data or uses it •Data in source systems is poor and must improved before use with data-cleansing software but it is better to improve at the source system level •For example: A hotel was keeping track of zip codes but found many of them were 99999. It occurred because clerks weren't asking customers for zip codes but needed to enter some information.
Mobile Commerce (M-Commerce)
•Refers to e-commerce that is conducted entirely in a wireless environment •Example: using cellphones to shop over the internet
Associative Entity
•Refers to each row in a relational table, which is a specific, unique representation of the entity •For example: your university's student database contains and entity called STUDENT. An instance of the STUDENT entity would be a particular student. For instance, you are an instance of the STUDENT entity in your university's student database
Cardinality
•Refers to the maximum number of times an instance of one entity can be associated with an instance in the related entity •Four symbols: Can be mandatory single, optional single, mandatory many, or optional many
Knowledge Management Systems (KMS)
•Refers to the use of modern IT-internet, intranets, extranets, databases- to systematize, enhance, and expedite intrafirm and intrafirm knowledge management •Intended to help an organization cope with turnover, rapid change, and downsizing by making the expertise of the organization's human capital widely accessible •Benefits: best practices-most effective/efficient ways of doing things readily available for a wide range of employees, improves overall performance •Challenges: employees must be willing to share personal knowledge, organization must maintain and upgrade its knowledge base, must update new knowledge and get rid of old
How Much Rent Can You Charge? RentRange
•Renting out homes was largely a mom-and-pop business until the US housing bust (2006-12) •After bust, large firms began investing substantial sums of money in this market •RentRange: provides analyses about purchasing foreclosed and low-priced single family homes then renting them out •Uses data from 12 million properties to estimate how much rent a property is likely to generate •Customers pay $50,000 for 5 years of rent data at many levels (state, county, zip code, etc.) •Accuracy between 1-3%
Goal-Seeking Analysis
•Represents a backward solution approach •Attempts to calculate a value of the inputs necessary to achieve a desire level of output
Database Management Systems (DBMS)
•Set of programs that provide users with tools to create and manage a database •Managing a database refers to the processes of adding, deleting, modifying, and analyzing data stored in a data base •Access data by using query and reporting tools within the DBMS •Provide mechanisms for maintaining the integrity of store data, managing security and user access, and recovering information if the system fails •Must be carefully managed because they are essential to all areas of business
Structured Query Language
•The most popular query language for requesting information from a relational database •Allows people to perform complicated searches by using relatively simple statements or key words (such as select, from, where)
Data Mining
•The process of searching for valuable business information in a large database, warehouse, or mart •Performs 2 operations: 1) Predicting trends and behaviors 2) identifying unknown patterns •Helps to explain why it is happening and what will happen
Sensitivity Analysis
•The study of the impact that changes in one or more parts of a decision-making model have on other parts •Most examine the impact that changes in input variables have on output variables •Generally performed to determine the impact of environmental variables on the results of the analysis •Valuable because it enables the system to adapt to changing conditions and the varying requirements of different decision making situations
Data Dashboard
•Think of MBS- presents information on sales and daily shipments •Usually dynamically updated and interactive
Data Governance
•To ensure that BI is meeting their needs, organizations must implement governance to plan and control their BI activities •Requires that people, committees, and processes be In place •Effective: Ensure strategies are aligned, prioritize projects, and allocate resources
Big Data Variety
•Traditional data formats tend to be structured and relatively and well described and they change slowly. Includes financial market data, POS transactions & more. •In contrast, big data formats change rapidly •For example: satellite imagery, broadcast audio streams, digital music files, web page content, scans of government documents, and comments posted on social networks
Conoco Phillips
•US multinational energy corporation •First step is to find oil and gas, then drill •PLOT: plunger lift surveillance and optimization software tool •PLOT gathers data specifically related to CP's use of plunger lifts, which are installed inside a will to life fluids that are stopping the flow of gas