BSAD 353: Final Exam
Embedded
executes when an event attached to a control or object occurs
Data mining
used to search data warehouses
Data manipulation subsystem
provides tools for maintaining and analyzing data Data Maintenance Analysis Tools
Data
- Facts or observations about people, place, things, and events - Used to be only numbers, letters, and symbols but now also include audio, music, photograph, and video
Data warehouse:
-Database that stores current and historical data that may be of interest to decision makers -Consolidates and standardizes data from many systems, operational and transactional databases -Data can be accessed but not altered
Program Maintenance
75% of total lifetime cost for an application is for maintenance Ensures program is •Error-free •Effective •Efficient Two activity categories 1.Operations •Patches - programming modification or corrections •Software updates - significant patches 2.Changing needs •Agile development - starts with getting the core functionality working then expands through to customer satisfaction
Phase 2: Systems Analysis
Data is collected about the present system and then analyzed to determine the new requirements 1. Gather data 2. Analyze data 3. Create summary
Batch Processing
Data is collected over a period of time and the processing happens later all at one time
Multidimensional
Data stored in data cubes with three or more dimensions A variation and an extension of the relational model •Data cube: Extension of the two dimensional data model to include additional or multiple dimensions •Good for representing complex relationships •Advantages over relational databases
Record
collection of related fields
Table
collection of related records
Data dictionary or schema
contains a description of the structure of data
Stand-alone
created and used independently of other controls or objects
Logic view
data is organized into groups or categories
Transitive dependency
exists when the value of a non-key field is functionally dependent on the value of another non-key field.
Physical view of data
focuses on the actual format and location of the data
Logical view of data
focuses on the meaning, content, and context of the data
Multidimensional Data Model
is typically used for the design of corporate data warehouses and departmental data marts. This view shows product versus region. If you rotate the cube 90 degrees, the face that will show is product versus actual and projected sales. If you rotate the cube 90 degrees again, you will see region versus actual and projected sales. Other views are possible. This graphic illustrates a data cube composed of three dimensions: actual/projected sales, product, and region. Obviously, the point is to try and understand differences between actual and projected sales by looking at region and product. This is an ideal problem for pivot tables in Excel because two of the variables are categorical (product and region), one is interval.
Data Maintenance
maintaining data
Security
limited access
Phase 5: Systems Implementation
process of changing-converting-from the old system to the new one and training people to use the new system.
Patches
programming modification or corrections
Application generation subsystem
provides tools to create data entry
Data warehouse
storing in a database for special use
Individual
Integrated files used by one person
WHERE
- Sets the criteria for the rows in the results
Periodic evaluation
All system's should be evaluated time to time
Computer-aided software engineering (CASE)
Automates portions of the development process
Software made up of
DBMS engine Data definition subsystem Data manipulation subsystem Application generation subsystem Data administration subsystem
Input data
•Determine the source of the data
Operational feasibility
asks how it will be received by all users
Economic feasibility
asks is a new system be economical?
Technical feasibility
asks is it technically possible?
Control Navigation
- A navigation form can be created to create a menu of commonly used objects. This is a tabbed menu system that ties the objects in the database together so that the database is easy to use. - An advantage of having a navigation form is that it gives the users a guide and helps make sure they are using the forms and reports that they should be using.
Advantage of an Executable Form of a Database
- An Access database executable (ACCDE) file is made to prohibit users from making design and name changes to forms or reports within the database and prohibits users from creating new forms and reports
Table Analyzer Tool
- Analyzes the tables in a database and make suggestions to minimize duplication of data. - Might involve splitting tables into smaller, related tables. - The Table Analyzer Wizard can decide for you or you can take control of the process by deciding which fields should be included in new tables.
Database Documenter Tool
- Creates a report that contains detailed information for each selected object in a database. It shows the field names, data types, properties, indexes, and permissions of each selected object. - Can also be used to get information that is not apparent by looking at an object, such as the last time it was modified or when a field in a table was last modified
Identify When to use a Data Macro
- Data macros are used to validate and ensure the accuracy of data in a table. - Data macros can only be associated with table events.
ORDER BY
- Determines how the records will be sorted
Performance Analyzer Tool
- Evaluates each object in a database and makes recommendations for optimizing the database. It lists three kinds of analysis results—recommendations, suggestions, and ideas.
The benefits of normalization are:
- Minimization of data redundancy - Improvement of referential integrity enforcement - Ease of maintaining data (add, update, delete) - Accommodation of future growth of a database
Purpose of a Macro
- Macros can be used to automate repetitive tasks or perform a specific action. - They can group a series of commands and instructions into a single database object to accomplish repetitive or routine tasks simply by executing the macro.
Second Normal Form (2NF)
- Meet first normal form criteria and all non-key fields are functionally dependent on the entire primary key. - A non-key field is a field that is not part of the primary key.
Third Normal Form (3NF)
- Meets second normal form criteria and if no transitive dependencies exist. - This may mean that new tables must be created that can reference other tables. Sometimes this may require large amounts of tedious data entry, but it would mean there is less chance of error if information is not repeated in several tables.
Phased
- New system is introduced a little at a time - Expensive but least risky, people in an organization performing different operations
Pilot
- New system is tried by one section of the organization, then another, etc. - Less expensive but riskier, preferred when there are many people performing similar operations
Parallel
- Old and new systems operate side by side until new system is reliable - Low risk but expensive trying to manage two systems at the same time
Finalize the Design
- Once new tables have been created, relationships should be created to complete the normalization process. The tables should be connected and Enforce Referential Integrity options should be checked.
Direct
- Out with the old and in with the new - Risky as the old system is no longer available to fall back on if the new one is wrong
SELECT
- Specifies the fields to include in the query
FROM
- Specifies the table or tables where the fields can be found
Structured Query Language
- The industry-standard language for defining, manipulating, and retrieving data in a database. •You are learning the data-retrieval and -manipulation language of all the industry-leading databases.
Database Splitter Tool
- The main advantage is that it improves reliability, security, and flexibility. - The front-end database contains the queries, forms, and reports, while the back-end database contains the tables of the database. - The tables of the database are stored in a separate location than the queries, forms, and reports. - If the front-end database is corrupted, it will not affect the back-end database, because they are two separate files.
Program Documentation
- Written descriptions and procedures about a program and how to use it - Carried on throughout the programing steps - Important for people who will use and/or support the program •Users need to know how to use the software •Operators need to know what to do about any error messages •Programmers may even remember all the details •Those taking over the program will need to know details
•Analytical C R M
-Based on data warehouses populated by operational C R M systems and customer t ouch points -Analyzes customer data (OLAP, data mining, etc.) §Customer lifetime value (C L T V )
•Operational C R M
-Customer-facing applications -Sales force automation, Call center and customer service support Marketing automation
Information policy
-States organization's rules for organizing, managing, storing, sharing information
Data mart
-Subset of data warehouses that is highly focused and isolated for a specific population of users
Macro
A series of actions that can be programmed to automate tasks. There are two types of macros: stand-alone macros and embedded macros.
System Analysis and Design
A system is defined as a collection of activities and elements organized to accomplish a goal Six-phase problem-solving procedure for examining and improving an information system
Named Data Macro
Access also allows you to create named data macros that can be accessed from anywhere in the database, including running them from within another macro.
Real-Time Processing
Also known as online processing because it happens immediately during the transaction
Pseudocode
An outline of the logic of the program you will write •Summary of the program before it is written
Company
Common operational or commonly used files shared in an organization
Automated design tools
Computer-aided software engineering tools (CASE) These tools relieve the systems analysts of many repetitive tasks, develop clear documentation, and, for larger projects, coordinate team member activities
First Normal Form (1NF)
Contain no repeating groups or repeated columns.
Manual testing with sample data
Correct and incorrect data manually entered, results evaluated
Relational
Data stored in tables consisting of rows and columns A more flexible type where there are no access paths down a hierarchy •Data stored in table called a relation •Tables linked via a common data item •Tables consist of rows and columns
Hierarchical
Data structured in nodes organized like an upside-down tree; each parent node can have only one parent Nodes: points connected like branches of an upside-down tree •One parent per node •A parent can have several child nodes •One-to-many relationship (*) A major concern is that if your parent node is deleted then so are all subordinate child nodes
Database administration
Database design and management group responsible for defining and organizing the structure and content of the database, and maintaining the database.
Distributed
Database spread geographically and accessed using database server
Program Test
Debugging: The process of testing and then eliminating errors such as: •Syntax errors •Logic errors •Testing process
An SQL keyword
Defines the purpose and the structure of an SQL statement. There are four basic keywords in SQL: Select, From, Where, Order by
Repetition structure or Loop Structure
Describes a process that may be repeated as long as certain condition remains true
Logic Structures
Enables you to write structured programs, which take much of the guesswork out of programming •Sequential structure •Selection structure •Repetition structure or Loop Structure
Flowcharts
Graphically present the detailed sequence of steps needed to solve a programming problem
Types of Databases
Individual, Company, Distributed, and Commercial
Commercial
Information utilities or data banks available to users on a wide range of topics •ProQuest Dialog •Dow Jones Factiva •LexisNexis
Generations of Programming Languages
Levels or Generations •Coding from machine languages to human or natural languages There are five distinct generations •Lower level is closer to machine language •Higher level is closer to human-like language
Network
Like hierarchical except that each child can have several parents Has a hierarchical node arrangement •Each child node may have more than one parent node •many-to-many relationship (*) •Additional connections between parent and child are pointers •Nodes can be reached through multiple paths
Data Organization
Logic view Key Field or Primary Key
Collection of integrated data
Logically related files and records
Sharing
between departments of an organization
Database model
Model defined rules and standards for data in a database
Functional dependency
Occur when the value of one field is determined by the value of another.
Object-oriented
Organizes data using classes, objects, attributes, and methods Store data as well as instructions to manipulate data •Classes - general definitions •Objects - specific instances of class containing data and instructions to manipulate the data •Attributes - data fields an object possesses •Methods - instructions for retrieving or manipulating attribute values
Program Design
Plan a solution using structured programming techniques •Techniques •Top-down design •Pseudocode •Flowcharts •Logic structures Plan -> Document -> CODE
Desk checking or Code review
Printout of program reviewed line by line
Program's objectives
Requires a clear statement of the problem being addressed
Data administration
Responsible for specific policies and procedures through which data can be managed as a resource
Data Analysis Tools - Grid Chart
Show the relationship between input and output documents
Database Management System (DBMS)
Software that enables users to create, modify, and gain access to data
Systems audit
System's performance is compared to the original design specs to determine productivity
Documenting
Systems Analyst Report
Phase 6: Systems Maintenance
Systems maintenance is an ongoing activity
Event-Driven Data Macro
Table events occur naturally as users enter, edit, and delete table data. Data macros that are associated with tables include Before Change and After Delete, and can be programmed to run before or after a table event occurs.
Beta testing
Testing by a select group of potential users; users provide feedback
Testing sample data on the computer
Tests for logic errors
Six-Step Software Development Life Cycle
The six steps are as follows: 1.Program specification 2.Program design 3.Program code 4.Program test 5.Program documentation 6.Program test
Desired output
The end-user should communicate the inputs and outputs
Phase 1: Preliminary Investigation
The preliminary investigation determines the need for a new information system 1. Define the problem 2. Suggest alternatives 3. Prepare report
Syntax errors
are a violation of the rules of programming language
Phase 4: Systems Development
Three steps 1. Acquire the software 2. Acquire the hardware 3. Test the new system
Phase 3: Systems Design
Three tasks •Design the alternatives •Select the best system •Write a systems design report
Top-Down Program Design
To identify the program's processing steps; called modules •Each module is made up of logically related program statements
Key Field or Primary Key
Unique identifier used to create relationships between tables
Top-down analysis method
Used to identify the top-level components of a complex system and each component is broken down into small components making analysis easier
An SQL SELECT statement
Used to retrieve data from tables in a database, just like a select query.
3Vs
Volume, variety, velocity
Coding
Write the program •A programming language uses a collection of symbols, words, and phrases that instruct a computer to perform specific operations
Program Code
Writing the program is called coding Logic -> Language -> Code -> TEST
Attempt at translation
Written program goes through translator program on the computer, must be syntax error free
DBMS engine
a bridge between the logical view of data and the physical
Testing process
involves one or more of several methods
Data integrity
accurate updating of files
Prototyping
building a model of the new system for trial
Less data redundancy
decrease unnecessary duplication
Data definition subsystem
defines the logical structure by using data dictionary or schema that contains a description of the structure of data
DBMS Structure
designed to work with data that is logically structured or arranged in a particular way
Clustering
discovering as yet unclassified groupings
Sequences
events linked over time
Field
group of related characters
Data administration subsystem
helps you manage the overall database environment •Database Administrators (DBAs) administer the database •Processing rights to determine who has access to the databases
Database
integrated collection of logically related tables
Logic error
occur when the programmer uses incorrect calculation or leaves out a programming procedure
Associations
occurrences linked to a single event.
Event
occurs when an action takes place
Classifications
patterns describing a group an item belongs to
Two ways to view data
physical view and logical view
Data integrity
reduce likelihood of inaccurate data
Data redundancy
same information in multiple files (duplicate)
Advantages to having databases
sharing, security, less data redundancy, data integrity
Software updates
significant patches
Character
single letter, number, or special character
Agile development
starts with getting the core functionality working then expands through to customer satisfaction
Rapid applications development (RAD)
the use of powerful development software, small specialized teams, and highly trained personnel costly but development is short and quality is better
Analysis Tools
used to view parts of the data Query-by-example (QBE) Structured query language (SQL)
Forecasting
uses series of values to forecast future values
Selection structure
• A decision must be made
Sequential structure
• One program statement follows another
5 Generations
•1st Gen: Machine languages •Data represented in 1s and 0s •2nd Gen: Assembly languages •Uses abbreviations or mnemonics that are automatically converted to the appropriate sequence of 1s and 0s •3rd Gen: High level procedural languages (3GLs) •Designed to express the logic - the procedures - that can solve general problems. Translated into machine language with a compiler or an interpreter •4th Gen: Task-oriented languages (4GLs) •Designed to solve specific problems •5th Gen: Problem and Constraint languages (5GL) •Computer languages that incorporate the concepts of artificial intelligence to allow a person to provide a system with a problem and some constraints and then request a solution
Program
•A list of instructions for the computer to follow to accomplish the task of processing data into information •Statements used in a programming language such as C++, Java, or Python •Programs can be •Prewritten/packaged •Custom-made
Programming or Software Development
•Actually a problem-solving procedure •Follows a six-step process know as the System Development Life Cycle
Program Specification
•Also called program definition or program analysis •Five items must be specifie 1.Program's objectives 2.Desired output 3.Input data required 4.Processing requirements 5. Documentation
Business Intelligence Infrastructure
•Array of tools for obtaining useful information from internal and external systems and big data -Data warehouses -Data marts -Hadoop -In-memory computing -Analytical platforms
Enterprise Software
•Built around thousands of predefined business processes that reflect best practices -Finance and accounting -Human resources -Manufacturing and production -Sales and marketing •To implement, firms: -Select functions of system they wish to use -Map business processes to software processes §Use software's configuration tables for customizing
Business Value of Customer Relationship Management Systems
•Business value of C R M systems -Increased customer satisfaction -Reduced direct-marketing costs -More effective marketing -Lower costs for customer acquisition/retention -Increased sales revenue •Churn rate -Number of customers who stop using or purchasing products or services from a company -Indicator of growth or decline of firm's customer base
•Advantages over relational databases
•Conceptualization provides users with an intuitive model in which complex data and relationships can be conceptualized •Processing speed for analyzing and querying a large multidimensional database is faster
Strategic uses
•Data warehouse •Data mining
Security
•Databases are valuable so protection necessary •Protected by firewalls
Web Mining
•Discovery and analysis of useful patterns and information from the web -E.g. to understand customer behavior, evaluate website, quantify success of marketing •Content mining - mines content of websites •Structure mining - mines website structural elements, such as links •Usage mining - mines user interaction data gathered by web servers
Program specification document
•Document program specifications
Enterprise Systems
•Enterprise resource planning (E R P ) systems •Suite of integrated software modules and a common central database •Collects data from many divisions of firm for use in nearly all of firm's internal business activities •Information entered in one process is immediately available for other processes
Next-Generation Enterprise Applications
•Enterprise solutions/suites -Make applications more flexible, web-enabled, integrated with other systems •S O A standards •Open-source applications •On-demand solutions •Cloud-based versions •Functionality for mobile platform •Social C R M -Incorporating social networking technologies -Company social networks -Monitor social media activity; social media analytics -Manage social and web-based campaigns •Business intelligence -Inclusion of B I with enterprise applications -Flexible reporting, ad hoc analysis, "what-if" scenarios, digital dashboards, data visualization
Data Mining
•Finds hidden patterns and relationships in large databases and infers rules from them to predict future behavior •Types of information obtainable from data mining -Associations: occurrences linked to single event -Sequences: events linked over time -Classifications: patterns describing a group an item belongs to -Clustering: discovering as yet unclassified groupings -Forecasting: uses series of values to forecast future values
Databases and the Web
•Firms use the web to make information from their internal databases available to customers and partners. •Middleware and other software make this possible -Web server -Application servers or CGI -Database server •Web interfaces provide familiarity to users and savings over redesigning legacy systems.
Object-oriented software - OOP
•Focuses less on procedures, more on relationships with previously defined procedure •Objects contain both the data and the processing operations needed to perform a task
Global Supply Chains and the Internet
•Global supply chain issues -Greater geographical distances, time differences -Participants from different countries §Different performance standards §Different legal requirements •Internet helps manage global complexities -Warehouse management -Transportation management -Logistics Outsourcing
Enterprise Application Challenges
•Highly expensive to purchase and implement enterprise applications •Technology changes •Business process changes •Organizational learning, changes •Switching costs, dependence on software vendors •Data standardization, management, cleansing
Business Value of Enterprise Systems
•Increase operational efficiency •Provide firm-wide information to support decision making •Enable rapid responses to customer requests for information or products •Include analytical tools to evaluate overall organizational performance and improve decision-making
Supply Chain Management
•Inefficiencies cut into a company's operating costs -Can waste up to 25 percent of operating expenses •Just-in-time strategy -Components arrive as they are needed -Finished goods shipped after leaving assembly line •Safety stock: buffer for lack of flexibility in supply chain •Bullwhip effect -Information about product demand gets distorted as it passes from one entity to next across supply chain
Customer Relationship Management
•Knowing the customer •In large businesses, too many customers and too many ways customers interact with firm •C R M systems -Capture and integrate customer data from all over the organization -Consolidate and analyze customer data -Distribute customer information to various systems and customer touch points across enterprise -Provide single enterprise view of customers
The Challenge of Big Data
•Massive quantities of unstructured and semi-structured data from Internet and more -3Vs: Volume, variety, velocity -Petabytes and exabytes •Big datasets offer more patterns and insights than smaller datasets, e.g. -Customer behavior -Weather patterns •Requires new technologies and tools
Business Value of Supply Chain Management Systems
•Match supply to demand •Reduce inventory levels •Improve delivery service •Speed product time to market •Use assets more effectively -Total supply chain costs can be 75 percent of operating budget
The Supply Chain
•Network of organizations and processes for: -Procuring materials -Transforming materials into products -Distributing the products •Upstream supply chain •Downstream supply chain •Internal supply chain
Database Normalization
•Normalization is the process of efficiently organizing data so that the same data is not stored in more than one table and that related data is stored together.
Analytical Tools: Relationships, Patterns, Trends
•Once data is gathered, tools are required for consolidating, analyzing, to use insights to improve decision making -Software for database querying and reporting -Multidimensional data analysis (O L A P ) -Data mining
Hadoop
•Open-source software framework for big data •Breaks data task into sub-problems and distributes the processing to many inexpensive computer processing nodes •Combines result into smaller data set that is easier to analyze •Key services -Hadoop Distributed File System (H D F S) -MapReduce
Customer Relationship Management Software
•Packages range from niche tools to large-scale enterprise applications •More comprehensive packages have modules for: -Partner relationship management (P R M ) §Integrating lead generation, pricing, promotions, order configurations, and availability §Tools to assess partners' performances -Employee relationship management (E R M ) §Setting objectives, employee performance management, performance-based compensation, employee training •C R M packages typically include tools for: -Sales force automation (S F A) §Sales prospect and contact information §Sales quote generation capabilities -Customer service §Assigning and managing customer service requests §Web-based self-service capabilities -Marketing §Capturing prospect and customer data, scheduling and tracking direct-marketing mailings or email §Cross-selling
Ensuring Data Quality
•Poor data quality: major obstacle to successful customer relationship management •Data quality problems caused by: -Redundant and inconsistent data produced by multiple systems -Data input errors •Data quality audit Data cleansing
Analytic Platforms
•Preconfigured hardware-software systems •Designed for query processing and analytics •Use both relational and non-relational technology to analyze large data sets •Include in-memory systems, NoS Q L D B M S •E.g. I B M Pure Data System for Analytics -Integrated database, server, storage components •Data lakes
Demand-Driven Supply Chains: From Push to Pull Manufacturing and Efficient Customer Response
•Push-based model (build-to-stock) -Earlier S C M systems -Schedules based on best guesses of demand •Pull-based model (demand-driven) -Web-based -Customer orders trigger events in supply chain •Internet enables move from sequential supply chains to concurrent supply chains -Complex networks of suppliers can adjust immediately
Characteristics of a good program
•Reliable •Produces the correct output •Catches common input errors •Well-documented and understandable •Structured programs - one of the best ways to code effective programs: Using logic structure
In-Memory Computing
•Relies on computer's main memory (RAM) for data storage •Eliminates bottlenecks in retrieving and reading data •Dramatically shortens query response times •Enabled by high-speed processors, multicore processing •Lowers processing costs
Data Analysis Tools - Data Flow Diagram
•Show the data or information flow within an information system •Data is traced from its origin through processing, storage, and output •Top diagram shows data flow •Bottom diagram shows data flow symbols
Supply Chain Management Software
•Supply chain planning systems -Model existing supply chain -Enable demand planning -Optimize sourcing, manufacturing plans -Establish inventory levels -Identify transportation modes •Supply chain execution systems -Manage flow of products through distribution centers and warehouses
Online Analytical Processing (O L A P)
•Supports multidimensional data analysis, enabling users to view the same data in different ways using multiple dimensions -Each aspect of information—product, pricing, cost, region, or time period—represents a different dimension -E.g., comparing sales in East in June versus May and July •Enables users to obtain online answers to ad hoc questions such as these in a fairly rapid amount of time
Data Analysis Tools - System Flowchart
•System flowcharts show the flow of input data to processing and finally to output or distribution of information. •System flowchart is to the left •System flowchart symbols are to the right
Processing requirements
•Tasks to move input to output
A Look to the Future ~ The Challenge of Keeping Pace
•To stay competitive with today's fast business pace, new technologies must be incorporated •Increased use of RAD and prototyping •Increased use of outside consulting
Text Mining
•Unstructured data (mostly text files) accounts for 80 percent of an organization's useful information. •Text mining allows businesses to extract key elements from, discover patterns in, and summarize large unstructured data sets. •Sentiment analysis -Mines online text comments online or in email to measure customer sentiment