IDS 200 Amazon Module Exam Review
n-Tier Architecture
1 ) Advantages Relative to Client-Server > scales better, easier to add more machines separate to any tier > avoids direct client access to data (better security in that respect) 2 ) Roles a. user interface b. typically runs most app operations c. holds data 3 ) Tiers (front to back) a. Client/Presentation/User b. Application/Server c. Data/Database
Strategy what you can do v the outcome eg. bicycle manufacturer wants to increase sales by 10% over the coming year - revenue = price * # sold suppose that after careful consideration, the firm decides to improve bike quality & charge 10% more MARKETING: - make ads about new and better bikes - maybe reach out to less recent prior buyers MANUFACTURING: - design the new bikes - change/upgrade production process FINANCE: - tell me how much money you need, I'll work on the financing HR: - hiring new salespeople - new training about new bike - new engineers - new system designers IT: - chances are all this might require some new IT systems
1 ) Alignment > making sure that different functional strategies all support the overall organizational strategy (Operations, Finance, Marketing, HR, Info Tech) 2 ) Culture > prevent some options from being considered while favoring others 3 ) Development > identify organizational strengths & weaknesses > consider options > pick one (identify what you can do, identify which of those things should have the best outcome, do it) 4 ) External Factors > state of the economy (product demand) > new technology (potential substitutes) > competitors' capabilities > partners' capabilities 5 ) Functional Strategies > strategies of individual departments (accounting, HR, marketing) 6 ) Internal Factors > a business must be aware of its own strengths and weaknesses >> past performance >> pilot efforts >> skill tracking 7 ) Organization Strategies > big goals that benefits the business 8 ) SWOT Analysis > internal: >> (S)trengths and (W)eaknesses > external: >> (O)pportunities and (T)hreats 9 ) Tactics > a large-scale opportunity that can be exploited by existing strengths
Transaction Processing Systems
1 ) Atomicity (for transactions) > any transaction involving multiple systems either succeeds on all or fails on all systems, not a mix 2 ) Concurrency > the abilty of multiple users to access the same data simultaneously 3 ) Consistency > any database operation starting from a consistent state will end up in a consistent state 4 ) Isolation > outsider lookers don't see inside the database while a transaction is in progress, but only the start or end states 5 ) Deadlock > two processes block each other & wait for the other to finish; hence, neither can move forward
Cloud Attributes
1 ) Backups > automatic, reliable 2 ) Elastic > can make abrupt changes in system usage 3 ) Metered > how else to track usage without long-term contract 4 ) Pooled Resources > many clients for a large cloud service sharing use of the system, economies of scale & smoother demand profile 5 ) Service-Based > ideally, rather than rent specific machines, clients use services on virtual servers
Data Mining
1 ) Cluster Analysis > you map the data and try to partition the space so that different clusters of same type data are separated 2 ) Correlation Measures (Confidence, Lift, Support) > CONFIDENCE: --> conditional probability eg. high if knowing a customer bought Big Mac makes it more likely they bought fries > LIFT: --> measures the increased likelihood that someone who has bought A also buys B --> will be high if confidence is higher than the baseline probability > SUPPORT: --> high if two products are often bought together as in many purchases have the same items eg. like Big Mac & fries at McDs 3 ) Regression analysis > you use a set of one or independent variables to predict the value of the dependent variable eg. [hot dog sales at baseball game] = [weather] + [teams] + [time] + [playoff y/n] 4 ) Tree Analysis > rules applied separate the data into groups with same characteristics > end result is sometimes the same but Tree Analysis is easier for categorical variables
Recommender Systems
1 ) How they work > delivers links or content based on a set of preferences > more historical data: product, users, time > reduces sampling errors > proprietary trend analysis 2 ) What factors make them work better > started preferences ("I like this sort of thing") > revealed preferences (indicated by behavior) > assumption by affiliation eg. star/scale ratings; % liked; ordering & filtering w/out visible ratings 3 ) What they are > significant defense against new retail competitors
Cloud Service Models
1 ) IaaS > hardware only 2 ) PaaS > hardware + operating system + maybe basic apps (like db) 3 ) SaaS > hardware + operating system + immediately usable apps Flexibility: Iaas > PaaS > SaaS Ease of use/management: SaaS > PaaS > IaaS > true cost should be cheapest for IaaS (but for small scale use, SaaS often free)
Object Description Languages
1 ) JSON > uses more compact JavaScript style which is readable by any web browser and supports arrays > smaller compact of data > widely used now 2 ) XML > uses opening & closing tags for each element/attribute > more bulky > not directly usable in web application > doesn't directly support arrays > rarely used now
Two-Phase Commit Transaction Model
1 ) Phases > Prepare Phase > Commit Phase 2 ) Overall Goal > you have 2+ related databases, after a transaction attempt they should all be consistent > either all show the transaction went through (if it did) or show that the attempt failed (if that happened) 3 ) Recovery Logs > recovery from uncompleted transactions >> invalid transaction parameters >> component failure during transaction >> unexpected logout by user 4 ) Rollback maintain a file of all initiated but uncompleted transactions and the initial state of the associated data > when a failure happens, the system can be returned to its last known state by undoing all the uncompleted transactions, latest first > MAINTAINING ROLLBACK RECORD INTEGRITY IS VITAL!!!
Cloud
1 ) Pricing Models > often complicated to compare >> apples-to-oranges for hardware, software >> many different variables, often opaque (not transparent) 2 ) Requirements for Viable Cloud Services > flexible contracts > easy, fast, reliable communication 3 ) Why it was needed > operating costs >> only pay for the capacity you need >> cloud provider economies of scale bring down costs for all users > vastly reduced risk of traffic overload > automatic backup & recovery > delegating hardware & software upgrades > better security (probably)
Client-Sever Architecture
1 ) Roles > clients (in this case, customers) directly obtain data residing on servers > application logic might reside on clients (thick) or servers (thin) 2 ) Strengths > cheaper to build and deploy > centralized data storage > single point for patches & upgrades > easier security monitoring 3 ) Tiers > standard architecture is three tiers >> Client/Presentation/User >> Application/Server >> Data/Database 4 ) Weaknesses > security issues: >> direct data access or tampering with client application logic > difficult to scale or modify
Business Process Re-Engineering (BPR)
1 ) What is it? > changes an existing process to a new one, commonly done as a response to: >> new technology >> competitors >> changing internal resources 2 ) BPR Sequence > map the process (this should already be done) > identify problem points > brainstorm or otherwise devise alternatives to problem points > test, evaluate, revise, repeat
Porter's Five Forces Model > to measure the potential for getting sustained competition advantage in market
> accesses the "competitive intensity" of a market >> basically the potential for one operation to get sustained higher profit Factors: 1 ) threat of new entrant 2 ) customer bargaining power 3 ) supplier bargaining power 4 ) threat of substitute products >> different products that fulfill same need 5 ) competitive intensity >> is the market growing quickly?
Cloud Service Security
> businesses used to wary of cloud for security reasons but not so much these days > cloud security generally better than non-cloud security --> because of survivorship bias & experience
Transaction Fraud Detection
> comparing individual events to some baseline standard can be based on individual history, product type, or order size (# items or $$$)
Data Silos
> different parts of the organization have separate data systems (which might contain conflicting or incompatible data) Problems: > errors in running queries across databases or database operations involving different systems require some translation
Business Inteligence
> extremely broad analytics concept >> analysis of all kinds of data >> internal: employee performance, item sales >> external: supply chain performance, product returns, competing products & prices eg. Amazon's recommender system > normally kept internal >> data flows in >> business intelligence workers and tools process queries and create reports >> reports distributed automatically or on request > conventionally, purchase correlation would be kept internal and used to structure store layout >> to encourage travel throughout the store >> to make shopping quicker >> Amazon shared this info with customers
Comparing Cloud Vendors
> first, it's difficult --> cloud services are complex and their details are hidden and there are many options and different businesses have different needs Typical comparison points: > cost & performance (relatively easy to compare) > usability > reliability > customer service > security > deletion of old data harder to assess
Service-Oriented Architecture (SOA)
> instead of having big monolithic application & systems with lots of dependencies in code & hardware, you have: --> applications that are small & modular --> easier to flexibly deploy (at home or cloud) --> needs standardized interfaces to achieve, so that applications can be modular
Value Chains > every step should add some value
> the sequence of activities for converting inputs to some salable product every stage should add some value otherwise why do it? 1 ) Primary Activities > directly involved in creating product, but make primary activities work better eg. operating website; acquiring titles; packaging & shipping 2 ) Support Activities > indirectly involved, thus harder to access eg. HR & customer support
Transaction Processing Systems Example: Suppose you have two processes that are running in parallel on one db
Deadlock: > the two processes block each other, so each is waiting for the other to finish & thus can't go forward Livelock (NOT ON EXAM): > the two processes are trying to get around each other but the way they're doing it creates a new block
Amazon EC2 & S3 Systems
Elastic Cloud (EC2) & Hosting > core of Amazon's cloud computing service >> users rent virtual servers (a virtual machine sold as a service by an internet hosting service) >> solid in "elastic compute units" ENABLES: > deploying applications, like big data analysis and transaction processing > backup capacity against traffic surges Simple Storage (S3) & Analytics > massive key-value object storage system >> estimated to hold 5 trillion objects >> each object could hold 5 terabytes of data (but the avg is much less) FEATURES: > cheap > reliable > accessible > analysis capabilities
Coupling: Loose & Tight
Loose: > the system components are free of any such dependencies eg. a web app that can run via any web browser Tight: > the system components are custom fitted for each other eg. it's known that they're running the same OS (Apple)
