Cloud Computing Final - 2nd Half
Supervisory Control and Data Acquisition (SCADA)
-data-gathering oriented -remote monitoring, remote telemetry, report back to control systems
Programable logic controller (PLC) (aggregators)
-handle specialized input and output -ensure uninterrupted processing Computer that performs some specific function you coded, burns code into firmware + continuously repeats it
ec2 instance savings plan v.s. compute savings plan for ec2
1. commiting to hourly spend of a specfic instance 2. flexible and can move to different instance types (more expensive per hour tho)
triggering a function
1. schedule driven (time or frequency) 2. event driven trigger (new file in s3 bucket)
5g
20 gigabits
Internet of Things
A global infrastructure for the information society, enabling advanced services by interconnecting (physical and virtual) things based on existing and evolving interoperable information and communication technologies
IOT control software (4)
AWS IoT core - connects + lets devices interact w/ cloud AWS iot device defender - security auditor AWS IoT device management - remote management at scale Aws iot things graph - connect devices + cloud services for iot applications
API gateway
added a web-interface to lambda in order to pull qualtrics results qualtrics triggers the API gateway, which triggers lambda
Kinesis
allows real time processing of large amount of data assists in data dumping + structuring
AWS Athena
allows you to access the files stored in a data lake using SQL without needing a database
IOT device software (2)
amazon freeRTOS - OS for managing microcontrollers Aws IOT Greengrass - software for local compute, messaging, data caching, sync
IOT data services (3)
aws iot analytics - sophisticated analytics w/ massive volume AWS iot events - detect and respond to events from sensors aws iot sitewise - collect, structure, and search IOT data
AWS Sagemaker
builds and deploys fully functional machine learning models in the cloud integrated environment for running jupyter notebooks in aws
overprovisioned services
capacity exceeds demand- services should be downsized
Cloud computing costs
considered operating expenses
cost optimization
cost engineering into design decisions
Data Warehouse
data stored structurally, optimized for online analytical processing (OLAP)
underprovisioned services
demand exceeds capacity- services should be upsized
Distributed Control System (DCS)
focus on controlling processes use sensors and feedback systems
Qualtrics survey
has interview logic, branching, data validation
Classification techniques
help assign instances to pre-defined categories
regression technqiues
help predict numeric values
Clustering techniques
help us identify groupings we may not have recognized
artificial intelligence techniques
help us recognize patterns in data using neural networks
Association discovery techniques
help us to identify common pairings
Cost awareness
individuals in org have a sense of what things cost
drawbacks of serverless
limit on resource consumption (3gb of mem) (max 15 mins) stateless nature - runs independently, no memory (state) between the operations of functions dev / debugging can be more difficult (complex)
when to use reserved instances
long duration + unchanging ex) running a ERP database
Data Lake
massive amounts of data stored in original format (csv), object store, unstructured so more like a swamp
cost management
monitor and manage cloud costs
Lambda
native support - write code within it triggers from most other aws services, also can trigger something to happen in aws (reboot server) transparent infrastructure (autoscaling) 1 million free requests (400,000 gb/s compute time) infinite scalability
prescriptive analytics
provide advice about actions we should or should not take "guiding our actions"
Serverless Computing
runs code on infrastructure is a PaaS (Function as a Service)
Why cloud analytics? (4)
scalability becomes easy complexity of data environment simplicity of cloud analytics tools manageable cost structure
IOT Primitives (5)
sensor aggregators (collecting and summarizing data) communication channel (allowing coms between them) external entity - the cloud decision trigger (business logic)
Reserved instances (convertible)
set aside capacity at a discount if convertible - can flex around the instance family
savings plan
set aside capacity at a discount - commiting to an hourly spend rate
SNS
simple notification service
benefits of serverless
simplicity of implementation (no servers) tremendous and immediate scalability (infinite) wasteless approach to CC, use only what you need manageable costs
when to use spot instances
something that doesn't generate new master data risk-tolerance dependent running reports/functions that arent entirely essential
When to use on-demand instances
something you've never done before for a known duration autoscaling (reserve for min, on demand for peaks)
Aggregators
summarize sensor output without 5g some of it done on device (edge)
Descriptive analytics
summarizing data we have about the past "understanding the past"
tagging
tagging resources allows the allocation of costs
Analytics merges three worlds
technology, business, statistics
Database
transactional data usually kept in relational form OLTP - online transaction processing
communication channel, external entities, decision triggers
transmit data products/ services implemented in hard/software conditional expressions that trigger actions
designing serverless
trigger inputs calculation output
predictive analytics
use past data to predict future outcomes "understanding the future"
Spot instances
utilize excess capacity of servers at extreme discount can be inturrupted within 2 mins notice differences in discount and frequency of interruption dependent on region!
Reserved instance (RI) Utilization
what % of RIs are being used by active workloads
Reserved Instance (RI) coverage
what % of active workloads are using reserved instances