Exam 3 ISM3004

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Protocol

"rules of the road"; Requirements that make sure that system functions reliably

What is a Data Warehouse? What are its characteristics?

Collection of databases that supports decision making -Many sources -Operational systems-periodic transfer -Historical data -Fast Queries -Exploration

Network

Collection of devices connected together via communications devices and transmission media

Business Intelligence

Combining aspects of reporting, data exploration & ad hoc queries, & sophisticated data modeling & analysis

Views

Display relationships by combining them for reporting and display

What is the problem with using the public Internet to connect remote offices and mobile users to an organization's private resources?

Easy to use when WAN does not connect, but not safe. Too risky. Lost functionality.

What are "Rogue APs" and why are they a problem?

Enterprise IT runs the network for your business, but an employee doesn't like the wireless there so they go to another store and get their own wireless router. The instant this is done all security and reliability that has been configured for the network can be hacked.

ERP

Enterprise Resource Planning System-paychecks, invoices, payments=business transactions/data to seek insights

What is operational data?

Operational Data is exactly what it sounds like - data that is produced by your organization's day to day operations. Things like customer, inventory, and purchase data fall into this category.

What customers are most likely to benefit from a service like Starlink?

Serving people in areas w/ no high speed internet service.

Ad-Hoc Reporting Tools

Tools that put users in control so that they can create custom reports on an as-needed basis by selecting fields, ranges, summary conditions, and other parameters.

What is TPS?

Transaction Processing Systems Examples: ATM, retail sales transactions, websites, searches

What is "information abundance"? What are its implications for knowledge workers?

Treasure side-The world has changed, jobs have changed. Some people not caught up to change. Geek up! Take advantage of opportunities.

What does an Ethernet Switch do?

What you use to connect a group of nodes together. Also has a speed capacity and management capabilities Port count - can buy at variety of port counts based on needs Limit of devices you can hook up

How do inconsistent data formats impact a business?

When bringing data together from different repositories, data formats can be inconsistent

PoE

PoE-Power Over Ethernet - Using unused ethernet wires for powering other devices

Benefits of using relational databases

-Combines and simplifies data

Cons of ad hoc reporting tools

-Demanding of user -Potentially steep learning curve -Business knowledge -Understand data schema

What are the 2 valid relationship types?

-Field must be found in 2 separate tables -Must be a key field in 1/2 of the tables

Dashboards

-Graphic view of what is happening inside the software system -Some customization -A picture is worth a thousand words

Business operations examples

-Health care patient data-Michigan tags cows at birth-Transportation: Plane engine produces 10tb of data every 30 min-Swiss Rails: 100 data items a second

Pros of OLAP

-Huge data -Pre-processed + Summarized -User reports fast

Cons of OLAP

-No access to details; user only sees summary

Characteristics of unstructured data

-Not organized-no schema -Ex: Text (email, facebook pages, news stories, etc) -Binary-images, audio, video

OLAP

-Online analytical processing the manipulation of information to create business intelligence in support of strategic decision making. -used for enormous amounts of data

Characteristics of structured data

-Organized: Rows and columns, structure -Predefined characteristics known as schemas: rules for organizing data (data type, data ranges)

MapReduce

-Programming model in Hadoop -Map: process input data in parallel -Reduce: combine data from Map to create final results

What technology was described that can protect against total server failure?

Clustering-When multiple servers share the same file storage

For legacy WiFi Lans (802.11b and 802.11g) What is the max bandwidth? What radio frequency spectra are used?

802.11b -Max bandwidth-11 mbps -Radio frequency-2.4 GHz 802.11g -Max bandwidth-54 mbps -Radio frequency-2.4 GHz

How well does WAN address remote office needs? What about mobile users?

A company has WAN for three offices- Florida, Texas, and Wisconsin. If CEO goes to Colorado for vacation, he cannot connect to WAN. Therefore, WAN works for remote office needs, but not mobile users.

Data Aggregator

A company whose sole job is to collect data from a wide variety of sources and organize it, clean it, and connect it to each other and then sell access to it to others. Example: Acxiom

What is a server? (i.e. what does it do?)

A computer whose job is to provide services, and share resources to other nodes on the network

Domain Registrar - what is it? why would you use one?

A person or entity who helps you to buy and register a domain. Helpfully, most registrars offer intuitive tools that help you search available names. On your part, you simply fill in the name of your choice and make payment. Important for a business to register a domain name to protect copyrights and trademarks, build creditability, increase brand awareness, and search engine positioning.

What is a Router? What does it do?

A router helps you connect multiple devices to the Internet, and connect the devices to each other. Also, you can use routers to create local networks of devices. These local networks are useful if you want to share files among devices or allow employees to share software tools. Also has security features like a firewall, VPN, access control

RAID 5

A technique that stripes data across three or more drives and uses parity checking, so that if one drive fails, the other drives can re-create the data stored on the failed drive. RAID 5 drives increase performance and provide fault tolerance. Windows calls these drives RAID-5 volumes.

What is Automated Data Tiering? How does it address these problems of data growth?

ADT-Match storage performance to access frequency-most accessible data is most relevant Current Working Data - Top Tier Storage (fastest, most high quality) - SSDs Recently Used - Mid Tier Storage - Hard Drives Historical - Bottom Tier Storage (cheapest/slowest) - Tape

What is AP?

AP-access point-Provides coverage to multiple users. Scatter them throughout the building and connect each access point unobtrusive, good coverage, many access points.

Be able to recognize each of the last mile technologies and describe it in very general terms (go back to lecture for this maybe-- this makes no sense)

Analog Modems- POTS (standard telephone lines) Cable Broadband-digital connection for Cable Tv; shared w/neighbors DSL-Digital Subscriber Line-telephone companies solution using existing telephone wires; works on limited distances, not for rural customers FTTH-Fiber To The Home Cellular Wireless Satellite Wireless

What is a "transaction"? What are its two key characteristics?

Any business exchange 1. Standardized-schema 2. Occurs repeatedly

Enterprise Software (3 different kinds)

Applications that address the needs of multiple users throughout an organization or work group.

How does VPN work?

Buy hardware with VPN capability Connect VPN to internet service providers VPN software installed on remote nodes Connects remote and mobile users

How do workstation NICs, Ethernet switches, and cables connect together to build a typical Ethernet LAN?

By buying another switch and connecting the two switches together with cable

Cash anonymous

Cash is always anonymous

Workstation

Client PCs that a human being uses to interact with the network and its resources

Client-Server service model

Client requests services, server provides Clear division of labor. Clients consume services and resources. Servers provide services, share resources. Servers are controlled and managed by IT to keep them secured, patched and efficient. Clients are user devices that could be BYOD or corporate supplied etc.

CRM

Customer Relation Management System Every sales call, every customer inquiry, every follow up call=data

Information

Data that has been presented in such a way that it answers questions or supports decision making

What is DeDupe? How does it address these problems?

DeDupe: Oftentimes, we have the same data repeatedly stored in our systems. -Match duplication in unstructured data Single storage for any data DeDuplication System-goes through the storage system and looks for unique images and keeps them. When discovers duplicates, it eliminates the data storage and replaces it with a pointer to the single copy

What is meant by the term "last mile"? Why should an organization care about the last mile?

Describes the portion of the telecommunications network chain that physically reaches the end-user's premises. The core of the internet is fast enough for todays needs, but when you get to the last mile where it enters a home or business, the speed plummets. o Why care? Bandwidth- how much you can fit in a certain period of time; decent amount is in megabits/second

Server

Device attached to the network whose primary purpose is to provide a service to users/workstations.

What is DAS? What problem does it solve?

Distributed Antenna Service used to improve wireless signals in an indoor or outdoor space, essentially anywhere with an obstructed signal. Install multiple small antenna inside of a building to boost the cellular signal inside of a building

DNS - What does the acronym mean? What does it do for us?

Domain Name System-DNS translates domain names to IP addresses so browsers can load Internet resources. A global distributed system of servers, software, and protocols that enable us to convert the billions of different host names into the appropriate IP numbers so that the computers and routers can get the work done. All we need to know is the name

Dark Data

Dormant data that is spread across servers on incompatible systems where it can not be turned into anything of value

What is the rate of data growth?

Doubling every 6 months...unprecedented and will continue regardless of budget restraints.

How do remote offices connect (VPN)?

Each office has a corporate router attached to service provider; router has VPN software built into it. All routers at different offices configured to be single VPN: seems as if they are on the same LAN. Encrypts data before sending it to other offices. Router can also send unencrypted information when going across internet.

Fiber Optic cable is non-conducting - why is that good?

Good for connecting buildings together during lightning strike - voltage will not ruin system

Top CIOs say that data growth is the #1 challenge today. What two problems arise from that challenge?

How are we going to handle that explosive growth with constrained budgets How are we going to exploit that data?-Process and present it to managers so they can make decisions

What is the "backhoe problem"? How do you protect your network against this problem?

If every network is connected by a single cable problems can arise where you lose connectivity. You want redundancy in lines, meaning you want multiple paths of connection out of the building so that if something happens to one cable it does not lose connectivity.

Knowledge

Insight derived from experience and enterprise-savvy information

IP

Internet Protocol. The main delivery system for information over the Internet; enables billions of devices on this planet to communicate at high speed around the world.

Satellite wireless - what is "latency"?

It is the amount of delay, measured in milliseconds (ms), that occurs in a round-trip data transmission to a satellite 23,000 miles away in space ½ second round trip, even at the speed of light because satellite is over 22,000 miles away.

Briefly explain the differences between Low-Earth Orbit (LEO) and Geosynchronous orbit

LEO-LEO satellites are much smaller and their orbits are much closer to earth, so the rockets needed to launch them are also smaller and cheaper. The downside with LEO satellites is that many are needed to cover any specific geographical area. LEO satellites orbit the Earth many times per day. Altitude of 160 to 2000 km Geosynchronous Orbit (GSO)-Stays in same place above Earth as Earth turns meaning that it takes one day to complete one orbit. Not many parking spots in orbit. GEO satellites are bigger and more expensive to deploy, the network operator can gradually add to their coverage as their business grows. Takes a long time to get data out there and back. 65x farther away than starlink.

LAN

Local Area Network; a geographic network that covers a relatively small geographic area such as a building or a small campus - no more than a mile distance between computers Ethernet (protocol) Physical - wires, radio waves MAC address Packet structure Rules for "speaking" and "listening" 802.11

MAC vs. IP addresses -- what do they do? how are they different?

MAC address-Media Access Control-unique number that identifies the network device IP address-address assigned uniquely to every device on the internet - LAN side (local network) MAC address-totally unique, tells nothing about where computer is located-does not change (example: SSN) IP-tells uniquely who is it and where are they-changes as you move from place to place (Mailing address)

For current 802.11ac, what is the max bandwidth and radio frequency spectra?

Max bandwidth-433-mbps-6.77gbps -Radio frequency-2.4 and 5 GHz

How do loyalty cards generate valuable data?

Membership program in which company is paying you through bonuses for data about you that you otherwise would not give them Company wants to know WHAT was sold to WHOM

Net Neutrality - what's the basic issue? Who is on each side of the issue and why?

NET NEUTRALITY IS the idea that internet service providers like Comcast and Verizon should treat all content flowing through their cables and cell towers equally. That means they shouldn't be able to slide some data into "fast lanes" while blocking or otherwise discriminating against other material. In other words, these companies shouldn't be able to block you from accessing a service like Skype, or slow down Netflix or Hulu, in order to encourage you to keep your cable package or buy a different video-streaming service.

What is a table?

Organized collection of data made up of records and fields

Key field

Part of relational database One of the fields in a table, data items are unique

Field

Part of table Column Attribute for data (fixed schema-textual data) Example: address

Record

Part of table Row of data individual observation

What common devices can interfere with WiFi networks? Which radio spectrum is affected?

Phones, Microwaves, etc can interfere because they use 2.4 GHz

Be familiar with the three examples of big data provided in the lecture. How do you see the Three V's in each?

Predictive Policing-Los Angeles Big Data is Cool! Tesco grocery chain Actions speak louder than words

PIG

Programming language of Hadoop

Two basic versions of RAID technology

RAID 1 RAID 5

Velocity

Rapid arrival=too fast. Cannot react fast enough. Feedback Loop-data comes in and we need to get it into a system and process it

Relational Database

Real power when we correlate data from multiple tables and link them together -multiple tables that are related

RAID - what risk does this protect against?

Redundant array of inexpensive disks Protects against hard drive failure

What is Cybersquatting?

Registering a domain [URL] that you have no rights to [ex: auburntigers.com]

Canned Reports

Reports that provide regular summaries of information in a predetermined format. -answer specific questions -easy for users -IT overhead

Point of Scale Systems

Retail computer systems that collect sales data and are hooked directly into the store's inventory-control system Scan barcode, transaction happens. Data.

How is a Data Mart different from a Data Warehouse?

Same thing, different scale- Looks at specific problem/unit rather than the enterprise

What can a company do about this problem?

Separate data repository -One for operational data -One for reporting and analytics Combine data from many sources-cleaning it Historical data-builds as months and days go by; used as a resource to see trends Periodic import from operational systems-allows analytical system to be up to date enough to come up with inferences

What are "data silos"? How do they come into being? Why is this a problem?

Silo-implying that data collections are completely separated with no possibility of communication or sharing -company may have some data that is trapped inside of obsolete legacy systems -incompatible systems Problem-Missed opportunities to see patterns, trends, correlations ,develop new insights to answer questions and make decisions.

What is an SSD? How does it address these problems of data growth?

Solid State Drives Storage - uses flash memory Faster than magnetic hard drives Latency-amount of time you have to wait for the data to spin to a place you can read it Greater throughput Lower power consumption-less electricity, cheaper, generate less heat=less AC RAID-use to link multiple together. Allows them to share the workload by spreading data among SSDs. Improved performance and capacity Prices dropping - viable alternative for many forms of corporate data

VPN (Virtual Private Network)

Solution to when WAN and internet cannot be used for business. All data is encrypted. Network that uses a public telecommunication infrastructure, such as the Internet, to provide remote offices or individual users with secure access to their organization's network. Affordable!

HDFS

Stands for Hadoop Distributed File System and is the way that Hadoop structures its files.

Briefly describe the Starlink service and the "constellation" they are building

Starlink is a plan by SpaceX to put 12,000 satellites (a constellation of satellites) into low Earth orbit (LEO) that offer high-speed, low-latency, cheap internet access to anyone anywhere on the planet. Unlimited data for cheap price-beta out right now known as "Better Than Nothing." Placed 340 miles up.

What is SQL?

Structured Query Language Most common language for creating & manipulating databases Ruling champion database in business world

SCM

Supply Chain Management System- Each order for finished goods/raw materials=transactions

Sources of customer-provided data

Surveys -customer survers -product registration cards -contests External sources -General info (weather, news) -Public Records

What is a "site survey"? Why is it important?

Take wireless devices and set them up in temp locations then take a sensor and walk around the site to see where signal levels are weak and strong and then tweak the locations till there is good coverage everywhere. Important for business instillations, large sites

Hadoop

Technically...Is an open source system designed to consume any data you want (unstructured, structured, etc). Distributing computing platform Practically...Highly scalable, open source, cost-effective, flexible, fault-tolerant

Bandwidth

The amount of data that can be transmitted over a network in a given amount of time. (bits per second)

For remote office and mobile user VPNs, identify where on the network diagram the data is encrypted.

The data is encrypted immediately after being sent from remote office/mobile user VPN.

"CAT" ratings for Ethernet cables

The number tells you specifically how the cable has been engineered and how safely it can transmit data. CAT 5 is the minimum remotely accessible quality

Packet

The small unit into which information is broken down before being sent across a network.

What guidance did Mr. Olson offer about WiFi range?

They are radio waves they don't like things that get in their way like walls. Maybe 100 feet indoors. More outdoors

What is "information overload"? What is its alleged impact?

Tumult side 900 billion cost to economy "Drinking from a fire hydrant"

RAID 1 (mirroring)

Two drives are used in unison, and all data is written to both drives, giving you a mirror or extra copy of the data, in the case that one drive fails

URL - what are the component parts and what does each do for you?

Uniform resource locator -tells our web browser to tell our software what it is we are looking for -Way that we can tell where any resource is on the internet Components http://www.nytimes.com/tech/index.html Application transfer protocal- tells software what data is (postcast, webpage, video, etc) {http} Hostname-{www.} Name of server that has data Domain name-{nytimes} Tells what organization on the network owns that host; consists of two or three partts Top level domain- {.com} fixed Path- Tells where within server does piece of data live-folder in file system {tech} File- Specific piece of content {index.html} (case sensitive)

UPS - what risk does this protect against? how does it work?

Uninterrupted Power Supply -Protects against Power Outages -Battery that has electronics in it being constantly charged from the electricity in the wall. Clean filtered power is then provided to the servers. If power is gone all together, the batteries provide the power needed to keep the servers running.

Copper UTP

Unshielded Twisted Pair-Cable contains 4 sets of 2 wires each-each pair is twisted around each other in different frequency so that the signals going down one pair do not mess with the signals going down the other pair -Startolobolgy - each cable goes to one node -Distance- cables can be up to 90m in wall, 10m to equipment (short runs) -Quality- "cat rating" - tells us how cable was engineered and how fast it can safely transmit data -Installation-not diy. Professional installation w/ test results for every single cable

What is the single biggest cause of data loss? How do you protect against that risk?

User error. Back everything up. Data and image backup. Image is a snapshot of the entire contents of the hard drive so that if the server crashes you can restore the server image to a previous image in time. Back up systems- tape, disk, software

Pros of ad hoc reporting tools

Users define their own resorts Powerful/flexible

Fiber Optic Cabling

Uses glass or plastic fiber to carry information as light pulses Long Runs - connecting buildings together, etc Multimode fiber: up to 550 meters Single mode fiber 5km to 40km Made out of glass- does not conduct electricity -Network Infrastructure - between buildings, closets

Variety

Variety too great; too little consistency: Text...Images...Sound...Video...Human input...sensors...servers...

What three characteristics are necessary for something to be "Big Data"? (three V's)

Volume, Velocity, Variety

How do mobile users connect to a VPN?

We do not trust alternate routers/internet. Install VPN software on mobile devices (CEO's laptop). VPN then encrypts data and then sends to hotel internet and then to corporate router in another state and then decrypts info.

WAN-What is it and how is it constructed?

Wide Area Network- dedicated private data circuits that stretch from state to state to different offices. Redundant, high speed, expensive. Does not work for mobile users, only remote offices.

WAN

Wide Area Network; largest type of network in terms of geographic area; largest WAN is the Internet

Ethernet

a system for connecting a number of computer systems to form a local area network, with protocols to control the passing of information and to avoid simultaneous transmission by two or more systems.

Analytics

a term describing the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions -Base decisions on data and analysis

Peer-to-Peer service model

all machines are equal (considered servers and clients at the same time) Everyone is a client and a server. Every computer could share its resources and provide services with the others. Hard to manage especially as you scale up. Reduced security and reliability.

Node

any device connected to a network

How does the analysis of operational data compete with customers?

delays and lost sales due to significant amount of additional load to the system during business hours (best if we do not query operational data)

Pros of Canned Reports

easy + useful

Cons of Canned Reports

inflexible + IT overhead

Volume

notion data is "too big" to be analyzed with traditional methods (hundreds and millions of data items)

Satellite wireless

radio transmission systems in space

Data

raw facts and figures-tells you nothing alone-very valuable, but needs to be turned into information. Data integrity is key and you must understand your data schema

Data Mining

the process of analyzing data to extract information not offered by the raw data alone -Enormous historical datasets -Identify patterns -Build Models -Predict Future


संबंधित स्टडी सेट्स

FSCJ Term 1 Pharmacology Unit IV NCLEX

View Set

21.2 Internal Combustion Engines

View Set

Unit four Licensure, Ethics & the insurance producer

View Set