BCIS unit 3 & 4

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Primary Key

a field or set of fields that uniquely identifies the record

Hadoop two primary components

- A data processing component (MapReduce) - A distributed file system (Hadoop Distributed File System, HDFS)

In-memory database (IMDB)

- A database management system that stores the entire database in random access memory (RAM) - Provides access to data at rates much faster than storing data on some form of secondary storage - Enables the analysis of big data and other challenging data-processing applications - Performs best on multiple multicore CPUs

Private Cloud Computing

- A single tenant cloud - Organization often implements due to concerns that their data will not be secure in a public cloud - Can be divided into two types: On-Premise and Service provider managed

Wi-Fi

- A wireless telecommunications technology brand owned by the Wi-Fi Alliance - Employs a wireless access point (a transmitter with an antenna) that receives the signal and decodes it; translates signals into a radio signal and sends it to devices wireless adapter - Users device has a wireless adapter that translates data into a radio signal and transmits it using an antenna

Three most common network topologies

- star network - bus network - mesh network

Software Defined Networking (SDN)

- An emerging approach to networking - Allows network administrators to have programmable central control of the network via a controller without requiring physical access to all the network devices. - Google is implementing Andromeda- the underlying SDN architecture that will enable Google's cloud computing services to scale better, more cheaply, and more quickly

ACID properties

- Atomicity: all changes to data are performed as if they are a single operation. - Consistency: data is in a consistent state when a transaction starts and when it ends (everything adds up) - Isolation: the intermediate state of a transaction is invisible to other transactions, all transactions are separate. - Durability: After successful transaction, no changes will be undone.

Affiliation IDs and their affiliations

- Biz: business sites - Com: All types of entities including nonprofits, schools, and private individuals (ex: ".com" googles domain name is google.com meaning their affiliation ID is Com) - Edu: Post-secondary educational sites - Gov: Government sites - Net: Networking sites - Org: Nonprofit organization sites

considerations when building a database

- Content: what data should be collected? cost? - Access: what data should be provided to which users and when? - Logical structure: how should data be arranged so that it makes sense? - Physical organization: where should data be physically located? - Archiving: how long to store? - Security: how can data be protected?

Two broad categories of communications media

- Guided (Wired) transmission media - Wireless

Challenges of Big Data

- How to choose what subset of the data to store - Where and how to store the data - How to find the nuggets of data that are relevant to the decision making at hand - How to derive value from the relevant data - How to identify which data needs to be protected from unauthorized access

Examples of big data

- Retail organizations monitor social networks to engage brand advocates, identify brand adversaries - Advertising and marketing agencies track comments on social media - Hospitals analyze medical data and patient records - Consumer product companies monitor social networks to gain insight into consumer behavior - Financial service organizations use data to identify customers who are likely to be attracted to increasingly targeted and sophisticated offers

Accessing the Internet

- There are several ways, including using a LAN Server, telephone lines, a high-speed service, or a wireless network - Dial-up internet connection uses modem and standard phone line - Other options include cable modem connections, DSL connections, and satellite connections

Guided transmission media types

- Twisted-pair wire - Coaxial cable - Fiber-optic cable

NoSQL database advantages

- ability to spread data over multiple servers so that each server contains only a subset of the total data - do not require a predefined schema - data structures are more flexible and can provide improved access speed and redundancy

Types of IoT applications

- connect and monitor - control and react - predict and adapt - transform and explore

Data management factors

- the need to meet external regulations designed to manage risk associated with financial misstatement - the need to avoid the inadvertent release of sensitive data - the need to ensure that high data quality is available for key decisions

data governance requires business leadership and active participation

- use of a cross-functional tea is recommended - team should consist of executives, project managers, line-of-business managers, and data stewards - a data steward is an individual responsible for management of critical data elements

ELT process

-Extract -Transform -Load

How the Internet of Things (IoT) works

1. Sensors gather data 2. Data passes over network 3. Data from across the IoT is gathered and stored- often in the cloud 4. Data is combined with other data from other systems 5. Data is analyzed to gain insights into operation of devices on IoT 6. Alerts sent to people, Enterprise systems, or IoT devices based on these insights

IP address

A 64-bit number that identifies a computer on the Internet

Uniform Resource Locator (URL)

A Web address that specifies the exact location of a Web page using letters and words that map to an IP address and a host location

HTML Tag

A code that tells the Web browser how to format text—as a heading, as a list, or as body text—and whether images, sound, and other elements should be inserted ex- <p style= "text-align-center"> gives paragraph format

Data Definition Language (DDL)

A collection of instructions and commands used to define and describe data and relationships in a specific database. allows the databases creator to describe data and relationships that are to be contained in the schema.

Cloud Computing

A computing environment where software and storage are provided as an Internet service and are accessed with a Web browser. - advantages to businesses: they can save on system design, installation, and maintenance. Increased efficiency and reduce the costs of new product and service launches. Employees can access corporate systems from any internet-connected computer.

Client

any computer that sends messages requesting services from the servers on the network

data warehouse

A large database that collects business information from many sources in the enterprise, covering all aspects of the company's processes, products, and customers, in support of management decision making. helps relate information in innovative ways.

Bus Network

A network in which all network devices are connected to a common backbone that serves as a shared communications medium. Long backbone with network nodes branching off of it.

Star Network

A network in which all network devices connect to one another through a single central device called the hub node. hub node in the central with network nodes branching out from it.

Internet of Things (IoT)

A network of physical objects or "things" embedded with sensors, processors, software, and network connectivity capability to enable them to exchange data with the manufacturer of the device, device operators, and other connected devices.

Mesh Network

A network that uses multiple access points to link a series of devices that speak to each other to form a network connection across a large area. Lots of communicating network nodes.

Data Lifecycle Management (DLM)

A policy-based approach to managing the flow of an enterprise's data, from its initial acquisition or creation and storage to the time when it becomes outdated and is deleted.

JavaScript

A popular programming language for client-side applications Use to create Web pages that respond to user actions

Search Engine Optimization (SEO)

A process for driving traffic to a Web site by using techniques that improve the site's ranking in search results.

Public Cloud Computing

A service provider owns and manages the infrastructure with cloud user organizations (tenants) accessing slices of shared hardware resource via the Internet - Can be a faster, cheaper, and more agile approach to building and managing your own IT infrastructure - data security is a key concern because when using a public cloud computing service, you are relying on someone else to safeguard your data

Hadoop

An open-source software framework that includes several software modules that provide a means for storing and processing extremely large data sets. Can be used as a staging area for data to be loaded into a data warehouse or data mart.

Data Manipulation Language (DML)

A specific language, provided with a DBMS, which allows users to access and modify the data, to make queries, and to generate reports.

Bluetooth

A wireless communications specification that describes how cell phones, computers, personal digital assistants, etc., can be interconnected

Other client-side programming languages include:

ASP.NET, C, C++, Perl, PHP, and Python

Popular tools for creating Web pages and managing Web sites

Adobe Dreamweaver, RapidWeaver (for Mac developers), and Nvu

data management

An integrated set of functions that defines the processes by which data is obtained, certified fit for use, stored, secured, and processed in such a way as to ensure that the accessibility, reliability, and timeliness of the data meet the needs of the data users within an organization.

Java

An object-oriented programming language from Sun Microsystems based on C++ Allows small programs (applets) to be embedded within an HTML document *can be used on any computer*

ARPANET (Advanced Research Projects Agency Network)

Ancestor of the internet Project started by the U.S. Department of Defense (DoD) in 1969

Internet Service Provider (ISP)

Any organization that provides Internet access to people.

Examples of using sensors and the IoT to monitor and control key operational activities

Asset monitoring Construction Agriculture Manufacturing Monitoring parking spaces Predictive Maintenance Retailing Traffic monitoring

Amazon Web Services (AWS)

Basic infrastructure that Amazon employs to make the contents of its online catalog available to other Web sites or software applications

HTML file

CSS file (fonts, colors, layout) + XML file (content) that makeup the look of a web page

On-premise private cloud

Cloud infrastructure is deployed by an organization on its data centers within its premise Provides complete control over the infrastructure and data. Enables standardization of IT resources, processes, and services

Connecting via LAN server

Connection method of businesses and organizations that manage a local area network (LAN)

The World Wide Web (Web)

Consists of server and client software, the hypertext transfer protocol (http), standards, and markup languages that combine to deliver information and services over the Internet

Sources of an organizations useful data

Documents, Data from business apps, Social media, Sensor data, Media, Machine log data, Public data, and Archives that make up the organizations Big Data.

traditional approach to data management

Each distinct operational system used data files dedicated to that system

Hyperlink

Highlighted text or graphics in a Web document that, when clicked, opens a new Web page containing related content. - Using these, Web users can jump between Web Pages stored on various Web servers, creating the illusion of interacting with one big computer

database approach to data management

Information systems share a pool of related data Offers the ability to share data and information resources A database management system (DBMS) is required

instant messaging

The online, real-time communication between two or more people who are connected via the Internet.

wireless connection

Internet service over cellular and Wi-Fi networks has become common

predict and adapt

IoT application - Degree of sensing: External data is used to augment sensor data - Degree of action: Data used to perform predictive analysis and initiate preemptive action

Control and react

IoT application - Degree of sensing: Individual devices each gathering a small amount of data - Degree of action: Automatic monitoring combined with remote control with trend analysis and reporting

Transform and explore

IoT application - Degree of sensing: Sensor and external data used to provide new insights - Degree of action: New business models, products, and services are created

connect and monitor

IoT application - Degree of sensing: individual devices each gathering a small amount of data - Degree of Action: Enables manual monitoring using simple threshold-based exception alerting

mobile device management (MDM) software

Manages and troubleshoots mobile devices remotely, pushing out applications, data, patches and settings A central control group can maintain group policies for security, control system settings, ensure malware protection is in place for mobile devices used across the network, and make it mandatory to use passwords to access the network

relational DBMS for individuals and workgroups

Microsoft Access, IBM Lotus Approach, Google Base, OpenOffice Base

Open source relational DBMS

MySQL, PostgreSQL, MariaDB, SQL Lite, CouchDB

The internet and the web have provided an online access to:

News, Education and training, Job information, messaging, Conferencing, blogging, podcasting, vlogging, media and entertainment, music, TV, Games, shopping, and Maps

Internet backbone

One of the Internet's high-speed, long-distance communications links.

relational DBMS for workgroups and enterprise

Oracle, IBM DB2, Sybase Adaptive Server, Teradata, Microsoft SQL Server, Progress OpenEdge

Network Management Software

Protects software from being copied, modified, or downloaded illegally Locates telecommunications errors and potential network problems

NoSQL database

Provides a means to store and retrieve data that is modeled using some means other than the simple two-dimensional tabular relations used in relational databases

Database Activities

Providing a user view of the database Adding and modifying data Storing and retrieving data Manipulating the data and generating reports

Database Administrator (DBA)

Skilled and trained IS professionals - work with users to define their data needs - apply database programming languages to craft a set of databases to meet those needs - test and evaluate databases - implement changes to improve their databases performance - assure that data is secure from unauthorized access

the Social Web

Social networking Web sites enable users to share information about themselves and to find, meet, and converse with others

Internet Censorship

Some countries try to control Internet content and services

Web 2.0

The Web as a computing platform that supports software applications and the sharing of information among users.

Autonomic computing

The ability of IT systems to manage themselves and adapt to changes in the computing environment, business policies, and operating objectives. - Goal: To create complex systems that run themselves, while keeping the systems complexity invisible to the end user - Addresses four key functions: Self-Configuring, Self-healing, Self-Optimizing, and Self-Protecting

Hotspot

The area covered by one or more interconnected wireless access points

Database as a Service (DaaS)

The database is stored on a service provider's servers The database is accessed by the client over a network, typically the Internet Database administration is handled by the service provider ex- Amazon Relational Database Service (Amazon RDS)

The internet size and impact

The internet is international in scope with users on every continent. Internet sites have a profound impact on world politics The number of worldwide internet users is expected to continue growing

Extensible Markup Language (XML)

The markup language designed to transport and store data on the Web. - The key to Web services - Used within a Web page to describe and transfer data between Web service applications

channel bandwidth

The rate at which data is exchanged, usually measured in bits per second (bps)

network topology

The shape or structure of a network, including the arrangement of the communication links and hardware devices on the network.

Hypertext Markup Language (HTML)

The standard page description language for Web pages. - tells the browser how to display font characteristics, paragraph formatting, page layout, image placement, hyperlinks, and the content of a web page

Storing and retrieving data

When an application program needs data it requests the data through the DBMS Requesting the data through the DBMS is a process called querying Concurrency controls deals with the situation in which two or more users or applications need to access the same record at the same time.

Wireless Technologies

Wireless transmission involves the broadcast of communications in one of three frequency ranges Radio, microwave, or infrared frequencies. In some cases, use of wireless communications is regulated meaning the signal must be broadcast within a specific frequency range to avoid interference with other wireless transmissions.

Radio frequency

Wireless transmission operating in the 3KHz-300MHz range - Advantages: Supports mobile users; costs are dropping - Disadvantages: Signal is highly susceptible to interception

microwave (terrestrial and satellite) frequency range

Wireless transmission operating with a high frequency radio signal (300MHz-300GHz) sent through the atmosphere and space (often involves communications satellites) - Advantages: Avoids cost and effort to lay cable or wires; capable of high-speed transmission - Disadvantages: Must have unobstructed line of sight between sender and receiver; signal is highly susceptible to interception Common forms of satellite communications- - Geostationary satellite - Low earth orbit (LEO) satellite

Infrared frequency range

Wireless transmission that signals in the 300GHz-400THz frequency range - Advantages: lets you move, remove, and install devices without expensive wiring - Disadvantages: Must have unobstructed line of sight between sender and receiver; transmission is effective only for short distances

Connection via internet service providers

You must have an account with the service provider along with software and devices that support a connection via TCP/IP

a bit

a binary digit that represents a circuit that is either on or off

attribute

a characteristic of an entity

Web site

a collection of pages on one particular topic, accessed under one Web domain

record

a collection of related data fields

file

a collection of related records

Schema

a description of the entire database. can be part of the database or a separate schema file. DBMS can reference a schema to find where to access the requested data in relation to another piece of data.

data dictionary

a detailed description of all the data used in the database - Can also include a description of data flows, information about the way records are organized, and the data-processing requirements

Sensor

a device that is capable of sensing something about its surroundings such as pressure, temperature, humidity, pH level, motion, vibration, or level of light

data model

a diagram of data entities and their relationships

Cascading Style Sheets (CSS)

a file or portion of an HTML file that defines the visual appearance of content in a Web page - uses special HTML tags to globally define characteristics for a variety of page elements as well as how those elements are laid out on the Web page

Database Management System (DBMS)

a group of programs that manipulate the database and provide an interface between the database and its users and other application programs. can produce a wide variety of documents, reports, and other output that can help organizations achieve their goals.

field

a name, number, or combination of characters that describes an aspect of a business object or activity

Extranet

a network based on Web technologies that links resources of a company's intranet with its customers, suppliers, or other business partners

Data Administrator (DA)

a nontechnical position responsible for defining and implementing consistent principles for a variety of data issues including setting data standards and data definitions that apply across all the databases in an organization. can be a high-level position reporting to top-level managers

entity

a person, place, or thing for which data is collected, stored, and maintained

broadband communications

a relative term; a telecommunications system that can transmit data very quickly

Virtual Private Network (VPN)

a secure connection between two points across the Internet

relational model

a simple but highly useful way to organize data into collections of two-dimensional tables called relations. each row in the table represents an entity and each column represents an attribute of that entity. if relations share at least one common attribute they can be linked to provide useful information.

SQL (Structured Query Language)

a special-purpose programming language for accessing and manipulating data stored in a relational database

data mart

a subset of a data warehouse that is used by small- and medium-sized businesses and departments within large companies to support decision making - a specific area in the data mart might contain greater detailed data than the data warehouse

Near Field Communication (NFC)

a very short-range wireless connectivity technology designed for consumer electronics, cell phones, and credit cards

query by example (QBE)

a visual approach to developing database queries or requests

in 1986 SQL was

adopted by ANSI as the standard query language for relational databases.

Intranet

an internal corporate network built using internet and world wide web standards and technologies

Database

an organized collection of data. a collection of integrated and related files.

hierarchy of data

bits, characters, fields, records, files, and databases

Service provider managed private cloud

built and managed by a service provider, which provides guaranteed security of cloud information

manipulating data by joining

combining two or more tables

manipulating data by linking

combining two or more tables through common data attributes to form a new table with only the unique data attributes

satellite transmission

communications satellites are relay stations that receive signals from one earth station and rebroadcast them to another

Hybrid Cloud Computing

composed of both private & public clouds integrated through networking - Organizations typically use the public cloud to run applications with less sensitive security requirements, and run more critical applications on the private portion of the cloud

SQL databases

conform to ACID properties

Local Area Network (LAN)

connects computer systems and devices within a small area such as an office or a home

Wide Area Network (WAN)

connects large geographic regions Consists of- - computer equipment owned by the user - data communications equipment and telecommunications links provided by various carriers and service providers Communications may involve transborder data flow

Metropolitan Area Network (MAN)

connects users and their devices in an area that spans a campus or city

Routing messages over the internet

data is transmitted from one host computer to another on the internet

Enterprise data modeling

data modeling done at the level of the entire enterprise. provides a roadmap for building databases and information systems.

entity-relationship (ER) diagrams

data models that use basic graphical symbols to show the organization of and relationships between data. help ensure that the logical structure of application programs is consistent with the data relationships in the database.

Data governance

defines the roles, responsibilities, and processes for ensuring that data can be trusted and used by the entire organization

manipulating data by projecting

eliminating columns in a table

manipulating data by selecting

eliminating rows according to certain criteria

Internet Protocol (IP)

enables computers to route communications traffic from one network to another

Big Data

extremely large and complex data collections - traditional data management software, hardware, and analysis processes are incapable of dealing with them. Three characteristics: Volume, Velocity, and Variety

client/server architecture

features multiple computer platforms dedicated to special functions, e.g., database management, printing, or communications

DBMS Front-end applications

interact directly with people

coaxial cable

guided transmission media type with inner conductor wire surrounded by insulation. advantages: cleaner and faster data transmission than twisted-pair wire disadvantages: more expensive than twisted-pair

fiber-optic cable

guided transmission media type with many extremely thin strands of glass bound together in a sheathing; uses light beams to transmit signals. advantages: diameter of cable is much smaller than coaxial, less distortion of signal, capable of high transmission rates disadvantages: expensive to purchase and install.

Twisted-Pair wire

guided transmission media type with twisted pairs of copper wire, shielded or unshielded, used for telephone service. advantages: widely available disadvantages: limitations on transmission speed and distance

Data Validation

identifying bad data and rejecting it at the time of data entry.

search engine

information on the web is found by specifying keywords - the market is dominated by Google

DBMS Back-end applications

interact with other programs or applications ex- The Library of Congress (LOC) provides a back-end application that allows Web access to its databases, which include references to books and digital media in the LOC collection.

a byte

made up of 8 bits. each one represents a character.

Microsoft's .NET platform

product that allows developers to use various programming languages to create and run programs - many other products make it easy to develop Web content and interconnect Web services as well

domain

range of allowable values for a data attribute

Internet Corporation for Assigned Names and Numbers (ICANN)

responsible for managing IP addresses and Internet domain names - domain names must adhere to strict rules

Database Server

sends only the data that meets a specific query—not the entire file

IP Protocol

set of rules used to pass packets from one host to another

Guided (wired) transmission media

signals are guided along a solid medium

Personal Area Network (PAN)

supports the interconnection of information technology close to one person. Personal and private accounts.

network operating system (NOS)

systems software that controls the computer systems and devices on a network ex- Linux, UNIX, Windows Server, and Mac OS X

Data lake

takes a "store everything" approach to big data, saving all the data in its raw and unaltered form - also called an enterprise data hub - raw data is available when users decide just how they want to use the data - only when the data is accessed for a specific analysis is it extracted from the data lake

Computer Network

the communications media, devices, and software needed to connect two or more computer systems or devices. Organizations can use networks to share hardware, programs, and databases.

Network nodes

the computers and devices on the networks

The internet

the infrastructure on which the Web exists - Made up of computers, network hardware such as routers and fiber-optic cables, software, and the TCP/IP protocols

Tunneling

the process by which VPNs transfer information by encapsulating traffic in IP packets over the Internet

Data Cleansing

the process of detecting and then correcting or deleting incomplete, incorrect, inaccurate, irrelevant records that reside in a database. also called data cleaning or scrubbing. the cost can be quite high.

wireless

the signal is broadcast over airwaves as a form of electromagnetic radiation

data item

the specific value of an attribute

Transmission Control Protocol (TCP)

the widely used transport layer protocol that most internet applications use with IP

Web Browser

web client software used to view web pages ex: internet explorer, firefox, chrome, and safari


Ensembles d'études connexes

Chapter 2: Credit & Debt Chapter Assessment

View Set

Marketing Principles Midterm #2 (Lucas)

View Set

Nutrition - Chapter 7 - Proteins

View Set

ECON CHAPTER 8 Price Ceilings Price Floors

View Set

Fyziologie respiračního systému - zkouškové otázky 117-128

View Set

Chapter 07: Managing Strategy and Strategic Planning

View Set

Real Estate Final National Portion

View Set