HSCI 391 Final

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

data misrepresentation

1. unconscious bias 2. taking advantage of glitzy software results inaccuracy of visualization 3. a conscious desire to deceive viewer of the visualization 4. poorly constructed visualization with insufficient attention to detail

relational

2 or more interrelated tables always have key field common to both tables in order to relate the records in those tables

databases versus spreadsheets

RDBMS require structure of database created before any data is entered into the system spreadsheets are blank slates

line chart

a type of chart which displays information as a series of data points called 'markers' connected by straight line segments most versatile graph types best types to illustrate data in a time series time depicted left to right depicted w/o clutter require little explanation

cluster column charts

a variation of a column chart that includes more than one category typically needs a legend

user level security

access control to a file, printer, or other network resource based on username. it provides greater security than share-level security because users are identified individually or within a group

temporal

allows us to ID patterns that we would not see otherwise time

input masks

allows user to force future end-users to input data using a specified format specify how data should be inputed

security methods for DBMS

anonymity data partitioning/segmentation encryption

text field

any alpha numeric character sometime called character or string fields can enter #s but no mathematical formulas allowed

consideration when creating a database

attention to initial design procedure to ensure consistency & integrity (hygiene, validation, rules) eliminate data redunacy protect sensitive data from inappropriate access

data partitioning/segmentation

based on the process of splitting up a database into multiple partitions in order to improve performance or as a security precaution to individually secure the partitions and control database user access for each partition

data partioning

based on the process of splitting up a database into multiple sections in order to improve performance or as a security precaution to individually secure and control database user access

relational database management systems

basic technology that can be used to support everything from accounting systems to electronic health records to personal christmas card list access allows users to create customized databases that contain custom-designed data entry screens, queries, reports, and user interface screens

logical fields

boolean Y/N T/F 1/0 used when can only have 2 possible field types

column charts

can be used with a time-series data too many data points-too many bars ineffective

memo field

can contain anything documents, graphics, links

cell address

combining column letter & row number every cell assigned a name which is a cell address identifies it's location in the spreadsheet 2 different types of cell address relative & constant

stacked bar or column charts

communicating the relative proportions of 2 variables over time

databases

composed of a collection of one or more tables

tables

composed of records (rows)

range

contiguous group of cells is specified ex. B2:B247

data validation in access

create rule used to test data input into that field error message pop-ups if user tries to do something else

what is the basic concept of spreadsheets?

data values are separated from the formula logic

what are the building blocks of RDMS?

database fields

under HIPPA-Safe Harbor Method

de-identifying patients data in fields

name

each field must be named no 2 fields can be the same

single-user desktop

easiest DB to deploy DB only exist on 1 persons computer

logic errors

errors that result in formulas that execute properly but don't produce the desired results difficult to identify follow formula's logic step-by-step to see where disconnect happens between the expectation & the result

cells are _____

essentially variables

spreadsheet standard review board

establish recommended guidelines for organizational use of spreadsheets & best practices for spreadsheet modeling

size

field assigned particular size will reserve that amount of capacity. stated in terms of how many characteristics a field will be able to hold

circular references

formula incorrectly references it's own results 2 or more formulas referenced each other results

records

info concerning places, things, events, or person

shapefiles

information related to where someone or something is physically present made popular by the environmental systems research institute ESRI

anecdotal data

information that doesn't necessarily have scientific validity

another type of data validation?

input masks

arguments

inputs & instructions needed for the function to do it's job

indexinf

keeps track of each record on table & change order of the way they appear index key has made a massive changes only takes 30 seconds

what is the strongest data encryption?

key-exchange cryptography not available on access

best practices for spreadsheets

know the purpose purpose should communicate to end users font colors (blue for constants/input cells) (black for formulas) assumptions should never be located together w/ outputs data validations techniques should be used whenever possible complex formulas should be documented use effective version control

clusters

larger than expected number of incidences of disease (crimes, car accidents, etc.) related by time & place

spreadsheets

limited # of rows (records) structure can be created while data is being inputed can only support flat file no structured report-writing capabilities limited limited ability to create data queries

DBMS

limited only by amount of disk space & operating system structure must be created first (logic) then data can be input can support complex relational DB structures ability to create complex data queries using boolean logic requires planning

types of geography

lines, points, & polygons

pattern recognition

looking at numbers analytically to determine what interesting patterns or trends may exist

projection

mathematical algorithms to adjust coordinates such as latitude & longitude for the earth's curvture

excel formuals

may be mathematical in nature or used to manipulate text evaluate logical conditions or perform other non-mathematical tasks

data fields

may be referred to as datetime formats only holds dates set up in chronological order

functions

mini computer games that can perform very specific specialized tasks within another program input --> output ex. =sum()

selection process from preselected values

minimize amount of typing that must be done by data operator ex. zipcode lookup table

syntax errors

mistakes in writing or typing the formulas and commands when creating a spreadsheet often result in error message ex. ####, #DIV/0, #VALUE!, #REF, #NAME?, #N/A

required data entry

most DBMS allow fields to be specified as required for data entry data operator cannot skip over fields

anatomy of a function

name () the name describes the function () contain arguments comma separate argument

do text fields set up numbers in chronological order?

no

does access have user-level security?

no

cells contents includes

numbers (dollars dates a quantitative value) labels (words & symbols) formulas (mathematical, evaluate logical conditions)

circular reference

occurs when 1. a formula incorrectly references its own results 2. two or more formulas reference each others results

field or column

piece of info within a record

version control

programmer keeps track of different versions of his or her code that accumulates as repairs & modifications are made to a program

3D pie chart

provides a more skewed view or visual inaccuracy when viewing data best to illustrate relative proportions between various components poor for time series data stick to 2D designs slices must add to 100%

arguments that include constants must be enclosed w/ ______

quotation marks ex. =function("my name")

data visualization

refers to graphical or pictorial representation of data

attributes

refers to information about geographic features

spatial

refers to information related to where someone or something is physically present space

spatial query

relating or integrating different tables based on spatial characteristics & use spatial statistics together with a database query to identify records meeting a criterion

anoynmity

removing all info from a dataset that refers to any info making them identifiable names, addresses, phone numbers effective anonymity ensures all fields containing personal identifies are removed from database prior to database being distributed

geographic centroids

represent the geometric center of the region

population centroids

represent the point inside of a region containing the greatest population density

multi-user access through LAN

requires all computers requiring access to DB connected using cable or wireless connection to same router closed system no internet possible to provide access to lan via internet through VPN

multi-user access through WAN

similar to LAN sever holding software & database directly accessible through internet accessed through web browser robust system security measures must be applied to ensure database accessibility to users & protected from unauthorized

database accessibility

single user desktop multi-user access through LAN multi-user access through WAN

numeric fields

sometimes called numbers or values can only hold #s

features

specific places on map and can be roads, regions, cities, zip codes etc.

since access doesn't have key-exchange. what is another form to protect info?

symantec, pgp, and gpg tools

2 types of errors

syntax & logic

####

syntax error column to small to display results

#DIV/0

syntax error dividing by a 0

#REF

syntax error invalid cell address referenced to a cell that no long exists

#NAME?

syntax error invalid cell address or name cell address inserted wrong into formula

#VALUE!

syntax error invalid value error invalid value is reference within formula

#N/A

syntax error result cannot be returned for the formula because there is no result

centroid

the coordinates representing the center of a region

geocoding

the process of determining the geographic coordinates of specific location based on street address or existence within a known region always an estimate included in most GIS software as a batch process

layer

transparent overlays of the map each containing a different type of geography

type

type of data the field will hold ex. numeric field

what should you never do in excel formulas?

use numbers always reference the cell address

planning a database

what tables will be need and how they potentially relate to one another what fields will be requires & assigning the characteristics of each field

do databases require planning?

yes

encryption

a process by which digital information is converted into an unreadable state based off science of cryptography

data hygiene

The degree to which a computer databases contain errors such as typos, data entry mistakes, transposed with numbers, outdated data elements

data validation

The process of ensuring that a program operates on clean, correct and useful data.

relative cells

a cell reference that can change by default in excel all cells are relative cells unless specified

constant cell

a cell reference that never changes Ex. $A1 A$1 $A$1

bar charts

a charts in which the bars are oriented vertically and can either be clustered by a grouping variable or they may be stacked rotated 90 degrees long space for description bars inappropriate time-series data

flat file

a database that consists of entirely 1 table work best when database is simple & narrowly defined

fileserver

a device that controls access to separately stored files as part of a multiuser system

thematic map

a map designed as data visualization tool map is designed to describe one or more attributes of the features data visualization technique that allows spatial patterns to easily be communicated

open source

a method of licensing is based on the premise that software licensed under this method is developed by a global community of programmers and is free for anyone to use

virtual private network (VPN)

a network that is constructed using public wires to connect to a private network such as company's internal network


Set pelajaran terkait

A Client with Rheumatoid Arthritis Pharm Questions

View Set