GIS Final

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Root mean square error

Estimates the difference between the measured points and the transformed points for both x and y coordinates - try to minimize this - more complex transformations not always best fit - goal is to fit transformation with a RMSE of less than 5 meters

Vector data sources

Geocommunity, TIGER, Sensors, social media, GIS sites

Unordered attribute table

Records appear in order they were entered Very inefficient

Visual query

Select features by pointing to a feature via the record in a table or a symbol in a map

Ordered attribute table

Table ordered by Last Name field Works great for searching by last name, however if I search by anything else its still inefficient

An index can be created for any field and data types. a) True b) False

a

ObjectID in ArcGIS is a good choice for a primary key. a) True b) False

b

Line smoothing and thinning

uses a set of polynomial functions to fit a smoothed line to the points - uses sets of points to delete (thin) redundant points

Types of shapefiles

- .shp = main file, variable-record-length file: each record describes a shape with a list of its vertices. STORES GEOMETRIES - .dbf = dBASE table contains FEATURE ATTRIBUTES with one record per feature - .shx = index file CONNECTS GEOMETRIES W/ ATTRIBUTES, points to segments that have relationships to others. Must have these or will be corrupted - .prj = projection/coordinate system info - .sbn and .sbx = spatial index - .xml = metadata

TIGER data enforcement

- 0cells are either isolated points or adjacent to one or more 1cells - all 1cells end with exactly 2 0cells - each line segment b/n adjacent 0cells is assigned to exactly one 1cell - every place on the map is b/n noodles is assigned to a single 2cell

Surfaces

- 3D - length, width, height - adds volume to discrete representations

File formats

- DLG (Digital Line Graph) and TIGER have topology - Open geospatial data - ESRI Geodatabase, shapefiles, coverages

Mixed pixel problem

- Most important - define what is most important in your study - Winner takes all - if one cell has 49% water and 51% grass, then that cell becomes grass - Edges separate - edges become a third category

Primary key

- a column or combo of columns that has a unique value for each and every record in the table - a key is needed to join two tables together, something that both tables have in common - attributes that make acceptable keys have non-repeating values - ObjectID is NEVER a good primary key because it is assigned arbitrarily

Primary data

- collected or developed by the intended user - advantages: higher quality, more specific, access - disadvantages: more expensive, requires more time, money, and employees -ex: remote sensing, GPS, traditional ground surveys

Scanning Hardcopy Maps

- converts hardcopy analog media into digital images - places map on glass plate and passes light beam over it - measures reflected light intensity - features can drop out

Secondary data

- data collected for other purposes that can be converted for use in GIS - advantages: can be easier and/or cheaper to acquire - disadvantages: may not be correct resolution and/or format, metadata may not exist, data you want may not be available - ex: hardcopy aerial photos, USGS topographic maps, feature names from atlases, social media

Indexed attribute table

- index files can be created which order specific fields and streamline the search process - search will use appropriate index file to locate record

Common data types

- numeric: short integer, long integer, float, double - text - date - BLOG - binary large object, images and multimedia

Table relationships

- one to one relationship - one to many - many to one - many to many

Modes of digitzing

- point - stream - distance

Attribute table structure

- records/objects/features in rows - fields/attributes in columns

Raster data

- supports gridded data - fundamental unit is a cell (pixel) - all cells in a raster dataset will almost always have the same resolution - conceptually simple and computationally fast - poor at representing points, lines, and areas, but good for surfaces - suffers from mixed pixel problem - often include redundant or missing data

Erros in digitizing

- undershoots and overshoots - invalid polygons - sliver polygons

Rubber sheeting

-create control points that are identifiable in the data to be digitized AND geographic or projected coordinates can be located for those points - match created using a mathematical relationship

Raster data formats

.TIF, .JPG/JPG2000, .sid, .ECW, ESRI GRID, DEM, .BIL

Raster Data Compresion

1. Full raster encoding 2. Run-length encoding 3. Value point encoding - matrix form is inefficient because of redundancy

What does a GIS dataset consist of?

1. spatial data - geometrical data capturing location and form of a geographical feature 2. attribute data - textual info describing key characteristics of associated geographical feature

Quadtrees

2D version of run length encoding - lossless compression - entire array defined, then recursively sub-divided into quadrants until cells have same values - root represents the entire raster - can describe a cell by its position in quadtree - read from top left, top right, bottom left, bottom right

Points

Discrete, zero dimensional, occupies no space (no width or length), focus is on location, density, distribution

Rasterization

How to assign values to cells - presence/absence: good for points and lines - cell centroid: good for polygons - dominant type: good for polygons - percent occurrence: each cell's layer coded separately

Join vs Relate

Join: appends fields from second table with data for each record where a key field match is found. For 1:1 or M:1 only. Relate: allows automatic access to a related table's records; keeps tables physically separate. For 1:M or M:M

Run length encoding

Store each run length (start at top, go from left to right) - individual numbers are listed as pairs ex: row 1 - 99666667 means (2,9) (5,6) (1,7) - lossless compresion

TIGER

Topologically Integrated Geographic Encoding Referencing System - noodles = lines - cells -- 0-cell: a) wherever 2 noodles cross or b) noodle terminates (node) -- 1-cell: each length of noodle b/n 2 consecutive 0-cells (arc) -- 2-cell: each group of consecutive 1-cells forming an enclosed area that does not contain any 1-cells that are not part of the boundary (polygon)

TIN

Triangular Irregular Network - Raster DEM (digital elevation model) may not be best to represent surface - referred to as DEMs or DTMs (digital terrain models) - preserve topology

Attribute query

Use the aspatial characteristics of a feature as a criterion

Spatial query

Use the spatial relationship between features as a criterion

Both vector and raster data are eventually stored in machine readable format that consists of binary digits. a) True b) False

a

Changing vector to raster is straightforward whereas raster to vector conversion can introduce odd errors such as false or lost connections among spatial features. a) True b) False

a

In relational databases, normalization is needed to reduce the redundancies by splitting the relations into many tables. a) True b) False

a

Which of the following raster encoding produces a large file size? a) Full raster encoding b) Run-length encoding c) Quadtree d) Value point encoding

a

_______________ affects results when point-based measures of spatial phenomena (e.g., population density) are aggregated into districts. The resulting summary values (e.g., totals, rates, proportions) are influenced by the choice of district boundaries. a) The modifiable areal unit problem (MAUP) b) Reclassification c) Buffering d) Dissolve operation

a

_______________ reduces over and undershoots within a specified threshold/tolerance. a) Snapping b) Root Mean Square Error c) Stream mode digitization d) Line smoothing

a

Bit

a binary digit that can have two values: on (1) or off (0) come in sets of eight (=byte) 8 bits = 1 byte

Foreign keys

a field in a table that has exactly the same value as the primary key column of a row in another table - a primary key-foreign key pair is needed to join two tables

Candidate key

a subset of attributes of a super key may also be a super key, and is called a candidate key

Which of the following are TRUE for georeferencing? a) Georeferencing involves capturing the map, and sometimes the attributes b) Georeferencing is the conversion of spatial information into digital form c) Georeferencing uses developable surfaces for map projections. d) Georeferencing leaves a "stamp" on the data. The method of geocoding can influence the structure and error associated with the spatial information that results e) Can involve address matching (=geocoding; image above)

a, b, d, e

Select the conditions necessary for dissolving polygons. a) Polygons need to be adjacent. b) Polygons need to have a similar size. c) Polygons need to have the same value of an attribute. d) Polygons need to intersect with each other.

a, c

Select the true statements about SQL. a) A powerful language which can be used to define one or more criteria that can consist of attributes, operators, and calculations b) SQL refers to Spatial Query Language c) ArcGIS supports a subset of function of the standard SQL, it also supports GIS queries that are not covered by a standard SQL d) In ArcGIS, SQL is used to select features with the Select by Attributes dialog box

a, c, d

Which of the following are true for Root Mean Square Error (RMSE)? a) Affected by transformation errors of rotation, translation and scale change. b) The objective is to try to maximize RMSE c) More complex transformations not always provide the best fit, even if they produce lower RMSE d) Estimates the difference (error) between the measured (known) points and the transformed (fit) points for both the x and y coordinates

a, c, d

Select the true statements. a) Raster data structure is simpler than vector. b) Analysis of continuous data is simpler with vector data model. c) Vector data is often easy to modify due to simple data structure. d) Raster data model is good for representing images and surfaces, but discrete features may show "stairstep" edges.

a, d

Relationships

associated b/n two or more objects in a geodatabase that can exist between spatial objects (features in feature classes), nonspatial objects (rows in tables), spatial and nonspatial objects

Functional dependency

attributes are functionally dependent if at a given point in time each value of the dependent attribute is determined by a value of another attribute

Data collected for other purposes, which can be converted for use in a GIS, is an example of primary data collection. a) true b) false

b

If GPS picks satellites close together (rather than far apart) in the sky, range of uncertainty decreases, therefore, the error is minimized. a) True b) False

b

Raster data model is better suited to represent discrete features. a) True b) False

b

The query below will run without errors: SELECT FirstName, LastName FROM Student WHERE Instructor = Garrison a) True b) False

b

To join a table, the two fields (primary and foreign key) can be in different data types. a) True b) False

b

Which of the following source of error can be removed completely? a) Atmosphere b) Selective availability c) Poor geometry d) Multipath

b

Select the queries that will return results without an error. a) SELECT FirstName, LastName FROM Student WHERE Instructor = Garrison b) SELECT FirstName, LastName FROM Student WHERE Instructor = 'Koylu' c) SELECT * FROM Student d) SELECT FirstName, LastName FROM Student WHERE Grade = 'B'

b, c, d

Select the strategies that can be used in converting a vector feature set to a raster. a) Quadtree b) Presence/Absence c) Cell Centroid d) Dominant Type e) Percent Occurrence

b, c, d, e

Which of the following are examples of secondary data? a) GPS b) Feature names from atlases c) Remote sensing d) Traditional ground surveys e) Hardcopy aerial photos f) USGS topographic maps

b, e, f

What does the Pseudo Random Code (PRC) do in a GPS receiver? a) Reduce range uncertainty. b) Find the best coverage. c) Measure lag time within the signal and allow communication with a GPS satellite. d) Connect to a new satellite.

c

What is the corresponding decimal (base 10) value of the binary code 00011000 ? a) 11,000 b) 2 c) 24 d) 128

c

Which of the below is the term used to describe the below process? 1. Create control points that are identifiable in the data to be digitized AND geographic or projected coordinates can be located for those points 2. "Match" created using a mathematical relationship a) Scanning b) On-screen digitizing c) Rubber sheeting d) Smoothing

c

Which of the below represent the disadvantages of primary data? a) Metadata may/may not exist b) May not be correct resolution (spatial/temporal) and/or format c) May be more expensive (time, money, employees)

c

__________ is a classification method that divides polygon features into groups so that the total area of the polygons in each group is approximately the same. a) Equal interval b) Quantile c) Equal area d) Natural Break

c

Which of the below is a raster data source? a) TIGER b) Polygon shapefile of the States c) Satellite imagery d) SPOT Data e) Landsat Data

c, d, e

Georeferencing

capturing data from analog maps and text through scanning and/or digitizing - conversion of spatial info into digital form - involves capturing the map, and sometimes the attributes - can involve address matching (geocoding) - leaves a stamp on the data

Conversion b/n raster and vector

changing vector to raster is straightforward; raster to vector can introduce errors

Primary keys

chosen from the set of candidate keys

Select the methods that help reduce the slivers. a) Check your input layers and redefine boundaries with highest coordinate accuracy, replace/fix these before overlay b) Manually identify and remove them c) Use a snapping tolerance distance during overlay d) All of the above

d

Which of the following correctly identifies the Multipath problem? a) Interference of the signals by the atmosphere b) Orbit error c) Errors caused by geometric arrangement of satellites. d) Errors caused by the bounce and reflection over the Earth's surface.

d

Which of the following is NOT a mode of entering coordinate data for points? a) Distance mode b) Stream mode c) Point mode d) Topology mode

d

Which of the following is NOT true for raster data model? a) Conceptually simple and often computationally fast b) Suffer from the mixed pixel problem c) Correspond to natural data model for scanned or remotely sensed data d) Good for representing discrete features, but poor at representing surfaces

d

Areas

discrete, 2D (length and width), focus is on length, orientation, area, shape

Lines

discrete, one dimensional (length, no width), focus is on length and orientation

Select all the correct choices for the statement: SQL can be used to: a) Create - construct a new data table b) Select - query one or more rows from a table / multiple tables c) Insert, Delete, Update - edit the table or the values in the table d) Drop - discard a data table e) All of the above

e

Select the correct statements about run-length encoding a) Stores each run length starting at top and left to right. b) Run-length encoding is not useful for files that do not have many runs (values are often different in adjacent cells). c) Stores pairs instead of individual numbers. d) It is a lossless compression which allows the original data to be perfectly reconstructed from the compressed data. e) all of the above are correct

e

Which of the following are true for Database Management Systems? a) DBMS allows for centralized control and maintenance b) Must support a diverse user community. c) A DBMS requires conceptualization of the data in the form of a model d) DBMS is a software application designed to provide efficient/effective way to store and retrieve data e) all of the above

e

Spatial and Attribute Data

features and attributes are linked by a unique integer identifier: - shapefile: FID column - feature class: OBJECTID column

Data models

fields and objects = conceptualizations, ways in which we think about geographic phenomena - not designed to deal with limitations of computers

On screen digitizing

if data already exist in digital form AND possess spatial info, it is possible to digitize directy

Binary

machine code that represents decimal (normal) numbers

Topology

maintenance of spatial relationships between geographic features - spatial relationships between adjacent or neighboring features - can share endpoints, boundaries, segments. There can b overlap, share vertices,

Data structures

methods of representing data model in digital form

Spaghetti Polygon Chains

no topology - features are spatially independent if they share a border or vertex - dual line encoding: internal border can be stored twice and it is easy for each copy to be different

Topology enforcement

objects used to describe spatial variation must obey simple rules: 2 areas cannot overlap, every place must be w/n exactly one area, or on a boundary - can build objects out of digitized lines - linear features must begin at a node and end at a node (from and to node, respectively) - info about left and right polygon bounding are stored with lines - nodes occur at all intersections - polygons must close - planar enforcement is applied

Super Key

one or more attributes that may be used to uniquely identify every record (row) for a table

Raster data perimeter calculation

perimeter = sum of(grid cells/edge * resolution) - perimeter = 8 grid cells/edge * 2km/side * 4 edges = 64 km perimeter

Snapping

reduces over and undershoots within a specified threshold/tolerance

Raster Data sources

satellite imagery, landsat data, SPOT data, Earth Explorer, DEMs, LiDAR, sensors

Tables

set of data elements (values) that is organized using a model of vertical columns and horizontal rows - all rows in a table have the same columns (fields) - each column has a data type, like integer, decimal, number

Byte

since 1 byte = 8 bits, has integer values of 0 to 255

Error propagation

small digitizing errors can scale up to large errors in GIS data layer

Database Management System (DBMS)

software package that enables us to organize data and retrieve info when needed - Microsoft access, ArcGIS, NoSQL, Oracle

Vector representations

spatial objects: point, line, polygon - widely used

Shapefiles

store non-topological geometry and attribute info for the spatial features in a dataset - geometry for a feature is stored as a shape comprising a set of vector coordinates

Joins

tables can be joined or combined together using primary keys

Relational Database Management System (R-DBMS)

tables, databases, and DBMS - data are building blocks - data models are design plans - database is construction phase advantages: columns, tables, indexes support DBMS, data independence, multiple user views, centralized control and maintenance disadvantages: may require specialized training to design, use and maintain. defining relationships can be complex


Set pelajaran terkait

Ch. 1: Consumer Behavior and Marketing Strategy(Notes + Review Q & A)

View Set

Chapter 1: Nurse's Role in Health Assessment: Collecting and Analyzing Data PrepU

View Set