CS 4337 CH 6 Data Types
coercion
automatic type casting
descriptor
collection of the attributes of a variable
const * int vs const int* vs const * int*
constant * int (constant pointer) const int* (pointer to a constant) const * int* (pointer to a constant pointer)
test 2
coverage ch 4,5,6,7 lisp tutorials -all three ch of lisp multiple choice open all day 75 minutes
data type
defines a collection of data objects a set of predefined operations on those objects 1. what value that variable can assume 2. what operations can be done
integer
exact reflection of the hardware... so mapping is trivial
decimal
for business applications (money) -essential to Cobol -C# offers a decimal data type -stored fixed number of decimal digits, in coded form (BCD) Advantage: Accuracy Disadvantages: Limited Range, wastes memory
lexicographic ordering
is str1 > str2 ?
compare string type in C, C++, Python, Java... how would you implement dynamic string length?
java uses obj, c++ & c char array in c & c++ make array average length and increase if needed in java use arrayList
reference counters
maintain a counter in every cell that store the number of pointers currently pointing at the cell -disadvantages: space requires, execution time required, complications for cell connected circularly -advantage: it is intrinsically incremental, so significant delays in the application execution are avoided
floating-point
modern real numbers, but only as approximations
NoSQL
not only SQL -not only supporting SQL but also supporting higher level abstractions (like tables) (use jagged arrays)
Implementation of Record Type
offset address relative to the beginning of the records is associated with each field
object
represents an instance of a user-defined (abstract data) type
boolean
simple two elements: true or false could be implemented as bits, but often as bytes ---advantage: readability
Array Initialization
some languages allow initialization at the time of storage allocation C-based languages: -int list [] = {1,2,5,7} -char *names[] = {"mike","fred","joe"}; Python: -list comprehensions -list = [x ** 2 for x in range(12) if x % 3 == 0] ---puts [0,9,36,81] in list
character (primitive)
stored as numeric encodings ASCII -american standard code for information interchange Unicode (UCS-2) -includes characters from most natural languages -originally used in java -also in C# and javaScript
primitive data type
those not defined in terms of other data types -some are merely reflections of the hardware - others require only a little non-hardware support
Accessing multidimensional arrays
two ways: by rows(most languages), by columns (fortran)
character type string operations
typical operations: -assignment and copying -comparison (=, >, etc -concatenation ---the plus operation on strings -substring reference -pattern matching
design issue for all data types
what operations are defined and how are they specified?
pointer and reference types
whats the difference? pointers store addresses in memory references are aliases -a pointer type variable has a range of values that consists of memory addresses and a special value, nil -provide the power of indirect addressing -provide a way to manage dynamic memory -a pointer can be used to access a location in the area where storage is dynamically created (usually called a heap) -pointers can point to any thing -references can alias to heap
Discriminated vs. Free Unions
- C and C++ provide union constructs in which there is no language support for type checking; the union in these languages is called free union -discriminate have a type indicator (required for type checking)
Array Operations
-APL provides the most powerful array processing operations for vectors and matrixes as well as unary operators (for example, to reverse column elements) -python's array assignments, but they are only reference changes. Python also supports array catenation and element membership operations -ruby also provides array catenation
enumeration types
-All possible values, which are named constants, are provided in the definition C# example enum day {mon, tue, wed, thu, fri, sat, sun}; -good for neater code -design issues --is an enumeration constant allowed to appear in more than one type definition, and if so, how is the type of an occurance of that constant checked? --are enumeration values coerced to integer? --any other type coerced to an enumeration type?
reference types
-C++ includes a special kind of pointer type called a reference type that is used primarily for formal parameters --advantages of both pass-by reference and pass-by-value -java extends C++ reference variables and allows them to replace pointers entirely --references are references to objects, rather than being addresses -C# includes both
character string type in certain languages
-C, C++ --not primitive --use char arrays and library of functions that provide operations -SNOBOL4 (a string manipulation language) --primitive --many operations, including elaborate pattern matching -fortran and python --primitive type with assignment and several operations -Java --primitive-like via the String class --java.lang.*; -Perl, JavaScript, Ruby, and PHP --provide built-in pattern matching, using regular expressions
strong typing
-a programming languages is strongly typed if type errors are always detected -advantages of strong typing: allows the detection of the misuses of variables that result in type errors ex: ML, F#, ada, pascal, (java, C# almost) weakly typed: javascript, C, C++ -coercion rules strongly affect strong typing--they can weaken it considerably (C++ vs ML and F#) -althought Java has just half the assignment coercions of C++, its strong typing is still less effective than that of Ada
Implementation of Array
-access function maps subscript expressions to an address in the array
evaluation of enumerated type
-aid to readability, no need to code a color as a number -aid to reliability, compiler can check --operations (don't allow colors to be added) --no enumeration variable can be assigned a value outside its defined range --C# and Java 5.0 provide better support for enumeration than C++ b/c type variables in these languages are not coerced into integer types
character string type evaluation
-aid to writability -as a primitive type with static length, they are inexpensive to provide -dynamic length is nice, but is it worth the expense?
immutable
-any time a string obj is created w/ fixed length and value and it cannot be changed s1 = s1 + "nani" -a completely new string is created in memory b/c you cannot modify a string obj once its created -old gets got by garbage collector ... when you say new String("hello class) ..... wut. benefit: more secure, tread safe
array design issues
-array indexing -subscript binding and array categories
Heterogeneous Arrays
-collections/databases function that takes different parameters -elements are not same type -supported by Python, JavaScript
compile and run-time descriptors
-compile-time descriptor for static strings --static string --length --address -run-time descriptor for limited dynamic strings --limited dynamic string --maximum length --current length --address
problems with pointers
-danging pointers (dangerous) --a pointer points to a heap-dynamic variable that has been deallocated -lost heap-dynamic variable --an allocated heap-dynamic variable that is no longer accessible to the user program (often called garbage) ---the process of losing heap-dynamic variables is called memory leakage
evaluation of pointers
-dangling pointers and dangling objects are problems as is heap management -pointers are like goto's--they widen the range of cells that can be accessed by a variable -pointers or references are necessary for dynamic data structures--so we can't design a language without them
tuple types
-data type that is similar to a record, except that the elements are not named -used in python to allow functions to return multiple values
pointers in C and C++
-extremely flexible but must be used with care -pointers can point at any variable regardless of when or where it was allocated -pointer arithmetic is possible -domain type need not be fixed (void *) --void * can point to any type and can be type checked (cannot be de-referenced) slide 71 dangling pointer
Evaluation of Unions
-free unions are unsafe --do not allow type checking -java and C# do not support unions --reflective of growing concerns for safety in programming language
type checking
-generalize the concept of operands and operators to include subprograms and assignments -type checking: is the activity of ensuring that the operands of an operator are of compatible types -a compatible type: is one that is either legal for the operator or is allowed under the language rules to be implicitly converted (by compiler) to legal type --this automatic conversion is called a coercion -type error: is the application of an operator to an operand of an inappropriate type -if all type bindings are static, nearly all type checking can be static -if type bindings are dynamic, type checking must be dynamic
array types
-homogeneous aggregate of data elements in which an individual element is identified by its position in the aggregate, relative to the first element
array indexing
-indexing (or subscripting) is a mapping from indices to elements -index syntax --fortran and Ada use parentheses -subscript types --fortran, C: int only --Java: integer types only
list types
-lists in Lisp and Scheme are delimited by parentheses and use no commas (A B C D) -data and code have the same form -the interpreter needs to know which a list is, so if it is data we quote it with an apostrophe -list operations in Scheme --car returns the first ele --(CAR '(A B C)) returns A --CDR returns the rest of the list after the first ele --(CDR '(A B C )) returns (B C) --CONS puts first parameter into second paramenter to make a new list --(CONS 'A (B C)) returns (A B C) -python lists --list data type also serves as Python's arrays --pythons list are muable (unlike lisp & scheme) --elements can be of any type (heterogenous) --create a list with an assignment --myList = [ 3, 5.8, "Grape"] -C# & Java supports list through their generic heap-dynamic collection classes, List and ArrayList, respectively
type equivalence
-means two variables have equivalent types if they are in either the same declaration or in declarations that use the same type name -easy to implement but highly restrictive: --subranges of integer types are not equivalent with integer types --formal parameters must be the same type as their corresponding actual parameters
User-Defined Ordinal Types
-range of possible values can be easily associated with the set of positive integers
record types
-record is a possibly heterogeneous aggregate of data elements in which the individual elements are identified by names -design issues: --what is the syntactic form of references to the field? --are elliptical references allowed? ------when using object ------obj.method(); ------method(); //use this instead of obj.method is called elliptical reference
Record Evaluation and Comparison to Arrays
-records are used when collection of data values is heterogeneous -access to array elements is much slower than access to record fields, because subscripts are dynamic (field names are static) -dynamic subscripts could be used with record field access, but it would disallow type checking and it would be slower
Rectangular and Jagged Arrays
-rectangular is mxn matrix -jagged matrix has rows with varying number of elements --possible when multi-dimensioned arrays actually appear as arrays of arrays -C, C++ and java support jagged arrays -F# and C# support both
character string implementation
-static length: compile-time descriptor -limited dynamic length: may need a run-time descriptor for length (but not in C and C++) -dynamic length: need run-time descriptor; allocation/deallocation is the biggest implementation problem
Subscript Binding and Array Categories
-static subscript ranges are statically bound and storage allocation is static (b4 run time) --advantage: efficiency (no dynamic allocation) -fixed stack-dynamic: subscript ranges are statically bound, but the allocation is done during declaration elaboration time during execution --advantage: space efficiency -fixed heap-dynamic: similar to fixed stack-dynamic: storage binding is dynamic but fixed after allocation (i.e., binding is done when requested and storage is allocated from heap, not stack) -heap dynamic: binding of subscript ranges and storage allocation is dynamic and can change any number of times --advantage: flexibility (arrays can grow or shrink during program execution) -C and C++ arrays that include static modifier are static -C and C++ arrays without static modifier are fixed stack-dynamic -C and C++ provide fixed heap-dynamic arrays -C# includes a second array class ArrayList that provides fixed heap-dynamic -Perl, JavaScript, Python, and Ruby support heap-dynamic arrays
character string length options
-static: COBOL, Java's String class -limited dynamic length: C and C++ --in these languages, a special character is used to indicate the end of a string's characters, rather than maintaining the length -dynamic (no maximum): SNOBOL4, Perl, JavScript
pointer operations
-two fundamental operations: assignment and dereferencing l-value left hand side of assignment *p=y; r-value right hand side of assignment int x=*p; -assignment is used to set a pointer variable's value to some useful address -dereferencing yields the value stored at the location represented by the pointer's value int j=*ptr
structure type equivalence
-two variables have equivalent types if their types have identical structures -more flexible but harder to implement
theory and data types
-type theory is a broad area of study in mathematics, logic, computer science, and philosophy -two branches of type theory in computer science: -- -- formal model of a type system is a set of types and collections of functions that define the type rules
union types
-type whose variable are allowed to store different type values at different times during execution --should type checking be required? (nearly impossible) --a table to store constants in a program for compiler use
associative arrays
-unordered collection of data elements that are indexed by an equal number of values called keys (ex. hash tables) --user-defined keys must be stored -design issues: --what is the form of references to elements? --is the size static or dynamic? -built-in type in Perl, Python, Ruby, and Lua --in Lua, they are supported by tables
character string types
-values are sequences of characters -design issues: ---is it a primitive type or just a special kind of array? ---should the length of strings be static or dynamic?
heap management
-very complex run-time process -single-size cells vs variable-size cells -two approaches to reclaim garbage --reference counters (eager approach): reclamation is gradual --mark-sweep (lazy approach): reclamation occurs when the list of variable space becomes empty
insertion and deletion (arrays)
-very expensive
Slices
-want a way to reference a subset of an array -useful in languages that have array operations ex: -python vector (3:6) //indexes 3,4,5 mat[0] [0,2] //inside first array of mat give indexes 0 & 1 -ruby list.slice(2,2) //returns the third and fourth elements of list //first num is starting index second is number you want
Design Issues of Pointers
-what is the scope & lifetime of a pointer variable? -what is the lifetime of a heap-dynamic variable -are pointers restricted to the type of value to which they can point? (pointer to int? can u do double?) -are pointers used for dynamic storage management, indirect addressing, or both? -should the language support pointer types, reference types, or both?
complex
C99, Fortran, and Python Support complex numbers