Material on Test #3
Occurrence of a pointer variable in an expression can be interpreted in 2 distinct ways
1) A pointer is interpreted as a reference to the contents of the memory cell to which it is bound (for a pointer it is an address) >> This is exactly how a non-pointer variable in an expression would be interpreted, although in that case its value likely wouldn't be an address. >> Normal pointer reference. 2) Pointer is interpreted as a reference to the value in the memory cell pointed to by the memory cell to which the pointer variable is bound. >> The pointer is interpreted as an indirect reference. >> Result of dereferences the pointer.
What is the sequence of operations that creates lost-heap dynamic variables?
1) Pointer p1 is set to point to a newly created heap-dynamic variable 2) p1 is later set to point to another newly created heap-dynamic variable.
Pointers are designed for 2 distinct kinds of uses
1) Pointers provide some of the power of indirect addressing (frequently used in assembly language programming) 2) Pointers provide a way to manage dynamic storage >> Pointer can be used to access a heap (a location in an area where storage is dynamically allocated)
Pointer arithmetic is possible (in restricted forms)
Value of index is first scaled by the side of the memory cell (in memory units) to which Ptr_variable is pointing to (its base type) Ptr_variable + index
Decimal
for business applications (money), this data type has the advantage of accuracy but the disadvantages of limited range and wasting memory
Array operation
one that operates on an array as a unit.
Given the code below, 1) set ptr to the address of the int 2) deference ptr to produce the value at init (which is then assigned to count) int *ptr; int count, init;
ptr = &init; // sets ptr to the address of init count = *ptr // count dereferences ptr to produce the value at init (then assigned to count)
Finite mappings
selection operand can be thought of as mapping from the array name and set of subscript values to an element in the aggregate. The arrays themselves are sometimes called finite mappings.
Descriptor
the collection of the attributes of a variable >> in an implementation, a descriptor is an area of memory that stores the attributes of a variable
Specific elements of an array are referenced by two-level syntactics mechanism
• Aggregate name • Scripts/Indices --> dynamic selector consisting of one or more items
Most common array operations
• Assignment • Catenation • Comparison for equality and inequality • Slices C-based languages don't provide any array operations, except through the methods of Java, C++, and C#.
Notes about specific languages regarding to static and dynamic binding (subscript bindings and array categories)
• C and C++ arrays that include static modifier are static • C and C++ arrays without static modifier are fixed stack-dynamic • C and C++ provide fixed heap-dynamic arrays • C# includes a second array class ArrayList that provides fixed heap-dynamic • Perl, JavaScript, Python, and Ruby support heap-dynamic arrays
Range errors
• Common in programs, so requiring range checking is an important factor in the reliability of languages. • Range errors must be implicitly checked *Many contemporary language also don't specify range checking of subscripts (but Java, C#, ML do)
Heap-dynamic (subscript bindings and array categories)
• binding of subscript ranges and storage allocation is dynamic and can change any number of times >> Advantage: flexibility (arrays can grow or shrink during program execution)
Fixed stack-dynamic (subscript bindings and array categories)
• subscript ranges are statically bound, but the allocation is done at declaration time >> Advantage: space efficiency
Static (subscript bindings and array categories)
• subscript ranges are typically statically bound and storage allocation is static (before run-time) >> Advantage: efficiency (no dynamic allocation)
Languages that provide pointers for the management of a heap must include an explicit allocation operation (T/F)?
True. • Allocation is sometimes specified with a subprogram (ex in C: malloc)
What is the sequence of operations that creates a dangling pointer in many languages?
1) A new heap-dynamic variable is created and pointer p1 is set to point at it. 2) Pointer p2 is assigned p1's value 3) The heap-dynamic variable pointed to by p1 is explicitly deallocated (possibly setting p1 to NIL), but p2 is not changed by the operation. p2 is now a dangling pointer. If the dellocation operation didn't change p1, both p1 and p2 would be danginling 4) Of course, this is a problem of aliasing (p1 and p2 are aliases). Ex in C++: Int * arrayPtr; Int * arrayPt2 = new int[100]; Int [100]; arrayPtr1 = arrayPtr2; Delete [] arrayPtr2; // Now, arrayPtr1 is dangling, because the heap storage to which it was pointed has been deallocated.
2 fundamental pointer operations
1) Assignment → sets a pointer variable's value to some useful address. • If pointer variables are used only to manage dynamic storage, then the allocation mechanism, whether by operator or built-in subprogram, serves to initialize the pointer variable. • If pointers are used for indirect addressing to variables that are not heap dynamic, then there must be an explicit operator or built-in subprogram for fetching the address of a variable which can then be assigned to the pointer variable. Dereferencing → takes a reference through one level of indirection • Dereferencing of pointers can be explicit or implicit >> C++ → explicitly specified with the asterisk (*). >> Ex of dereferencing: assignment operation j = *ptr ptr is a pointer variable with value 7080, cell whose address is 7080 has the value 206.
2 ways a pointer to a record can be used to reference a field in that record (C/C++)
1) If a pointer variable p points to a record with a field name age, (*p).age can be used to refer to that field. (*p).age 2) Operator ->, when used between a pointer to a struct and a field of that struct, combines dereferencing and field reference. p -> age (*p).age and p -> age are equivalent.
There are two ways in which multidimensional arrays can be mapped to one dimension:
1) In row major order → elements of the array that have as their first subscript the lower bound value of that subscript are stored first (matrixes are stored by rows) 2) In column major order → opposite of ^^ but not used in any widely used language)
2 categories of variables for pointers
1) Reference types 2) Value types *Both: add writability to a language. Ex: suppose it's necessary to implement a dynamic structure like a binary tree to a language that doesn't have pointers or dynamic storage. >> This would require the programmer to provide and maintain a pool of available tree nodes, which would probably be implemented in parallel arrays. >> It would also be necessary for the programmer to guess the maximum number of required nodes. • Reference variables are closely related to pointers
Why is a dangling pointer dangerous?
1) The location being pointed to may have been reallocated to some new heap (dynamic variable) 2) If the new variable is not the same type as the old one, type checks of uses of the dangling pointer are invalid >> Even if the new dynamic variable is the same type, its new value will have no relationship to the old pointer's dereferenced value. 3) If the dangling pointer is used to change the heap-dynamic variable, the value of the new heap-dynamic variable will be destroyed. 4) It is possible that the location now is being temporarily used by the storage management system, possibly as a pointer in a chain of available blocks of storage, thereby allowing a change to the location to cause the storage manager to fail.
Dangling pointer (also known as a dangling reference)
A pointer that contains the address of a heap-dynamic variable that has been deallocated. *Explicit deallocation of dynamic variables is the cause of dangling pointers!!
What is the notation for the operator for producing the address of a variable?
Denoted as an ampersand (&) Ex: Ptr = &init;
What is the notation for the dereferencing operation?
Denoted as an asterisk (*) Ex: int *ptr;
Array element references map the subscripts to a particular element of the array.
Function calls map the actual parameters to the function definition, and eventually, a function value.
General format to locate an element in a multidimensional array:
Location(a[i,j]) = address of a [row_lb, col_lb] (((i - row_lb)*n) + (j - col_lb)) * element_size
Define and initialize an array of references to String objects in Java
String[] names = ["Bob", "Jake", "Darcie"];
Primitive data types
Those not defined in terms of other data types >> some primitive data types are merely reflections of the hardware, while others require only a little non-hardware support for their implementation
Is it true that languages that support object-oriented programming, allocation of heap objects if often specified with the new operator.
True
Is it true that where a language doesn't provide implicit deallocation (like C++), delete is used as its deallocation operator.
True
Pointers can also be assigned the address value of any variable of the correct domain type or nil (constant zero)? (T/F?)
True
In Java, are character strings primitive? If so, with what?
Yes, it is primitive. With the String class.
Access function for single-dimensional arrays (because they're stored in contiguous memory)
address(list[k]) = address(list[lower_bound]) ((k - lower_bound) * element)_size)
Syntax of array references
array_name(subscript_value_list) → element *Problems with using parenthesis to enclose subscript expressions because they are often also used to enclose the parameters in subprogram calls. >> This use makes references to arrays look exactly like those calls.
Arrays of strings in C++ and C can also be initialized with string literals
char *names[] = {"Bob", "Jake", "Darcie"}; • String literals are taken to be pointers to characters (so the array is an array of pointers to characters) Ex: names[0] is a pointer to the letter 'B' • In the literal character array that contains the characters 'B', 'O', 'B', and the null character
Data type
defines a collection of data values and a set of predefined operations on those values.
Enumeration types
one in which all of the possible values, which are named constants, are provided, or enumerated, in the definition C# ex: enum days {mon, tue, wed, thu, fri, sat, sun};
Character string type
one in which the values consist of sequences of characters
PROBLEM: PL/I (and similar languages)
pointers were flexible but could lead to several programming errors Java has replaced pointers completely with reference types, which along with implicit deallocation, minimize the primary problems with pointers. >> A reference type is really only a pointer with restricted operations.
Character
stored in computers as numeric codings, the most commonly used coding of which is ASCII
Complex
supported by some languages such as C99, Fortran, and Python. Each value consists of two floats, the real part and the imaginary part
Evaluation of Enumeration types
• Aid to readability (e.g. no need to code a color as a number) • Aid to reliability... e.g. compiler can check: >> Operations >> No enumeration variable can be assigned a value outside its defined range >> C# and Java 5.0 provide better support for enumeration than C++ because enumeration type variables in these languages are not coerced into integer types
Lost heap-dynamic variables
• An allocated heap-dynamic variable that is no longer accessible to the user program. >> Such variables are often called garbage (because they aren't useful for their original purpose and they can't be reallocated for some new use in the program) *Memory leakage → first heap-dynamic variable is now inaccessible or lost >> Problem in both implicit and explicit deallocation.
Initialize arrays at the time their storage is allocated.
• C, C++, Java, and C# allow initialization of their arrays. int list[] = {4, 5, 7, 83} >> The compiler sets the length of the array. >> The array list is created and initialized with values 4, 5, 7, 83 >> Removes the possibility that the system could detect some kinds of programmer errors, such as mistakenly leaving a value out of the list.
Design issues with Enumeration types
• Is an enumeration constant allowed to appear in more than one type definition, and if so, how is the type of an occurrence of that constant checked? • Are enumeration values coerced to integer? • Any other type coerced to an enumeration type?
Design issues with the character string type
• Is it a primitive type or just a special kind of array? • Should the length of strings be static or dynamic?
Are pointers scalar variables?
• Pointers are different from scalar variables >> they are used to reference some other variable, rather than being used to store data.
Are pointers structured types?
• Pointers are not structured types >> unlike arrays and records, even though they are defined using a type operation (* in C/C++)
Advancements on pointers in C and C++
• Pointers can point at any variable, regardless of where it is allocated. • They can point anywhere in memory, whether there is a variable there or not (one of the dangers of the pointers as well).
Implementation of character string types
• Static length → compile-time descriptor • Limited dynamic length → may need a run-time descriptor for length (but not in C and C++) • Dynamic length → need run-time descriptor; allocation/deallocation is the biggest implementation problem
Selector being Static vs. Dynamic
• Static → if all of the subscripts in a reference are constant • Dynamic → if at least one of the subscripts in a reference is not constant.
Length options:
• Static: COBOL, Java's String class • Limited dynamic length: C and C++ → in these languages, a special character is used (\0) to indicate the end of a string's characters, rather than maintaining the length • Dynamic (no maximum): SNOBOL4, Perl, JavaScript
Access function for a multidimensional array
• The access function for a multidimensional array is the mapping of its base address and a set of index values to the address in memory of the element specified by the index values • In general, the address of an element is the base address of the structure plus the element size times the number of elements that precede it in the structure
2 distinct types are involved in an array type
• The element type • The type of the subscripts (often integer)
All arrays use zero as the lower bound of their subscript ranges and array names without subscripts always refer to the address of the first element.
• The pointer operations include the same scaling that is used in indexing operations. >> Pointers to arrays can be indexed as if they were array names. // Declarations int list[10] int *ptr; // Assignment ptr = list; /*This assigns the address of list[0] to ptr. Given this assignment, the following are true *(ptr + 1) is equivalent to list[1] *(ptr + index) is equivalent to list[index] Ptr[index] is equivalent to list[index] */
Character strings in C++ and C are implemented as arrays of char
• These arrays can be initialized to string constants char name[] = "freddie"; >> The array name will have 8 elements (because all strings are terminated with a null character, zero, which is implicitly supplied by the system for string constants) >> String literal being used to initialize the char array name (The literal is taken to be a char array)
Heap-dynamic variables
• Variables that are dynamically allocated from the heap • Often don't have identifiers associated with them. >> This means that they can only be referenced by pointer or reference type variables >>Anonymous variables → variables without names.
Design Issues for pointers
• What are the scope and lifetime of a pointer variable? • What is the lifetime of a heap--dynamic variable (the value a pointer references)? • Are pointers restricted as to the type of value to which they can point? • Are pointers used for dynamic storage management, indirect addressing, or both? • Should the language support pointer types, reference types, or both?
Type evaluation for character string types
• aid to writability • as a primitive type with static length, they are inexpensive to provide -- why not have them?>> Dynamic length is nice, but is it worth the expense?
Integer
• almost always an exact reflection of the hardware so mapping is trivial. There may be as many as eight different integer types; for example, Java's signed integer sizes are byte, short, int, long
Typical Operations (character string types)
• assignment and copying, comparison (=, >, etc.), catenation, substring reference, pattern matching
Implementation of array types
• implementing arrays requires considerably more compile-time effort than does implementing primitive types. • The code to allow accessing of array elements must be generated at compile time, and the code is executed at run time to produce an address (there's no way to precompute the address to be accessed by a reference)
Floating point
• model real numbers, but the representations are only approximations for may real values >> languages for scientific use support at least two floating-point types (e.g. float and double), sometimes more >> Usually exactly like the hardware, but now always >> Precision → accuracy of the fractional part of a value >> Range → a combination of the range of fractions and, more importantly, the range of exponents
User-defined types
• provide improved readability through the use of meaningful names for types. >> Allow type checking of the variables of a special category of use, which would otherwise not be possible. >> By changing a type definition only, the type of a category of variables in a program can be changed
Boolean
• range of values: two elements, one for "true" and one for "false" >> Simplest of all data types >> Could be implemented as bits, but often as bytes
Fixed heap-dynamic (subscript bindings and array categories)
• similar to the fixed stack-dynamic in that storage binding is fixed after allocation, but different in that subscript ranges are dynamically bound (i.e. binding is done when requested and storage is allocated from heap, not stack).
Pointer type
• the variables have a range of values that consists of memory addresses and nil (a special value that is used to indicate that a pointer can't currently be used to reference a memory cell. It is not a valid address itself).
In C and C++ character strings are not primitive. What do they use instead?
• they use char arrays and a library of functions that provide operations