Dictionaries, Lists & Tuples
Dictionaries are mutable
Like lists, dictionaries are a mutable data structure: - you can change the object via various operations, such as index assignment Example: my_dict = {'bill': 3, 'rich': 10} print(my_dict['bill']) # prints 2 my_dict['bill'] = 100 #"Bill' i set to 100 print(my_dict['bill']) # prints 100
Tuples
● Tuples are like immutable lists ● They use () instead of [] as the constructor >>> 10, 12 # creates a tuple (10, 12) >>> tup = 2, 3 # assign tuple to variable (2, 3) >>> (1) # not a tuple, a grouping >>> (1, ) # comma makes a tuple (1, ) >>> x, y = "a', 3.14159 # multiple assignment >>> x, y # create a tuple ("a', 3.14159)
Indexing
● What does the [ ] mean, a list or an index? [1, 2, 3][1] => 2 ● Context solves the problem. Index always comes at the end of an expression - it is preceded by something (a variable, a sequence): values = [1, 2, 3] values[1] => 2
Key-Value pairs
- The key acts as an index to find the associated value. - Just like a dictionary, you look up a word by its spelling to find the associated definition - A dictionary can be searched to locate the value associated with a key - Extremely common data structure in computer language
Common operators
Like all collections, dictionaries respond to these operations: ● len(my_dict) - returns the number of key: value pairs in the dictionary ● element in my_dict - Boolean, is element a key in the dictionary ● for key in my_dict: - iterates through the keys of a dictionary
Using 'else' with a 'for' loop
Used when you want program to do something ONLY if the loop is completely finished: needle = 'd' haystack = ['a', 'b', 'c'] for letter in haystack: if needle == letter: print('Found!') break else: # if no break occurred print('Not found!') result: Not found!
Commas make a tuple
For tuples, you can think of a comma as the operator that makes a tuple, where the ( ) simply acts as a grouping: myTuple = 1, 2 # creates (1,2) myTuple = (1,) # creates (1) myTuple = 1, # creates (1) myTuple = (1) # creates 1 (integer) not (1) tuple
dict comprehensions
Like list comprehensions, you can write shortcuts that generate either: a dictionary or a set, with the same control you had with list comprehensions ● both are enclosed with {} (remember, list comprehensions were in []) ● difference is if the collected item is a : separated pair or not ● the general form is same as with lists, except braces not brackets {expression for- clause condition}
zip
Method used to loop through two or more LISTS at the same time. x_list = [1, 2, 3] y_list = [2, 4, 6] for x, y in zip(x_list, y_list): print(x, y) result: 1 2 2 4 3 6
'get' dictionary method
To ascertain if a key is in a dictionary, without crashing program if key is NOT in dict: ages = {'Mary' : 31, 'Joe' : 28} age = ages.get('Dick', 'unknown') print('Dick is %s years old' % age) If 'Dick' is in the dict, then it prints Dick's age; if 'Dick' is NOT in the dict, it uses the default value 'unknown' result: Dick is unknown years old
modifying what gets collected
[c for c in "Hi There Mom" if c.isupper()] ● The if part of the comprehension controls which of the iterated values is collected at the end. ● Only those values which make the if part true will be collected => ['H', 'T', 'M']
List methods
● Methods are called using the dot ( . ) notation: object . method(arg) my_list = ['a',1,True] my_list.append('z') ● append is a member of list
join method
● NOT a list method --> string method ● The join method takes a list of strings as an argument and concatenates (in order) each of those strings into a new string ● the calling string is used a the separator between each string being concatenated: strings = ["Hi", "Ho", "bye", "zap"] new_str = ":".join(strings) # calling string is colon : print(new_str, type(new_str)) Hi:Ho:bye:zap <class 'str'>
Similarities with strings
● lists, like strings are ordered sequences ● concatenate with + (but only of lists) ● repeat with * ● indexing with the [ ] operator ● slicing with [:] ● membership with the in operator ● length with the len() function
Building dictionaries faster
● zip creates pairs from two parallel lists: ● zip("abc", [1, 2, 3]) yields [('a',1), ('b',2), ('c',3)] ● That's good for building dictionaries. We call the dict function which takes a list of pairs to make a dictionary: ● dict(zip("abc", [1,2,3])) yields {'a': 1, 'c': 3, 'b': 2}
Data Structures
MANY programming problems require dealing with data containing multiple parts/elements, and so require data structures like lists and tuples ● Data structures are particular ways of storing data to make some operation easier or more efficient. That is, they are tuned for certain tasks
Tuple unpacking
Method to swap variables in a tuple: x = 10 y = -10 Above can be written as a tuple: x, y = 10, -10 print("Before: x = %d, y = %d" % (x, y)) x, y = y, x print("After: x = %d, y = %d" % (x, y)) result: Before: x = 10, y = -10 After: x = -10, y = 10
Examples of value equality and reference equality
numbers = [10, 20, 30] things = numbers new_things = [10, 20, 30] things == numbers # True things is numbers # True new_things == numbers # True new_things is numbers # False
List Functions
● len(lst): number of elements in list (top level). len([1, [1, 2], 3]) => 3 ● min(lst): smallest element. Must all be the same type! ● max(lst): largest element, again all must be the same type ● sum(lst): sum of the elements, numeric only
String into tuple
A string can be split() into a list, and the results stored into a tuple: date_input = input("Enter DOB (d/m/y)") parts = date_input.split("/") # this will be a list of strings my_dob = (int(parts[0]), int(parts[1]), int(parts[2])) print(my_dob) ● Notice how we process each value in parts the same way ● So we can use a list comprehension: parts = date_input.split("/") my_dob = tuple([int(part) for part in parts])
Aliasing
Be aware that an assignment statement does not copy a list... ● An assignment statement only creates a new variable that refers to the existing list: numbers = [10, 20, 30] things = numbers numbers.append(40) print(numbers) # [10, 20, 30, 40] print(things) # [10, 20, 30, 40] ● This is called aliasing - things is an alias of numbers ● This can be dangerous... you think it's different, but it's not!
Functions and Tuples
Functions that return multiple values actually return tuples: x, y = (1, 2) # (1, 2) is a tuple print(x, type(x)) # 1 <class 'int'> def get_low_high(values): return min(values), max(values) low, high = get_low_high(my_list) print(low, type(low)) # -12 <class 'int'> z = get_low_high(my_list) print(z, type(z)) # (-12, 45) <class 'tuple'>
Let's think about how various types of information should be stored...
How could we represent rainfall data for a 12-month period? ● list (or array) How could we store information about group of people (name, age, phone number, email address)? ● class (but could use a list or tuple) How could we store hexadecimal colour codes like: Red = #FF000 and BurlyWood = #DEB887 ? ● dictionary --> a mapping of values to a key; list is not useful ∵ there is no order How could we store and sort the top high-scores for a computer game? ● list ∵ need to store the order; could save data to a file so that we can add new scores later
enumerate
Method used to iterate through a LIST; keeps track of the index and item in the LIST. cities = ['Marseille', 'Amsterdam', 'New York', 'London'] for i, city in enumerate(cities): print(i, city) result: 0 Marseille 1 Amsterdam 2 New York 3 London N.B. This is preferable to the accumulator method.
List Comprehensions
One way to construct a list with is a "list comprehension": >>> [n for n in range(1, 5)] [1, 2, 3, 4] ● creates a list by modifying an existing sequence to make a new sequence ● A "list comprehension" is aconcise way to iterate & collect values from a collection; values cab be either modified or filtered --> replaces a "for loop" [ -> marks beginning of comprehension n -> what we collect; elements in the new list for -> start of for loop range(1,5) -> what we iterate through. Note that we iterate over a set of values and collect some (in this case all) of them n for n in range(1,5) ] -> marks end of comprehension
Dictionary methods
Only 9 methods in total. Here are some: - my_dict.clear() - empty the dictionary - my_dict.update(yourDict) - for each key in yourDict, updates my_dict with that key/value pair - my_dict.copy() - shallow copy - my_dict.pop(key)- remove key, return value
Kinds of data structures
Roughly two kinds of data structures: ● built-in data structures, data structures that are so common as to be provided by default lists, tuples, string, dictionaries, sets... ● user-defined data structures (classes in object oriented programming) that are designed for a particular task ● a class makes a new type for something that doesn't already exist: >>> class Person: >>> class EmployeeList: ● classes use PascalCase
Value equality vs Reference equality
This an important distinction you must be aware of... x == y ● performs value equality - Check if the values are the same x is y ● performs reference equality - Check if two variables refer to the same object in memory Suggestion: use id() to test this out
Iteration thru a list
You can iterate through the elements of a list like you did with a string: >>> my_list = [1, 3, 4, 8] >>> for element in my_list: print(element, end=' ') 1 3 4 8 ● element is generic --> use specific name when we know what kind of element: for number in numbers for subject in subjects for name in names ● skip the first or last element --> header line in Excel file: for line in lines[1::] for number in numbers[::2] --> every 2nd number
List Operators
[1, 2, 3] + [4] => [1, 2, 3, 4] [1, 2, 3] * 2 => [1, 2, 3, 1, 2, 3] 1 in [1, 2, 3] => True [1, 2, 3] < [1, 2, 4] => True compare index to index, first difference determines the result
modifying what we collect
[n ** 2 for n in range(1, 6)] returns [1, 4, 9, 16, 25] ● Note that we can only change the values we are iterating over, in this case n
multiple collects
[x+y for x in range(1, 4) for y in range (1,4)] It is as if we had done the following: my_list = [ ] for x in range (1, 4): for y in range (1, 4): my_list.append(x + y) => [2, 3, 4, 3, 4, 5, 4, 5, 6]
Views are iterable
for key in my_dict: print key ● prints all the keys for key, value in my_dict.items(): print(key, value) ● prints all the key/value pairs for value in my_dict.values(): print(value) ● prints all the values Note: dictionaries are unordered. The order that the loop steps through the keys is undefined - do not rely on it. ● If you need a sorted order, you can get the keys as a list, sort that, then iterate over it and use it to access the values.
List of Lists
my_list = ['a', [1, 2, 3], 'z'] What is the second element (index 1) of that list? Another list: my_list[1][0] # apply left to right my_list[1] => [1, 2, 3] [1, 2, 3][0] => 1
Unexpected results
my_list = [4, 7, 1, 2] my_list = my_list.sort() my_list => None # what happened? ● What happened was the sort operation changed the order of the list in place (right side of assignment). ● Then the sort method returned None, which was assigned to the variable. ● The list was lost and None is now the value of the variable.k
Using 'else' with a 'try/except' statement
print("Converting!") try: print(int('1')) except: print("Conversion failed!") else: #Only executed if no except print("Conversion successful!") finally: # Always executed, whether exception or not print("Done") result: Converting! 1 Conversion successful! Done! N.B. The 'finally' statement is useful for closing a file, even if an exception occurs!!!!
Some new methods
● A list is mutable and can change: → my_list[0] = 'a' #index assignment → my_list.append() # Add an item to the end of the list. → my_list.extend() # Extend the list by appending all the items from the iterable, one at a time → my_list.count(x) # Return the number of times x appears in the list. → my_list.pop([i]) # Remove the item at the given position in the list, and return it. If no index is specified, a.pop() removes and returns the last item in the list. → my_list.insert(i, x) # Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is equivalent to a.append(x). → my_list.remove(x) #Remove the first item from the list whose value == x. It is an error if there is no such item values = [1, 2, 3, 2] values.remove(2) # removes the first 2 in list, but leaves any others print(values) [1, 3, 2] → my_list.sort() # elements must be homogenous → my_list.reverse() # Reverse the elements of the list in place. → my_list.sort() # Return a shallow copy of the list. Equivalent to a[:]. ● methods like insert, remove or sort that only modify the list have no return value printed - they return the default None.
How to access dictionary elements
● Access requires [ ], but the key is the index! Example: my_dict={} - an empty dictionary my_dict['bill'] = 25 - added the pair 'bill':25 print(my_dict['bill']) - prints 25
Python Dictionary
● Dictionaries are collections but they are not sequences such as lists, strings or tuples: - there is no order to the elements of a dictionary - in fact, the order (for example, when printed) might change as elements are added or deleted. - the order elements are added is lost
Lists and Tuples
● Everything that works with a list works with a tuple, except methods that modify the tuple ● Thus indexing, slicing, len, print all work as expected ● However, none of the mutable methods work: append, extend, del, sort, reverse
Dictionary
● In data structure terms, a dictionary is better termed as an associative array, associative list or a map. ● A dictionary is a collection, BUT not a sequence --> not ordered ● A way to store sequences of data where there is a direct connection between the pairs of data --> list of pairs
keys and values
● Key must be immutable: - strings, integers, tuples are fine - lists are NOT ∵ lists are mutable ● Keys must be UNIQUE ● Value can be anything: - a list of lists, strings, integers, floats etc.
split method
● NOT a list method --> string method ● The string method split generates a sequence of characters by splitting the string at certain split-characters. ● It returns a list: words = 'this is a test'.split() words => ['this', 'is', 'a', 'test'] # string methods almost always return something
Sorting
● Only lists have a built in sorting method. ● Thus you often convert your data to a list if it needs sorting: my_list = list('xyzabc') my_list -> ['x', 'y', 'z', 'a', 'b', 'c'] my_list.sort() # no return my_list -> ['a', 'b', 'c', 'x', 'y', 'z'] ● sort method returns None ● CAPS come before lowercase in ASCII table:
Data Structures and algorithms
● Part of the "science" in computer science is the design and use of data structures and algorithms ● Data structures are suited to solving certain problems, and they are often associated with algorithms. ● Data structures are a type, i.e. they are classes ● As you go on in CS, you will learn more and more about these two areas N.B. Phone number --> a string not a number ∵ it may start with zero which is ignored by int type; also we never use a phone number like a number, e.g. we never say my phone number is even or 1/2 my phone number --> meaningless!
The Python List Data Structure
● a list is an ordered sequence of items. ● lists are collections --> contain multiple elements ● lists commonly store homogeneous types, but may be heterogeneous my_list = ["CP1404", 12, my_name] ● you have seen such a sequence before in a string. A string is a particular kind of list (what kind)? numbers = [] # empty list subjects = ["CP1404", "CP2406", "CP5632"] scores = [18, 25, 96]
get method
● get method returns the value associated with a dict key or a default value provided as second argument. ● Below, the default is 0 (the default default is None) sentence = "to be or not to be" words = sentence.split() # returns a list for word in words: word_count[word] = word_count.get(word, 0) + 1 print(word_count) ● default prevents KeyErrors - if user enters a key that is not in the dict (could also use try-except)
list constructor
● list are created using a constructor ● Like all data structures, lists have a constructor, named the same as the data structure. It takes an iterable data structure and adds each item to the list my_list = list(1, 2, 3, 4) [1, 2, 3, 4] list("Hello") ['H', 'e', 'l', 'l', 'o'] ● It also has a shortcut, the use of square brackets [ ] to indicate explicit items: my_list = [1, 2, 3, 4] [1, 2, 3, 4]
differences between lists and strings
● lists can contain a mixture of any Python object, strings can only hold characters: [1,"bill", 1.2345, True] ● lists are mutable, their values can be changed, while strings are immutable ● lists are designated with [ ], with elements separated by commas, strings use " " or ' '
More about list methods
● most of these methods do not return a value ● This is because lists are mutable, so the methods modify the list directly. No need to return anything. >>> things = [2, "Bob", 34.2, [2,3], True] >>> things.reverse() # blank line >>> things [True, [2, 3], 34.2, 'Bob', 2] ● however, pop() returns the item removed ● string methods almost always return something: >>> s = "Lindsay" >>> s.upper() "LINDSAY" >>> s # s is unchanged "Lindsay" # strings are immutable
Dictionary content methods
● my_dict.items() - all the key/value pairs as tuples --> most useful way to use a dict ● my_dict.keys() - all the keys as a list ● my_dict.values() - all the values as a list These return what is called a dictionary view: - They are dynamically updated with changes - They are iterable; they can be used in a 'for' loop - The order of key-value pairs cannot be determined; but, whatever the order of the key view, the value view will have the same order
Create a list
● program that asks the user for their scores and adds them to the list, until they enter a negative score, then prints their highest score. scores = [] score = int(input("Score: ")) while score >= 0: scores.append(score) score = int(input("Score: ")) print("Your highest score is", max(scores)) Note: This would crash if the user entered a negative number first. Why? ∵ scores list is empty --> there is no max value How could we fix it? try-except statement
Why tuples?
● The real question is, why have an immutable list, a tuple, as a separate type? ● An immutable list gives you a data structure with some integrity/permanency ● You know you cannot accidentally change one. Example: If you want to specify a CONSTANT that is a sequence --> use a tuple ∵ it is immutable
Sorted function returns a value
● To sort a list and return a value --> use the sorted( ) function; nb. the .sort() method returns None ∵ it modifies the list directly ● The sorted function (not a method) will sort amy sequence; it breaks a sequence into elements and sorts the sequence, returning the results as a list: >>> numbers = [27, 56, 4, 18] >>> sorted(numbers) [4, 18, 27, 56] >>> chars = 'hi mom') >>> chars.sorted() [' ','h','i','m','m','o'] >>> " ".join(sorted(chars)) "himmo"
Lists are mutable
● Unlike strings, lists are mutable. You can change the object's contents: >>> my_list = [1, 2, 3] >>> my_list[0] = 127 >>> print(my_list) [127, 2, 3]
Create a dictionary
● Use the { } marker to create a dictionary ● Use the : marker to indicate key: value pairs data = {} # create empty dict data[1] = "Jack" # add pair to dict data[2] = "Bill" print(data) {1: "Jack", 2: "Bill"} data[2] = "Joe" #modify 1 print(data) {1: "Jack", 2: "Joe"}
What is the best way to name a list variable?
● Same rules as any variable, but... ● Lists and tuples are sequences - they contain multiple elements ● So name them with plural terms ● E.g. a list containing numbers could be... numbers Or if each list item was a Person object, you would call it...people items =["salt", "flour", "sugar"] numbers = [1, 2, 3, 4, 5] names = ["Joe", "Fred", "Tony"] ● if each list item was heterogeneous: things = [1, "Bob", 34.2, [2, 3], True] ● Don't use list as an identifier - it's already taken! or dict, str, int, max, sum, len...
Sort by part of an element?
● Suppose: data = [['Derek', 7], ['Carrie', 8], ['Bob', 6], ['Aaron', 9]] data.sort()# Gives [['Aaron', 9], ['Bob', 6], ['Carrie', 8], ['Derek', 7]] ● Question: How could we sort this by the number (age) instead of the string (name)? ● Answer: use operator.itemgetter from operator import itemgetter data = [['Derek', 7], ['Carrie', 8], ['Bob', 6], ['Aaron', 9]] data.sort(key=itemgetter(1)) # Gives [['Bob', 6], ['Derek', 7], ['Carrie', 8], ['Aaron', 9]] You can pass multiple values to itemgetter, like:data.sort(key=itemgetter(1, 0)) # Sort by 2nd then 1st elements
A list of pairs
● The first element of the pair, the key, is used to retrieve the second element, the value. ● Thus we map a key to a value