Chapter 14 (sets, dictionaries, tuples)
~how to count the occurrence of words in a file using a dictionary
# Replace punctuation in the line with a space def replacePunctuations(line): ~for ch in line: ~~if ch in "~@#$%^&*()_-+=~<>?/,.;:!{}[]|'\"": ~~~line = line.replace(ch, " ") ~return line #Count each word in the line def processLine(line, wordCounts): # Replace punctuation with space ~line = replacePunctuations(line) # Get words from each line ~words = line.split() ~for word in words: ~~if word in wordCounts: ~~~wordCounts[word] += 1 ~~else: ~~~wordCounts[word] = 1 def main(): # Prompt the user to enter a file ~filename = input("Enter a filename: ").strip() # Open the file ~infile = open(filename, "r") # Create an empty dictionary to count words ~wordCounts = {} #Process the line using the user define func ~for line in infile: ~~processLine(line.lower(), wordCounts) # Get pairs from the dictionary ~pairs = list(wordCounts.items()) # Reverse pairs in the list ~items = [[x, y] for (y, x) in pairs] #Sort pairs in items ~items.sort() #Display 10 most common words ~for i in range(len(items) - 1, len(items) - 11, -1): ~~print(items[i][1] + "\t" + str(items[i][0]))
What does >>> s1 = {(1,2,3)} do?
>>> s1 {(1, 2, 3)}
>>> s1 = {1,2,3} >>> s2 = {1,2,3,4,5} >>> s1 | s2 ???? >>> s1&s2 ???? >>> s1 - s2 ???? >>> s2 - s1 ???? >>> s1 ^ s2 ??? >>> s2 ^ s1 ???
>>> s1 | s2 {1, 2, 3, 4, 5} >>> s1&s2 {1, 2, 3} >>> s1 - s2 set() >>> s2 - s1 {4, 5} >>> s1 ^ s2 {4, 5} >>> s2 ^ s1 {4, 5}
>>> tuple(1,2,3) ???? >>> list(1,2,3) ???? >>> set(1,2,3) ???? >>> (1,2,3) ???? >>> [1,2,3] ???? >>> {1,2,3} ????
>>> t1 = tuple(1,2,3) Traceback (most recent call last): File "<pyshell#118>", line 1, in <module> t1 = tuple(1,2,3) TypeError: tuple expected at most 1 arguments, got 3 >>> list(1,2,3) Traceback (most recent call last): File "<pyshell#119>", line 1, in <module> list(1,2,3) TypeError: list expected at most 1 arguments, got 3 >>> set(1,2,3) Traceback (most recent call last): File "<pyshell#120>", line 1, in <module> set(1,2,3) TypeError: set expected at most 1 arguments, got 3 >>> (1,2,3) (1, 2, 3) >>> [1,2,3] [1, 2, 3] >>> {1,2,3} {1, 2, 3}
What is a dictionary? What is a key? what constraints are there for keys? Values? What forms an item in a dictionary? Why is it called a dictionary?
A dictionary stores values along with a unique key. The keys are like an index operator. In a list, the indexes are integers. In a dictionary, the key must be a hashable object (so not a list of dictionary as the key). A dictionary cannot contain duplicate keys. Each key maps to one value. A value can be of any type. A key and its corresponding value form an item (or entry) stored in a dictionary. The data structure is a called a "dictionary" because it resembles a word dictionary, where the words are the keys and the words' definitions are the values.
Can a set can contain elements of the same type and mixed types? Can a set contain a list?
A set can contain the elements of the same type or mixed types. For example, s {1, 2, 3, "one", "two", "three"} is a set that contains numbers and strings. Each element in a set must be hashable. Each object in Python has a hash value and an object is hashable if its hash value never changes during its lifetime. All of Python's immutable built-in objects are hashable, while no mutable containers (such as lists or dictionaries) are. Objects which are instances of user-defined classes are hashable by default
set comparisons: What are subsets and supers sets and proper subsets and super sets comparison operators with sets (5)
A set s1 is a subset of s2 if every element in s1 is also in s2. You can use the s1.issubset(s2) method to determine whether s1 is a subset of s2 A set s1 is a superset of set s2 if every element in s2 is also in s1. You can use the s1.issuperset(s2) method to determine whether s1 is a superset of s2 If s1 is a proper subset of s2, every element in s1is also in s2, and at least one element in s2 is not in s1. If s1 is a proper subset of s2, s2 is a proper superset of s1. s1 < s2 returns True if s1 is a proper subset of s2 s1 <= s2 returns True if s1 is a subset of s2 s1 > s2 returns True if s1 is a proper superset of s2 s1 >= s2 returns True if s1 is a superset of s2 You can use the == and != operators to test if two sets contain the same elements, NO MATTER THE ORDER
If you are reading in a file and testing to see if a keyword is in that file, would it be more effective to use a list or a set?
A set!
Tuples are not mutable, but can the elements in a tuple be mutable?
A tuple contains a fixed list of elements. An individual element in a tuple may be mutable. >>> t1 = (1, [1,2,3], 2) >>> t1[1][2]=0 >>> t1 (1, [1, 2, 0], 2)
>>> s1 = {[1,2,3], 1, 2, 3} What does this output?
An error Traceback (most recent call last): File "<pyshell#91>", line 1, in <module> s1 = {[1,2,3], 1, 2, 3} TypeError: unhashable type: 'list'
When is a tuple really considered immutable?
If a tuple contains immutable objects, the tuple is said to be immutable. For example, a tuple of numbers or a tuple of strings is immutable.
Empty sets vs empty dictionaries
Python uses curly braces for sets and dictionaries. The syntax {} denotes an empty dictionary. To create an empty set, use set().
How are sets similar to and different from lists?
Sets are like lists in that you use them for storing a collection of elements. Unlike lists, however, the elements in a set are nonduplicates and are not placed in any particular order.
Can sets use slicing and indexing?
Since sets are not ordered, there is no indexing or slicing of sets.
Set operations union intersection symmetric difference
The union of two sets is a set that contains all of the elements from both sets. s1.union(s2) OR s1 | s2 The intersection of two sets is a set that contains the elements that appear in both sets. s1.intersection(s2) OR s1 & s2 The difference between two sets is a set that contains the elements in the first set that are not in the second set. s1.difference(s2) OR s1 - s2 The symmetric difference (or exclusive or) between two sets is a set that contains the elements that are in exactly one of the two sets. s1.symmetric_difference(s2) OR s1 ^ s2
Do the set operations change the sets involved in the operation?
These set methods return a resulting set, but they do not change the elements in the sets.
What to remember about sets?
They do not contain duplicates!!
What is a tuple?
Tuples are like lists, but their elements are fixed; that is, once a tuple is created, you cannot add new elements, delete elements, replace elements, or reorder the elements in the tuple.
How to split up the tuples that result from dict.items()
Using simultaneous assignment to split up the tuples: for key,value in d2.items():
Can you iterate through a dictionary? How to print each item in a dictionary
Yes, you can iterate through the keys >>> for key in d1: print(str(key) + " : " + str(d1[key])) 1 : A 2 : B 3 : C 4 : D 5 : E
How to create a dictionary
You can create a dictionary by enclosing the items inside a pair of curly braces ({}). Each item consists of a key, followed by a colon, followed by a value. The items are separated by commas. students = {"111-34-3434":"John", "132-56-6290":"Peter"}
How to create a set
You can create a set of elements by enclosing the elements inside a pair of curly braces ({}). The elements are separated by commas. You can create an empty set, or you can create a set from a list or a tuple, as shown in the following examples: 1. s1 = set() # Create an empty set 2. s2 = {1, 3, 5} # Create a set with three elements 3. s3 = set((1, 3, 5)) # Create a set from a tuple >>> s3 = set((1,2,3)) >>> s3 {1, 2, 3} 4. # Create a set from a list s4 = set([x * 2 for x in range(1, 10)]) Likewise, you can create a list or a tuple from a set by using the syntax list(set) or tuple(set). 5. You can also create a set from a string. Each character in the string becomes an element in the set. # Create a set from a string >>> s5 = set("aabbccddeeffgg") >>> s5 {'c', 'e', 'f', 'b', 'd', 'g', 'a'} Note that although the character a appears twice in the string, it appears only once in the set because a set does not store duplicate elements. >>> s1 = {[1,2,3], 1, 2, 3} Traceback (most recent call last): File "<pyshell#91>", line 1, in <module> s1 = {[1,2,3], 1, 2, 3} TypeError: unhashable type: 'list'
How to create an empty dictionary?
You can create an empty dictionary by using the following syntax: students = {} # Create an empty dictionary
Uses of the 3 data structures
You can use a tuple for storing a fixed list of elements, a set for storing and quickly accessing nonduplicate elements, and a dictionary for storing key/value pairs and for accessing elements quickly using the keys.
Comparison operators that work with dictionaries
You can use the == and != operators to test whether two dictionaries contain the same items (REGARDLESS OF ORDER) You cannot use the comparison operators (>, >=, <=, and <) to compare dictionaries because the items are not ordered.
How to make a tuple (5 ways)
You create a tuple by enclosing its elements inside a pair of parentheses. The elements are separated by commas. You can create an empty tuple and create a tuple from a list 1. t1 = () # Create an empty tuple 2. t2 = (1, 3, 5) # Create a tuple with three elements 3. # Create a tuple from a list t3 = tuple([2 * x for x in range(1, 5)]) 4. You can also create a tuple from a set by using the syntax tuple(set). 5. You can also create a tuple from a string. Each character in the string becomes an element in the tuple. For example: # Create a tuple from a string t4 = tuple("abac") # t4 is ['a', 'b', 'a', 'c']
What happens if you try to remove an item not in a set?
You get a key error s1.remove(6) Traceback (most recent call last): File "<pyshell#105>", line 1, in <module> s1.remove(6) KeyError: 6
How to read in a file and test to see if some keyword are in that file and count the number of keywords that occur in the file
def main(): ~keyWords = {#some strings separated by commas} ~filename = input("Enter a Python source code filename: ").strip() # Check if file exists ~if not os.path.isfile(filename): ~~print("File", filename, "does not exist") ~~sys.exit() # Open files for input ~infile = open(filename, "r") # Read and split words from the file ~text = infile.read().split() ~count = 0 ~for word in text: ~~if : word in keywords ~~~count += 1 ~print("The number of keywords in", filename, "is", count)
Dictionary methods
dict.clear() Deletes all entries from dict and returns None. dict.get(key) Returns the value corresponding to the key. Same as dict[key] EXCEPT if the key isn't present, it returns None instead of an error dict.pop(key) Removes the key/value pair & returns the value • Same as del dict[key] EXCEPT it returns a value
Dictionary methods useful for looping
dict.keys() Returns a sequence of the keys >>> d1 {1: 'A', 2: 'B', 3: 'C', 4: 'D', 5: 'E'} >>> d1.keys() dict_keys([1, 2, 3, 4, 5]) dict.values() Returns a sequence of the values >>> d1.values() dict_values(['A', 'B', 'C', 'D', 'E']) dict.items() Returns a sequence of tuples containing the keys >>> d1.items() dict_items([(1, 'A'), (2, 'B'), (3, 'C'), (4, 'D'), (5, 'E')]) ___________________________________________________________________ for key in d2.keys(): iterates through the keys for value in d2.values(): iterates through the values for item in d2.items(): iterates through key/value tuples ____________________________________________________________________ CAN ALSO DO THIS: >>> list(d1.keys()) [1, 2, 3, 4, 5] >>> list(d1.values()) ['A', 'B', 'C', 'D', 'E'] >>> list(d1.items()) [(1, 'A'), (2, 'B'), (3, 'C'), (4, 'D'), (5, 'E')]
How to add an item to a dictionary (or replace a value for an already existing key) How to retrieve a key (non-method way) (what is the issue with this way) How to delete an item from a dictionary How to find length of dictionary How to find if a key is in a dictionary (using in and not in)
dictionaryName[key] = value If the key is already in the dictionary, the preceding statement replaces the value for the key. To retrieve a value, simply write an expression using dictionaryName[key]. If the key is in the dictionary, the value for the key is returned. Otherwise, a KeyError exception is raised. del dictionaryName["key"] This statement deletes an item with the key "key" from the dictionary. If the key is not in the dictionary, a KeyError exception is raised. len(dict) returns the number of elements in dict x in dict True if x is an element of dict (think this only really works for keys) x not in dict True if x is not an element of dict (think this only really works for keys) >>> 1 in d1 True >>> "A" in d1 False >>> "A" in d1[1] True
Functions for sets
len(mySet) returns the number of elements in mySet min(mySet) returns the smallest element in mySet max(mySet) returns the largest element in mySet sum(mySet) returns the sum of all elements in mySet x in mySet True if x is an element of the list mySet x not in mySet True if x is not an element of the list mySet
Methods for sets
mySet.add(item) add an item to mySet mySet.remove(item) remove an item from mySet
What does the empty set look like?
set()
Common tuple operations
t[i] the ith element of t t[i:j] a slice of t consisting of the elements from i to j of t t1 + t2 concatenates two tuples t1 and t2 t*n, n*t n copies of t concatenated together len(t) returns the number of elements in t min(t) returns the smallest element in t max(t) returns the largest element in t sum(t) returns the sum of all elements in t <, <=, > >=, ==, != used to compare two tuples x in t True if x is an element of the tuple t x not in t True if x is not an element of the tuple t for i in tuple1: Can traverse the elements of a tuple with for loop
Are tuples or lists more efficient?
tuples due to python's implementation
