Programming for Everybody - Python - Coursera (2nd Module)
Dictionary Tracebacks
It is an error to reference a key which is not in the dictionary >>> ccc = dict() >>> print (ccc['csev']) Traceback We can use in operator to see if a key in in a dictionary >>>'csev' in ccc False
Looking inside lists
Just like strins, we can get at any single element in a list using an index specified in square brackets. friends = [ 'Joseph' , 'Glenn' , 'Sally'] print (friends[1]) Output Glenn
List vs Dictionary
List - linear collection of values that stay in order Dictionary - a "bag" of values, each with its own label
List Constants
List constants are surrounded by square brackets and the elements in the list are separated by commas. List element can be any Python object - even another list A list can be empty ([1, [5, 6], 'Hello''])
Lists are Mutable (unlike strings)
String fruit = 'Banana' fruit [0] = 'b' You will get an error since strings aren't mutable lotto = [2, 14, 26, 41, 63] lotto [2] = 28 print (lotto) Output [2, 14, 28, 41, 63] Because lists are mutable - we can change elements of a list using the index operator
colon operator - slicing strings
The second number is one beyond the slice, it says "up to but NOT including" s = 'Monty Python' print (s[0:4]) output will be Mont print (s[4:7]) output will be y P print (s[6:20]) output will be Python (no traceback error) print (s[:2]) output will be Mo (if you leave off the first or last number it is assumed to be the beginning or end of string respectively)
Built=in Functions and Lists
There are a number of functions built into Python that take lists as parameters >>>nums = [3, 41, 12, 9, 74, 15] >>> print(len(nums) 6 >>>print(max(nums)) 74 >>>print(min(nums)) 3 >>>print(sum(nums)) 154 >>>print(sum(nums)/len(nums)) 25.6
Opening a File
To read contents of a file, we have to tell Python which file we are going to work with, and what we'll be doing with the file. open() function open() returns a "file handle" - a variable used to perform operations on the the file Similar to "File -> Open" in a Word Processor
Lists and Definite Loops
friends = ['Joseph' , 'Glenn' , 'Sally'] for x in friends : print ('Happy New Year:' , friend) Output Happy New Year: Joseph Happy New Year: Glenn Happy New Year: Sally
in as a Logical Operator
fruit = 'banana' 'n' in fruit True 'm' in fruit False 'nan' in fruit True
Looping through strings (2) for
fruit = 'banana' for letter in fruit: print (letter) This will output b a n a n a
Looping through strings (1) while
fruit = 'banana' index = 0 while index < len (fruit) : letter = fruit [index] print (index, letter) index = index + 1
len (built-in function)
gives the length of a string fruit = 'banana' print (len(fruit)) output will be 6 b/c 'banana' has six characters
open() (Reserved Word)
handle = open(filename, mode) returns a handle used to manipulate the file filename is a string mode is optional and should be 'r' if we are planning to reat the file and 'w' if we are going to write to the file Example: fhand = open('mbox.txt', 'r')
Concatenating Lists
>>> a = [1, 2, 3] >>> b = [4, 5, 6} >>> c = a + b >>> print (c) [1, 2, 3, 4, 5, 6]
split() continued (specifying a delimiter)
>>> line = 'A lot of spaces' >>>etc=line.split() >>>print(etc) ['A' , 'lot' , 'of' , 'spaces] >>>line = 'first;second;third' >>>thing = line.split() >>>print(thing) ['first;second;third'] >>>print(len(thing)) 1 >>>thing=line.split(';') >>>print(thing) ['first;second;third'] print(len(thing)) 3 When you do not specify a delimiter, multiple spaces are treated like one delimiter
dict()
>>> purse = dict() >>> purse ['money'] = 12 >>> purse ['candy'] = 3 >>> purse ['tissues'] = 75 >>>print(purse) {'money' : 12, 'tissues' : 75, 'candy' :3} >>>print(purse['candy']) 3 purse['candy'] = purse['candy'] + 2 >>>print(purse) {'money': 12, 'tissues' : 75, 'candy':5} candy is the key, purse is the dictionary Because dictionaries have no order, we index things in the dictionary with a "lookup tag"
n/ (newline character)
>>> stuff = 'Hello\nWorld!' >>>print (stuff) Hello World >>>stuff = 'Xn\Y' >>>print (stuff) X Y >>>len(stuff) 3 *\n is seen as one character, not two*
Slicing Lists
>>> t = [9, 41, 12, 3, 74, 15] >>> t = [1:3} [4 , 12]
List Methods
>>> x = list() >>> type (x) <type 'list'> >>> 'append' , 'count', 'extend' , 'index' , 'insert' , 'pop' , 'remove' , 'reverse' , 'sort'] >>>
Dictionary Literals (Constants)
>>>jjj = { 'chuck' : 1 , 'fred' : 42, 'jan' : 100} >>>print(jjj) { 'chuck' : 1 , 'fred' : 42, 'jan' : 100} >>> ooo = { } >>> print (ooo) { } >>> Dictionaries use curly braces and have a lst of key:value pairs You can make an empty dictionary using empty curly braces.
delimiter
A character that separates data entries from one another. can be utilized in the split() function e.g. split(,) will split things based on where commas appear
Dictionary
A dictionary is a general-purpose data structure for storing a group of objects. A dictionary has a set of keys and each key has a single associated value. ... A dictionary is also called a hash, a map, a hashmap in different programming languages (and an Object in JavaScript). Often called an Associate Array
Lists are in Order
A list can hold many items and keeps those items in the order until we do something to change the order. A list can be sorted (i.e. change its order) The sort method (unlike in strings) means "sort yourself" >>> friends = [ 'Joseph' , 'Glenn' , 'Sally' ] >>> friends.sort() >>>print (friends) ['Glenn' , 'Joseph' , 'Sally'] >>> print(friends[1] Joseph
replace () (Method)
A method on the String class to replace one value in a string with another. greet = 'Hello Bob' nstr = greet.replace ('Bob' , 'Jane') print (nstr) Output will be Hello Jane (as all instances of Bob are replaced with Jane
Collection
A variable or value with several elements such as a string, tuple, or list. allows us to put many values in a single "variable" convenient b/c we can carry many values around in one convenient package.
Algorithms vs Data Structures
Algorithm - set of rules/steps used to solve a problem Data Structures - a particular way of organizing data in a computer
Definite Loops and Dictionaries
Even though dictionaries are not stored in order, we can write a for loop that goes through all the entries in a dictionary - actually it goes through all of the keys in the dictionary and loops up the values. >>> counts = { ' chuck' : 1 , 'fred' : 42, 'jan' : 100} >>> for key in counts: print (key, counts [key]) jan 100 chuck 1 fred 42
Double Split Pattern
From [email protected] Sat Jan 5 09:14:16 >>>words = line.split() >>>email = words[1] [email protected] >>>pieces = email.split('@') ['bob.smith' , 'gmail.com'] >>>print (pieces[1]) gmail.com That will pull out the email address. Sometimes we split a line one way, and then grab one of the pieces of the line and split again.
Python 3 Unicode
In Python 3 all strings are Unicode
String Library
Python has a number of string functions which are in the string library. These function are already built into every string - we invoke them by appending the function to the string variable These functions do no modify the original string, instead they return a new string that has been altered. .lower() greet = 'Hello Bob' zap = great.lower() print (zap) output with be hello bob (it will be all lower case letters)
Is Something in a List
Python provides two operators that let you check if an item is in a list. These logical operators return True or False They do not modify the lsit >>> some = [1, 9, 21, 10, 16] >>> 9 in some True >>> 15 in some False >>> 20 not in some True
Dictionary (continued)
Python's most power data collection Allow us to do fast database-like operations in Python Different names in different languages Associative Arrays (Perl/PHP) Properties, Map, or Hasmap (Java) Property Bag - C#/.Net
find() (Method)
Searches for a substring within another string. Finds the first occurrence in the substring. fruit = 'banana' pos = fruit.find('na') print (pos) output will be 2, it shows you the position in the string where that substring starts. If the substring is not found, find() returns -1
Flat Text File
Simplest type of file; the computer writes straight to it without encoding it in computer language or doing any special kind of formatting.
Building a List from Scratch
We can create an empty list and then add elements using the append method. The list stays in order and new elements are added at the end of the list. >>>stuff = list() >>> stuff.append ('book') >>>stuff.append (99) >>>print(stuff) ['book' , 99] >>> stuff.append ('cookie') >>>print (stuff) ['book' , 99 , 'cookie']
Searching through a file
We can put an if statement in our for loop to only print lines that meet some criteria. fhand = open (mbox.txt) for line in fhand if line.startswith ('From:') : print (line) *Caution* each line from the file has a newline at the end, the print statment adds a newline to each line, this will cause it to print a bunch of blank lines*
Simplified Counting with get()
We can use get() and provide a default value of zero when the key is not yet in the dictionary - and then just add one. counts = dict() names = ['csev' , 'cwen' , 'csev' , 'zqian' , 'cwen'] for name in names: counts [name] = counts.get(name, 0) + 1 print (counts) ---------------------- 0 is the default value for new keys, So if a word isn't in the dictionary it gets assigned 1. If it is it has 1 added to its value each time it appears
Two Iteration Variables can be used in For Loops
We loop through the key-value pairs in a dictionary using *two* iteration variables Each iteration, the first variable is the key and the second variable is the corresponding value for the key. book = {'chuck' : 1 , 'fred' : 42, 'jan' : 100} for aaa, bbb in book.items() print(aaa, bbb)
Keys vs Values vs Items
You can get a list of keys, values, or items (both) from a dictionary >>> book = {'chuck' : 1 , 'fred' : 42, 'jan' : 100} LIST >>> print (list(book)) ['jan' , 'chuck' , 'fred'] KEYS >>>print(book.keys) ['jan' , 'chuck' , 'fred'] VALUES >>>print(book.values) [100 , 1, 42] ITEMS >>>print(book.items) [('jan' , 100), ( 'chuck' , 1) , ('fred' , 42)]
lower() (Method)
a "method" used to modify an object lower() modifies a string by changing all of its uppcase characters to lowercase characters.
string
a sequence of characters A string literal uses quotes 'Hello' or "Hello" For strings, + means "concatenate" When strings contain numbers, it is still a string We can convert numbers in a string into a number using int()
Counting Pattern
counts = dict() line = input ('Enter a line of text: ' ) words = line.split() print ('Words; ' , words) print ('Counting . . . ') for word in words: count[word] = counts.get(word, 0) + 1 print ('Counts' , counts)
When We See a New Name
counts = dict() names = [ 'cesv' , 'cwen' , 'csev' , 'zqian' , 'cwen'] for name in names: if name not in counts: counts [name] = 1 else : counts [name] = counts [name] + 1 print (counts) {'csev': 2, 'zquian': 1, 'cwen': 2}
Counting lines from a file
doc = open (text.txt) count = 0 for line in doc: count = count + 1 Print ('Line Count:' , count)
Using in to Select Lines
fhand = open ('mbox-short.txt') for line in fhand: line = line.rstrip() if not 'uct.ac.za' in line : continue print (line)
Searching Through a File (fixed)
fhand = open (mbox.txt) for line in fhand line - line.rstrip() if line.startswith ('From:') : print (line) rstrip will now strip the whitespace from the right-hand side of the string using rstrip.
Skipping with Continue
fhand = open('mbox.txt') for line in fhand: line = line.rstrip() if not line.startswith ('From:') : continue print (line)
Bad File Names
fname - input ('Enter the file name: ') try: fhand - open(fname) except" print('File cannot be opened: ' , fname) quit() count = 0 for line in fhand: if line.startswith('Subject:') : count = count + 1 print ('There were', count, "subject lines in' , fname)
Prompt for File Name
fname = input ('Enter the file name: ') fhand = open (fname) count = 0 for line in fhand: if line.startswith ('Subject:') : count = count + 1 print ("There were" , 'count, 'subject lines in' , fname)
in - a deeper look
for letter in 'banana' : print (letter) letter = iteration variable banana = sequence / string print(letter) = block (body) The iteration variable "iterates" through the sequence and the block is executed once for each value in the sequence
get()
if name in counts: x = counts[name] else : x = 0 x = counts.get(name, 0) {'csev': 2, 'zquian': 1, 'cwen': 2} ---------------------------------- name = key 0 = default value The pattern of checking to see if a key is already in a dictionary and assuming a default value if the key is not there is so common that there is a method called get() Method only works on dictionaries, not lists. Works whether the key exists or not, no traceback error
input
input only ever gives us back a string if you use input for numbers you must then later convert that string into a number (int (), float () etc...)
How long is a list?
len() function tells us the number of elements of any set or sequence x = [1, 2, 'joe' , 99] print (len(x)) Output 4
startswith (Method)
line = 'Please have a nice day' line.startswith ('Please') True line.startswith ('p') False
Bringing it all together
name = input ('Enter file: ') handle = open(name) counts = dict() for line in handle: ____words = line.split() ____for words in words ________counts [word] = counts.get(word, 0) + 1 bigcount = None bigword = None for word, count in counts.items(): #word, count is a double iteration variable here# ____if bigcount is None or count > bigcount: ________bigword = word ________bigcount = count print (bigword, bigcount)
Manipulating Lists getting an average using data structures instead of algorithms
numlist = list() while True: inp = input ('Enter a number: ') if inp == 'done' : break value = float (inp) numlist.append(value) average = sum(numlist) / len(numlist) print ('Average: ' , average)
read() - Reading the Whole File
read() reads the whole file (newlines and all) into a single string >>>fhand = open ('mbox-short.txt) >>>inp = fhand.read() >>>print (len(inp)) 94626 *(this is the length of all the text and spaces in the file) >>> print(inp[:20]) From stephen.marquar *(this would represent the first 20 characters of the file)
strip(), lstrip(), rstrip() (Methods)
removes white space from beginning and end of the string - strip(), from the right side of a string - rstrip(), or from the left side of a string - lstrip(). greet = ' Hello Bob ' greet.lstrip () 'Hello Bob ' greet.rstip () ' Hello Bob' greet.strip 'Hello Bob'
Range function and lists
returns a list of numbers that range from zero to one less than the parameter friends = ['Bob', 'Tom', 'John'] print (len(friends)) print range(friends) OUtput 3 [0,1,2]
split()
split () breaks a string into parts and produces a list of strings. We think of these as words. We can access a particular word or loop through all the words. >>>abc = 'With three words' >>> stuff = abc.split() >>> print(stuff) ['With' , 'three' , 'words'] >>> print (len(stuff)) 3 >>>print(stuff[0]) With
objects
variable that have capabilities grafted onto them string is an object
index operator [ ]
we can get at any single characters in a string using an index specified in brackets [ ] index value must be an integer and starts at 0 index value can be an expression that is computed fruit = 'banana' letter = fruit [1] print (letter) this will produce a 0 = b. 1 = a, 3=n, 4=a, 5=n, 6=a
Looping and counting
word = 'banana' count = 0 for letter in word : if letter == 'a' : count = count + 1 print (count) Counts the number of a's in the word banana
File Handle as a Sequence
xfile = open ('mbox.text') for cheese in xfile: print (cheese) xfile is the File Handle can be treated as a sequence of strings where each line in the file is a string in the sequence we can use the for statement to iterate through a sequence, thus the for statement will iterate through each line of the file