Py4e: Chapter 12; reading web data from Python

¡Supera tus tareas y exámenes ahora con Quizwiz!

What is the ASCII character that is associated with the decimal value 42?

*

ASC2 Code (Decimal)

0 nul 16 dle 32 sp 48 0 64 @ 80 P 96 ` 112 p 1 soh 17 dc1 33 ! 49 1 65 A 81 Q 97 a 113 q 2 stx 18 dc2 34 " 50 2 66 B 82 R 98 b 114 r 3 etx 19 dc3 35 # 51 3 67 C 83 S 99 c 115 s 4 eot 20 dc4 36 $ 52 4 68 D 84 T 100 d 116 t 5 enq 21 nak 37 % 53 5 69 E 85 U 101 e 117 u 6 ack 22 syn 38 & 54 6 70 F 86 V 102 f 118 v 7 bel 23 etb 39 ' 55 7 71 G 87 W 103 g 119 w 8 bs 24 can 40 ( 56 8 72 H 88 X 104 h 120 x 9 ht 25 em 41 ) 57 9 73 I 89 Y 105 i 121 y 10 nl 26 sub 42 * 58 : 74 J 90 Z 106 j 122 z 11 vt 27 esc 43 + 59 ; 75 K 91 [ 107 k 123 { 12 np 28 fs 44 , 60 < 76 L 92 \ 108 l 124 | 13 cr 29 gs 45 - 61 = 77 M 93 ] 109 m 125 } 14 so 30 rs 46 . 62 > 78 N 94 ^ 110 n 126 ~ 15 si 31 us 47 / 63 ? 79 O 95 _ 111 o 127 del

What is the decimal (Base-10) numeric value for the upper case letter "G" in the ASCII character set?

71

What ends up in the "x" variable in the following code: html = urllib.request.urlopen(url).read() soup = BeautifulSoup(html, 'html.parser') x = soup('a')

A list of all the anchor tags (<a..) in the HTML from the URL

Which HTTP header tells the browser the kind of document that is being returned?

Content-Type:

What is the purpose of the BeautifulSoup Python library?

It repairs and parses HTML to make it easier for a program to understand

What should you check before scraping a web site?

That the web site allows scraping not That the web site returns HTML for all pages

What is the most common Unicode encoding when moving data between systems?

UTF-8 is the most commonly used on web pages although UTF-8, UTF-16, and UTF-32 are the standard.

When reading data across the network (i.e. from a URL) in Python 3, what string method must be used to convert it to the internal format used by strings?

decode( )

Which of the following Python data structures is most similar to the value returned in this line of Python: x = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')

file handle

What word does the following sequence of numbers represent in ASCII: 108, 105, 110, 101

line

In this Python code, which line is most like the open() call to read a file: import socket mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) mysock.connect(('data.pr4e.org', 80)) cmd = 'GET http://data.pr4e.org/romeo.txt HTTP/1.0\n\n'.encode() mysock.send(cmd) while True: data = mysock.recv(512) if (len(data) < 1): break print(data.decode()) mysock.close()

mysock.connect( )

In this Python code, which line actually reads the data: import socket mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) mysock.connect(('data.pr4e.org', 80)) cmd = 'GET http://data.pr4e.org/romeo.txt HTTP/1.0\n\n'.encode() mysock.send(cmd) while True: data = mysock.recv(512) if (len(data) < 1): break print(data.decode()) mysock.close()

mysock.recv()

Which of the following regular expressions would extract the URL from this line of HTML: <p>Please click <a href="http://www.dr-chuck.com">here</a></p>

not href="(.+)", http://.*

How are strings stored internally in Python 3?

unicode


Conjuntos de estudio relacionados

chapter 4 psychological theories

View Set

Nutrition 2750 Chapter 4: Lipids

View Set

True or False/ Naming Careers in Health Care

View Set

Intro to Forensic psych (midterm)

View Set