19 - Web and APIs
What is an API? (online slides)
- API Stands for Application Programming Interface - It is a means to execute code other people wrote in our own programs. - API's provide abstractions. You don't have to know how it works only how to use the API. (Think driving a car versus being a mechanic.) - Without API's we would have to write all our code from scratch every time. - Imagine building a house but first you must build your own tools, cut your own wood from trees, make your own nails!
How the web works (online slides)
- Clients make requests to web servers, typically using a browser. - The client provides a request method and a URL - The web server send a response to the client. In the response is the content based on the URL - The client renders (draws) the content in the web browser
TLS - Transport Layer Security (online slides)
- Encrypts traffic over the wire - Protects against "Man in the Middle" attacks (sniffing data in transmission) - Organizations acquire a certificate from an Authority - Browsers "Trust" the Authority and encrypt the traffic - Clients request https:// instead of http:// to get the TLS encrypted site - Moral: Just because a site uses TLS doesn't mean its "secure" it only means the traffic between you and the server is encrypted!!!!
Web API Content formats (online slides)
- HTML is not a desired content format for Web API's because another computer is the recipient of the output (as opposed to a user). - While HTML **is** machine readable, it mixes data with layout and formatting making it difficult to find the information we wish to extract from the content. - In the example there is HTML layout mixed in with the data, making the extraction of data difficult.
HTTP (online slides)
- HTTP, or the Hypertext Transport Protocol is the data transfer protocol of the web. - It consists of requests, which contain a verb and URL and a response, which contains a status code and content type. - HTTP is a stateless protocol, which means the current request knows nothing of the previous requests. - The well-known port for HTTP is TCP/80
The World-Wide Web (online slides)
- Information System on the Internet for displaying content resources. - The world wide web is not the Internet; it is part of it! - Built upon open standards
Protecting data (online slides)
- Limited the potential damage. - Connect to databases with read-only permissions if you are not updating or inserting data. - Validate form fields: verify the data the user typed before proceeding - Run web services with only the minimal level of permissions that is needed. - Use logging so if something does happen. - Use change control - ASSUME EVERYONE IS A BAD ACTOR!
• List at least 3 things required to secure a website? (participation)
- Secure communication with TLS (Transport Layer Security) - Protect the server by service Hardening on the Web server. Only run the services that are required - nothing more. - Protect the web service itself - Secure the application running over the web
Web Servers (online slides)
- Serve up static content over HTTP, or execute code and return a response as content. (This is called CGI - Common Gateway Interface) Popular Web Servers: • Apache - Open source web server. Most Popular. • IIS - Microsoft's web server • NGINX - Engine X Open source webserver, commonly used for: • Load balancing • Reverse proxies
Web Services (online slides)
- The most important service in any organization. - Beyond a company's website, other business processes get "webified" --> Webmail, Customer Relationship Portals, E-Commerce - To support these same services outside the browser we "webify" the business logic into an API (Application Program Interface)
Webmaster vs. Web Administrator (online slides)
- Two major roles in the web - Generally the same person for small companies - But NOT the same person for midsized or larger companies. Webmaster (a very outdated term) --> Person responsible for content, graphics, usability, etc - What is classically thought of when creating websites / webpages. Web Administrator --> Person responsible for administering webserver (machine or VM), create virtual directories, virtual sites, patching, backups, etc. - Basic skills required in administering any server•
TLS - how it works on the web (online slides)
1. Client request 2. Server response 3. Key exchange 4. Cipher negotiate 5. Client http get 6. Data transfer
HTTP Response Status Codes (online slides)
1xx - Informational 2xx - Success - 200 - OK - 201 - Created - 202 - Accepted 3xx - Redirection - 301 - Moved Permanently - 304 - Not Modified Error - 307 - Redirect 4xx - Client error - 400 - Bad Request - 401 - Unauthorized - 403 - Forbidden - 404 - Not Found 5xx - Server Error - 500 - Internal Server - 501 - Not Implemented - 502 - Bad Gateway
Which HTTP Status code explains the requestor made the error? (participation)
4xx - Client error (400 level = user request error (you typed a URL that doesn't exist))
What is an HTTP reverse proxy? Why is it common in cloud environments? (participation)
An HTTP server which retrieves resources from one or more servers on behalf of a client. It is common in cloud environments because it is used to limit exposure of the web application.
HTTP Request Verbs (online slides)
GET - Request a resource. Most common POST - Add to a resource. Used when sending data to the website, like submitting a form. Other Verbs: PUT - Update a resource DELETE - Remove a resource PATCH - Update part of a resource HEAD - No Response Body OPTIONS - Reserved.
What is the common HTTP request verb used to request a resource? (quiz)
Get
HTTP Protocol in Action (online slides)
Like SMTP and IMAP, you can use the HTTP protocol directly with telnet:
Reverse Proxies at work (online slides)
Reverse Proxy Server dispatches requests to the correct server based on configured rules.
3 Types Of Web Service Architectures (online slides)
Static, Dynamic CGI / Platform, Dynamic Data /Driven
HTTP Response (online slides)
Status Code - What happened? Content Type - The actual content
Why Web API's (online slides)
The Web is transitioning: • From direct user-based consumption of data • To indirect user-based consumption of data through devices and also direct device-to-device consumption. Examples: • Do you read news in your browser or on your phone? • Do check the weather on weather.com or do you ask Alexa?
HTTP Content Types (online slides)
These are Media Types. They instruct the client (usually a browser) what to do with the content. - text/plain - plan text - text/html - HTML text - image/gif - gif image format - image/jpeg - jpeg image format - application/json - JSON data format - application/xml - XML data format - application/javascript - JavaScript
What is a URL? What Type of namespace is it? (participation)
URL - Uniform Resource Locator. A global (hierarchal) name space which identifies a resource on the web.
Web Terminology (online slides)
URL - Uniform Resource Locator. A global name space which identifies a resource on the web. HTML - Hypertext Markup Language. A Markup language for rendering web pages. Web Server - A computer on the web which hosts resources. Web Browser - A computer on the web which consumes resources Resource - content at a URL, hosted on a web server and requested by a web browser.
HTTP Request (online slides)
Verb - Nature of the request • URL - The resource to request
TLS Termination Proxy (online slides)
We can configure our reverse proxy for TLS but do not require encryption over our internal network.
What are two machine-readable content formats for web APIs? (participation)
XML, JSON
Which is NOT one of the benefits of using Transport Layer Security with SSL certificates on the web? (quiz)
allows users to trust the service to protect their sensitive data
Why is HTML not a suitable content format for web APIs? (participation)
because another computer is the recipient of the output (as opposed to a user). HTML requires sending a lot of data, big files.
What is the markup language used for rendering web pages? (quiz)
html
What does it mean when we say HTTP is a stateless protocol? (participation)
the current request knows nothing of the previous requests.
What is the global name space which identifies resources on the web? (quiz)
url
Which of the following is a machine readable content format commonly used by web APIs? (quiz)
xml
A Web API (online slides)
• A Web API is an API which is executed over the HTTP or HTTPS protocols. • This allows us to leverage services in the cloud into our own programs, such as: Weather, Text to speech, Video playback • Amazon Alexa is a simple device but seems intelligent because it simply is a voice activated means to execute Web API's in the Cloud!
HTTP Reverse Proxy (online slides)
• An HTTP Reverse Proxy is an HTTP server which retrieves resources from one or more servers on behalf of a client. • Used to limit exposure of the web application
Common methods of attacks (online slides)
• Directory Traversal: using ../../ to go up or down a directory structure. Can obtain data that is otherwise unavailable • Form field corruption: using a websites forms to enter data or purchase items via hidden data fields. If you know what variables are being used to pass data, you can change the values. • SQL injection: inject SQL statements (select * from lastnames) to add, edit, or delete data in a database or even execute applications on the webserver.
JSON Content Format (online slides)
• JSON - JavaScript Object Notation is a lightweight data interchange format based on how JavaScript data is serialized to text. - The JSON format is more compact than XML and requires little effort for many programming languages to parse (convert from text back into a workable object) easily. - In the example is trivial for a machine to extract the stock information because the JSON only contains data and its structure.
Web Service Security (online slides)
• Rule #1 ALWAYS assume the worst. There are many layers of security, use them all: • Secure communication with TLS (Transport Layer Security) • Protect the server by service Hardening on the Web server. Only run the services that are required - nothing more. • Protect the web service itself• Secure the application running over the web
HTTP Dependent Services (online slides)
• TCPIP Networking • DNS (internal and root DNS servers) Resolve names like www.google.com to IP addresses
RESTful API's (online slides)
• When a web API embraces the HTTP semantics, it is considered a RESTful API. • REST stands for "Representational State Transfer" and is a design pattern for API's • REST design uses URL's and HTTP Verbs to make the intent clear: • Examples: - Current Weather in Syracuse, NY: GET http://fudgeweather.com/weather/Syracuse,NY/current - Add Item to shopping cart: POST http://fudgeazon.com/cart?productid=1043
XML Content Format (online slides)
• XML - the Extensible Markup Language is a machine readable content format similar to HTML. • XML allows for the design of schemas so that any data format can be represented. These schemas can then be validated to ensure the content • In the example is trivial for a machine to extract the stock information because the XML only contains data and its structure.