CS103
What Is a URL?
"Uniform Resource Locator"; It is a unique identifier from a page, image, or other file on the web; what you enter in the address bar of your web browser Example: hdp://www.bpl.org/general/hours/index.php
Malware from downloading: Trojan Horse
A trojan horse is malware "disguised as useful files or applications that entice you into executing them."
Why are hacking and malware so common?
-According to cybercrime reporter Brian Krebs, it all started w/ spam -Spam is unsolicited email, often of a commercial nature, sent indiscriminately to multiple mailing lists, individuals, or newsgroups; junk email.
Another Trojan Horse Ex
Some web pages use JavaScript to download malware when you visit the page; no clicking required -called a drive-by download -reason to avoid pages w/ suspicious URLs
Add your content to the body
-Add the content of the page b/t the 2 body tags -Use HTML tags to identify the elements of the page -All content should be inside a pair of elements like these: paragraph <p></p> top-level heading <h1></h1> second-level heading <h2></h2> (third-level, fourth-level, fifth-level, and sixth-level headings) unordered list <ul> </ul> ordered list <ol></ol> list item within a list <li></li>
Adding images to web pages
-For each image, you will need an <img> tag w/ 4 attributes: -src (source, specifying location of image file) -alt (alternative text) -height (in pixels) -width ( also in pixels)
Worms
A worm travels from computer to computer on a network by exploiting vulnerabilities in software. It does not require human action to spread
Botnets
Botnet is a collection of compromised computers often referred to as zombies infected w/ malware that allows an attacker to control them -Botnet owners or herders are able to control the machines in their botnet by means of a covert channel such as IRC, issuing commands to perform a malicious activities such as distributed denial-of-service attacks, the sending of spam mail, and info theft
What's an image credit?
For a CC licensed image: -Title of the image -Link to the original image -Name of the creator -Link to the creator's web page -Name of the abbreviation of the CC license -A link to the CC license deed
What's in an image credit?
For a public domain image: -title of the image -link to the original image -name of the creator's web page -link to the creator's web page -statement that "this work is in the public domain"
Spam-based businesses
Most businesses that rely on spam for marketing sell prescription drugs or counterfeit merchandise -Selling prescription drugs w/o a prescription is illegal in the US
Getting around spam
Most internet service providers block email from known senders of spam, making it harder for the senders of spam to do business -Comps that send spam found a way to keep getting their message through: botnets
No lists or paragraphs inside headings
paragraph element not needed inside heading element
Why write good HTML?
-Good HTML will work in all web browsers (Browsers will fix some mistakes, but not all, and they don't always fix them the same way) -Good HTML is more accessible b/c assistive devices use headings and lists to help people navigate through pages
Good HTML
-Has all required tags -Uses both opening and closing tags -Nests tags correctly -Spells tags correctly -Provides attributes when required -Doesn't contain formatting-put that in a style sheet instead
To avoid phishing: Check the URL
-Hover over a link in an email or on a web page to see the URL in your web browser -URL should begin with http:// or https:// -The domain name, including the TLD, should exactly match the website you expect to go to
Add Hyperlinks
-Hyperlinks are what makes the world wide web unique
Formatting images for web use
-Make images only as large as you need for web use
Other Intellectual Property Protections
-Patents and trademarks are 2 forms of intellectual property protection, not related to copyright -Patents protect an investor's right to sell his/her creation -Trademarks are used to identify and distinguish the goods/services of one seller pr provider from those of others, and to indicate the source of the goods/services
Lists can only contain list items
Cannot have paragraphs inside of them
Protect Yourself against viruses
-Run anti-virus software -keep anti-virus software definitions up to date -Do not open email attachments that you aren't expecting, or from ppl you do not know or trust -If you're not sure, call the person/ comp and ask before opening
Web Image Formats: SVG
-Scalable vector graphics -relatively new -used for drawings and animations, not realistic images -Lossless -Scalable-can use at many different sizes and resolutions w/o sacrificing quality -images are stored as objects (circle, line, path) rather than as pixels -Small file size -Edit w/ a text editor or drawing program
How to spot a phishing email
-Sense of urgency or threat-something bad will happen if you don't respond right away -The email asks you to send a password, or click a link to view, verify, or reset something-but the URL to which the link takes you is not a trusted website -The "from" email address does not match the "from" name -The "from" name or email address does not match a trusted source -Errors in grammar, spelling, or punctuation
Causes of Password theft
-Someone guessed it using info they know about you -A cybercriminal used a computer program to try thousands of passwords until it found the one that worked -Someone guessed the answers to your security questions and used them to reset your password -Another site was hacked and your password was stolen -You entered your password on a computer infected w/ keylogger malware -Your password was intercepted when you entered it while on public wifi -accidentally gave it away through phishing
Difference b/t phishing and spam
-Spam is unwanted email that is attempting to sell you something -Usually annoying but not dangerous -Phishing is unwanted email that attempts to trick you into surrendering private info
PNG
-Stands for "Portable Network Graphics" -Developed as a patent-free alternative to GIF -Better compression that GIF -Can accommodate many colors -Good for drawings, logos, and charts regardless of number of colors -supports transparency -does not support animation -edit w/ an image editor like photoshop or GIMP
GIF
-Stands for Graphics Interchange Format -Compression algorithm is patented by Unisys -Can accommodate 256 colors -Compression is lossless if 256 colors or less are used -Good for drawings, logos, and charts w/ less than 256 colors -supports transparency -supports animation -Edit w/ an image editor like Photoshop or GIMP
What is the purpose of copyright?
-encourages creativity by allowing creators to profit from their work -ensures authors are paid fairly for their effort -A creative work is an expression of the personality of its creator, and thus should be protected from being used w/o the creator's permission
How long does copyright last?
-life of author + 70 years -For corporations, 95 years after publication
Improving Performance
-resize image to the size you will need using an image editor. -Do not resize by using adjusting the height and width attributes in the HTML -Use highest compression possible w/o loss of quality. Experiment and see what works Use lighter-weight alternatives to images: -CSS effects like borders and gradients -Regular text instead of images of text -Icon fonts for icons
JPEG
-stands for Joint Photographic Experts Group -Can accommodate 16 million colors -Compression is lossy-info is lost w/ each edit -Best for continuous-tone images like photos -Edit w/ an image editor lke photoshop or gimp
Protect yourself from password theft
-use great passwords -don't give truthful answers to security questions-treat them just like passwords -use different passwords -don't keep passwords in email -use a password manager (lastpass) to store passwords -try 2 factor authentication if available -don't use public wifi for sensitive transactions
Web image formats
-web browsers will only display images that are saved in these formats: JPEG, GIF, PNG, SVG
Page with the document type declaration for html5
<!DOCTYPE html>
Skeleton looks like this
<!DOCTYPE html> <html> <head> </head> <body> </body> </html>
Creating Hyperlinks
<a href="http://www.bu.edu/">Boston University</a> -href is an attribute of the <a> tag-it provides info necessary for the a tag to work. Here, that info is the destination (or target) of the link. -Note that href must be enclosed in straight quotation marks (curly quotes won't always work) -The href URL must begin with http:// (or https://). -The text b/t the <a> and </a> tags is called the link text
Some other useful HTML tags
<br> Creates a line break (forces text after it to a new line) -Do not use instead of a paragraph, or to add extra space below or above a paragraph -Do use in addresses or stanzas of poetry, when multiple lines are part of the same conceptual unit
Standalone tags
<br> and <hr> are standalone tags. Others include <meta>, <link>, and <img> In XHTML, we write these as "self-closing" tags, w. a slash before the closing tight angle bracket. <br /> But in HTML 5, the closing slash is not necessary -Cannot make own html tags
Other useful HTML tags
<hr> horizontal rule (horizontal line)
Other useful HTML tags, cont.
<sub> subscript </sub> <sup> superscript </sup> <blockquote> creates an intended block of text. Careful-only use for block quotations, not simply cheating </blockquote>
What is a markup language?
A markup language is a computer language that uses tags to define elements within a document
More about 2 factor authentication
A password is something you know. -2 factor authentication requires a password plus another factor: either something you have (usually a phone or token) or something you are (fingerprint, retinal scan)
Malware from infected files: virus
A virus is: a specific type of malware, not a term for all malware -a parasitic application that can self-replicate -requires a carrier, that is, a file that a user wants or needs that it can hide inside of -Two great hiding places: email attachments and Microsoft Office
Public domain
A work is in the public domain if it is not protected by copyright -This may occur b/c: the work was something that could not be copyrighted-ie a name, the copyright has expired, the work was dedicated to the public domain by its creator (rare), the copyright was not registered or renewed properly, the work was created by the US federal govt. -If a work is in the public domain, you do not need to get permission to use it
How can you tell a work is in the public domain?
A work is in the public domain if: it is produced by the US govt, identified explicitly as "public domain" -The absence of a copyright statement does not mean the work is in the public domain
Denial of service attack
An attempt to shut down a computer, website, or network by flooding it w/ requests -Motivation can be: commerical (shut down competitor), Political (activism), criminal
From IP Address to Domain Name
B/c they consist of numbers and periods, IP addresses are hard for humans to read and remember -Domain names provide a friendly equivalent -Domain name servers translate domain names back into IP addresses
Botnets, cont
Botnets are an incredibly valuable tool to the senders of spam -Email from computers is unlikely to be traced of blocked -However, anti-virus and anti-malware software can detect and remove the bot software, taking that computer out of the botnet
CC License Rationale
CC licenses -allow the copyright holder to keep copyright. The work does not become part of the public domain -allow others to reuse the content for free, w/o notifying the copyright owner, as long as they obey the license terms -Are free to obtain and use
More About CC
CC licenses mix and match these attributes: -attribution (BY)-anyone using your work must provide attribution (give credit) to you -NoDerivs(ND)-Abbreviation for no derivative works, others may use your work as is, but not create a derivative work (collage, translation, mashup) -Share Alike (SA)- Derivative works you create must have same CC license as yours -Non-commercial (NC)- commercial (business) use not allowed
HTML Document
Consists of content (words and images) and HTML tags indicating its structure
What is copyright?
Copyright is a legal concept that grants authors and artists control over certain uses of their creations for defined periods of time. It limits who may copy, change, perform, or share those creations.
What does copyright prohibit?
Copyright makes it illegal to do the following w/o the permission of the copyright owner: -reproduce, distribute (sell), perform, publicly display, make into a derivative work
What does copyright protect?
Copyright protection subsists, in accordance w/ this title, in original works of authorship fixed in any tangible medium of expression, now known or later developed, from which they can be perceived, reproduced, or otherwise communicated, either directly or w/ the aid of a machine or device
Exception: Fair Use
Fair use is a provision in US copyright law that allows for limited use of copyrighted works for a few defined purposes: -criticisms, comment, news reporting, research, scholarship, teaching
Trojan Horse Example
Fake anti-virus software has become a common Trojan horse technique -Affects both Windows and Mac systems -A pop-up window tells you that your computer is infected "Click here to scan or fix" -Tricks you into purchasing something you do not need (steals your money) -May also install other types of malware
Is it fair use?
Determined on a case-by-case basis. These questions are considered: -Is it being used for commercial or non-commercial purposes? -How much of the copyrighted work is used? -What effect does the use have on the value of the work?
How does malware get onto your computers?
Different types of malware enter by different routes. -Worms: Enter through vulnerabilities in software -Viruses: Enter via infected files (such as email attachments) -Trojan horses: enter through files that we willingly download or open b/c we think they are useful
Also Available: Country Code TLDs
Dreamhost also offers some country code TLDs that are commonly used in US -.io (British Indian Ocean Territory) -.me (Montenegro) -.co (Colombia) Each new country controls how its ccTLD is used. These countries have chosen to allow people outside the country to use theirs.
How do domain names work?
Each computer on the internet has an "IP (Internet Protocol) address" that distinguishes it from all other devices on the internet -Example: the IP address for www.bu.edu is 128.197.26.4 -If you enter http://128.197.26.4 into your browser's bar, you will get the BU home page
Sample Markup Languages
HTML -Hypertext markup language, used for web pages, identifies paragraphs, headings, lists XML -extensible markup language -used for data exchange -precise set of rules (the schema) determines which elements are required, which elements are children of one another
What does HTML stand for?
HTML=hypertext markup language
What cannot be copyrighted?
Ideas, facts, names, titles, short phrases or expressions, list of ingredients (recipe)
What to do if you suspect phishing
If anything looks suspicious, get the sender's phone number from the official source (its own website or a directory, not the email) to find out if its legitimate -Do not click on the link -Do not respond to the email
Image Location
Image files must be located on the web server to be visible on the web. -upload the image to your website at the same time that you upload the web page, or the image will not appear correctly -
"New" gTLDs
In 2013, ICANN created a new process to dramatically expand the number of gTLDs -More than 1300 new gTLDs available or in process since then ie. .guru, .gallery, .bike -Requirements for acquiring vary by gTLD
Malware (malicious software)
Malware is software designed to interfere with a computer's normal functioning
Reasons to hyperlink
Navigation-Help the user get b/t pages on your own site Help the user-Define terms, Help the user get to info that you don't "own"...or don't want to provide yourself Do the right thing-Acknowledge your sources, Link to creative commons license, promote other sites that share your goals
What makes you vulnerable to worms?
Not automatically updating the operating system on your computer or phone. -Not applying updates for popular software like Microsoft Office, Safari, Chrome
Spear Phishing
Phishing emails have become much more sophisticated -Spear phishing targets a particular person using publicly available info, such as the name of your comp, school, or supervisor, or domain of your website ie John Podesta
What is phishing?
Phishing is the act of sending email to a user falsely claiming to be an established legitimate enterprise in an attempt to scam the user into surrendering private info that will be used for identity theft
Protect yourself from worms
On your own computer or phone -Keep all your software up to date -Whenever possible, turn on automatic updates -get rid of software you do not need -do not use apps, or software that is no longer being updated
Domain Name
Part of the URL
More about password managers
Password managers are secure applications that store your passwords -You need one strong master password. Password manager stores the rest -Password managers improve security by: 1.Allowing you to use longer, more complicated passwords b/c you don't need to remember them 2.Alerting you to fake websites, b/c they will only autofill your passwords on the real site
Screen readers
People who are blind use screen reader software to use the web -Screen readers turn text into speech
Who can claim copyright?
People-creator of work, creator's heirs Corporations-If work was created "for hire" or by an employee -Copyright does not disappear when the creator dies, although it will not last forever
Copyright Infringement does not equal plagiarism
Plagiarism is representing the work of another person as your own, whether by explicitly identifying as the author, or by failing to credit the author (thereby implying that work is your own) -Copyright infringement is using another person's copyrighted work w/o their permission, even if you give them credit. _Avoid both.
Internet security has two parts:
Protecting your computers (your own and your web host) and data from malware and intrusions -Protecting your credentials from hackers and cybercriminals
Protect yourself from Trojan Horses
Read reviews from trusted sources (ie CNET, Mac World) before you download any software -Trusted source means something that cannot be faked or manipulated -Never download software from a pop-up window or by clicking a link. Always go to the vendor's website for software -Be suspicious of pop-ups, software offers, warning messages, and email attachments -Never visit a site marked as malicious in search engine results
Restricted gTLDs
Restricted TLDs are meant for a particular purpose. When registering one, you need to agree that you meet the criteria. You may be called upon to prove it later. Ex: .biz-businesses .pro-for licensed professionals (ie. accountant, plumber) .name-for individual people, or fictional characters to which the registrant has rights
Getting Permission
To use copyrighted material, you must get permission from the owner. 1. the copyright owner may pre-emptively give you permission through a terms of use or CC license that grants permission as long as you follow those terms 2. Otherwise, you must get permission individually by writing to the copyright owner and negotiating a license. This may involve payment
Phishing using TLDs
Typosquatting
Separating content and presentation
Some of the tags are deprecated b/c they are used only to affect the way the text appears on the page -when html was first written, this was the only way to control the appearance on the page -Today, Cascading style sheets do this work, so we won't include presentation info in HTML
Not available to you and me: Sponsored Generic TLDs
Sponsored gTLDs are administered by orgs that tightly control their use. Only eligible entities are allowed to register them in the first place. .edu: colleges, universities .gov: US federal government entities .mil: US military .museum: Museums .travel: travel and tourism sites .xxx: "Adult sites"
Building the skeleton
Start by adding essential tags, also called elements to your page. -The <html> tag opens the document; </html> ends it -Inside <html></html>, the document has two parts: -<head> Info about the web page (metadata) goes here -<body> Content of the web page goes here
when was copyright established?
US constitution gives congress the power to enact laws establishing a system of copyright in the US. Congress enacted the first federal copyright law in May 1790, and the first work was registered within 2 weeks
How does copyright work?
Under current US law: -Copyright is in effect as soon as the work is created -Display of copyright notice not required -you do not have to register w/ the US copyright office unless you want a public record of your copyright or you want to sue for copyright infringement
Generic TLDs
Unrestricted Restricted Sponsored New
Fill in the title element
The <title> tag does not appear on the web page itself, but it does show in the browser frame or tag -Should include both the name of the web page and the name of the website -The title is critical to both usability and findability -If a page has no title, the URL or filename will be displayed in the browser frame instead
Fair Use Caution
The distinction b/t fair use and infringement may be unclear and not easily defined. There is no specific number of words, lines, or notes that may safely be taken w/o permission... the safest course is always to get permission from the copyright owner before using copyrighted material.
Consequences
The makers of spam need to constantly come up w/ new kinds of malware to add computers to their botnets. -Thus, more malware is created every day -Once botnets were created, they could be rented out of more nefarious purposes: -Distributed Denial of Service attacks, looking for software vulnerabilities, harvesting of usernames, passwords, and email contacts
Can websites be copyrighted? Can domain names be copyrighted?
The original authorship appearing on a website may be protected by copyright. This includes writings, artwork, photographs, and other forms of authorship protected by copyright. Domain names cannot be copyrighted
Top-Level Domain Names
The right most section of the domain name (the suffix or extension) is the TLD -ie. ".edu" -There are two kinds of TLDs: Generic TLDs (gTLDs) indicate the type of website Country code TLDs (ccTLDs) are particular to a geographic location
Unrestricted gTLDs
These gTLDs are restricted; they can be used by anyone for any purpose. -However they are usually used for: .com=company .org=organization, usually nonprofit .net=network .info=informational site
HTML: Deprecated Tags
These tags were used in previous versions of HTML, but are "deprecated"; they will not be supported in future versions of HTML -Avoid these when writing new HTML <center> <font> <strike> <u>
What is accessibility?
Web accessibility is the inclusive practice of removing barriers that prevent access to websites by people w/ disabilities
More sample markup languages
XHTML -HTML that follows the rules of XML MathML -Mathematical Markup Language -used for formulae within web pages
Deconstructing a URL http://www.bpl.org/general/hours/index.php
bpl.org-is the domain name http://-is the protocol www-is a hostname or subdomain /general/hours/-is the file path Index.php-is the page
What is hypertext?
hypertext is tect which contains links to other texts (or other media)
What does copyright protect?
literary works, musical works, dramatic works, pantomime and choreographic works, pictorial and graphic and sculptural works, motion works, motion pictures and other audiovisual works, sound recordings and architectural works, computer software