Topic 4 Computer Science
Definition: Abstraction
Abstraction is the process of taking away or removing characteristics from something in order to reduce it to a set of essential characteristics In computer science, abstraction is a technique for managing complexity of computer systems. EX: A user does not need to know the internal working of the mobile phone to operate it.
4.3.13 Intro
Construct algorithms using predefined sub-programmes, one-dimensional arrays and/or collections
4.1.19 Intro
Construct an abstraction from a specified situation
4.2.3 Intro
Discuss an algorithm to solve a specific problem
4.1.17 Intro
Identify examples of abstraction
Real vs abstract
It's all about the details
4.3.5 Intro
Outline the need for a translation process from a higher level language to machine executable code
4.3.1 Intro
State the fundamental operations of a computer
4.2.7 Intro
Suggest suitable algorithms to solve a specific problem
Definition: Variable
• Variables are storage location for data in a program. • They are a way of naming a memory location for later usage (to put a value into/retrieve a value). • Each variable has a name and a data type that is determined at its creation (and cannot be changed).
What effects run time of an algorithm?
(a) computer used, the hardware platform (b) representation of abstract data types (ADT's) (c) efficiency of compiler (d) competence of programmer (programming skills) (e) complexity of underlying algorithm (f) size of the input
Why do we need high level languages?
- High level language = similar to human language (like English) - Low level language = close to the binary code used to actually process the instruction. • As human needs for computer systems have expanded, it is necessary to abstract from the basic operations of a computer. • It would take far too long to write the type of systems needed today in machine code.
Advantages to modular design
- Usefulness of reusable code - Eases program organization, both for the individual programmer, team members - Makes future maintenance easier - you only have fix/update a module, not the whole program. Modular programming is an important and beneficial approach to programming problems. They make program development easier, and they can also help with future development projects.
Standard collection operations
.addItem( data ) = add data item to the collection .resetNext() = start at the beginning .hasNext() → tells whether there is another item in the list .getNext() → retrieves a data item from the collection .isEmpty() → check whether collection is empty
Flow Chart Symbols
1. Arrows (Shows the flow of the program). 2. Ovals (End and start of the program). 3. Rectangle (Processing such as calculations, etc). 4. Parallelogram (Input of data and output from the computer memory). 5. Diamond (indicates a decision, like true or false) 6. Rectangle with lines (process of modules read, calc, print) 7. Polygon (loop with a counter) 8. Connectors (Flowchart connectors like circle sections of the same page. Home base connects pages)
Fundamental Operations
All CPUs have sets of instructions, also called the fundamental operations, that enable commands to be processed. The four most fundamental operations are: ADD COMPARE RETRIEVE (sometimes called LOAD) STORE (sometimes called SAVE)
4.2.4 Intro
Analyze an algorithm presented as a flow chart
4.2.5 Intro
Analyze an algorithm presented in pseudocode
Binary search
Binary search, also known as half-interval search, is a search algorithm that finds the position of a target value within a sorted array. • It works by comparing the target value to the middle element of the array; • If they are unequal, the lower or upper half of the array is eliminated depending on the result and the search is repeated in the remaining sub-array until it is successful. • It only applies to SORTED arrays (where there are usually no duplicate values, or duplicates do not matter)
4.3.11 Intro
Construct algorithms using the access methods of a collection. A collection is like a linked-list, but the order of elements is not guaranteed so you can't use .get(x) or .size() etc.
4.2.6 Intro
Construct pseudocode to represent an algorithm
4.2.8 Intro
Deduce the efficiency of an algorithm in the context of its use
4.3.7 Intro
Define the operators: = , ≠, <, <=, >, >=, mod, div
4.3.6 Intro
Define the terms: variable, constant, operator, object
4.3.10 Intro
Describe the characteristics and applications of a collection
4.2.1 Intro
Describe the characteristics of standard algorithms on linear arrays
4.2.9 Intro
Determine the number of times a step in an algorithm will be performed for given input data
4.3.12 Intro
Discuss the need for sub-programmes and collections within programmed solutions
4.1.20 Intro
Distinguish between a real-world entity and its abstraction
4.3.2 Intro
Distinguish between fundamental and compound operations of a computer
4.3.3 Intro
Explain the essential features of a computer language
4.3.4 Intro
Explain the need for higher level languages
4.1.18 Intro
Explain why abstraction is required in the derivation of computational solutions for a specified solution
Essential features of a computer language:
Fixed vocabulary Unambiguous meaning Consistent grammar & syntax
Collections
In 'real life' Java, there are many types of collections. In IB land Java, think of a collection as a Linked List with an unknown size/length. A collection is like a linked-list, but the order of elements is not guaranteed
4.2.2 Intro
Outline the standard operations of collections
Types of operators
Simple assignment, unary, arithmetic, relational
Collection methods in Pseudocode are:
• .addItem( new data item ) • .resetNext( ) start at beginning of list • .hasNext( ) checks whether there are still more items in the list • .getNext( ) retrieve the next item in the list • .isEmpty( ) check whether the list is empty
Basic comparisons
• A binary search is faster - O( log N ), but can only be performed in a SORTED list • A sequential search is slower - O(N) but can be performed whether the list is sorted or not • A bubble sort can "quit early" if no swaps are made in a pass. But it makes lots of swaps. • A selection sort must always perform N passes - it cannot "quit early". But it makes fewer swaps - maximum of N swaps • Both bubble and selection sort are O(n^2) = equally complex
Definition: Operator
• A character/set of characters that represents an action • Types: - Boolean operators (AND, OR, &&, ||) for working out true/false situations - Arithmetic operators (+, -, ++, --, /, %, div, mod) for doing simple mathematical calculations - Assignment operators , which assign a specified value to another value and ( = ) - Relational operators , which compare two values (<, >, >=, <=, ==, !=, .equals() )
Definition: Constant
• A constant is an identifier with an associated value which cannot be altered by the program during normal execute -the value is constant. • This is contrasted with a variable, which is an identifier with a value that can be changed during normal execution - the value is variable.
Key difference: complexity
• A fundamental operation could be something like add two numbers, store a number, move a number to another location in RAM etc. • These are operations that do not require the processor to go through a large number of sub operations to reach a result. • A compound operation is an operation that involves a number of stages/other operations. Think of it as a group of operations that combine together to form an operation.
Code reusability
• A program module can be reused in programs. • This is a convenient feature because it reduces redundant code. • Modules can also be reused in future projects. • It is much easier to reuse a module than recreate program logic from scratch.
On a more basic note:
• A single loop that repeats n times takes n time to run • A nested loop that repeats n times takes n x n times to run (potentially MUCH longer) • A loop that checks a condition/flag (usually a WHILE loop) only loops while it has to - no unnecessary looping! • So, want a loop to run faster? Try using a flag-based (WHILE) loop that will stop once the item you're searching for is found. The alternative (FOR loop) would check EVERYTHING every time it runs.
Why use abstraction?
• Abstraction can be viewed both as a process and as an entity. • Abstraction enables a person to concentrate on the essential aspects of the problem on hand, while ignoring details that tend to be distracting. • The real world is sufficiently complex and presents to many items simultaneously to be dealt with. • To solve problems well, we need to conquer this complexity. • Abstraction is a convenient way to deal with complexity.
Collections?
• As far as the IB is concerned, collections are UNORDERED lists usually of UNKNOWN length or size. • In practice we usually program collections using LinkedLists in Java. • This means that we must remember there are a few things that LinkedLists CAN do that collections CANNOT.
Manageable tasks
• Breaking down a programming project into modules makes it more manageable. • These individual modules are easier to design, implement and test. • Then you can use these modules to construct the overall program.
Bubble sort
• Bubble sort is a simple sorting algorithm that repeatedly steps through the list to be sorted, compares each pair of adjacent items and swaps them if they are in the wrong order. • The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. • The algorithm, which is a comparison sort, is named for the way smaller elements "bubble" to the top of the list. • Although the algorithm is simple, it is too slow and impractical for most problems
Key terms
• Compiler - If the translator translates a high level language into a lower level language (done in a batch) • Interpreter - the translator translates a high level language into an intermediate code which will be immediately executed by the CPU (done line by line) • Assembler - the translator translates assembly language to machine code (mnemonics to binary)
Definition: Complexity
• Complexity of an algorithm is a measure of the amount of time and/or space required by an algorithm for an input of a given size (n). • Time for an algorithm to run t(n) is characterized by the size of the input. • We usually try and estimate the WORST CASE, and sometimes the BEST CASE, and very rarely the AVERAGE CASE.
Two Types of Languages
• Human languages like English, Arabic, French, Flemish are called natural languages. • Computer languages are either high level languages (like Java, C#, VisualBasic, Python, etc.) or low level (like Assembly or Machine Code).
Definition: Object
• In Object-oriented programming (OOP), an object is an instance of a class. • Objects are an abstraction: they hold both data (states), and ways to manipulate the data (behaviours).
What do we measure?
• In analysing an algorithm, rather than a piece of code, we will try and predict the number of times the principle activity of that algorithm is performed. • For example, if we are analysing a sorting algorithm we might count the number of comparisons performed.
Sequential search
• Linear search or sequential search is an algorithm to find an item in a list. • It starts at the first element and compares each element to the one it's looking for until it finds it. • Commonly used with collections (which are unsorted lists of items) and text/csv file reading.
Distributed development
• Modular programming allows distributed development. • By breaking down the problem into multiple tasks, different developers can work in parallel. • And this will shorten the development time.
Program readability
• Modular programming leads to more readable programs. • Modules can be implemented as user-defined functions. • A program that has plenty of functions is straightforward. • But a program with no functions can be very long and hard to follow.
Selection sort
• Selection sort is a sorting algorithm and it is inefficient on large lists • Selection sort is noted for its simplicity, and it has performance advantages over more complicated algorithms in certain situations, particularly where memory is limited. • The algorithm divides the input list into two parts: the sublist of items already sorted, which is built up from left to right at the front (left) of the list, and the sublist of items remaining to be sorted that occupy the rest of the list. • Initially, the sorted sublist is empty and the unsorted sublist is the entire input list. • The algorithm proceeds by finding the smallest (or largest, depending on sorting order) element in the unsorted sublist, exchanging (swapping) it with the leftmost unsorted element (putting it in sorted order), and moving the sublist boundaries one element to the right.
Some applications for lists
• Useful for group of items when you don't know how many items you'll be needing/using (contrast to arrays where the size is set in stone at creation) • Because the collection is only as big as you need it to be, it is an efficient use of RAM (memory) • Can be of any data type (primitive or even your own object)
Best vs Worst vs Average case
• Worst Case - is the maximum run time, over all inputs of size n, ignoring effects (a) through (d) above. That is, we only consider the "number of times the principle activity of that algorithm is performed". • Best Case - In this case we look at specific instances of input of size n. For example, we might get best behaviour from a sorting algorithm if the input to it is already sorted. • Average Case - Arguably, average case is the most useful measure. It might be the case that worst case behaviour is pathological and extremely rare, and that we are more concerned about how the algorithm runs in the general case. Unfortunately this is typically a very difficult thing to measure.