IB Computer Science Case Study 2018

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Vehicle-to-vehicle (VTV) protocol

Using V2V communication, vehicles can send warning messages once an emergency event happens. If vehicles and can receive these messages with little delay, the drivers can be alerted immediately. In such cases, has a good chance of avoiding the accident via prompt reactions, and benefits from such warnings when visibility is poor or when the driver is not paying enough attention to the surroundings. Thus, the vehicle to-vehicle communication enables the cooperative collision warning among vehicles, and. Even though V2V communication may be beneficial, wireless communication is typically unreliable. Many factors, for example, channel fading, packet collisions, and communication obstacles, can prevent messages from being correctly delivered in time. In addition, ad hoc networks formed by nearby vehicles are quite different from traditional ad hoc networks due to high mobility of vehicles.

Max-pooling / Pooling

...

Bounding boxes

A bounding box (usually shortened to bbox) is an area defined by two longitudes and two latitudes, where- Latitude is a decimal number between -90.0 and 90.0.Longitude is a decimal number between -180.0 and 180.0.

Bounding boxes

A bounding box of an object in an image is the smallest rectangle that completely encompasses the entirety of the object.

Brute-force

A brute force algorithm is one in which each possible combination of solutions is tested and the optimum solution is identified. This is not possible within human lifetimes for some computationally complex problems such as the Traveling Salesman problem with complexity of O(n!)

Backpropagation

A common method of training a neural net in which the initial network output is compared to the desired output. The size of the difference between desired output and current output is described as the cost function. The weights of individual neural connections are then modified to gradually reduce the cost function so hopefully improving the overall performance of the network. Backpropagation ("Backward propagation of errors") is implemented in artificial neural networks (such as CNNs) and deep learning algorithms. This method takes the neural network and an accompanying error function and calculates the gradient of this error. A common method of training a neural net in which the initial system output is compared to the desired output, and the system is adjusted until the difference between the two is minimized.

Backpropagation

A common method of training a neural net in which the initial system output is compared to the desired output, and the system is adjusted until the difference between the two is minimized.

Cost function

A cost function is a mathematical formula used to used to chart how production expenses will change at different output levels. In other words, it estimates the total cost of production given a specific quantity produced.

Cost function

A cost function is a measure of "how good" a neural network did with respect to it's given training sample and the expected output. It also may depend on variables such as weights and biases. A cost function is a single value, not a vector, because it rates how good the neural network did as a whole.

Cost Function

A cost function is a relationship depicting the accuracy of a neural network. It represents how close the neural network comes to the correct answer.

Filters (Kernels)

A filter (or a kernel) looks at specific regions called a receptor field. The filter is also an array of numbers (called weights or parameters) and the depth of the filter has to be the same depth as the input, making sure the math works out. The first position of the filter would be the top left corner which then convolves around the input image.

Filters (kernels)

A filter kernel is a matrix of numbers, each number represents a single pixel in an image. These matrixes can be modified to add effects like the ones found in photo shop such as blurring, brightening or darkening. As an example, you could have and white image and a corresponding kernel matrix, each number in the matrix would be responsible for the brightness of a corresponding pixel.

Greedy algorithm

A greedy algorithm is a mathematical process that looks for simple, easy-to-implement solutions to complex, multi-step problems by deciding which next step will provide the most obvious benefit.

Overfitting

A modeling error which occurs when a function is too closely fit to a limited set of data points. Overfitting the model generally takes the form of making an overly complex model to explain idiosyncrasies in the data under study. In reality, the data being studied often has some degree of error or random noise within it. Thus attempting to make the model conform too closely to slightly inaccurate data can infect the model with substantial errors and reduce its predictive power.

MultiLayer Perceptron

A multilayer perceptron (MLP) is a class of feedforward artificial neural network. An MLP consists of at least three layers of nodes. Application: Learning occurs in the perceptron by changing connection weights after each piece of data is processed, based on the amount of error in the output compared to the expected result MLPs are useful in research for their ability to solve problems stochastically (randomly determined), which often allows approximate solutions for extremely complex problems like fitness approximation Analogy: When you learn to read, you first have to recognize individual letters, then combine letters into words, then words into sentences. When you get good at reading, say, English, it's easy to recognize words directly without even thinking about the letters. In fcat, you can eaisly raed jmubled wrods.

Multi-layer perceptron (MLP)

A multilayer perceptron (MLP) is a class of feedforward artificial neural network. An MLP consists of at least three layers of nodes. Except for the input nodes, each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training. Its multiple layers and non-linear activation distinguish MLP from a linear perceptron. It can distinguish data that is not linearly separable. MLPs are fully connected meaning each node in one layer connects with a certain weight to every node in the following layer. This has made them very computationally intensive and impractical for problems that require large input volumes and many hidden layers.

Multi-layer perceptron (MLP)

A multilayer perceptron (MLP) is a feedforward artificial neural network that generates a set of outputs from a set of inputs. An MLP is characterized by several layers of input nodes connected as a directed graph between the input and output layers. MLP uses backpropogation for training the network. MLP is a deep learning method.

Nearest neighbour algorithm

A nearest neighbor algorithm is an example of a greedy algorithm that estimates the shortest route to a destination. This makes efficient (Due to it finding the shortest route in a fast amount of time) but it can be inaccurate because of the algorithms attempt to minimize the time taken. The Nearest Neighbor algorithm is mainly used by buses as it has to cycle through each point of an area once per cycle.

Point clouds

A point cloud is a collection of data points defined by a given coordinates system. In a 3D coordinates system, for example, a point cloud may define the shape of some real or created physical system. Point clouds are used to create 3D meshes and other models used in 3D modeling for various fields including medical imaging, architecture, 3D printing, manufacturing, 3D gaming and various virtual reality (VR) applications.

Point clouds

A point cloud is a collection of data points defined by a given coordinates system. In a 3D coordinates system, for example, a point cloud may define the shape of some real or created physical system. Point clouds are used to create 3D meshes and other models used in 3D modeling for various fields including medical imaging, architecture, 3D printing, manufacturing, 3D gaming and various virtual reality (VR) applications. In a 3D Cartesian coordinates system, a point is identified by three coordinates that, taken together, correlate to a precise point in space relative to a point of origin. X, y and z axes extend in two directions and the coordinates identify the distance of the point from the intersection of the axes (0) and the direction of divergence, expressed as + or -.

Vehicle-to-infrastructure (V2I) protocol

A protocol that allows autonomous vehicles to communicate with surrounding infrastructure (such as traffic lights, ramps and buildings) to suggest or impose the behaviour of the vehicle. Types of behaviour include: optimal velocity and acceleration (with the goal of optimising emissions and fuel consumption) and an example of V2I control is ramp metering, which requires sensors and actuators to measure traffic density on a highway and traffic lights on ramps.

Receptive Fields

A receptive field is the "field of view" of a sensor. It is the area in which stimuli will trigger the sensor. For example, if your eyes are sensors, then everything you can see is within your receptive fields. In the above diagram, the door is within the receptive field of the eyes. However, whatever is behind the head is outside of the receptive field. In the vehicle, the receptive field is the area in which the camera can see and sensors can react.

Point clouds

A set of data points in a 3D coordinate system. it can be used to represent 3D objects.

Shift invariance (Spatial invariance)

A shift invariant system is the discrete equivalent of a time-invariant system, defined such that if y(n) is the response of the system to x(n), then y(n-k) is the response of the system to x(n-k).

Dijkstra's algorithm

An algorithm to find the shortest paths from a single source vertex to all other vertices in a weighted, directed graph. All weights must be nonnegative.

Receptive field

An area of the body surface over which a single sensory receptor, or its afferent nerve fiber, is capable of sensing stimuli. In some body area, e.g. face, ears, front paws, the sensitive areas are small; over the back they are larger.

Autonomous

An autonomous car (also known as a driverless car, auto, self-driving car, robotic car) is a vehicle that is capable of sensing its environment and navigating without human input. (The ability to do tasks or make decisions without the aid of human input.) Many such vehicles are being developed, but as of May 2017 automated cars permitted on public roads are not yet fully autonomous. Having the freedom to govern itself or control its own affairs.

Filters (Kernels)

An image kernel is a small matrix used to apply effects like the ones you might find in Photoshop or Gimp, such as blurring, sharpening, outlining or embossing. They're also used in machine learning for 'feature extraction', a technique for determining the most important portions of an image. In this context the process is referred to more generally as "convolution".

Big O Notation

Big O notation is a notation that describes the efficiency of an algorithm through the running time or amount of space used, proportional the size of the input. e.g. O(n) is linear time, the running time/memory used is proportional to the input size For another example O(log n) means that for any n input, the time/memory required to run the algorithm is log n Big O notation describes efficiency asymptotically, meaning that it does not matter if the time/memory is 2n + 3, because constants are ignored, only proportionality matters, therefore 2n + 3 efficiency would just be O(n). Big O notation can be used to describe the efficiency of any algorithm that depends on an input, such as searching or sorting algorithms.

BigO notation

Big O notation is the language we use for articulating how long an algorithm takes to run. It's how we compare the efficiency of different approaches to a problem.

BigO notation

Big O notation is used in Computer Science to describe the performance or complexity of an algorithm. Big O specifically describes the worst-case scenario, and can be used to describe the execution time required or the space used (e.g. in memory or on disk) by an algorithm. We have used it to describe the execution time of algorithms as proportional to the number (n) items. E.G. The effective complexity of Dijkstras and the Nearest Neighbour algorithm are both O(n^2) Big O notation is the language we use for articulating how long an algorithm takes to run. It's how we compare the efficiency of different approaches to a problem.

Brute-force

Brute force (also known as brute force cracking) is a trial and error method used by application programs to decode encrypted data such as passwords or Data Encryption Standard (DES) keys, through exhaustive effort (using brute force) rather than employing intellectual strategies. Just as a criminal might break into, or "crack" a safe by trying many possible combinations, a brute force cracking application proceeds through all possible combinations of legal characters in sequence. Brute force is considered to be an infallible, although time-consuming, approach.

Brute-force

Brute-force is a trial and error method used to decode encrypted data such as passwords, where every possibility of the set is trialled until the correct one is found.

Convolutional Neural Network

Convolutional Neural Networks (CNNs) are a form of artificial, machine learning networks used to analyse visual imagery. The process is complicated and was originally modeled after biological brain functions. These neural networks require little pre-processing, making them more adaptable to new images not previously tested. In the above example, the neural network would analyses the portion of the image and return the possibility that is represents another image (e.g. bird, sunset, dog, cat, etc.). These neural networks are used in image, video and language recognition and processing.

Deep learning

Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks.

Deep learning

Deep learning (also known as deep structured learning or hierarchical learning) is the application of artificial neural networks (ANNs) to learning tasks that contain more than one hidden layer. Deep learning is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms. Learning can be supervised, partially supervised or unsupervised.

End-to-end learning

End-to-end learning is used in which the CNN learns appropriate steering movements by analysing these actions performed by human drivers over a variety of different routes. The only input received by the CNNs would be input from the front facing cameras. In its training mode, the CNN compares the action it would have taken with the action taken by the driver and through backpropagation repeatedly adjusts its parameters until the cost function reaches an acceptable value. An extension to these approaches would be to also include more data as inputs to the CNNs, such as information from the sensor fusion systems, vehicle-to-vehicle and vehicle-to-infrastructure communications systems. These would require that this data was collected from many hours of human driven cars fitted with these systems to allow for enough training data.

End-to-end learning

End-to-end steering describes the driving-related AI task of producing a steering wheel value given an image.

Feature maps (Activation maps)

Feature Mapping is an interactive classification process that can be applied to any aerial or satellite multiband imagery, from high-quality hyperspectral to poor-quality airvideo.

Feature maps (Activation maps)

Feature maps are produced by each convolutional layer in a CNN. After convolving the filter over all the locations or regions, a 28 by 28 by 1 (in this situation) array of numbers will be created, (for 32 by 32 by 3) which is called an activation map or a feature map (this is the final result).

Feature Maps (activation maps)

Feature maps are the result of applying a convolution (also known as a zip) to an image. This is when its units within a hidden layer are segmented and it looks for the same or similar features, but at different positions of the image. This produces a feature map.

Filter stride

Filter Stride-stride is the number of pixels with which we slide our filter, horizontally or vertically

Deep Learning

Given an input, the program provides outputs by processing information using hidden layers. The deep learning works by pulling abstractions out of input using hidden layers. For example, if you input an image of cat in deep learning algorithm... The program will analyse components of the picture. It will analyse things such shape, colour etc. e.g. colour: eyes, fur Once each analysis is done, the hidden layer report a probability that the feature is a feature of cat, using the probability the output determine if the picture is really a cat. If you search by image on Google, this is an example of deep learning. Try it on the cat.

Greedy algorithm

Greedy algorithms will attempt to maximise benefit/minimise cost at every step in the algorithm. This often leads to a local optimum solution but not necessarily a globally optimum solution. Counting Coins example This problem is to count to a desired value by choosing the least possible coins and the greedy approach forces the algorithm to pick the largest possible coin. If we are provided coins of ₹ 1, 2, 5 and 10 and we are asked to count ₹ 18 then the greedy procedure will be − 1 − Select one ₹ 10 coin, the remaining count is 8 2 − Then select one ₹ 5 coin, the remaining count is 3 3 − Then select one ₹ 2 coin, the remaining count is 1 4 − And finally, the selection of one ₹ 1 coins solves the problem

Convolutional neural networks (CNNs)

In machine learning, a convolutional neural network (CNN, or ConvNet) is a class of deep, feed-forward artificial neural network that have successfully been applied to analyzing visual imagery.

Nearest neighbour algorithm

In pattern recognition, the k-Nearest Neighbors algorithm (or k-NN for short) is a non-parametric method used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space.

Pooling / Max Pooling

It is common to periodically insert a Pooling layer in-between successive Conv layers in a ConvNet architecture. Its function is to progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network, and hence to also control overfitting. The Pooling Layer operates independently on every depth slice of the input and resizes it spatially, using the MAX operation. The most common form is a pooling layer with filters of size 2x2 applied with a stride of 2 downsamples every depth slice in the input by 2 along both width and height, discarding 75% of the activations. Every MAX operation would in this case be taking a max over 4 numbers (little 2x2 region in some depth slice). In addition to max pooling, the pooling units can also perform other functions, such as average pooling or even L2-norm pooling. Average pooling was often used historically but has recently fallen out of favor compared to the max pooling operation, which has been shown to work better in practice.

Machine learning

Machine learning is a field of computer science that gives computers the ability to learn without being explicitly programmed. It refers to the study and construction of algorithms that can learn from and make predictions on data. Machine learning is employed in a range of computing tasks where designing and programming explicit algorithms with good performance is difficult or infeasible; example applications include email filtering, detection of network intruders or malicious insiders working towards a data breach, optical character recognition (OCR), learning to rank, and computer vision.

Machine learning

Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it learn for themselves.

Max-pooling / Pooling

Max pooling is a sample-based discretization process. The objective is to down-sample an input representation (image, hidden-layer output matrix, etc.), reducing its dimensionality and allowing for assumptions to be made about features contained in the sub-regions binned. This is done to in part to help reduce the likelihood of over-fitting by providing an abstracted form of the representation. As well, it reduces the computational cost by reducing the number of parameters to learn and provides basic translation invariance to the internal representation.

Classification problem

Object detection Outputs bounding boxes - objects

Regression problem

Outputs direction & Speed E.g. End-to-End

Overfitting

Overfitting is a problem in machine learning and particularly in artificial neural networks (ANNs) in which the system can perform very well on its training data set but not on new data. In effect the system is fooled by the "noise" in data. By building a very complex model, it's quite easy to perfectly fit our dataset. But when we evaluate such a complex model on new data, it performs very poorly. In other words, the model does not generalize well. In general, the machine learning engineer is always working with a direct trade-off between overfitting and model complexity. If the model isn't complex enough, it may not be powerful enough to capture all of the useful information necessary to solve a problem. However, if our model is very complex (especially if we have a limited amount of data at our disposal), we run the risk of overfitting. Deep learning takes the approach of solving very complex problems with complex models and taking additional countermeasures to prevent overfitting.

Overfitting

Overfitting is what occurs when a model is excessively complex, such as having too many parameters relative to the number of observations.

Society of Automotive Engineers

SAE International, initially established as the Society of Automotive Engineers, is a U.S.-based, globally active professional association and standards developing organization for engineering professionals in various industries.

Society of Automotive Engineers

SAE International, initially established as the Society of Automotive Engineers, is a U.S.-based, globally active professional association and standards developing organization for engineering professionals in various industries. Principal emphasis is placed on transport industries such as automotive, aerospace, and commercial vehicles. SAE is commonly used in North America to indicate United States customary units (USCS or USC) measuring systems in automotive and construction tools. SAE is used as a tool marking to indicate that they are not metric (SI) based tools, as the two systems are incompatible.

Sensor Fusion

Sensor fusion is a process by which data from several different sensors are "fused" to compute something more than could be determined by any one sensor alone. An example is computing the orientation of a device in three-dimensional space.

Sensor Fusion

Sensor fusion is the aggregation of data from multiple sensors to gain a more accurate picture of the sensors' subject or environment than can be determined by any one sensor alone.

Sensor fusion

Sensor fusion- the process of merging multiple data sources into one providing more consistent and accurate data. For example in self-driving car data from multiple cameras or sensors can be merged to give more consistent and accurate data.

Shift Invariance (Spatial Invariance)

Shift invariance or spatial means that a shift in the input signal will result in an identical shift in the output signal. Mathematically a system is time-invariant if: x(n) produces y(n) then x(n + s) produces y(n + s) for any signal and any constant s. Shift invariance is important because it means that the system does not change over time.

Filter Stride

Stride controls how the filter convolves around the input volume. Stride is normally set in a way so that the output volume is an integer and not a fraction. Stride is how much a filter is shifted on an image with each step. The filter slides over the image, stopping at each stride length, and performs the necessary operations at that step.

Filter stride

The Filter stride is a measure of the "distance", in terms of inputs, between the receptive fields of one filter and the next. Two filters looking at in put layer of pixel values with a one pixel difference in receptive field would be described as having a filter stride of 1.

Greedy Algorithm

The greedy algorithm is an algorithm that makes the best (or the most optimal) choice at each local stage, with the hopes of reaching a global optimal solution. The greedy algorithm selects the highest value first, and then selects other items that are secondary in value until it reaches the goal. A problem needs the greedy property in order to run the algorithm on it. The greedy property means that you can still reach a solution if you make the best possible choice at the first step. You never have to revisit earlier choices.

Nearest Neighbour Algorithm

The nearest neighbour algorithm - opposite to the greedy algorithm - selects the smallest path length at every stage, until it eventually reaches the destination, without revisiting paths already travelled on. It is not as efficient as Dijkstra's algorithm as it can miss shorter paths that are easily identified by humans.

Bounding Boxes

The smallest enclosed rectangle (or cuboid) of an object or a set of objects. In autonomous vehicles, bounding boxes may been used to encompass vehicles and avoid collisions. In the above example, the sphere is bounded by the cuboid.

Vehicle-to-Vehicle (VTV) protocols

Vehicle-to-Vehicle (VTV) protocol is a protocol that allows automobiles to communicate to each other with a 360 degree awareness of the position of the vehicle and other vehicles.

Vehicle-to-infrastructure (VTI) protocol

Vehicle-to-infrastructure (V2I or v2i) is a communication model that allows vehicles to share information with the components that support a country's highway system. Such components include overhead RFID readers and cameras, traffic lights, lane markers, streetlights, signage and parking meters. V2I communication is typically wireless and bi-directional: data from infrastructure components can be delivered to the vehicle over an ad hoc network and vice versa. Similar to vehicle-to-vehicle (V2V) communication, V2I uses dedicated short range communication (DSRC) frequencies to transfer data.

Vehicle-to-infrastructure (VTI) protocol

Vehicle-to-infrastructure (VTI) protocol: Vehicle-to-infrastructure (V2I or v2i) is a communication model that allows vehicles to share information with the components that support a country's highway system. Such components include overhead RFID readers and cameras, traffic lights, lane markers, streetlights, signage and parking meters.

Vehicle-to-vehicle (VTV) protocol

Vehicle-to-vehicle. ... Vehicle-to-vehicle (V2V) is an automobile technology designed to allow automobiles to "talk" to each other. V2V communications form a wireless ad hoc network on the roads. Such networks are also referred to as vehicular ad hoc networks, VANETs.

Dijkstra's algorithm

What is it? (Shortest Path algorithm) - An algorithm run on a weighted path. - Starts with an initial node and a goal node. - Finds the least cost path to the goal node.

Receptive field

What?: A filter with a pre-defined size (parameters) e.g. 5x5. IF the receptive field is 5x5, then each neuron in the Conv Layer will have weights to a [5x5xV], where V is the input volume of the image,, for a total of 5*5*V = 25V weights. The V stays the same as the connectivity along the depth axis must be V, since it is the depth of the input volume. Why?: When dealing with high-dimensional inputs such as images, it is impractical to connect neurons to all neurons in the previous volume. Instead, connecting each neuron to only a local region of the input volume.

Shift invariance (Spatial invariance)

What?: When the input shifts the output also shifts, e.g. If a filter learns a useful feature to detect 'a tree' during training, it will capture 'a tree' regardless of its location on the image. Why?: So the CNN does not become dependent on a feature being found in a specific area on the image/input.


Ensembles d'études connexes

Chapter 1 - Introduction to Criminal Behavior

View Set

Distribution System Passive Devices

View Set

Lewis: Chapter 36 Inflammatory and Structural Heart Disorders

View Set