AIML Final Test (Wks 7-12)
Explain the main components of a convolutional neural network (CNN).
(see pgs. 130-132 for images) Sharpening Filter: 3x3 filter that 'sharpens' an image by scanning across each pixel one by one and calculating a sharpened values based on it and the 8 surrounding pixels Pooling Layers: Reduces the dimensionality (width and height) of the feature map - Why do we need this: Reduces computational cost, Reduces sensitivity to small changes in pixels. Gets us closer to the size of our output layer - Different types: max pooling, average pooling, ... Fully Connected Layers: Like a standard feedforward neural network - Connects all neurons from the previous layer to every node in the fully connected layer - Allows "flattening" the network to produce a final decision/classification (output layer)
How are images represented?
- An image is a matrix (2D grid) of pixel values - It has rows (# = height) and columns (# = width) - Most basic image is 0 (black) or 1 (white) - Greyscale pixel values range between 0 (fully black) and 255 (fully white) - Colour images need three values/'channels': RGB, where each channel is still 0-255
What are the ethical issues of computer vision?
- Biased datasets leads to biased predictions - Privacy concerns: surveillance & facial recognition, data collection w/o prior consent - Deepfakes and misinformation; creating images using CV - accountability or responsibility for use of CV - impact on jobs
Explain how you can use CLIP to generate images from text.
- CLIP (Contrastive Language-Image Pre-training) is a special OpenAI model trained to link text and images through a shared encoding - Trained on 100s of millions of <caption, image> pairs! - Make the d(T1, I1) (distance from T1 to I1) as small as possible (see pg. 137 for image) - Now that we have trained CLIP, we can use the text encoder to give an encoding that we can generate images from!
Explain what a diffusion model does.
- Diffusion models: moving from chaos ("noise") to structure - Diffusion models gradually mould a coherent image from abstract noise - Each refinement builds upon the previous step to iteratively improve the image - Strictly speaking, image generation is "reverse diffusion"; "diffusion" is making things more noisy/chaotic... - For T steps, we have an x image at each step - With enough steps, we get a smooth/gradual transition from noise to/from image
Explain the connection between pixels and features. Why don't you often want to work with raw pixels, and what is the solution to this?
- Each pixel can be seen as a feature of the image - One image = one instance with N features (pixels) - Often, we don't want to work with "raw pixels" b/c too many of them, and not informative individually - "Feature extraction" techniques create high-level features
Which real-world domains AI can help with?
- Environment - Manufacturing - Gaming - Healthcare - Cybersecurity NZ Primary Industries: - Agriculture - AquaCulture - Mining & Minerals - Horticulture - Forestry
What are the applications of computer vision?
- Healthcare: Tumour detection, X-ray analysis - Autonomous Vehicles: Object detection, route mapping - Agriculture: Disease identification, crop health monitoring - Retail: Automated checkouts, customer behaviour analysis - Social media: Facebook tagging, Snapchat filters - Augmented Reality (AR) and Virtual Reality (VR)
Why use computer vision?
- Humans are good at adapting to different situations, but are slower than computers at image processing - if you have a lot of image data, humans are expensive/get bored/make mistakes - Machine learning is good at rapidly processing data on specific, well-defined tasks ("narrow AI") - Can use CV to automate a lot of manual tasks
What are some ethical concerns with AI generated images?
- Impact on human artists: Risk of hurting artists' livelihoods by putting them out of a job - Reinforcing gender roles and other stereotypes - reinforcing harmful ideas like the Eurocentric concept of 'beauty' ('beautiful person' produces light skin, young, skinny woman) - cultural appropriation (ex: 'Maori artwork' outputs may not be accurate or may be created from data from Maori artists that they didn't consent to being used)
Explain the social impacts of AI on public opinion and politics. What regulation measures can be taken to address this?
- Info is increasingly being spread via social media vs traditional media. - Social media platforms run on AI via recommender systems, and these systems impact what information is distributed to which people based on what kind of person they are and what they have engaged with. - AI also manages content moderation (as there is too much to check manually so AI classifiers identify harmful content) - AI impacts ad targeting via machine learning for optimisation. Regulations: - AI social media legislation (DSA - Digital Services Act) - recommender algorithm research and regulation - bring AI processes into the public domain (like constructing content classifier training data outside of companies that use them) - banning ad targeting towards children, or targeting based on protected attribute
What is 'transfer learning' in CNNs?
- This is where we have a network that has already been trained by someone else on a big dataset (E.g. ImageNet: 14 million images) - We can then "fine-tune" the network for our specific task/application (Train it a little bit more on our (much smaller) dataset) - Great way to improve your computer vision results
What happens if Māori data sovereignty is not honoured?
- failure to establish trust with Māori & therefore failure to accurately represent Māori, resulting in poor quality data for Māori 'Flow-on' effect: Undercounting/misrepresenting Māori leads to imbalanced and/or less accurate Māori data ... which leads to a more imprecise (biassed) ML model Example: Māori not participating in census & therefore poor quality data for that population
How do you incorporate a caption into a diffusion model to get it to produce images based on the caption?
- incorporate the text caption (as a text "embedding") to "condition" each step of generation - the caption guides the diffusion model towards images related to the caption you give it
What are the limitations of text-to-image generation?
- look at 13/9/23 lecture for more info - can struggle to generate a complex scene (ex: a dog wearing a bowtie and eating a cat - most results mess up the 'eating a cat' or don't even include it) - limited semantic understanding (captions can be ambiguous or contain subtle details that are tough to capture accurately in generated images) - limited by training data
Explain the social impacts of AI on culture. What regulation measures can be taken to address this?
- recommender systems influence the entertainment people consume based on what they previously liked (now artists have to write and market their art in a way that will be picked up and distributed by the recommender algorithm - like making 'TikTokable bits' - AI art via generative AI Regulations: - regulating generative AI tools: companies using generative AI need to make available a reliable detector for it or specify the content is AI generated - prevent generative AI from scraping work from the internet w/o the creators' consent
"Encodings" - Linking Text and Images. How does it work so that you can generate an image from a caption?
1. An "encoder": converts a caption into a text "encoding" 2. A "prior": converts text encoding to an image "encoding" 3. A "decoder": converts the encoding to an image "Encoding": - Represent "high-level" features (if we can represent the caption and the image with matching "features" (encoding), then we can generate an image from just the caption)
Example Question: Describe two ways in which AI can improve productivity in the workplace.
1. Automate repetitive tasks like data entry, handle mundane tasks like basic customer inquiries. Enables human workers to focus on tasks like require more human skills like creativity. 2. Optimise the Supply Chain: AI can predict customer demand and department needs in order to effectively allocate resources 3. Company Analysis: AI can analyse productivity by department to identify what areas need to be addressed. 4. LLMs can summarise documents so you can extract the information that is relevant to you quickly
What are the six Māori data principles?
1. Rangatiratanga | Authority 2. Whakapapa | Relationships 3. Whanaungatanga | Obligations 4. Kotahitanga | Collective Benefit 5. Manaakitanga | Reciprocity 6. Kaitiakitanga | Guardianship
Explain the process for training a diffusion model. What is the result?
1. take an image 2. add a little bit of noise to it step-by-step, such as 1000 times (diffusion) 3. end up with random noise 4. teach the model to go backwards: remove a little bit of noise step-by-step, such as 1000 times (reverse diffusion) 5. end up with your original image ... repeat If you do diffusion training enough times with enough images, you end up with a model that can give you an "interesting" image from random noise (which is neat but not that useful)
Why is 0-255 the range of values for a pixel?
2^8 = 256, so 8 bits (1 byte) can be used to represent the 256 numbers (0-255)
Explain the filters and outputs for a CNN.
A CNN has filters: small weight matrices that the network learns - allows recognition of edges, textures, ... Each filter slides across the image to produce a feature map: calculate the value for each pixel by taking its neighbouring pixels into account Many fewer weights for the network to learn - just the filter values! Output nodes: 1 for regression/binary classification, N for N classes in multi-class classification (pick highest)
Example Question: Explain how a large language model can make use of web search to inform its response to a prompt
A LLM can use web search to access current information, expand its knowledge base about the topic, fact-check information it intends to put in its response, explore multiple perspectives, gather statistics, cite sources, and obtain location-specific information.
Give three examples of how AI can be applied to horticulture. What problems? What inputs & outputs? What kind of AI? Any potential issues?
AI Robots for Weed Identification and Removal (Classification, Reinforcement Learning) - Input: visuals of plants, sensor data, weed species and crop information - Output: classification of weeds vs crops, control actions for precise weed removal - Type of AI: CNNs for image analysis, reinforcement learning for removal strategies - Potential Issues: accuracy in crop vs weed identification, removal accuracy, adapting to diverse field conditions, impact on weeders' jobs Optimising Crop Management - Input: crop data, weather data, environmental data (soil, water), historical crop performance data - Output: recommendations for planting, irrigation, fertilisation, and harvesting schedules - Type of AI: optimisation algorithms, reinforcement learning - Potential Issues: data quality and accuracy, changing environmental conditions, integrating into existing farming practices Crop Yield Prediction (Regression): - Input: weather data, crop data, soil data, planting and historical yield data - Output: predicted crop yield in bushels or kg - Type of AI: regression models (linear or multi-feature regression, decision trees), NNs - Potential Issues: data quality and availability, complex interactions between factors, climate variability, model scalability, model accuracy by region
Explain the social impacts of AI on jobs and work. What regulation measures can be taken to address this?
AI is impacting all stages of job cycle: hiring (via creating shortlist of candidates to interview), management and evaluation (using quantitative info like customer ratings to evaluate workers, using algorithms to connect gig workers with clients, AI systems collaborating w/ workers) job automation (AI may take over jobs and impact labour demands and income in sectors where AI can automate work) creation of new jobs (more jobs for AI people, more jobs for those with distinctively human skills) Regulations: - legislation against AI taking jobs (ex: South Korea 'robot tax' that reduces tax incentives for companies using robots) - making transparent AI; so recruitement AI can be audited for discrimination
Describe Kotahitanga/Collective Benefit and give two examples.
AI/Data ecosystems should enable Māori to have collective and personal benefit Build capacity: development of a Māori workforce - empower Māori and learn by teaching (ako) Connect: connections between Māori and other indigenous peoples enable sharing ideas, strategies Examples: Meaningfully involving Māori from an early stage in developing a large language model for Aotearoa, with benefit for all involved. A tech company establish a comprehensive training program on AI for Māori. Māori collaborate with other indigenous communities to discuss strategies on how to use AI to advance indigenous sovereignty. A Māori community using AI-driven technology to enhance its collective farming practices and improve resource management & allocation for the benefit of all community members.
Give three examples of how AI can be applied to the environment. What problems? What inputs & outputs? What kind of AI? Any potential issues?
Air Quality Prediction: - Input: air quality monitoring station data, weather data, historical data - Output: prediction of the concentration levels of specific pollutants, health warnings - Type of AI: regression models (linear regression, decision trees); can also apply time series analysis or NNs to capture relationship between air quality, weather, and time - Potential Issues: data quality and accuracy, changing environmental conditions Optimising Renewable Energy Production and Distribution: - Input: energy production and consumption data, weather forecasts - Output: energy production forecasts, demand predictions, energy grid management recommendations - Type of AI: predictive models (NNs, decision trees), reinforcement learning, optimisation algorithm, regression - Potential Issues: grid integration challenges, weather forecast uncertainty, smart grid cybersecurity Wildlife Identification: - Input: visual animal data, audio recordings, geographic info about location of animal sightings - Output: species identification - Type of AI: computer vision models (CNNs, object detection), RNNs for processing audio data, transfer learning from large datasets - Potential Issues: data quality, data imbalance and species variation, image variations, background noise in audio files
Describe Whakapapa/Relationships and give two examples.
All data has a context - a whakapapa. Metadata - where did it come from, why, who from? Data disaggregation: having data separated into smaller categories that prioritise Māori needs Future use: the use (AI!) of data can have long-term consequences - need to protect from future harm Examples: The "Pima Indians" dataset was collected from a long-term study of indigenous peoples in the USA. Became used as an ML "benchmark dataset" without any real permission from the Pima. (failure to honor whakapapa) An app for preserving Te Reo Māori is built via collaborating with Māori scholars. The app details the context of all the information (where it came from, its cultural significance) and the app's data is barred from being used for other purposes without consent of the Māori who provided it. A health initiative that collects Māori data and enables for it to be broken down by iwi, region, age, gender, etc. in order to target specific demographic needs in the Māori collective.
Give three examples of how AI can be applied to agriculture. What problems? What inputs & outputs? What kind of AI? Any potential issues?
Animal or Plant Disease Detection (Classification): - Input: visual data, animal medical history, sensor data (temp, humidity, soil moisture), historical disease and environmental data - Output: classification of healthy or diseased, specific disease diagnosis - AI Models: CNNs for image data, machine learning algorithms like decision trees for written data - Potential Issues: data quality and accuracy, limited data for rare diseases, environmental variability impacting disease symptoms, animal welfare ethics Assess Cow Milk Quality (Regression): - Input: cow characteristics (breed, age, health status); milk data (fat & protein content), environmental data - Output: assessment of overall milk quality, prediction of characteristics like fat or protein content - AI Models: regression - Potential Issues: data quality and accuracy, variability due to cow individuality, ethical concerns for animal welfare Water Management (Optimisation): - Input: sensor data, weather forecasts, historical crop data; irrigation system data, water usage demand and cost - Output: optimised irrigation schedules, water-saving recommendations, water consumption and cost estimates - AI Models: optimisation algorithms, reinforcement learning, decision trees - Potential Issues: accurate sensor and weather forecast data, balancing water conservation with crop yield goals, complying with water allocation regulations
Give three examples of how AI can be applied to forestry. What problems? What inputs & outputs? What kind of AI? Any potential issues?
Balancing Economic Productivity with Ecological Sustainability (Optimisation): - Input: economic data (timber prices, operational costs), ecological data (forest makeup, biodiversity), regulatory requirements and conservation goals - Output: optimised forest management strategies - Type of AI: optimisation techniques, decision trees, NNs, regression - Potential Issues: balancing conflicting objectives, accounting for changing regulations, social and ethical considerations Classifying Tree Species: - Input: sensor data, tree characteristics, visual data, historical tree data - Output: classification of trees into specific species - Type of AI: classification models - decision trees, CNNs - Potential Issues: ensuring accurate and representative training data, variability in tree characteristics & imagery Predicting Tree Growth Rate (Regression): - Input: environmental data (climate, soil), historic tree growth data, forest management practices (fertilisation) - Output: predict tree growth rate (height or volume increment) - Type of AI: regression models - Potential Issues: variety in tree species growth, impact on trees from changing environmental conditions Forest Fire Management (Reinforcement Learning): - Input: sensor data from fire monitoring equipment, historical data on past wildfires, forest data - Output: fire management recommendations - Type of AI: reinforcement learning, decision trees, NNs - Potential Issues: making real-time decisions under uncertain and changing conditions, responses that protect both human and ecological assets, complexity of wildfires
Describe Whanaungatanga/Obligations and give two examples.
Balancing rights: individuals' data/privacy rights need to be balanced with those of the group (Sometimes, collective Māori rights prevail over individuals') Accountability: those who collect, use (AI), or store Māori data are accountable to those whose data it is Examples: A Māori-driven research project that uses AI to analyse individuals' genetic data to understand how it relates to specific health conditions - with the benefits returning directly to Māori. The project understands the obligation to the well-being of Māori. (accountability) A Māori cultural organization embarks on a project to digitize and preserve traditional artifacts. They recognize their obligations to the community to safeguard cultural heritage, and do that by consulting with iwi to make collective decisions on how to handle the data. (balancing rights of org. and Māori)
What is computer vision? What is the goal?
Computer vision is a subfield of ML that trains models to interpret and make decisions based on visual data (images, videos, etc) The goal: mimic (or surpass!) human vision capabilities but at scale and without fatigue
Explain standard vs indigenous vs Māori data sovereignty.
Data Sovereignty: data is subject to the laws of the nation within which it is stored Indigenous Data Sovereignty: data is subject to the laws of the nation from which it is collected Māori Data Sovereignty: recognises that Māori data should be subject to Māori governance. Māori data sovereignty supports tribal sovereignty and the realisation of Māori and Iwi aspirations
Why is Māori data sovereignty relevant to the use of AI?
Data is a living tāonga and is of value to Māori. If care isn't taken to respect Māori culture and concerns, AI can utilise Māori data in a way that disadvantages or disrespects them. AI can perpetuate Māori stereotypes, or replicate Māori culture like tā moko in a way that is culturally insensitive. To Māori, data isn't just simply information, but also has an important historical and cultural context.
Give three examples of how AI can be applied to healthcare. What problems? What inputs & outputs? What kind of AI? Any potential issues?
Disease Diagnosis: - Input: historic patient data, medical literature, specific patient data (written or visual) - Output: disease classification for patient - Type of AI: classification models - decision trees, NNs, CNNs for images - Potential Issues: patient data privacy, model interpretability, representation and accuracy for rare diseases Optimising Resource Allocation: - Input: patients flow data, staff availability and qualifications, equipment variety and count - Output: Optimized resource (staff and equipment) allocation scheduling and recommendations - Type of AI: optimisation techniques, regression, NNs, classification - Potential Issues: balancing patient needs with resource constraints, fairness in resource allocation, adapting to dynamic conditions Personalised Treatment Recommendations (Optimisation, Reinforcement Learning): - Input: patient health records, real-time patient monitoring data, medical guidelines, doctor recommendations - Output: optimal treatment recommendations, adaptive recommendations based on evolving patient condition - Type of AI: reinforcement learning, optimisation, regression, NNs, classification - Potential Issues: patient safety and ethics, model interpretability, quality, and accuracy Predict Patient Length of Stay (Regression): - Input: patient characteristics and medical data, historic patient data - Output: predicted length of stay in days - Type of AI: regression models, ML like decision trees or NNs - Potential Issues: patient privacy, adapting to dynamic patient conditions, model interpretability
Explain the social impacts of AI on economics. What regulation measures can be taken to address this?
Drive an increase in economic productivity via AI automation. Drive an increase in economic inequality by enabling bigger tech companies with the capacity to invest in AI grow bigger and buy out the little ones. Potential economic scenarios: 1. new jobs balance out jobs given to AI 2. job losses, but AI profits remain in NZ and enable redistribution of wealth from companies to those who have lost their jobs ('universal basic income') 3. job losses & AI profits go offshore Regulations: - legislation to combat tech company monopolisation - tax big tech companies - establish global corporate tax to prevent companies from avoid taxes by registering profits in tax havens
What are the common parameters in neural networks?
Epochs: the number of times the NN trains on the whole dataset (50 epochs = 50 times per image) Batch Size: number of images used for each iteration of training (if 100 images, batch size could be 20, so 5 batches per epoch) Learning Rate: affects how much we adjust the weights when updating them in each batch (higher = faster learning, but may "jump" over good weights) - too low of a learning rate means small updates in weight values, so learning takes forever - too high of a learning rate means big updates in weight values, so optimal weights are overshot - just right learning rate means more moderate weight updates and getting to optimal values efficiently
Give three examples of how AI can be applied to aquaculture. What problems? What inputs & outputs? What kind of AI? Any potential issues?
Fish Identification (Classification): - Input: visuals, fish characteristics (weight, size) - Output: fish species classification - Type of AI: decision trees and CNNs - Potential Issues: fish species variability, visual variability in one species, visual similarities between dif. species, need for diverse and representative training data Fish Growth Prediction (Regression): - Input: fish species data, environmental data feeding data - Output: size and weight prediction, time series of growth predictions - Type of AI: regression models, decision trees, NNs, time series analysis techniques - Potential Issues: accurate fish growth data collection, variability within a fish species, changing environmental conditions Fish Feeding Optimisation: - Input: fish dta, environmental data, feeding data, economic data (cost of feed ingredients and economic outputs of fishing operation) - Output: optimal feeding schedules and amounts - Type of AI: optimisation algorithms, reinforcement learning, ML models like NNs, decision trees, regression models - Potential Issues: balancing feed nutritional needs with feed amounts and cost, changing environmental conditions and fish behaviour
Explain backpropagation (how you update the weights in a NN).
Five steps: 1. Forward Pass - get the initial prediction by passing the inputs through the NN 2. Calculating the Error - large error = far off target, small error = doing well/on the right track 3. Propagating the Error Backward - distribute the 'blame' (error) to each weight based on the weight's contribution to the overall error - The error is "propagated" backwards, layer by layer. Weights that have a larger contribution to the error are adjusted more 4. Updating the Weights - Each weight in the network is adjusted based on the error signal it received 5. Iterative Learning (i.e. repeat 1 to 4) - can adjust the size of the updates: learning rate - the update process is iterative - with each forward pass, the network adjusts its weights to reduce error - over time, the neural network's weights are refined to make better predictions
How is a Generative Adversarial Network (GAN) trained?
Generator and Discriminator are trained hand-in-hand Training the Discriminator: 1. The discriminator classifies both real data and fake data from the generator 2. Penalise the discriminator for misclassifying a real instance as fake or a fake instance as real (FN/FP) 3. The discriminator updates its weights through backpropagation Training the Generator: 1. Produce generator output from "random noise" 2. Use the discriminator to classify this output as a "Real" or "Fake" image 3. Penalise the generator if it is classified as "Fake" 4. Updates weights through backpropagation Training Both, Together: 1. The discriminator trains for one or more epochs Sometimes the discriminator is trained before working on the generator so that we can truly judge if the generator is good or not right off the bat 2. The generator trains for one or more epochs 3. This continues until the generator creates images so good that the discriminator can't tell they're fake i.e. we're finished training when the discriminator is guessing randomly (50% accuracy)
Describe Kaitiakitanga/Guardianship and give two examples.
Guardianship: data should be stored/used in a way that reinforces the capacity of Māori to exercise Kaitiakitanga over Māori data Ethics: Tikanga (practices/values), kawa (protocols) and mātauranga (knowledge) underpin the use of Māori data Restrictions: Māori have a right to decide if Māori data is controlled (tapu) or open (noa) access. Examples: A Māori cultural institution is given the control to collect and digitize Māori cultural data (artifacts, historical documents), ensuring they are maintained according to Māori preferences and controlling who gets to access it. A research institute builds an AI nurse that incorporates traditional Māori healing practices. They ensure that the research is conducted in a way that respects Māori knowledge, protocols, and values, giving them the same respect as other health practices.
Example Question: Describe two influences that AI is currently having on human jobs. You can refer to any stage of the work lifecycle: hiring, management/evaluation, or redundancy.
Hiring: AI screens potential applicants and whittles down the list of people to call back for an interview Redundancy: AI can make certain jobs redundant as they can automate those tasks, resulting in human losing jobs to AI and impacting labour demands in sectors that can be automised
Explain how humans process an image vs computers.
Humans: 1. The retina "captures" an image, and the photoreceptor cells (rods/cones) turn them into an electrical signal 2. Signal sent along the optic nerve to the brain 3. Brain "processes" the signal (in the visual cortex) 4. Frontal lobe adds other senses & memory to make decisions Computers: 1. A camera captures an image, and the image sensor turns it into an electrical signal → pixel values 2. The image is pre-processed into features 3. Algorithms operate on features to make decisions
Explain computer vision vs human vision.
Image Capture: - humans: eyes capture images (rods - brightness, cones - color) - computer: cameras/sensors capture digital images in RGB, infrared Image Processing: - humans: retina then visual cortex - comp.: pre-processing, feature exraction, high-level processing via algorithms Decision Making: - humans: based on past exxperiences, context, and instinct - comp.: based on algorithms, pattern matching, and trained models Speed: - humans: limited by biological neural processing speed (<~200Hz) - comp.: can process lots of data quickly, superior in repetitive tasks Accuracy: - humans: accurate for everyday tasks but can be deceived (illusions) - comp.: highly accurate in trained tasks, but misled by poor training data Adaptability: - humans: highly adaptable, can learn and adjust to new visual contexts - comp.: requires re-training or fine-tuning for new tasks or contexts
What are the applications of text-to-image generation?
Image Upscaling (higher resolution) Denoising (cleaning up a grainy image) In/outpainting (filling in parts of image, or adding content around the image) Image Variations (small changes) Image → Caption (caption an image)
What is image generation? Why is image generation so hard?
Image generation is the the automated creation of images by humans. Why it is so hard: - Vast array of textures in real-life - Lighting, shadows and reflections - Minute details: wrinkles, water droplets, hair... - Realism isn't only about objects, but also their interaction with their environment
What is an analogy for diffusion?
Imagine you're a sculptor: you have a block of marble, and your client describes what sculpture they want Block of marble: random noise Client's description: the text prompt/caption You go through many small steps to convert the block (noise) into the final sculpture (image) based on the description (caption) The principle of starting broad and then refining is a universal concept in creation/intelligence - another analogy: starting with a rough draft and refining it until you get to the final draft
Why are 'normal' NNs not good for Computer Vision?
In a normal NN, inputs would be the pixel values of the image. Adding up the products of each input value and its corresponding weight together produces the output. If you 'flatten' an image, you lose spatial patterns - neighbours become far apart and vice versa. Also, simple NNs treat each pixel as a separate input. If you try to make a normal NN out of an image, you need inputs and weights for each pixel, which is computationally expensive and hard (see pg 127-128 of notes for images)
Explain how an AI Text-to-Image generator works.
It is a generative AI that includes: 1. An "encoder": converts a caption into a text "encoding" 2. A "prior": converts text encoding to an image "encoding" 3. A "decoder": converts the encoding to an image The decoder is non-deterministic - can produce many different images for the same input Uses diffusion models ("prior", "decoder") and transformers ("encoder")
Explain how a convolutional neural network (CNN) works in high-level terms.
It works by using specialized layers called convolutional layers to automatically learn and recognize patterns and features within the data. These layers apply filters to small portions of the input data, identifying things like edges, shapes, or textures. As the network goes deeper, it combines these features to recognize more complex patterns and eventually make high-level predictions, like identifying objects in an image.
Give three examples of how AI can be applied to mining. What problems? What inputs & outputs? What kind of AI? Any potential issues?
Location New Mineral Deposits (Classification): - Input: geological data, satellite imagery and sensor data, historical mining data and exploration reports - Output: classification of likelihood of area containing mineral deposits - Type of AI: CNNs for image analysis, decision trees - Potential Issues: data quality and availability, complex geological factors Predict Likelihood of Mineralisation (Regression): - Input: geological data from target area, historical exploration data from similar regions - Output: continuous value representing the likelihood of mineral content. - Type of AI: regression models, decision trees - Potential Issues: model accuracy, limited historical data, complex geological factors Optimising Mining Equipment Performance (Reinforcement Learning): - Input: real-time machine data, historical performance data, maintenance records - Output: real-time recommendations and adjustments to maximise efficiency and safety - Type of AI: reinforcement learning, optimisation algorithms, regression, neural networks (NNs), and decision trees - Potential Issues: balancing safety and efficiency, handling noise in sensor data, complex equipment dynamics
How can deepfakes be detected?
Look at: - Fine details: blinking, hair movement - Lighting/shadows/reflection inconsistencies - "Artifacts": small errors - Audio analysis - does the audio match the voice? - Metadata analysis (file veracity - authenticity, accuracy, and trustworthiness) Using AI to detect deepfakes: - Intel's FakeCatcher looks for subtle 'blood flow' in video pixels (veins change colour as heart beats) - "PhaseForensics" system extracts motion features from lips in videos - "higher-level" features - recognising camera 'fingerprints'
Give three examples of how AI can be applied to cybersecurity. What problems? What inputs & outputs? What kind of AI? Any potential issues?
Malware Detection (Classification): - Input: computer data, network traffic, system data, known information about malware (signatures, behaviour) - Output: Determining if behaviour is malicious or benign - Type of AI: neural networks, decision trees - Potential Issues: unknown/zero-day threats, unrepresentative training data, detection evasion techniques Assessing Security Threat Level (Classification or Regression): - Input: security logs, historical attack data, computer & system vulnerabilities and behaviour - Output: classification level or threat score - Type of AI: decision trees, regression - Potential Issues: data quality, zero-day threats, output accuracy Optimising Security Resource Allocation: - Input: digital assets and their value, computer & system data, potential threats, budget constraints, available resource information and cost - Output: optimal allocation for resource deployment - Type of AI: optimisation algorithms, regression, classification, NNs - Potential Issues: balancing resources amongst assets, balancing allocation with budget constraints, adapting to changing threats Incident Response (Reinforcement Learning): - Input: network and system data, security logs, real-time threat information - Output: adaptive incident response recommendations - Type of AI: reinforcement learning, NNs, classification - Potential Issues: incomplete visibility to environment, delayed or incorrect threat detection, response fails to preserve critical evidence or assets
What is Māori data sovereignty?
Māori Data Sovereignty: recognises that Māori data should be subject to Māori governance. Māori communities should be the ones making decisions about how their data is collected, used, and shared. Māori data sovereignty supports tribal sovereignty and the realisation of Māori and Iwi aspirations.
What is Māori data?
Māori data is data produced by Māori or about Māori or the environments that Māori have relationships with. Data is a living tāonga and is of value to Māori. Examples: - Data from government agencies, organisations and businesses about Māori - Data about Māori used to describe/compare Māori collectives - Data about Te Ao Māori (The Māori World) that emerges from research
Describe Rangatiratanga/Authority and give two examples.
Māori have a right to control over Māori data How it is collected, used (AI/ML), stored, ... Jurisdiction: it should be stored in a way that enhances control; within Aotearoa, wherever possible Self-determination: data should empower self-determination and self-governance Examples: Using a NZ cloud service to store data about Māori and ensure those whose data is stored have a say in how the data is used and managed. A community-controlled data platform is established by Māori organizations in NZ, ensuring authority over the collection, use, and management of Māori data.
Where are deepfakes used?
Politics - Gabon Deepfake Controversy: potentially deepfaked video contributed to a military coup attempt Hollywood - deepfake used to create young Luke Skywalker for TBoBF - "Performance cloning": one-off payment to actor to scan their likeness to use for AI in perpetuity Positive Uses: - restore people's voices lost to disease - improve foreign-language film dubbing - create deepfakes of famous people for museum exhibits or nonfiction historical films Potential Issues: - misinformation (political polarisation, lobbying) - falsifying evidence in court/defamation (or making judge/jury doubt genuine evidence) - Financial Fraud - Cyberbullying and revenge pornography
Give three examples of how AI can be applied to manufacturing. What problems? What inputs & outputs? What kind of AI? Any potential issues?
Predictive Maintenance (Regression, Reinforcement Learning): - Input: machine data (usage, time since last maintenance) - Output: Predict when equipment might fail or require maintenance - Type of AI: ML algorithms (regression, decision trees), NNs, reinforcement learning - Potential Issues: data quality and availability, integration with existing equipment and systems, ensuring accurate and timely predictions Supply Chain Optimisation: - Input: sales records, inventory records, market trends, transportation data - Output: recommendations for production schedules, transportation routes, optimised inventory levels - Type of AI: time series analysis, reinforcement learning, optimisation algorithms, regression - Potential Issues: data integration and accuracy, changing market conditions, accurately aligning with business goals Identify Defective Products: - Input: visual data of products from cameras or sensors - Output: classification of products as good or defective - Type of AI: computer vision algorithms like CNNs - Potential Issues: dealing with image variations, cost of integrating AI inspection systems
Example Question: Name a principle of Māori data sovereignty and give an example of how it could be violated.
Rangatiratanga/Authority: collecting Māori data w/o informed consent and storing it outside of NZ Whakapapa/Relationships: gathering data from Māori and later using it in a way that did not consent to. Whanaungatanga/Obligations: gathering data from Māori and using it in a way that exploits them (ex: using the data to develop a lucrative product and not fairly compensating Māori for their contribution) Kotahitanga/Collective Benefit: Using Māori data for solely corporate gain rather than collective benefit of all groups involved. Manaakitanga/Reciprocity: gathering data from Māori but barring them from accessing the findings from that data Kaitiakitanga/Guardianship: making Māori data open access without consulting them first, not allowing them to revoke data access, failing to safeguard the data from misuse
Summarise the six principles of Māori data sovereignty.
Rangatiratanga/Authority: control, jurasdiction, self-determination Whakapapa/Relationships: context, data disaggregation, future use Whanaungatanga/Obligations: balancing rights, accountabilites Kotahitanga/Collective Benefit: benefit, build capacity, connect Manaakitanga/Reciprocity: respect, consent Kaitiakitanga/Guardianship: guardianship, ethics, restrictions
Give three examples of how AI can be applied to gaming. What problems? What inputs & outputs? What kind of AI? Any potential issues?
Realistic NPC Behaviour: - Inputs: game world data (player positions, actions, environment state) - Output: NPC behaviours like dialogue and physical responses to player actions - Type of AI: decision trees, reinforcement learning - Potential Issues: making sure NPCs aren't predictable, don't get unfair advantages from knowledge of game environment (like shortcuts or overpowered abilities) Play Games: - Input: Game state (current position, potential actions) - Output: feedback (reward or penalty), AI learns a policy (state -> action) - Type of AI: reinforcement learning, tree search, using a NN to train a policy - Potential Issues: game complexity, memorising specific scenarios, not having complete information about game state Content Generation: - Input: design rules and specifications (difficulty levels, obstacles) - Output: game content such as game levels, maps, items - Type of AI: GANs, evolutionary algorithms - Potential Issues: generating levels that are too simple or too complex, making sure it adheres to desired specifications
Describe Manaakitanga/Reciprocity and give two examples.
Respect: collection, use (AI), and storage of data should uphold the dignity of Māori; results that stigmatise or blame Māori can cause long-term harm Consent: Free, prior and informed consent (FPIC) should underpin collection and usage (AI) of data Examples: Māori were deeply suspicious of the collection of data in early "Māori censuses" due to the potential for misuse - this forced data collection is still remembered and considered today. - combat by obtaining FPIC before collecting data A group of historians develop an AI chatbot for answering questions about Māori culture and history by consulting Māori communities and historians to ensure an accurate, dignifying representation of the people and their ancestors.
Example Question: Describe one law for big tech companies that is already in force in the European Union (EU).
The Digital Services Act (DSA) that places special obligations on large social media companies about transparency to users and auditors. Noncompliance can result in fines up to 6% of annual revenue.
Example Question: Describe one law that has been proposed for AI systems, but is not yet in force. Briefly discuss the advantages and disadvantages of this proposed law.
The EU AI Act - establishes a model repository & model risk classification Advantages: - holds all parties involved with AI at every step of process accountable - rules apply to places outside EU if AI is meant to be used in EU - % of company's revenue as penalty for noncompliance Disadvantages: - regulation can stifle EU AI research and development - EU AI developers may struggle to compete with those in countries with more relaxed regulations - may not be comprehensive or able to keep up with AI advancements
Explain the social impacts of AI on the way governments run. What regulation measures can be taken to address this?
Using AI for government tasks: - predict school leavers at risk of long-term unemployment - predict risk of criminals reoffending - identify crime hotspots and implement 'predictive policing' (problematic b/c it can reinforce historic biases toward certain areas) Regulations: - 'open government': establishing transparency on government use of AI - catalogue of AI models, how they are used and how they work ("these inputs and these outputs, this it the overall accuracy, this is the accuracy by demographic")
Example Question: Ali designs a simple neural network to classify cats and dogs. His images have a resolution of 100x100 pixels. His network has 10,000 input nodes, 100 hidden nodes, and 2 output nodes. After training his network, he only gets 60% test accuracy. Explain why this network performs so badly.
Using a simple NN means you aren't capturing spatial patterns (as it treats each pixel as a separate input) and the large number of inputs means the network can easily overfit to the training data rather than learn generalised patterns. Simple NNs are not as suited as CNNs to capturing spatial patterns and feature connections.
Example Question: Does this web search mechanism help prevent the language model from generating 'hallucinations' in its response? Explain your answer.
While it doesn't entirely eliminate the possibility of hallucinations, it does reduce the likelihood of one being generated. Allowing a LLM web access means it can access diverse sources to gain insights and fact-check information, and include the most relevant and up-to-date information in its response. However, there are potential issues with information quality, confirmation bias, time constraints between prompt submission and response times, and complex language resulting in ambiguity or misinterpretation.
How are deepfakes created?
With Generative Adversarial Networks (GANs) They use two separate networks: 1. Generator (makes fake images) 2. Discriminator (judges if an image is real or fake) - If the discriminator can't tell if it's real or not, it means the generator is doing a good job
Example Question: What sort of deep network could be used to train a Deepfake generator, and how would you train it?
a Generative Adversarial Network (GAN); train a generator and a discriminator in tandem
What is a deepfake?
a video of a person in which their face or body has been digitally altered so that they appear to be someone else, typically used maliciously or to spread false information
How are CNNs trained?
nearly the same process as standard/normal NNs - The human hand-designs the network architecture (number of each layer, number/shape of filters, ...) - We train the weights in the network (filters and standard weights) with backpropagation - Needs a LOT of labelled data ...
Example Question: What is "Māori Data Sovereignty" and how does it differ from the standard definition of "Data Sovereignty"?
standard data sovereignty is where the data subject to the laws of the nation within which it is stored, whereas Māori data sovereignty specifies it is the people (in this case, Māori) the data is collected about, who should be in charge of the data.