Generative AI
What does self-encoding do?
Finds contextual dependencies between words. Reflects importance of each word in the sequence with respect to all the words in the sequence
Name a few techniques used for next word generation by the softmax output?
Greedy sampling and Random sampling
What does a token ID represent?
Position in the dictionary of all possible words that the model can work with
What is the training objective of autoregressive models?
Predict the next token based on the previous sequence of tokens
What is in-context learning?
Providing an example in the prompt
What is Few shot learning?
Providing more than one example in the context
What is zero shot learning
Providing no examples in the prompt
Name a couple of generative algorithms?
Recurrent Neural Networks(RNN) and Transformers
What are the highlights of transformers architecture?
1. Scaled to use multi-core GPUs 2. Parallel process input data 3. Use much larger datasets 4. Learn and pay attention to meanings of words being processed
What are the parts of the generative ai project life cycle?
1. Scope project, 2. Select model, 3. Adapt and align, 4. App integration
What is generative AI?
A subset of machine learning algorithms that find statistical patterns in massive human generated datasets
What is a context window?
A word count restricted text box that allows the users to interact with the models
What are autoencoding models good for?
Sentence classification tasks such as sentiment analysis Token level tasks such as Named Entity Recognition or word classification
What are autoregressive models suited for?
Text generation and zero shot inference
What is a Prompt?
Text provided as input to the model
What is inference?
The act of using a model to generate an output
What does an embedding do?
Converts tokens into token vectors
What is self-attention?
Attention weights between words that have high values
Types of LLMs
Autoencoding Autoregressive Sequence to sequence models
What is a tokenizer and why is it needed?
Converts words to numbers
What is an auto-regressive model?
Decoder only models pretrained using causal language modeling
Whaat does top k config parameter do?
Impacts the number of words in the output by selecting top k probability words
What does the temperature parameter do?
Impacts the randomness of next word prediction. Higher the temperature, higher the randomness.
What does the top p parameter do?
Impacts the words chosen by the total probability restriction
How does the transformer architecture understand context?
It uses attention maps which contain attention weights to each word and also between words
What is completion?
It's the output of the model. Comprises of input text and the output text
What does positional encoding do?
Maintains the position, there by relevance of the word in the sentence
What are large language models?
Models trained on trillions of words over weeks or months using large computing power exhibit properties such as language, break down complex tasks, reason and problem solve.
What is denoising objective for encoder models?
Models trained with masked language modeling are tasked to predict the mask tokens to reconstruct the original sentence
What is the base concept behind chatbots, text generation, translation between languages or machine code, named entity recognition or word classification
Next word generation
Why did Recurrent Neural Networks fail?
Not enough context, homonyms, syntactic ambiguity
What is pre-training an LLM mean?
Phase where the model builds a deep statistical representation of the language(training data)
What are auto encoder models?
They are encoder only models
What is a vector?
Tokens or words converted into high dimensional space where the angle measures the distance between words or mathematically understand the language.