Introduction to Large Language Models
parameter-efficient tuning methods (PETM)
-easier and more efficient parameter tuning method -tune LLM on your own custom data without duplicating the model -the base model itself is not altered -instead small number of add-on layers are tuned, which can be swapped in and out at inference time
Pathways Language Model (PaLM)
-has 540 billion parameters -leverages the new Pathway system: efficiently train a single model across multiple TPU v4 Pods -orchestrate distributed computation for accelerators *will handle many tasks at once, learn new tasks quickly, and reflect a better understanding of the world
What is question answering in natural language processing?
-question answering is a subfield of natural language processing -depending on the model used, the answer can be directly extracted from text or generated from scratch Generative QA -generates free text directly based on the context -it leverages text generation models -no need for domain knowledge
Vertex AI Search and Conversation creates Generative AI Apps with...
...little or no coding and no prior ML experience
dialog tuned
trained to have a dialog by predicting the next response -request is typically framed as a question to a chatbot -specialization of instruction tuning that is expected to be in the context of a back-and-forth conversation, and typically works better with natural question-like questions
instruction tuned
trained to predict a response to the instructions given in the input
large language models
large, general-purpose language models can be pre-trained (basic common language problems: text classification, question answering, document summarization, text generation) and then fine-tuned (specific problems in different fields) specific purposes
chain of though reasoning
models are better at getting the right answer when they first output text that explains the reason for the answer
generic (or raw) LLM
predict the next word (technically token: a part of a word, the atomic unit that LLMs work in) based on the language in the training data
prompt engineering
the practice of developing and optimizing prompts to efficiently use language models for a variety of applications -domain-specific knowledge -examples of what is asked
tuning
the process of adapting a model to a new domain or set of customs use cases by training the model on new data. ex: we may collect training data and "tune" the LLM specifically for the legal or medical domain
What are some of the benefits of using large language models (LLMs)? 1. LLMs have many benefits, including: 1) They can generate human-quality text. 2) They can be used for a variety of tasks.3) They can be trained on massive datasets of text and code. 4) They are constantlyimproved. 2. LLMs have a number of benefits, including:1) They can generate human-quality text.2) They can be used for many tasks, such as text summarization and code generation.3) They can be trained on massive datasets of text, images, and code.4) They are constantly improving. 3. LLMs have a number of benefits, including:1) They can generate non-probabilities and human-quality text.2) They can be used for many tasks, such as text summarization and code generation.3) They can be trained on massive datasets of text, image, and code.4) They are constantly improving.
1. LLMs have many benefits, including: 1) They can generate human-quality text. 2) They can be used for a variety of tasks.3) They can be trained on massive datasets of text and code. 4) They are constantlyimproved.
What are some of the challenges of using LLMs? Select three options 1. They can be expensive to train 2. After being developed, they only change when they are fed new data. 3. They can be biased. 4. They can be used to generate harmful content.
1. They can be expensive to train 3. They can be biased. 4. They can be used to generate harmful content.
benefits of using large language models
1. a single code can be used for different tasks 2. the fine-tuning process requires minimal field data -obtain decent performance even with minimal training -can be used for few-shot or even zero-shot scenarios 3. the performance of LLM is continuously growing when you add more data and parameters
Transformer model
1. encoding component -input 2. decoding component -output
what are the 3 main kinds of LLM?
1. generic or raw 2. instruction tuned 3. dialog tuned
3 major features of large language models
1. large: -large training dataset - a large number of parameters: memory and knowledge machine-learned from training, define the skills of the model 2. general purpose: model sufficient ability to solve a common problem -commonality of human languages -resource restriction 3. pre-trained and fine-tuned
What are large language models (LLMs)?: 1. Generative AI is a type of artificial intelligence (AI) that only can create new content, such as text, images, audio, and video by learning from new data and then using that knowledge to predict a discrete, supervised learning output. 2. Generative AI is a type of artificial intelligence (AI) that only can create new content, such as text, images, audio, and video by learning from new data and then using that knowledge to predict a classification output. 3. An LLM is a type of artificial intelligence (AI) that can generate human-quality text. LLMs are trained on massive datasets of text and code, and they can be used for many tasks, such as writing, translating, and coding.
3. An LLM is a type of artificial intelligence (AI) that can generate human-quality text. LLMs are trained on massive datasets of text and code, and they can be used for many tasks, such as writing, translating, and coding.
What are some of the applications of LLMs? 1. LLMs can be used for many tasks, including:1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating non-creative discrete probabilities 2. LLMs can be used for many tasks, including:1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating non-creative discrete classes 3. LLMs can be used for many tasks, including:1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating creativecontent 4. LLMs can be used for many tasks, including: 1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating non-creative discrete predictions 5. LLMs can be used for many tasks, including:1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating non-creative discrete probabilities, classes, and predictions.
3. LLMs can be used for many tasks, including:1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating creativecontent
LLM Development v. Traditional Development
LLM development -no ML expertise needed -no training examples -no need to train a model -thinks about prompt design Traditional ML development -yes ML expertise needed -yes training examples yes need to train a model -yes compute time + hardware -thinks about minimizing a loss function
PaLM API & MakerSuite Simplifies Generative Development Cycle
PaLM API: simple entry point for Google's LLM. Provide developer access to models that are optimized for use cases such as summarization, classification, and more. -graphical user interface MakerSuite: an approachable way to start prototyping and building generative AI applications. Iterate on promotes, augment your dataset with synthetic data, tune custom models -model training tool -model deployment tool -model monitoring tool (performance)
fine tuning
bring your dataset and retrain the model by tuning every weigh t in the LLM. This requires big training job and hosting your own fine-tuned model -expensive and not realistic in many places
zero-shot
in Machine learning, implies that a model can recognize things that have not explicitly been taught in the training before
few-shot
in machine learning, it refers to training a model with minimal data
prompt design
is the process of creating prompts that elicit the desired response from a language model
