Introduction to Large Language Models

Ace your homework & exams now with Quizwiz!

parameter-efficient tuning methods (PETM)

-easier and more efficient parameter tuning method -tune LLM on your own custom data without duplicating the model -the base model itself is not altered -instead small number of add-on layers are tuned, which can be swapped in and out at inference time

Pathways Language Model (PaLM)

-has 540 billion parameters -leverages the new Pathway system: efficiently train a single model across multiple TPU v4 Pods -orchestrate distributed computation for accelerators *will handle many tasks at once, learn new tasks quickly, and reflect a better understanding of the world

What is question answering in natural language processing?

-question answering is a subfield of natural language processing -depending on the model used, the answer can be directly extracted from text or generated from scratch Generative QA -generates free text directly based on the context -it leverages text generation models -no need for domain knowledge

Vertex AI Search and Conversation creates Generative AI Apps with...

...little or no coding and no prior ML experience

dialog tuned

trained to have a dialog by predicting the next response -request is typically framed as a question to a chatbot -specialization of instruction tuning that is expected to be in the context of a back-and-forth conversation, and typically works better with natural question-like questions

instruction tuned

trained to predict a response to the instructions given in the input

large language models

large, general-purpose language models can be pre-trained (basic common language problems: text classification, question answering, document summarization, text generation) and then fine-tuned (specific problems in different fields) specific purposes

chain of though reasoning

models are better at getting the right answer when they first output text that explains the reason for the answer

generic (or raw) LLM

predict the next word (technically token: a part of a word, the atomic unit that LLMs work in) based on the language in the training data

prompt engineering

the practice of developing and optimizing prompts to efficiently use language models for a variety of applications -domain-specific knowledge -examples of what is asked

tuning

the process of adapting a model to a new domain or set of customs use cases by training the model on new data. ex: we may collect training data and "tune" the LLM specifically for the legal or medical domain

What are some of the benefits of using large language models (LLMs)? 1. LLMs have many benefits, including: 1) They can generate human-quality text. 2) They can be used for a variety of tasks.3) They can be trained on massive datasets of text and code. 4) They are constantlyimproved. 2. LLMs have a number of benefits, including:1) They can generate human-quality text.2) They can be used for many tasks, such as text summarization and code generation.3) They can be trained on massive datasets of text, images, and code.4) They are constantly improving. 3. LLMs have a number of benefits, including:1) They can generate non-probabilities and human-quality text.2) They can be used for many tasks, such as text summarization and code generation.3) They can be trained on massive datasets of text, image, and code.4) They are constantly improving.

1. LLMs have many benefits, including: 1) They can generate human-quality text. 2) They can be used for a variety of tasks.3) They can be trained on massive datasets of text and code. 4) They are constantlyimproved.

What are some of the challenges of using LLMs? Select three options 1. They can be expensive to train 2. After being developed, they only change when they are fed new data. 3. They can be biased. 4. They can be used to generate harmful content.

1. They can be expensive to train 3. They can be biased. 4. They can be used to generate harmful content.

benefits of using large language models

1. a single code can be used for different tasks 2. the fine-tuning process requires minimal field data -obtain decent performance even with minimal training -can be used for few-shot or even zero-shot scenarios 3. the performance of LLM is continuously growing when you add more data and parameters

Transformer model

1. encoding component -input 2. decoding component -output

what are the 3 main kinds of LLM?

1. generic or raw 2. instruction tuned 3. dialog tuned

3 major features of large language models

1. large: -large training dataset - a large number of parameters: memory and knowledge machine-learned from training, define the skills of the model 2. general purpose: model sufficient ability to solve a common problem -commonality of human languages -resource restriction 3. pre-trained and fine-tuned

What are large language models (LLMs)?: 1. Generative AI is a type of artificial intelligence (AI) that only can create new content, such as text, images, audio, and video by learning from new data and then using that knowledge to predict a discrete, supervised learning output. 2. Generative AI is a type of artificial intelligence (AI) that only can create new content, such as text, images, audio, and video by learning from new data and then using that knowledge to predict a classification output. 3. An LLM is a type of artificial intelligence (AI) that can generate human-quality text. LLMs are trained on massive datasets of text and code, and they can be used for many tasks, such as writing, translating, and coding.

3. An LLM is a type of artificial intelligence (AI) that can generate human-quality text. LLMs are trained on massive datasets of text and code, and they can be used for many tasks, such as writing, translating, and coding.

What are some of the applications of LLMs? 1. LLMs can be used for many tasks, including:1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating non-creative discrete probabilities 2. LLMs can be used for many tasks, including:1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating non-creative discrete classes 3. LLMs can be used for many tasks, including:1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating creativecontent 4. LLMs can be used for many tasks, including: 1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating non-creative discrete predictions 5. LLMs can be used for many tasks, including:1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating non-creative discrete probabilities, classes, and predictions.

3. LLMs can be used for many tasks, including:1) Writing2) Translating3) Coding4) Answering questions5) Summarizing text6) Generating creativecontent

LLM Development v. Traditional Development

LLM development -no ML expertise needed -no training examples -no need to train a model -thinks about prompt design Traditional ML development -yes ML expertise needed -yes training examples yes need to train a model -yes compute time + hardware -thinks about minimizing a loss function

PaLM API & MakerSuite Simplifies Generative Development Cycle

PaLM API: simple entry point for Google's LLM. Provide developer access to models that are optimized for use cases such as summarization, classification, and more. -graphical user interface MakerSuite: an approachable way to start prototyping and building generative AI applications. Iterate on promotes, augment your dataset with synthetic data, tune custom models -model training tool -model deployment tool -model monitoring tool (performance)

fine tuning

bring your dataset and retrain the model by tuning every weigh t in the LLM. This requires big training job and hosting your own fine-tuned model -expensive and not realistic in many places

zero-shot

in Machine learning, implies that a model can recognize things that have not explicitly been taught in the training before

few-shot

in machine learning, it refers to training a model with minimal data

prompt design

is the process of creating prompts that elicit the desired response from a language model


Related study sets

Chapter 1 Questions: MasteringBiology

View Set

ch 3 business reporting, visual analytics, and BPM

View Set

AP World: Chapter 1 Before History

View Set