DALL-E

Ace your homework & exams now with Quizwiz!

Capabilities: Georgia_Tech: Raven%27s_Progressive_Matrices

"[""Its visual reasoning ability is sufficient to solve Raven's Matrices (visual tests often administered to humans to measure intelligence).[25][26]\n""]"

Technology: Georgia_Tech: Raven%27s_Progressive_Matrices

"[""Its visual reasoning ability is sufficient to solve Raven's Matrices (visual tests often administered to humans to measure intelligence).[25][26]\n""]"

Capabilities: Microsoft: Open_source

"[""There have been several attempts to create open-source implementations of DALL-E.[51][52] Released in 2022 on Hugging Face's Spaces platform, Craiyon (formerly DALL-E Mini until a name change was requested by OpenAI in June 2022) is an AI model based on the original DALL-E that was trained on unfiltered data from the Internet. It attracted substantial media attention in mid-2022 after its release due to its capacity for producing humorous imagery.[53][54][55]\n""]"

Open-source implementations: : Open_source

"[""There have been several attempts to create open-source implementations of DALL-E.[51][52] Released in 2022 on Hugging Face's Spaces platform, Craiyon (formerly DALL-E Mini until a name change was requested by OpenAI in June 2022) is an AI model based on the original DALL-E that was trained on unfiltered data from the Internet. It attracted substantial media attention in mid-2022 after its release due to its capacity for producing humorous imagery.[53][54][55]\n""]"

Reception: Microsoft: Open_source

"[""There have been several attempts to create open-source implementations of DALL-E.[51][52] Released in 2022 on Hugging Face's Spaces platform, Craiyon (formerly DALL-E Mini until a name change was requested by OpenAI in June 2022) is an AI model based on the original DALL-E that was trained on unfiltered data from the Internet. It attracted substantial media attention in mid-2022 after its release due to its capacity for producing humorous imagery.[53][54][55]\n""]"

Technology: Microsoft: Open_source

"[""There have been several attempts to create open-source implementations of DALL-E.[51][52] Released in 2022 on Hugging Face's Spaces platform, Craiyon (formerly DALL-E Mini until a name change was requested by OpenAI in June 2022) is an AI model based on the original DALL-E that was trained on unfiltered data from the Internet. It attracted substantial media attention in mid-2022 after its release due to its capacity for producing humorous imagery.[53][54][55]\n""]"

Capabilities: Algorithmic_bias: Deepfake

"['A concern about DALL-E 2 and similar image generation models is that they could be used to propagate deepfakes and other forms of misinformation.[32][33] As an attempt to mitigate this, the software rejects prompts involving public figures and uploads containing human faces.[34] Prompts containing potentially objectionable content are blocked, and uploaded images are analyzed to detect offensive material.[35] A disadvantage of prompt-based filtering is that it is easy to bypass using alternative phrases that result in a similar output. For example, the word ""blood"" is filtered, but ""ketchup"" and ""red liquid"" are not.[36][35]\n']"

Technology: Algorithmic_bias: Deepfake

"['A concern about DALL-E 2 and similar image generation models is that they could be used to propagate deepfakes and other forms of misinformation.[32][33] As an attempt to mitigate this, the software rejects prompts involving public figures and uploads containing human faces.[34] Prompts containing potentially objectionable content are blocked, and uploaded images are analyzed to detect offensive material.[35] A disadvantage of prompt-based filtering is that it is easy to bypass using alternative phrases that result in a similar output. For example, the word ""blood"" is filtered, but ""ketchup"" and ""red liquid"" are not.[36][35]\n']"

Capabilities: ExtremeTech: MIT_Technology_Revie

"['According to MIT Technology Review, one of OpenAI\'s objectives was to ""give language models a better grasp of the everyday concepts that humans use to make sense of things"".[16]\n']"

Reception: ExtremeTech: MIT_Technology_Revie

"['According to MIT Technology Review, one of OpenAI\'s objectives was to ""give language models a better grasp of the everyday concepts that humans use to make sense of things"".[16]\n']"

Technology: ExtremeTech: MIT_Technology_Revie

"['According to MIT Technology Review, one of OpenAI\'s objectives was to ""give language models a better grasp of the everyday concepts that humans use to make sense of things"".[16]\n']"

Capabilities: Technological_unemployment: Conjunction_(grammar)

"['DALL-E 2\'s language understanding has limits. It is sometimes unable to distinguish ""A yellow book and a red vase"" from ""A red book and a yellow vase"" or ""A panda making latte art"" from ""Latte art of a panda"".[39] It generates images of ""an astronaut riding a horse"" when presented with the prompt ""a horse riding an astronaut"".[40] It also fails to generate the correct images in a variety of circumstances. Requesting more than 3 objects, negation, numbers, and connected sentences may result in mistakes and object features may appear on the wrong object.[22] Additional limitations include handling text - which, even with legible lettering, almost invariably results in dream-like gibberish - and its limited capacity to address scientific information, such as astronomy or medical imagery.[41]\n']"

Technology: Technological_unemployment: Conjunction_(grammar)

"['DALL-E 2\'s language understanding has limits. It is sometimes unable to distinguish ""A yellow book and a red vase"" from ""A red book and a yellow vase"" or ""A panda making latte art"" from ""Latte art of a panda"".[39] It generates images of ""an astronaut riding a horse"" when presented with the prompt ""a horse riding an astronaut"".[40] It also fails to generate the correct images in a variety of circumstances. Requesting more than 3 objects, negation, numbers, and connected sentences may result in mistakes and object features may appear on the wrong object.[22] Additional limitations include handling text - which, even with legible lettering, almost invariably results in dream-like gibberish - and its limited capacity to address scientific information, such as astronomy or medical imagery.[41]\n']"

Capabilities: Artistic_medium: Algorithmic_bias

"['DALL-E 2\'s reliance on public datasets influences its results and lead to algorithmic bias in some cases such as generating higher numbers of men than women for requests that do not mention gender.[29] DALL-E 2\'s training data was filtered to remove violent and sexual imagery, but this was found to increase bias in some cases such as reducing the frequency of women being generated.[30] OpenAI hypothesize that this may be because women were more likely to be sexualized in training data which caused the filter to influence results.[30] In September 2022, OpenAI confirmed to The Verge that DALL-E invisibly inserts phrases into user prompts in order to address bias in results; for instance, ""black man"" and ""Asian woman"" are inserted into prompts that do not specify gender or race.[31]\n']"

Technology: Artistic_medium: Algorithmic_bias

"['DALL-E 2\'s reliance on public datasets influences its results and lead to algorithmic bias in some cases such as generating higher numbers of men than women for requests that do not mention gender.[29] DALL-E 2\'s training data was filtered to remove violent and sexual imagery, but this was found to increase bias in some cases such as reducing the frequency of women being generated.[30] OpenAI hypothesize that this may be because women were more likely to be sexualized in training data which caused the filter to influence results.[30] In September 2022, OpenAI confirmed to The Verge that DALL-E invisibly inserts phrases into user prompts in order to address bias in results; for instance, ""black man"" and ""Asian woman"" are inserted into prompts that do not specify gender or race.[31]\n']"

Capabilities: : Photorealistic

"['DALL-E can generate imagery in multiple styles, including photorealistic imagery, paintings, and emoji.[1] It can ""manipulate and rearrange"" objects in its images,[1] and can correctly place design elements in novel compositions without explicit instruction. Thom Dunn writing for BoingBoing remarked that ""For example, when asked to draw a daikon radish blowing its nose, sipping a latte, or riding a unicycle, DALL-E often draws the handkerchief, hands, and feet in plausible locations.""[19] DALL-E showed the ability to ""fill in the blanks"" to infer appropriate details without specific prompts such as adding Christmas imagery to prompts commonly associated with the celebration,[20] and appropriately-placed shadows to images that did not mention them.[21] Furthermore, DALL-E exhibits broad understanding of visual and design trends.[citation needed]\n']"

Technology: Diffusion_model: Photorealistic

"['DALL-E can generate imagery in multiple styles, including photorealistic imagery, paintings, and emoji.[1] It can ""manipulate and rearrange"" objects in its images,[1] and can correctly place design elements in novel compositions without explicit instruction. Thom Dunn writing for BoingBoing remarked that ""For example, when asked to draw a daikon radish blowing its nose, sipping a latte, or riding a unicycle, DALL-E often draws the handkerchief, hands, and feet in plausible locations.""[19] DALL-E showed the ability to ""fill in the blanks"" to infer appropriate details without specific prompts such as adding Christmas imagery to prompts commonly associated with the celebration,[20] and appropriately-placed shadows to images that did not mention them.[21] Furthermore, DALL-E exhibits broad understanding of visual and design trends.[citation needed]\n']"

Technology: Generative_Pre-trained_Transformer: Zero-shot_learning

"['DALL-E was developed and announced to the public in conjunction with CLIP (Contrastive Language-Image Pre-training).[16] CLIP is a separate model based on zero-shot learning that was trained on 400 million pairs of images with text captions scraped from the Internet.[1][16][18] Its role is to ""understand and rank"" DALL-E\'s output by predicting which caption from a list of 32,768 captions randomly selected from the dataset (of which one was the correct answer) is most appropriate for an image. This model is used to filter a larger initial list of images generated by DALL-E to select the most appropriate outputs.[10][16]\n']"

Capabilities: NBC: ExtremeTech

"['ExtremeTech stated ""you can ask DALL-E for a picture of a phone or vacuum cleaner from a specified period of time, and it understands how those objects have changed"".[20] Engadget also noted its unusual capacity for ""understanding how telephones and other objects change over time"".[21]\n']"

Reception: NBC: ExtremeTech

"['ExtremeTech stated ""you can ask DALL-E for a picture of a phone or vacuum cleaner from a specified period of time, and it understands how those objects have changed"".[20] Engadget also noted its unusual capacity for ""understanding how telephones and other objects change over time"".[21]\n']"

Technology: NBC: ExtremeTech

"['ExtremeTech stated ""you can ask DALL-E for a picture of a phone or vacuum cleaner from a specified period of time, and it understands how those objects have changed"".[20] Engadget also noted its unusual capacity for ""understanding how telephones and other objects change over time"".[21]\n']"

Capabilities: Raven%27s_Progressive_Matrices: Artistic_medium

"['Given an existing image, DALL-E 2 can produce ""variations"" of the image as unique outputs based on the original, as well as edit the image to modify or expand upon it. DALL-E 2\'s ""inpainting"" and ""outpainting"" use context from an image to fill in missing areas using a medium consistent with the original, following a given prompt. For example, this can be used to insert a new subject into an image, or expand an image beyond its original borders.[27] According to OpenAI, ""Outpainting takes into account the image's existing visual elements — including shadows, reflections, and textures — to maintain the context of the original image.""[28]\n']"

Technology: Raven%27s_Progressive_Matrices: Artistic_medium

"['Given an existing image, DALL-E 2 can produce ""variations"" of the image as unique outputs based on the original, as well as edit the image to modify or expand upon it. DALL-E 2\'s ""inpainting"" and ""outpainting"" use context from an image to fill in missing areas using a medium consistent with the original, following a given prompt. For example, this can be used to insert a new subject into an image, or expand an image beyond its original borders.[27] According to OpenAI, ""Outpainting takes into account the image's existing visual elements — including shadows, reflections, and textures — to maintain the context of the original image.""[28]\n']"

Capabilities: Conjunction_(grammar): NBC

"['Most coverage of DALL-E focuses on a small subset of ""surreal""[16] or ""quirky""[23] outputs. DALL-E\'s output for ""an illustration of a baby daikon radish in a tutu walking a dog"" was mentioned in pieces from Input,[42] NBC,[43] Nature,[44] and other publications.[1][45][46] Its output for ""an armchair in the shape of an avocado"" was also widely covered.[16][24]\n']"

Reception: : NBC

"['Most coverage of DALL-E focuses on a small subset of ""surreal""[16] or ""quirky""[23] outputs. DALL-E\'s output for ""an illustration of a baby daikon radish in a tutu walking a dog"" was mentioned in pieces from Input,[42] NBC,[43] Nature,[44] and other publications.[1][45][46] Its output for ""an armchair in the shape of an avocado"" was also widely covered.[16][24]\n']"

Technology: Conjunction_(grammar): NBC

"['Most coverage of DALL-E focuses on a small subset of ""surreal""[16] or ""quirky""[23] outputs. DALL-E\'s output for ""an illustration of a baby daikon radish in a tutu walking a dog"" was mentioned in pieces from Input,[42] NBC,[43] Nature,[44] and other publications.[1][45][46] Its output for ""an armchair in the shape of an avocado"" was also widely covered.[16][24]\n']"

Technology: : Generative_Pre-trained_Transformer

"['The Generative Pre-trained Transformer (GPT) model was initially developed by OpenAI in 2018,[11] using a Transformer architecture. The first iteration, GPT, was scaled up to produce GPT-2 in 2019;[12] in 2020 it was scaled up again to produce GPT-3, with 175 billion parameters.[13][1][14] DALL-E\'s model is a multimodal implementation of GPT-3[15] with 12 billion parameters[1] which ""swaps text for pixels"", trained on text-image pairs from the Internet.[16] DALL-E 2 uses 3.5 billion parameters, a smaller number than its predecessor.[17]\n']"

Capabilities: MIT_Technology_Revie: Microsoft

"['The art community has had a negative reaction to DALL-E.[48][49][50] Two arguments are typically presented. The first is that AI art is not art because it is not created by a human with intent. ""The juxtaposition of AI-generated images with their own work is degrading and undermines the time and skill that goes into their art. AI-driven image generation tools have been heavily criticized by artists because they are trained on human-made art scraped from the web.""[3] The second is the trouble with copyright law and what art is used for training the AI. DALL-E has not released information about what dataset(s) were used to create the models and there is a general concern that the artist\'s work has been used for training without permission. The copyright laws are inconclusive at the moment. [4]\n']"

Reception: MIT_Technology_Revie: Microsoft

"['The art community has had a negative reaction to DALL-E.[48][49][50] Two arguments are typically presented. The first is that AI art is not art because it is not created by a human with intent. ""The juxtaposition of AI-generated images with their own work is degrading and undermines the time and skill that goes into their art. AI-driven image generation tools have been heavily criticized by artists because they are trained on human-made art scraped from the web.""[3] The second is the trouble with copyright law and what art is used for training the AI. DALL-E has not released information about what dataset(s) were used to create the models and there is a general concern that the artist\'s work has been used for training without permission. The copyright laws are inconclusive at the moment. [4]\n']"

Technology: MIT_Technology_Revie: Microsoft

"['The art community has had a negative reaction to DALL-E.[48][49][50] Two arguments are typically presented. The first is that AI art is not art because it is not created by a human with intent. ""The juxtaposition of AI-generated images with their own work is degrading and undermines the time and skill that goes into their art. AI-driven image generation tools have been heavily criticized by artists because they are trained on human-made art scraped from the web.""[3] The second is the trouble with copyright law and what art is used for training the AI. DALL-E has not released information about what dataset(s) were used to create the models and there is a general concern that the artist\'s work has been used for training without permission. The copyright laws are inconclusive at the moment. [4]\n']"

Capabilities: Deepfake: Technological_unemployment

['Another concern about DALL-E 2 and similar models is that they could cause technological unemployment for artists, photographers, and graphic designers due to their accuracy and popularity. [37][38]\n']

Technology: Deepfake: Technological_unemployment

['Another concern about DALL-E 2 and similar models is that they could cause technological unemployment for artists, photographers, and graphic designers due to their accuracy and popularity. [37][38]\n']

Technology: Zero-shot_learning: Diffusion_model

['DALL-E 2 uses a diffusion model conditioned on CLIP image embeddings, which, during inference, are generated from CLIP text embeddings by a prior model.[17]\n']

Capabilities: Photorealistic: Georgia_Tech

['DALL-E is able to produce images for a wide variety of arbitrary descriptions from various viewpoints[22] with only rare failures.[10] Mark Riedl, an associate professor at the Georgia Tech School of Interactive Computing, found that DALL-E could blend concepts (described as a key element of human creativity).[23][24]\n']

Technology: Photorealistic: Georgia_Tech

['DALL-E is able to produce images for a wide variety of arbitrary descriptions from various viewpoints[22] with only rare failures.[10] Mark Riedl, an associate professor at the Georgia Tech School of Interactive Computing, found that DALL-E could blend concepts (described as a key element of human creativity).[23][24]\n']


Related study sets

Chapter 9: Organizing the Body of the Speech

View Set

Pays et capitales d'Amérique du Sud

View Set

Mental Health Nursing -chapter 10, 11,12

View Set

First certificate language practice-Consolidation 1

View Set

Grammar Study Guide - Semicolon and Colons

View Set