INTRO TO
GENERATIVE
AI
Date: 7-Nov-24
by- GDG On Campus,
IITP
01 What is Generative AI?
OUR AIM
02 What are LLMs (Open Sourced and Closed)
03 Intro to Hugging face Library
04 Practical using API
What is Generative AI?
01
Generative AI
• Generative AI refers to a class of artificial
intelligence techniques and algorithms
designed to generate new content, data, or
outputs that mimic or are similar to those
found in the training data it has been exposed
to.
• Instead of simply recognizing patterns in data
or making predictions based on existing data,
generative AI models can create entirely new
content that has never been seen before.
Real world
Application
● Video Generation (eg SoRA)
● Image Generation (eg DALLE)
● Text generation (eg ChatGPT)
● Music generation (Soundraw.io)
● Code generation (eg GitHub Copilot)
● AI Search (eg Perplexity)
● Game Generation
How GANs work?
How GANs work?
● Initialization: Two neural networks are created: a Generator (G) and a Discriminator
(D). G is tasked with creating new data, like images or text, that closely resembles real
data. D acts as a critic, trying to distinguish between real data (from a training dataset)
and the data generated by G.
● Generator’s First Move: G takes a random noise vector as input. This noise vector
contains random values and acts as the starting point for G’s creation process. Using
its internal layers and learned patterns, G transforms the noise vector into a new data
sample, like a generated image.
● Discriminator’s Turn: D receives two kinds of inputs: Real data samples from the
training dataset. The data samples generated by G in the previous step. D’s job is to
analyze each input and determine whether it’s real data or something G cooked up. It
outputs a probability score between 0 and 1. A score of 1 indicates the data is likely
real, and 0 suggests it’s fake.
How GANs work?
● The Learning Process: Now, the adversarial part comes in: If D correctly identifies
real data as real (score close to 1) and generated data as fake (score close to 0),
both G and D are rewarded to a small degree. This is because they’re both doing
their jobs well. However, the key is to continuously improve. If D consistently
identifies everything correctly, it won’t learn much. So, the goal is for G to
eventually trick D.
● Generator’s Improvement: When D mistakenly labels G’s creation as real (score
close to 1), it’s a sign that G is on the right track. In this case, G receives a
significant positive update, while D receives a penalty for being fooled. This
feedback helps G improve its generation process to create more realistic data.
● Discriminator’s Adaptation: Conversely, if D correctly identifies G’s fake data (score
close to 0), but G receives no reward, D is further strengthened in its
discrimination abilities.
What are LLMs?
02
Large Language Model
● A Large Language Model (LLM) is a powerful artificial intelligence model
that uses deep learning techniques, typically based on neural networks, to
understand and generate human-like text. LLMs are trained on vast
amounts of text data and learn to capture the complex patterns and
structures of natural language.
● They are capable of performing a variety of language-related tasks, such as
text generation, summarization, translation, question answering, and more.
● LLMs have significantly advanced the field of natural language processing
(NLP) and have applications in various domains, including content
generation, virtual assistants, chatbots, sentiment analysis, and information
retrieval."
How they work:
How they work:
• Encoder-Only Model:
An encoder-only model, also known as an encoder or feature extractor,
takes an input sequence and produces a fixed-length representation (embedding) that
captures the relevant information of the input. It's commonly used in tasks where understanding
the input sequence is essential but generating an output sequence isn’t necessary. For example,
in text classification tasks, NER etc. Model Example: BERT
• Decoder-Only Model:
A decoder-only model, generates an output sequence based on a given
input or context. It's typically used in tasks where generating coherent and meaningful output
sequences is the primary objective. For example, in text generation tasks, next word prediction
etc. Model Example: GPT
• Encoder-Decoder Model:
An encoder-decoder model consists of both an encoder and a decoder,
where the encoder processes the input sequence and generates a representation, which is then
used by the decoder to generate an output sequence. It's commonly used in tasks where there's
a clear mapping between an input sequence and an output sequence, such as machine
Open-sourced vs Closed
• OPEN SOURCED: The code, model weights, and often the training data are available to
the public, allowing anyone to access, inspect, and modify the model. Examples include
models like META- LLaMA, BLOOM.
Users can fine-tune, retrain, and adapt open-source models to their specific needs,
allowing greater flexibility. Hugging face includes hundreds of transformer-based
models which can be easily fine-tuned or used for specific applications.
● CLOSED SOURCED: The model code, weights, and data are proprietary and not available
to the public. Users can interact with the model through a paid API or hosted platform
but have no direct access to the underlying architecture or weights. Examples: GPT-4,
Gemini, Claude etc
HUGGING FACE
03
Notebook Link:
https://colab.research.google.com/drive/1vh-6X6g7JBk9_-A1Jj-4RWBEgvSG
nKMu?usp=sharing
USING GEMINI API
04
Notebook link:
https://colab.research.google.com/drive/1FA24GCsyvRm_77j_iY
hm1zAWowiM5whU?usp=sharing
TRY OUT NOW:
• Making a text summarizer bot using both open/closed
LLM
• An image captioning AI tool using Gemini API
THANKS FOR TUNING!!
You can ask your doubts!!!

Build with AI : GenAI introduction and usage

  • 1.
  • 2.
    01 What isGenerative AI? OUR AIM 02 What are LLMs (Open Sourced and Closed) 03 Intro to Hugging face Library 04 Practical using API
  • 3.
  • 4.
    Generative AI • GenerativeAI refers to a class of artificial intelligence techniques and algorithms designed to generate new content, data, or outputs that mimic or are similar to those found in the training data it has been exposed to. • Instead of simply recognizing patterns in data or making predictions based on existing data, generative AI models can create entirely new content that has never been seen before.
  • 5.
    Real world Application ● VideoGeneration (eg SoRA) ● Image Generation (eg DALLE) ● Text generation (eg ChatGPT) ● Music generation (Soundraw.io) ● Code generation (eg GitHub Copilot) ● AI Search (eg Perplexity) ● Game Generation
  • 6.
  • 7.
    How GANs work? ●Initialization: Two neural networks are created: a Generator (G) and a Discriminator (D). G is tasked with creating new data, like images or text, that closely resembles real data. D acts as a critic, trying to distinguish between real data (from a training dataset) and the data generated by G. ● Generator’s First Move: G takes a random noise vector as input. This noise vector contains random values and acts as the starting point for G’s creation process. Using its internal layers and learned patterns, G transforms the noise vector into a new data sample, like a generated image. ● Discriminator’s Turn: D receives two kinds of inputs: Real data samples from the training dataset. The data samples generated by G in the previous step. D’s job is to analyze each input and determine whether it’s real data or something G cooked up. It outputs a probability score between 0 and 1. A score of 1 indicates the data is likely real, and 0 suggests it’s fake.
  • 8.
    How GANs work? ●The Learning Process: Now, the adversarial part comes in: If D correctly identifies real data as real (score close to 1) and generated data as fake (score close to 0), both G and D are rewarded to a small degree. This is because they’re both doing their jobs well. However, the key is to continuously improve. If D consistently identifies everything correctly, it won’t learn much. So, the goal is for G to eventually trick D. ● Generator’s Improvement: When D mistakenly labels G’s creation as real (score close to 1), it’s a sign that G is on the right track. In this case, G receives a significant positive update, while D receives a penalty for being fooled. This feedback helps G improve its generation process to create more realistic data. ● Discriminator’s Adaptation: Conversely, if D correctly identifies G’s fake data (score close to 0), but G receives no reward, D is further strengthened in its discrimination abilities.
  • 9.
  • 10.
    Large Language Model ●A Large Language Model (LLM) is a powerful artificial intelligence model that uses deep learning techniques, typically based on neural networks, to understand and generate human-like text. LLMs are trained on vast amounts of text data and learn to capture the complex patterns and structures of natural language. ● They are capable of performing a variety of language-related tasks, such as text generation, summarization, translation, question answering, and more. ● LLMs have significantly advanced the field of natural language processing (NLP) and have applications in various domains, including content generation, virtual assistants, chatbots, sentiment analysis, and information retrieval."
  • 12.
  • 13.
    How they work: •Encoder-Only Model: An encoder-only model, also known as an encoder or feature extractor, takes an input sequence and produces a fixed-length representation (embedding) that captures the relevant information of the input. It's commonly used in tasks where understanding the input sequence is essential but generating an output sequence isn’t necessary. For example, in text classification tasks, NER etc. Model Example: BERT • Decoder-Only Model: A decoder-only model, generates an output sequence based on a given input or context. It's typically used in tasks where generating coherent and meaningful output sequences is the primary objective. For example, in text generation tasks, next word prediction etc. Model Example: GPT • Encoder-Decoder Model: An encoder-decoder model consists of both an encoder and a decoder, where the encoder processes the input sequence and generates a representation, which is then used by the decoder to generate an output sequence. It's commonly used in tasks where there's a clear mapping between an input sequence and an output sequence, such as machine
  • 14.
    Open-sourced vs Closed •OPEN SOURCED: The code, model weights, and often the training data are available to the public, allowing anyone to access, inspect, and modify the model. Examples include models like META- LLaMA, BLOOM. Users can fine-tune, retrain, and adapt open-source models to their specific needs, allowing greater flexibility. Hugging face includes hundreds of transformer-based models which can be easily fine-tuned or used for specific applications. ● CLOSED SOURCED: The model code, weights, and data are proprietary and not available to the public. Users can interact with the model through a paid API or hosted platform but have no direct access to the underlying architecture or weights. Examples: GPT-4, Gemini, Claude etc
  • 15.
  • 16.
    USING GEMINI API 04 Notebooklink: https://colab.research.google.com/drive/1FA24GCsyvRm_77j_iY hm1zAWowiM5whU?usp=sharing
  • 17.
    TRY OUT NOW: •Making a text summarizer bot using both open/closed LLM • An image captioning AI tool using Gemini API
  • 18.
    THANKS FOR TUNING!! Youcan ask your doubts!!!