Build with AI : GenAI introduction and usage

INTRO TO
GENERATIVE
AI
Date: 7-Nov-24
by- GDG On Campus,
IITP

01 What is Generative AI?
OUR AIM
02 What are LLMs (Open Sourced and Closed)
03 Intro to Hugging face Library
04 Practical using API

Generative AI
• Generative AI refers to a class of artificial
intelligence techniques and algorithms
designed to generate new content, data, or
outputs that mimic or are similar to those
found in the training data it has been exposed
to.
• Instead of simply recognizing patterns in data
or making predictions based on existing data,
generative AI models can create entirely new
content that has never been seen before.

Real world
Application
● Video Generation (eg SoRA)
● Image Generation (eg DALLE)
● Text generation (eg ChatGPT)
● Music generation (Soundraw.io)
● Code generation (eg GitHub Copilot)
● AI Search (eg Perplexity)
● Game Generation

How GANs work?
● Initialization: Two neural networks are created: a Generator (G) and a Discriminator
(D). G is tasked with creating new data, like images or text, that closely resembles real
data. D acts as a critic, trying to distinguish between real data (from a training dataset)
and the data generated by G.
● Generator’s First Move: G takes a random noise vector as input. This noise vector
contains random values and acts as the starting point for G’s creation process. Using
its internal layers and learned patterns, G transforms the noise vector into a new data
sample, like a generated image.
● Discriminator’s Turn: D receives two kinds of inputs: Real data samples from the
training dataset. The data samples generated by G in the previous step. D’s job is to
analyze each input and determine whether it’s real data or something G cooked up. It
outputs a probability score between 0 and 1. A score of 1 indicates the data is likely
real, and 0 suggests it’s fake.

How GANs work?
● The Learning Process: Now, the adversarial part comes in: If D correctly identifies
real data as real (score close to 1) and generated data as fake (score close to 0),
both G and D are rewarded to a small degree. This is because they’re both doing
their jobs well. However, the key is to continuously improve. If D consistently
identifies everything correctly, it won’t learn much. So, the goal is for G to
eventually trick D.
● Generator’s Improvement: When D mistakenly labels G’s creation as real (score
close to 1), it’s a sign that G is on the right track. In this case, G receives a
significant positive update, while D receives a penalty for being fooled. This
feedback helps G improve its generation process to create more realistic data.
● Discriminator’s Adaptation: Conversely, if D correctly identifies G’s fake data (score
close to 0), but G receives no reward, D is further strengthened in its
discrimination abilities.

Large Language Model
● A Large Language Model (LLM) is a powerful artificial intelligence model
that uses deep learning techniques, typically based on neural networks, to
understand and generate human-like text. LLMs are trained on vast
amounts of text data and learn to capture the complex patterns and
structures of natural language.
● They are capable of performing a variety of language-related tasks, such as
text generation, summarization, translation, question answering, and more.
● LLMs have significantly advanced the field of natural language processing
(NLP) and have applications in various domains, including content
generation, virtual assistants, chatbots, sentiment analysis, and information
retrieval."

How they work:
• Encoder-Only Model:
An encoder-only model, also known as an encoder or feature extractor,
takes an input sequence and produces a fixed-length representation (embedding) that
captures the relevant information of the input. It's commonly used in tasks where understanding
the input sequence is essential but generating an output sequence isn’t necessary. For example,
in text classification tasks, NER etc. Model Example: BERT
• Decoder-Only Model:
A decoder-only model, generates an output sequence based on a given
input or context. It's typically used in tasks where generating coherent and meaningful output
sequences is the primary objective. For example, in text generation tasks, next word prediction
etc. Model Example: GPT
• Encoder-Decoder Model:
An encoder-decoder model consists of both an encoder and a decoder,
where the encoder processes the input sequence and generates a representation, which is then
used by the decoder to generate an output sequence. It's commonly used in tasks where there's
a clear mapping between an input sequence and an output sequence, such as machine

Open-sourced vs Closed
• OPEN SOURCED: The code, model weights, and often the training data are available to
the public, allowing anyone to access, inspect, and modify the model. Examples include
models like META- LLaMA, BLOOM.
Users can fine-tune, retrain, and adapt open-source models to their specific needs,
allowing greater flexibility. Hugging face includes hundreds of transformer-based
models which can be easily fine-tuned or used for specific applications.
● CLOSED SOURCED: The model code, weights, and data are proprietary and not available
to the public. Users can interact with the model through a paid API or hosted platform
but have no direct access to the underlying architecture or weights. Examples: GPT-4,
Gemini, Claude etc

HUGGING FACE
03
Notebook Link:
https://colab.research.google.com/drive/1vh-6X6g7JBk9_-A1Jj-4RWBEgvSG
nKMu?usp=sharing

USING GEMINI API
04
Notebook link:
https://colab.research.google.com/drive/1FA24GCsyvRm_77j_iY
hm1zAWowiM5whU?usp=sharing

TRY OUT NOW:
• Making a text summarizer bot using both open/closed
LLM
• An image captioning AI tool using Gemini API

THANKS FOR TUNING!!
You can ask your doubts!!!

Build with AI : GenAI introduction and usage

More Related Content

Similar to Build with AI : GenAI introduction and usage

Recently uploaded

Build with AI : GenAI introduction and usage