SlideShare a Scribd company logo
Bhashini Tools
BhashaDaan & ULCA
Agenda
● Bhashini Mission
● Need for Digital Infrastructure
● NLTM Architecture
● Datasets & Models
● BhashaDaan
● ULCA
● Contributing Datasets & Models to ULCA
● Roadmap
Bhashini Mission Statement
Create a knowledge-based
society by transcending the
language barriers ;
Providing content and services
to citizens, in their own
language.
Digital Infrastructure Educational
eBooks
Digital Web
Content
Tele
Services
Communication
Services
Knowledge
Base
Search
Datasets
Datasets
AI/ML Models
NLTM Architecture
In simple words…
Contributors of
Datasets
Development of
AI Models
Development of
End user Applications
Datasets & Contributors
Contributors Datasets
Public
- Crowdsourced
- Free
Dedicated Teams
- Language Experts
- Specific Tasks
- Paid
Parallel
Monolingual
ASR
TTS
OCR & more…
Data Collection
Help to build an open repository of data to digitally enrich your language
ASR Datasets
TTS Datasets
Parallel Datasets
OCR Datasets
BashaDhaan - A short video
AI Models
Task Types Contributors
Translation
ASR
TTS
Transliteration
OCR
Models
EkStep
AI4Bharat
IITs
IIITs
CDAC
IndicTrans
Vakyansh
IndicXlit
IndicTTS
Anuvaad
and more… and more… and more…
ULCA stands for Universal Language Contribution APIs
ULCA
ULCA is a standard API and open scalable data platform (supporting
various types of datasets) for Indian language datasets and models.
World’s largest Indic language data and models platform for Open AI
innovation
ULCA - Components
Open and scalable data platform
● Parallel text corpus in two or more languages
● Monolingual text corpus
● Automatic Speech Recognition (ASR) corpus
● Text to Speech (TTS) corpus
● Optical Character Recognition (OCR) corpus
● Natural Language Understanding (NLU) datasets
● Machine Translation (MT)
● Automatic Speech Recognition (ASR)
● Text to Speech (TTS)
● Optical Character Recognition (OCR)
● Transliteration
● Large, diverse and task specific benchmarks
● Research community approved metric system
Inclusive Indian language Models
Automated Transparent Benchmarking
ULCA - Current Status
Datasets
● 215 Million Parallel sentences in 13 languages
● 14k Hours of Audio recording in 14 languages
● 2.5 Million Images for OCR in 12 languages
● 10 Million Transliteration pairs in 19 languages
World's largest Indic language data and models platform for open AI innovation
Models ● 240 State of the Art Models in 21 Indian
languages across Translation, speech (ASR/TTS),
OCR & Transliteration
Benchmarks ● 135 Open Benchmarks across Translation, ASR
& Transliteration in 20 Indian languages
ULCA- Actions
Datasets
Submission My Contribution
Search & Download
My Searches
Models
Benchmarking
Submission My Contribution
Explore Models
Try Model
Metrics Benchmark Dataset
Explore Models
Try Model
Model Feedback
Model Leaderboard
Contributing Datasets to ULCA
Contributing Models to ULCA
ULCA - Language AI Models Demo
ULCA - Roadmap
Datasets
POS, NER
Multi-lingual Multi-speaker
Mobile APK
Models
POS, NER
Benchmark
OCR Benchmark dataset
User Analytics
Ex : En-Hi Legal
Readymade Datasets
Realtime Inference
for Models
ULCA - Roadmap (Contd.)
ULCA
Automated Ingestion of verified contents from external sources to ULCA
Thank you!
Questions ?

More Related Content

What's hot

Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
Adarsh Saxena
 
Python and Machine Learning
Python and Machine LearningPython and Machine Learning
Python and Machine Learning
trygub
 
NLP
NLPNLP
Evolution of Computer Languages
Evolution of Computer LanguagesEvolution of Computer Languages
Evolution of Computer Languages
Electro Computer Warehouse
 
Nlp presentation
Nlp presentationNlp presentation
Nlp presentation
Surya Sg
 
Language translator
Language translatorLanguage translator
Language translator
SumitSumit26
 
Python Anaconda Tutorial | Edureka
Python Anaconda Tutorial | EdurekaPython Anaconda Tutorial | Edureka
Python Anaconda Tutorial | Edureka
Edureka!
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Deep Learning Italia
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
VeenaSKumar2
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
WingChan46
 
NLP
NLPNLP
Discover AI with Microsoft Azure
Discover AI with Microsoft AzureDiscover AI with Microsoft Azure
Discover AI with Microsoft Azure
Jürgen Ambrosi
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
Mariana Soffer
 
Natural language processing
Natural language processing Natural language processing
Natural language processing
Md.Sumon Sarder
 
Build and Distributing SDK Add-Ons
Build and Distributing SDK Add-OnsBuild and Distributing SDK Add-Ons
Build and Distributing SDK Add-Ons
Dave Smith
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLP
Rupak Roy
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
Jaganadh Gopinadhan
 
Lecture 3 basic syntax and semantics
Lecture 3  basic syntax and semanticsLecture 3  basic syntax and semantics
Lecture 3 basic syntax and semantics
alvin567
 
Advanced Natural Language Processing with Apache Spark NLP
Advanced Natural Language Processing with Apache Spark NLPAdvanced Natural Language Processing with Apache Spark NLP
Advanced Natural Language Processing with Apache Spark NLP
Databricks
 
Recurrent Neural Networks for Recommendations and Personalization with Nick P...
Recurrent Neural Networks for Recommendations and Personalization with Nick P...Recurrent Neural Networks for Recommendations and Personalization with Nick P...
Recurrent Neural Networks for Recommendations and Personalization with Nick P...
Databricks
 

What's hot (20)

Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Python and Machine Learning
Python and Machine LearningPython and Machine Learning
Python and Machine Learning
 
NLP
NLPNLP
NLP
 
Evolution of Computer Languages
Evolution of Computer LanguagesEvolution of Computer Languages
Evolution of Computer Languages
 
Nlp presentation
Nlp presentationNlp presentation
Nlp presentation
 
Language translator
Language translatorLanguage translator
Language translator
 
Python Anaconda Tutorial | Edureka
Python Anaconda Tutorial | EdurekaPython Anaconda Tutorial | Edureka
Python Anaconda Tutorial | Edureka
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
NLP
NLPNLP
NLP
 
Discover AI with Microsoft Azure
Discover AI with Microsoft AzureDiscover AI with Microsoft Azure
Discover AI with Microsoft Azure
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processing Natural language processing
Natural language processing
 
Build and Distributing SDK Add-Ons
Build and Distributing SDK Add-OnsBuild and Distributing SDK Add-Ons
Build and Distributing SDK Add-Ons
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLP
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
 
Lecture 3 basic syntax and semantics
Lecture 3  basic syntax and semanticsLecture 3  basic syntax and semantics
Lecture 3 basic syntax and semantics
 
Advanced Natural Language Processing with Apache Spark NLP
Advanced Natural Language Processing with Apache Spark NLPAdvanced Natural Language Processing with Apache Spark NLP
Advanced Natural Language Processing with Apache Spark NLP
 
Recurrent Neural Networks for Recommendations and Personalization with Nick P...
Recurrent Neural Networks for Recommendations and Personalization with Nick P...Recurrent Neural Networks for Recommendations and Personalization with Nick P...
Recurrent Neural Networks for Recommendations and Personalization with Nick P...
 

Similar to Bhashini (NLTM) Tools

Dhruva - Deploying models at scale.pptx
Dhruva - Deploying models at scale.pptxDhruva - Deploying models at scale.pptx
Dhruva - Deploying models at scale.pptx
Aravinth Bheemaraj
 
Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?
Georg Rehm
 
ELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technology
Dafydd Gibbon
 
How AI can help you build better customer relationships?
How AI can help you build better customer relationships?How AI can help you build better customer relationships?
How AI can help you build better customer relationships?
Knoldus Inc.
 
The Standards Mosaic Opening the Way to New Technologies
The Standards Mosaic Opening the Way to New TechnologiesThe Standards Mosaic Opening the Way to New Technologies
The Standards Mosaic Opening the Way to New Technologies
Dave Lewis
 
2010 tool forum ata handout
2010 tool forum ata handout2010 tool forum ata handout
2010 tool forum ata handout
ascetlan
 
AI as a service
AI as a serviceAI as a service
AI as a service
Asher Sterkin
 
GDSC career guide presentation.pptx
GDSC career guide presentation.pptxGDSC career guide presentation.pptx
GDSC career guide presentation.pptx
DishaSharma737984
 
GDSC career guide presentation.pptx
GDSC career guide presentation.pptxGDSC career guide presentation.pptx
GDSC career guide presentation.pptx
AryanSharma853911
 
Improving the User Experience of UiPath Apps
Improving the User Experience of UiPath AppsImproving the User Experience of UiPath Apps
Improving the User Experience of UiPath Apps
DianaGray10
 
Company Overview
Company OverviewCompany Overview
Company Overview
Tamas Csinos
 
Ai/ML services
Ai/ML servicesAi/ML services
Ai/ML services
Ecosmob Technologies
 
Smart cities no ai without ia
Smart cities   no ai without iaSmart cities   no ai without ia
Smart cities no ai without ia
Fredric Landqvist
 
Gdsc IIIT Surat Orientation 2022.pdf
Gdsc IIIT Surat Orientation 2022.pdfGdsc IIIT Surat Orientation 2022.pdf
Gdsc IIIT Surat Orientation 2022.pdf
SparshJhariya2
 
Google Cloud Platform - Cloud-Native Roadshow Stuttgart
Google Cloud Platform - Cloud-Native Roadshow StuttgartGoogle Cloud Platform - Cloud-Native Roadshow Stuttgart
Google Cloud Platform - Cloud-Native Roadshow Stuttgart
VMware Tanzu
 
NLP based Data Engineering and ETL Tool - Ask On Data.pdf
NLP based Data Engineering and ETL Tool - Ask On Data.pdfNLP based Data Engineering and ETL Tool - Ask On Data.pdf
NLP based Data Engineering and ETL Tool - Ask On Data.pdf
HelicalInsight1
 
Conversational Artificial Intelligence with Ben Tomlinson and Wayne Thompson
Conversational Artificial Intelligence with Ben Tomlinson and Wayne ThompsonConversational Artificial Intelligence with Ben Tomlinson and Wayne Thompson
Conversational Artificial Intelligence with Ben Tomlinson and Wayne Thompson
Databricks
 
Translation as a professional activity
Translation as a professional activityTranslation as a professional activity
Translation as a professional activity
Chelo Vargas
 
Sudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdfSudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta Mukherjee
 
Google Cloud Platform Munich
Google Cloud Platform MunichGoogle Cloud Platform Munich
Google Cloud Platform Munich
VMware Tanzu
 

Similar to Bhashini (NLTM) Tools (20)

Dhruva - Deploying models at scale.pptx
Dhruva - Deploying models at scale.pptxDhruva - Deploying models at scale.pptx
Dhruva - Deploying models at scale.pptx
 
Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?
 
ELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technology
 
How AI can help you build better customer relationships?
How AI can help you build better customer relationships?How AI can help you build better customer relationships?
How AI can help you build better customer relationships?
 
The Standards Mosaic Opening the Way to New Technologies
The Standards Mosaic Opening the Way to New TechnologiesThe Standards Mosaic Opening the Way to New Technologies
The Standards Mosaic Opening the Way to New Technologies
 
2010 tool forum ata handout
2010 tool forum ata handout2010 tool forum ata handout
2010 tool forum ata handout
 
AI as a service
AI as a serviceAI as a service
AI as a service
 
GDSC career guide presentation.pptx
GDSC career guide presentation.pptxGDSC career guide presentation.pptx
GDSC career guide presentation.pptx
 
GDSC career guide presentation.pptx
GDSC career guide presentation.pptxGDSC career guide presentation.pptx
GDSC career guide presentation.pptx
 
Improving the User Experience of UiPath Apps
Improving the User Experience of UiPath AppsImproving the User Experience of UiPath Apps
Improving the User Experience of UiPath Apps
 
Company Overview
Company OverviewCompany Overview
Company Overview
 
Ai/ML services
Ai/ML servicesAi/ML services
Ai/ML services
 
Smart cities no ai without ia
Smart cities   no ai without iaSmart cities   no ai without ia
Smart cities no ai without ia
 
Gdsc IIIT Surat Orientation 2022.pdf
Gdsc IIIT Surat Orientation 2022.pdfGdsc IIIT Surat Orientation 2022.pdf
Gdsc IIIT Surat Orientation 2022.pdf
 
Google Cloud Platform - Cloud-Native Roadshow Stuttgart
Google Cloud Platform - Cloud-Native Roadshow StuttgartGoogle Cloud Platform - Cloud-Native Roadshow Stuttgart
Google Cloud Platform - Cloud-Native Roadshow Stuttgart
 
NLP based Data Engineering and ETL Tool - Ask On Data.pdf
NLP based Data Engineering and ETL Tool - Ask On Data.pdfNLP based Data Engineering and ETL Tool - Ask On Data.pdf
NLP based Data Engineering and ETL Tool - Ask On Data.pdf
 
Conversational Artificial Intelligence with Ben Tomlinson and Wayne Thompson
Conversational Artificial Intelligence with Ben Tomlinson and Wayne ThompsonConversational Artificial Intelligence with Ben Tomlinson and Wayne Thompson
Conversational Artificial Intelligence with Ben Tomlinson and Wayne Thompson
 
Translation as a professional activity
Translation as a professional activityTranslation as a professional activity
Translation as a professional activity
 
Sudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdfSudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdf
 
Google Cloud Platform Munich
Google Cloud Platform MunichGoogle Cloud Platform Munich
Google Cloud Platform Munich
 

Recently uploaded

Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Zilliz
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 

Bhashini (NLTM) Tools

  • 2. Agenda ● Bhashini Mission ● Need for Digital Infrastructure ● NLTM Architecture ● Datasets & Models ● BhashaDaan ● ULCA ● Contributing Datasets & Models to ULCA ● Roadmap
  • 3. Bhashini Mission Statement Create a knowledge-based society by transcending the language barriers ; Providing content and services to citizens, in their own language.
  • 4. Digital Infrastructure Educational eBooks Digital Web Content Tele Services Communication Services Knowledge Base Search Datasets Datasets AI/ML Models
  • 6. In simple words… Contributors of Datasets Development of AI Models Development of End user Applications
  • 7. Datasets & Contributors Contributors Datasets Public - Crowdsourced - Free Dedicated Teams - Language Experts - Specific Tasks - Paid Parallel Monolingual ASR TTS OCR & more…
  • 8. Data Collection Help to build an open repository of data to digitally enrich your language ASR Datasets TTS Datasets Parallel Datasets OCR Datasets
  • 9. BashaDhaan - A short video
  • 10. AI Models Task Types Contributors Translation ASR TTS Transliteration OCR Models EkStep AI4Bharat IITs IIITs CDAC IndicTrans Vakyansh IndicXlit IndicTTS Anuvaad and more… and more… and more…
  • 11. ULCA stands for Universal Language Contribution APIs ULCA ULCA is a standard API and open scalable data platform (supporting various types of datasets) for Indian language datasets and models. World’s largest Indic language data and models platform for Open AI innovation
  • 12. ULCA - Components Open and scalable data platform ● Parallel text corpus in two or more languages ● Monolingual text corpus ● Automatic Speech Recognition (ASR) corpus ● Text to Speech (TTS) corpus ● Optical Character Recognition (OCR) corpus ● Natural Language Understanding (NLU) datasets ● Machine Translation (MT) ● Automatic Speech Recognition (ASR) ● Text to Speech (TTS) ● Optical Character Recognition (OCR) ● Transliteration ● Large, diverse and task specific benchmarks ● Research community approved metric system Inclusive Indian language Models Automated Transparent Benchmarking
  • 13. ULCA - Current Status Datasets ● 215 Million Parallel sentences in 13 languages ● 14k Hours of Audio recording in 14 languages ● 2.5 Million Images for OCR in 12 languages ● 10 Million Transliteration pairs in 19 languages World's largest Indic language data and models platform for open AI innovation Models ● 240 State of the Art Models in 21 Indian languages across Translation, speech (ASR/TTS), OCR & Transliteration Benchmarks ● 135 Open Benchmarks across Translation, ASR & Transliteration in 20 Indian languages
  • 14. ULCA- Actions Datasets Submission My Contribution Search & Download My Searches Models Benchmarking Submission My Contribution Explore Models Try Model Metrics Benchmark Dataset Explore Models Try Model Model Feedback Model Leaderboard
  • 17. ULCA - Language AI Models Demo
  • 18. ULCA - Roadmap Datasets POS, NER Multi-lingual Multi-speaker Mobile APK Models POS, NER Benchmark OCR Benchmark dataset User Analytics Ex : En-Hi Legal Readymade Datasets Realtime Inference for Models
  • 19. ULCA - Roadmap (Contd.) ULCA Automated Ingestion of verified contents from external sources to ULCA

Editor's Notes

  1. <a href="https://www.freepik.com/vectors/robot-head">Robot head vector created by pch.vector - www.freepik.com</a>