SlideShare a Scribd company logo

HarshithAkkapelli_Presentation.pdf

Data Mining Presentation

1 of 16
Download to read offline
Escape the Big Data Trap: LLMs Liberate Recommenders via
Training-Free Condensation
CMPE 255 - Harshith Akkapelli
Harshith Akkapelli
Course Instructor: Vijay Eranti
SAN JOSE STATE UNVERSITY
Why Training Data Condensation?
Recommendation is a crucial part in today’s large scale systems which really improves
user experience. These days the amount of data is extremely huge, especially text data.
Training the recommendation system with huge data is an extremely cumbersome
process. Therefore condensing the large dataset to a valuable small subset of data is
extremely crucial.
Existing Solutions and their Problems
They are iterative meaning the
condensed dataset have to be
iterated over time to get better
performance
They do not capture the
relationship between users and
items
They cannot generate new text
data which is non continuous
1.
2.
3.
New Approach
It is not an iterative approach
rather a one way or forward process
It is primarily for textual data
which is discrete data
It makes sure to capture the
relationships between user and
items/products
1.
2.
3.
Why LLM’s?
Firstly they mainly focused on textual data as there are no existing methods that deal with it and given
the fact that text data is extremely important in Recommendation systems. Another crucial aspect is
maintaining the relationship between user and item.
Architecture
User Level
Content Level
Two Levels:

Recommended

Multidirectional Product Support System for Decision Making In Textile Indust...
Multidirectional Product Support System for Decision Making In Textile Indust...Multidirectional Product Support System for Decision Making In Textile Indust...
Multidirectional Product Support System for Decision Making In Textile Indust...IOSR Journals
 
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’SDiscovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’SIRJET Journal
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )SBGC
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentationDavid Raj Kanthi
 
Machine Learning On Big Data: Opportunities And Challenges- Future Research D...
Machine Learning On Big Data: Opportunities And Challenges- Future Research D...Machine Learning On Big Data: Opportunities And Challenges- Future Research D...
Machine Learning On Big Data: Opportunities And Challenges- Future Research D...PhD Assistance
 
Change Management: The Secret to a Successful SAS® Implementation
Change Management:  The Secret to a Successful SAS® ImplementationChange Management:  The Secret to a Successful SAS® Implementation
Change Management: The Secret to a Successful SAS® ImplementationThotWave
 
An Improvised Fuzzy Preference Tree Of CRS For E-Services Using Incremental A...
An Improvised Fuzzy Preference Tree Of CRS For E-Services Using Incremental A...An Improvised Fuzzy Preference Tree Of CRS For E-Services Using Incremental A...
An Improvised Fuzzy Preference Tree Of CRS For E-Services Using Incremental A...IJTET Journal
 

More Related Content

Similar to HarshithAkkapelli_Presentation.pdf

76201910
7620191076201910
76201910IJRAT
 
SentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdfSentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdfDevinSohi
 
Smart Data for Smart Labs
Smart Data for Smart Labs Smart Data for Smart Labs
Smart Data for Smart Labs OSTHUS
 
Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
Summary of: "Automating Data Preparation: Can We? Should We? Must We?"Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
Summary of: "Automating Data Preparation: Can We? Should We? Must We?"SamueleBertollo1
 
Multi Objective Optimization Of Environmental And Energy...
Multi Objective Optimization Of Environmental And Energy...Multi Objective Optimization Of Environmental And Energy...
Multi Objective Optimization Of Environmental And Energy...Julie Webster
 
Final Report
Final ReportFinal Report
Final Reportimu409
 
Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach IJCSIS Research Publications
 
Ieee transactions on 2018 knowledge and data engineering topics with abstract .
Ieee transactions on 2018 knowledge and data engineering topics with abstract .Ieee transactions on 2018 knowledge and data engineering topics with abstract .
Ieee transactions on 2018 knowledge and data engineering topics with abstract .tsysglobalsolutions
 
copy for Gary Chin.
copy for Gary Chin.copy for Gary Chin.
copy for Gary Chin.Teng Xiaolu
 
Crm Hicss 2003 Presentation
Crm Hicss 2003 PresentationCrm Hicss 2003 Presentation
Crm Hicss 2003 PresentationRam Srivastava
 
Business Case for leveraging Machine Learning (ML) to Validate Data Lake.pdf
Business Case for leveraging Machine Learning (ML) to Validate Data Lake.pdfBusiness Case for leveraging Machine Learning (ML) to Validate Data Lake.pdf
Business Case for leveraging Machine Learning (ML) to Validate Data Lake.pdfarifulislam946965
 
Applying systemic methodologies to bridge the gap between a process-oriented ...
Applying systemic methodologies to bridge the gap between a process-oriented ...Applying systemic methodologies to bridge the gap between a process-oriented ...
Applying systemic methodologies to bridge the gap between a process-oriented ...Panagiotis Papaioannou
 
Query aware determinization of uncertain
Query aware determinization of uncertainQuery aware determinization of uncertain
Query aware determinization of uncertainjpstudcorner
 
IEEE Projects 2015 | Query aware determinization of uncertain objects
IEEE Projects 2015 | Query aware determinization of uncertain objectsIEEE Projects 2015 | Query aware determinization of uncertain objects
IEEE Projects 2015 | Query aware determinization of uncertain objects1crore projects
 
Survey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageSurvey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageIRJET Journal
 
Algorithm ExampleFor the following taskUse the random module .docx
Algorithm ExampleFor the following taskUse the random module .docxAlgorithm ExampleFor the following taskUse the random module .docx
Algorithm ExampleFor the following taskUse the random module .docxdaniahendric
 
Cooks Emporium Is A Small Retail Business Operating Out Essay
Cooks Emporium Is A Small Retail Business Operating Out EssayCooks Emporium Is A Small Retail Business Operating Out Essay
Cooks Emporium Is A Small Retail Business Operating Out EssayKelly Byers
 
Advantages And Differences Between Text Mining And Text...
Advantages And Differences Between Text Mining And Text...Advantages And Differences Between Text Mining And Text...
Advantages And Differences Between Text Mining And Text...Sarah Davis
 

Similar to HarshithAkkapelli_Presentation.pdf (20)

76201910
7620191076201910
76201910
 
-linkedin
-linkedin-linkedin
-linkedin
 
SentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdfSentimentAnalysisofTwitterProductReviewsDocument.pdf
SentimentAnalysisofTwitterProductReviewsDocument.pdf
 
Smart Data for Smart Labs
Smart Data for Smart Labs Smart Data for Smart Labs
Smart Data for Smart Labs
 
ST_Short_Story.pptx
ST_Short_Story.pptxST_Short_Story.pptx
ST_Short_Story.pptx
 
Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
Summary of: "Automating Data Preparation: Can We? Should We? Must We?"Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
Summary of: "Automating Data Preparation: Can We? Should We? Must We?"
 
Multi Objective Optimization Of Environmental And Energy...
Multi Objective Optimization Of Environmental And Energy...Multi Objective Optimization Of Environmental And Energy...
Multi Objective Optimization Of Environmental And Energy...
 
Final Report
Final ReportFinal Report
Final Report
 
Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach
 
Ieee transactions on 2018 knowledge and data engineering topics with abstract .
Ieee transactions on 2018 knowledge and data engineering topics with abstract .Ieee transactions on 2018 knowledge and data engineering topics with abstract .
Ieee transactions on 2018 knowledge and data engineering topics with abstract .
 
copy for Gary Chin.
copy for Gary Chin.copy for Gary Chin.
copy for Gary Chin.
 
Crm Hicss 2003 Presentation
Crm Hicss 2003 PresentationCrm Hicss 2003 Presentation
Crm Hicss 2003 Presentation
 
Business Case for leveraging Machine Learning (ML) to Validate Data Lake.pdf
Business Case for leveraging Machine Learning (ML) to Validate Data Lake.pdfBusiness Case for leveraging Machine Learning (ML) to Validate Data Lake.pdf
Business Case for leveraging Machine Learning (ML) to Validate Data Lake.pdf
 
Applying systemic methodologies to bridge the gap between a process-oriented ...
Applying systemic methodologies to bridge the gap between a process-oriented ...Applying systemic methodologies to bridge the gap between a process-oriented ...
Applying systemic methodologies to bridge the gap between a process-oriented ...
 
Query aware determinization of uncertain
Query aware determinization of uncertainQuery aware determinization of uncertain
Query aware determinization of uncertain
 
IEEE Projects 2015 | Query aware determinization of uncertain objects
IEEE Projects 2015 | Query aware determinization of uncertain objectsIEEE Projects 2015 | Query aware determinization of uncertain objects
IEEE Projects 2015 | Query aware determinization of uncertain objects
 
Survey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageSurvey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug Triage
 
Algorithm ExampleFor the following taskUse the random module .docx
Algorithm ExampleFor the following taskUse the random module .docxAlgorithm ExampleFor the following taskUse the random module .docx
Algorithm ExampleFor the following taskUse the random module .docx
 
Cooks Emporium Is A Small Retail Business Operating Out Essay
Cooks Emporium Is A Small Retail Business Operating Out EssayCooks Emporium Is A Small Retail Business Operating Out Essay
Cooks Emporium Is A Small Retail Business Operating Out Essay
 
Advantages And Differences Between Text Mining And Text...
Advantages And Differences Between Text Mining And Text...Advantages And Differences Between Text Mining And Text...
Advantages And Differences Between Text Mining And Text...
 

Recently uploaded

Self scaling Multi cloud nomad workloads
Self scaling Multi cloud nomad workloadsSelf scaling Multi cloud nomad workloads
Self scaling Multi cloud nomad workloadsBram Vogelaar
 
killing camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdfkilling camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdfssuser82c38d
 
Getting Started with Trello for Beginners.pptx
Getting Started with Trello for Beginners.pptxGetting Started with Trello for Beginners.pptx
Getting Started with Trello for Beginners.pptxmavinoikein
 
sql ppt for students who preparing for sql
sql ppt for students who preparing for sqlsql ppt for students who preparing for sql
sql ppt for students who preparing for sqlbharatjanadharwarud
 
Steps to Build a PWA with Odoo.pdf
Steps to Build a PWA with Odoo.pdfSteps to Build a PWA with Odoo.pdf
Steps to Build a PWA with Odoo.pdfayushinwizards
 
OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20Shane Coughlan
 
The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!ISPMAIndia
 
maximum subarray ppt for killing camp students
maximum subarray ppt for killing camp studentsmaximum subarray ppt for killing camp students
maximum subarray ppt for killing camp studentsssuser82c38d
 
AUTOKEYUNLOCKER-BRANDS-SUPPORT-STANDARD-VERSION.pdf
AUTOKEYUNLOCKER-BRANDS-SUPPORT-STANDARD-VERSION.pdfAUTOKEYUNLOCKER-BRANDS-SUPPORT-STANDARD-VERSION.pdf
AUTOKEYUNLOCKER-BRANDS-SUPPORT-STANDARD-VERSION.pdfAutokey
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkTimothy Spann
 
Embracing Change - The Impact of Generative AI on Strategic Portfolio Management
Embracing Change - The Impact of Generative AI on Strategic Portfolio ManagementEmbracing Change - The Impact of Generative AI on Strategic Portfolio Management
Embracing Change - The Impact of Generative AI on Strategic Portfolio ManagementOnePlan Solutions
 
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ..."Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...ISPMAIndia
 
Manual de la Mezcladora SoundCraft Notepad -12Fx
Manual de la Mezcladora SoundCraft Notepad -12FxManual de la Mezcladora SoundCraft Notepad -12Fx
Manual de la Mezcladora SoundCraft Notepad -12Fxjavierdavidvelasco17
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A..."Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...ISPMAIndia
 
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...emili denli
 
Sql server types of joins with example.pptx
Sql server types of joins with example.pptxSql server types of joins with example.pptx
Sql server types of joins with example.pptxsameer gaikwad
 
SPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product ManagementSPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product ManagementISPMAIndia
 
Essence of Requirements Engineering: Pragmatic Insights for 2024
Essence of Requirements Engineering: Pragmatic Insights for 2024Essence of Requirements Engineering: Pragmatic Insights for 2024
Essence of Requirements Engineering: Pragmatic Insights for 2024Asher Sterkin
 
App Builder - Hierarchical Data Apps.pptx
App Builder - Hierarchical Data Apps.pptxApp Builder - Hierarchical Data Apps.pptx
App Builder - Hierarchical Data Apps.pptxPoojitha B
 

Recently uploaded (20)

Self scaling Multi cloud nomad workloads
Self scaling Multi cloud nomad workloadsSelf scaling Multi cloud nomad workloads
Self scaling Multi cloud nomad workloads
 
killing camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdfkilling camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdf
 
Getting Started with Trello for Beginners.pptx
Getting Started with Trello for Beginners.pptxGetting Started with Trello for Beginners.pptx
Getting Started with Trello for Beginners.pptx
 
sql ppt for students who preparing for sql
sql ppt for students who preparing for sqlsql ppt for students who preparing for sql
sql ppt for students who preparing for sql
 
Steps to Build a PWA with Odoo.pdf
Steps to Build a PWA with Odoo.pdfSteps to Build a PWA with Odoo.pdf
Steps to Build a PWA with Odoo.pdf
 
OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20
 
The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!
 
maximum subarray ppt for killing camp students
maximum subarray ppt for killing camp studentsmaximum subarray ppt for killing camp students
maximum subarray ppt for killing camp students
 
AUTOKEYUNLOCKER-BRANDS-SUPPORT-STANDARD-VERSION.pdf
AUTOKEYUNLOCKER-BRANDS-SUPPORT-STANDARD-VERSION.pdfAUTOKEYUNLOCKER-BRANDS-SUPPORT-STANDARD-VERSION.pdf
AUTOKEYUNLOCKER-BRANDS-SUPPORT-STANDARD-VERSION.pdf
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
Embracing Change - The Impact of Generative AI on Strategic Portfolio Management
Embracing Change - The Impact of Generative AI on Strategic Portfolio ManagementEmbracing Change - The Impact of Generative AI on Strategic Portfolio Management
Embracing Change - The Impact of Generative AI on Strategic Portfolio Management
 
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ..."Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
 
Manual de la Mezcladora SoundCraft Notepad -12Fx
Manual de la Mezcladora SoundCraft Notepad -12FxManual de la Mezcladora SoundCraft Notepad -12Fx
Manual de la Mezcladora SoundCraft Notepad -12Fx
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A..."Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...
 
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...
The Game-Changer_ How Software Development Outsource Can Catapult Your Growth...
 
Sql server types of joins with example.pptx
Sql server types of joins with example.pptxSql server types of joins with example.pptx
Sql server types of joins with example.pptx
 
SPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product ManagementSPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product Management
 
Essence of Requirements Engineering: Pragmatic Insights for 2024
Essence of Requirements Engineering: Pragmatic Insights for 2024Essence of Requirements Engineering: Pragmatic Insights for 2024
Essence of Requirements Engineering: Pragmatic Insights for 2024
 
App Builder - Hierarchical Data Apps.pptx
App Builder - Hierarchical Data Apps.pptxApp Builder - Hierarchical Data Apps.pptx
App Builder - Hierarchical Data Apps.pptx
 

HarshithAkkapelli_Presentation.pdf

  • 1. Escape the Big Data Trap: LLMs Liberate Recommenders via Training-Free Condensation CMPE 255 - Harshith Akkapelli Harshith Akkapelli Course Instructor: Vijay Eranti SAN JOSE STATE UNVERSITY
  • 2. Why Training Data Condensation? Recommendation is a crucial part in today’s large scale systems which really improves user experience. These days the amount of data is extremely huge, especially text data. Training the recommendation system with huge data is an extremely cumbersome process. Therefore condensing the large dataset to a valuable small subset of data is extremely crucial.
  • 3. Existing Solutions and their Problems They are iterative meaning the condensed dataset have to be iterated over time to get better performance They do not capture the relationship between users and items They cannot generate new text data which is non continuous 1. 2. 3.
  • 4. New Approach It is not an iterative approach rather a one way or forward process It is primarily for textual data which is discrete data It makes sure to capture the relationships between user and items/products 1. 2. 3.
  • 5. Why LLM’s? Firstly they mainly focused on textual data as there are no existing methods that deal with it and given the fact that text data is extremely important in Recommendation systems. Another crucial aspect is maintaining the relationship between user and item.
  • 12. Metrics Used in Paper NDCG@K: Measures the quality of the recommended items by considering both the position of the relevant items in the list and their individual relevance scores. Recall@K: Measures the ability of the system to retrieve a set of relevant items out of all possible relevant items in the dataset. Quality: Measure the performance of the condensed dataset considering the performance on original dataset is 100% Ratio: Dataset compression ratio with respect to original dataset
  • 14. My two cents on paper The paper really presented the novel approach for Dataset condensation which is extremely useful in Recommendation Systems these days. The way they leveraged LLM in solving this complex task is really awesome. Towards the end they used two fundamental techniques: 1)KMeans - clustering and 2) LLM for text condensation which is what most of us know but the greatness comes in the way they leveraged it. The results or metrics they presented are really promising and it is really gonna revolutionize the Recommendation Systems. With just like subset of dataset they achieved nearly same or sometimes even better results 1. 2.
  • 15. Q&A Session Thank you for listening!
  • 16. Leveraging Large Language Models (LLMs) to Empower Training-Free Dataset Condensation for Content-Based Recommendation References