SlideShare a Scribd company logo
PYTHON VS R
BY: KENNAN DUFFY, DARIA GBOR, CHRIS LUKENS,
JOHN SAVIELLO, & JAMES SCHEUREN
http://project.mis.temple.edu/pythonvsranalytics/final-deliverables/
AGENDA
1. Our Process
2. Use Cases
3. Sentiment Analysis - Python
4. Sentiment Analysis - R
5. Scorecard
6. Recommendation
7. Q & A
2
OUR PROCESS
3
RESEARCH
Conduct web research on the use
case and the language
PYTHON
Complete the use case in Python
ANALYZE
Review and analyze the results of
Python & R as a team
R CODE
Complete the use case in R
DEFINE
Define the business purpose of the
use case and completion plan
SCORE
Fill out the scorecard based on previously
defined scoring criteria
SCORECARD
4
Criteria Weight (%)
Package Requirement 10%
Lines of Code 5%
Simplicity 10%
Popularity 5%
Development Sources 10%
Data Visualization 15%
Functionality 45%
Total 100%
USE CASE #1 - PREDICTIVE ANALYTICS
What
➔ NFL franchise wants to ensure that the player they are selecting
from the draft will be a high performer
How
➔ Linear Regression using the NFL combine dataset from 1985-2015
USE CASE #2 - TEXT MINING
What
➔ Justin Trudeau’s campaign team wants to stay updated on
what the public opinion is on him
How
➔ Sentiment analysis using Twitter feed as our dataset
USE CASE #3 - IMAGE ANALYTICS
What
➔ England wants to keep track of what is going on in the
busy streets for security purposes
How
➔ Object detection using a picture of a busy street in England
SENTIMENT ANALYSIS - PYTHON
8
csv
Allows us to write output to
csv file for analysis
Tweepy
Python library that allows
access twitter API and use
different functions
TextBlob
Natural language processor
to get subjectivity and
polarity of tweets
01
03 02
DEMONSTRATION - PYTHON
9
PERFORMANCE - PYTHON
10
Overall Accuracy: 28%
▰ Negative Accuracy: 52% (11/21)
▰ Positive Accuracy: 27% (7/26)
▰ Neutral Accuracy: 19% (10/53)
SENTIMENT ANALYSIS - R
11
04
03
02
01Syuzhet
Sentiment Analysis
TwitteR
Twitter API
Snowball C
Concision
TM
Text Mining
DEMONSTRATION - R
12
PERFORMANCE - R
13
Overall Accuracy: 50%
▰ Negative Accuracy: 77% (30/39)
▰ Positive Accuracy: 27% (9/33)
▰ Neutral Accuracy: 39% (11/28)
SCORECARD
14
Our Recommendation
15
- Built for Data Analytics
- Package Accuracy
- Usability
16
THANK YOU!
Any questions?
APPENDIX
GRADING CRITERIA
1. Package Requirement:
0 packages = 10 points
1 package = 9 points
2 packages = 8 points
3 packages = 7 points
4 packages = 6 points
5 packages = 5 points
6 packages = 4 points
7 packages = 3 points
8 packages = 2 points
9 packages = 1 point
10 packages = 0 points
3. Simplicity:
Quick, really simple to write, really simple to read = 10
Took a while to complete, but pretty simple, easy to understand = 7
Took so long to complete, not very simple, hard to understand = 4
Hard to write, almost impossible, not able to read = 1
4. Popularity:
Very Popular among the industry = 10
A lot of people use this language = 7
Some people use this language = 4
No one uses it = 1
5. Development Sources:
A lot of help in the online community = 10
Some resources available, decently helpful sources = 7
Not many resources available = 4
No help available online = 1
18
6. Data Visualization
Easy to manipulate, cleanliness, visually appealing = 10
Harder to manipulate, messy, not exciting = 7
Harder to manipulate, difficult to read = 4
Unable to manipulate, unreadable = 1
7. Functionality
Accurate data, does everything it needs to do = 10
Mostly accurate data, does most of what it needs to do = 7
Inaccurate data, barely does what it needs to do = 4
Is not able to complete the task = 1
2. Lines of Code:
0-10 lines = 10 points
11-20 = 9 points
21-30 = 8 points
31-40 = 7 points
41-50 = 6 points
51-60 = 5 points
61-70 = 4 points
71-80 = 3 points
81-90 = 2 points
91-100 = 1 point
101 + = 0 points
19
USE CASE 1 - Python
20
Use CASE 1 - R
USE CASE 3 – IMAGE ANALYTICS
21

More Related Content

Similar to Python vs R for Data Analytics Final

IPPROJECT61-66 (2).pdf
IPPROJECT61-66 (2).pdfIPPROJECT61-66 (2).pdf
IPPROJECT61-66 (2).pdf
SaketMishra61
 
Using Generative AI to Assess the Quality of Open-Ended Responses in Surveys
Using Generative AI to Assess the Quality of Open-Ended Responses in SurveysUsing Generative AI to Assess the Quality of Open-Ended Responses in Surveys
Using Generative AI to Assess the Quality of Open-Ended Responses in Surveys
Ray Poynter
 
Python webinar 4th june
Python webinar 4th junePython webinar 4th june
Python webinar 4th june
Edureka!
 
An introduction to R is a document useful
An introduction to R is a document usefulAn introduction to R is a document useful
An introduction to R is a document useful
ssuser3c3f88
 
FEC2017-Introduction-to-programming
FEC2017-Introduction-to-programmingFEC2017-Introduction-to-programming
FEC2017-Introduction-to-programming
Henrikki Tenkanen
 
Design + Devops: What We've Learned from Our Developer Friends
Design + Devops: What We've Learned from Our Developer FriendsDesign + Devops: What We've Learned from Our Developer Friends
Design + Devops: What We've Learned from Our Developer Friends
UXPA International
 
Pig latin
Pig latinPig latin
Pig latin
Bita Kazemi
 
JDO 2019: Data Science for Developers - Matthew Renze
JDO 2019: Data Science for Developers -  Matthew RenzeJDO 2019: Data Science for Developers -  Matthew Renze
JDO 2019: Data Science for Developers - Matthew Renze
PROIDEA
 
A Large Scale Study of Multiple Programming Languages and Code Quality
A Large Scale Study of Multiple Programming Languages and Code QualityA Large Scale Study of Multiple Programming Languages and Code Quality
A Large Scale Study of Multiple Programming Languages and Code Quality
Pavneet Singh Kochhar
 
Splunk for DataScience (.conf2014)
Splunk for DataScience (.conf2014)Splunk for DataScience (.conf2014)
Splunk for DataScience (.conf2014)
stelligence
 
LaGatta and de Garrigues - Splunk for Data Science - .conf2014
LaGatta and de Garrigues - Splunk for Data Science - .conf2014LaGatta and de Garrigues - Splunk for Data Science - .conf2014
LaGatta and de Garrigues - Splunk for Data Science - .conf2014Tom LaGatta
 
Splunk conf2014 - Splunk for Data Science
Splunk conf2014 - Splunk for Data ScienceSplunk conf2014 - Splunk for Data Science
Splunk conf2014 - Splunk for Data Science
Splunk
 
OpenPOWER Webinar from University of Delaware - Title :OpenMP (offloading) o...
OpenPOWER Webinar from University of Delaware  - Title :OpenMP (offloading) o...OpenPOWER Webinar from University of Delaware  - Title :OpenMP (offloading) o...
OpenPOWER Webinar from University of Delaware - Title :OpenMP (offloading) o...
Ganesan Narayanasamy
 
Iwsm2014 application of function points to software based on open source - ...
Iwsm2014   application of function points to software based on open source - ...Iwsm2014   application of function points to software based on open source - ...
Iwsm2014 application of function points to software based on open source - ...
Nesma
 
DATA MINING USING R (1).pptx
DATA MINING USING R (1).pptxDATA MINING USING R (1).pptx
DATA MINING USING R (1).pptx
myworld93
 
Ask me anything: A Conversational Interface to Augment Information Security w...
Ask me anything:A Conversational Interface to Augment Information Security w...Ask me anything:A Conversational Interface to Augment Information Security w...
Ask me anything: A Conversational Interface to Augment Information Security w...
Matthew Park
 
Introduction To R
Introduction To RIntroduction To R
Introduction To R
Spotle.ai
 
DevOps Is More than Dev and Ops: It’s about Tearing Down Walls
DevOps Is More than Dev and Ops: It’s about Tearing Down WallsDevOps Is More than Dev and Ops: It’s about Tearing Down Walls
DevOps Is More than Dev and Ops: It’s about Tearing Down Walls
TechWell
 
196 - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...
196  - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...196  - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...
196 - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...
ESEM 2014
 
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMSDYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
Praveen Penumathsa
 

Similar to Python vs R for Data Analytics Final (20)

IPPROJECT61-66 (2).pdf
IPPROJECT61-66 (2).pdfIPPROJECT61-66 (2).pdf
IPPROJECT61-66 (2).pdf
 
Using Generative AI to Assess the Quality of Open-Ended Responses in Surveys
Using Generative AI to Assess the Quality of Open-Ended Responses in SurveysUsing Generative AI to Assess the Quality of Open-Ended Responses in Surveys
Using Generative AI to Assess the Quality of Open-Ended Responses in Surveys
 
Python webinar 4th june
Python webinar 4th junePython webinar 4th june
Python webinar 4th june
 
An introduction to R is a document useful
An introduction to R is a document usefulAn introduction to R is a document useful
An introduction to R is a document useful
 
FEC2017-Introduction-to-programming
FEC2017-Introduction-to-programmingFEC2017-Introduction-to-programming
FEC2017-Introduction-to-programming
 
Design + Devops: What We've Learned from Our Developer Friends
Design + Devops: What We've Learned from Our Developer FriendsDesign + Devops: What We've Learned from Our Developer Friends
Design + Devops: What We've Learned from Our Developer Friends
 
Pig latin
Pig latinPig latin
Pig latin
 
JDO 2019: Data Science for Developers - Matthew Renze
JDO 2019: Data Science for Developers -  Matthew RenzeJDO 2019: Data Science for Developers -  Matthew Renze
JDO 2019: Data Science for Developers - Matthew Renze
 
A Large Scale Study of Multiple Programming Languages and Code Quality
A Large Scale Study of Multiple Programming Languages and Code QualityA Large Scale Study of Multiple Programming Languages and Code Quality
A Large Scale Study of Multiple Programming Languages and Code Quality
 
Splunk for DataScience (.conf2014)
Splunk for DataScience (.conf2014)Splunk for DataScience (.conf2014)
Splunk for DataScience (.conf2014)
 
LaGatta and de Garrigues - Splunk for Data Science - .conf2014
LaGatta and de Garrigues - Splunk for Data Science - .conf2014LaGatta and de Garrigues - Splunk for Data Science - .conf2014
LaGatta and de Garrigues - Splunk for Data Science - .conf2014
 
Splunk conf2014 - Splunk for Data Science
Splunk conf2014 - Splunk for Data ScienceSplunk conf2014 - Splunk for Data Science
Splunk conf2014 - Splunk for Data Science
 
OpenPOWER Webinar from University of Delaware - Title :OpenMP (offloading) o...
OpenPOWER Webinar from University of Delaware  - Title :OpenMP (offloading) o...OpenPOWER Webinar from University of Delaware  - Title :OpenMP (offloading) o...
OpenPOWER Webinar from University of Delaware - Title :OpenMP (offloading) o...
 
Iwsm2014 application of function points to software based on open source - ...
Iwsm2014   application of function points to software based on open source - ...Iwsm2014   application of function points to software based on open source - ...
Iwsm2014 application of function points to software based on open source - ...
 
DATA MINING USING R (1).pptx
DATA MINING USING R (1).pptxDATA MINING USING R (1).pptx
DATA MINING USING R (1).pptx
 
Ask me anything: A Conversational Interface to Augment Information Security w...
Ask me anything:A Conversational Interface to Augment Information Security w...Ask me anything:A Conversational Interface to Augment Information Security w...
Ask me anything: A Conversational Interface to Augment Information Security w...
 
Introduction To R
Introduction To RIntroduction To R
Introduction To R
 
DevOps Is More than Dev and Ops: It’s about Tearing Down Walls
DevOps Is More than Dev and Ops: It’s about Tearing Down WallsDevOps Is More than Dev and Ops: It’s about Tearing Down Walls
DevOps Is More than Dev and Ops: It’s about Tearing Down Walls
 
196 - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...
196  - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...196  - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...
196 - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...
 
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMSDYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
 

More from BobSmith712

Sprint 6
Sprint 6Sprint 6
Sprint 6
BobSmith712
 
Sprint_5_Python_vs_R
Sprint_5_Python_vs_RSprint_5_Python_vs_R
Sprint_5_Python_vs_R
BobSmith712
 
Sprint_3_Python_vs_4
Sprint_3_Python_vs_4Sprint_3_Python_vs_4
Sprint_3_Python_vs_4
BobSmith712
 
Sprint_3_Python_vs_R
Sprint_3_Python_vs_RSprint_3_Python_vs_R
Sprint_3_Python_vs_R
BobSmith712
 
Sprint_2_python_vs_R_v2
Sprint_2_python_vs_R_v2Sprint_2_python_vs_R_v2
Sprint_2_python_vs_R_v2
BobSmith712
 
Sprint_2_Python_vs_R
Sprint_2_Python_vs_RSprint_2_Python_vs_R
Sprint_2_Python_vs_R
BobSmith712
 
Sprint_1_Python_vs_R
Sprint_1_Python_vs_RSprint_1_Python_vs_R
Sprint_1_Python_vs_R
BobSmith712
 

More from BobSmith712 (7)

Sprint 6
Sprint 6Sprint 6
Sprint 6
 
Sprint_5_Python_vs_R
Sprint_5_Python_vs_RSprint_5_Python_vs_R
Sprint_5_Python_vs_R
 
Sprint_3_Python_vs_4
Sprint_3_Python_vs_4Sprint_3_Python_vs_4
Sprint_3_Python_vs_4
 
Sprint_3_Python_vs_R
Sprint_3_Python_vs_RSprint_3_Python_vs_R
Sprint_3_Python_vs_R
 
Sprint_2_python_vs_R_v2
Sprint_2_python_vs_R_v2Sprint_2_python_vs_R_v2
Sprint_2_python_vs_R_v2
 
Sprint_2_Python_vs_R
Sprint_2_Python_vs_RSprint_2_Python_vs_R
Sprint_2_Python_vs_R
 
Sprint_1_Python_vs_R
Sprint_1_Python_vs_RSprint_1_Python_vs_R
Sprint_1_Python_vs_R
 

Recently uploaded

GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 

Recently uploaded (20)

GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 

Python vs R for Data Analytics Final

  • 1. PYTHON VS R BY: KENNAN DUFFY, DARIA GBOR, CHRIS LUKENS, JOHN SAVIELLO, & JAMES SCHEUREN http://project.mis.temple.edu/pythonvsranalytics/final-deliverables/
  • 2. AGENDA 1. Our Process 2. Use Cases 3. Sentiment Analysis - Python 4. Sentiment Analysis - R 5. Scorecard 6. Recommendation 7. Q & A 2
  • 3. OUR PROCESS 3 RESEARCH Conduct web research on the use case and the language PYTHON Complete the use case in Python ANALYZE Review and analyze the results of Python & R as a team R CODE Complete the use case in R DEFINE Define the business purpose of the use case and completion plan SCORE Fill out the scorecard based on previously defined scoring criteria
  • 4. SCORECARD 4 Criteria Weight (%) Package Requirement 10% Lines of Code 5% Simplicity 10% Popularity 5% Development Sources 10% Data Visualization 15% Functionality 45% Total 100%
  • 5. USE CASE #1 - PREDICTIVE ANALYTICS What ➔ NFL franchise wants to ensure that the player they are selecting from the draft will be a high performer How ➔ Linear Regression using the NFL combine dataset from 1985-2015
  • 6. USE CASE #2 - TEXT MINING What ➔ Justin Trudeau’s campaign team wants to stay updated on what the public opinion is on him How ➔ Sentiment analysis using Twitter feed as our dataset
  • 7. USE CASE #3 - IMAGE ANALYTICS What ➔ England wants to keep track of what is going on in the busy streets for security purposes How ➔ Object detection using a picture of a busy street in England
  • 8. SENTIMENT ANALYSIS - PYTHON 8 csv Allows us to write output to csv file for analysis Tweepy Python library that allows access twitter API and use different functions TextBlob Natural language processor to get subjectivity and polarity of tweets 01 03 02
  • 10. PERFORMANCE - PYTHON 10 Overall Accuracy: 28% ▰ Negative Accuracy: 52% (11/21) ▰ Positive Accuracy: 27% (7/26) ▰ Neutral Accuracy: 19% (10/53)
  • 11. SENTIMENT ANALYSIS - R 11 04 03 02 01Syuzhet Sentiment Analysis TwitteR Twitter API Snowball C Concision TM Text Mining
  • 13. PERFORMANCE - R 13 Overall Accuracy: 50% ▰ Negative Accuracy: 77% (30/39) ▰ Positive Accuracy: 27% (9/33) ▰ Neutral Accuracy: 39% (11/28)
  • 15. Our Recommendation 15 - Built for Data Analytics - Package Accuracy - Usability
  • 18. GRADING CRITERIA 1. Package Requirement: 0 packages = 10 points 1 package = 9 points 2 packages = 8 points 3 packages = 7 points 4 packages = 6 points 5 packages = 5 points 6 packages = 4 points 7 packages = 3 points 8 packages = 2 points 9 packages = 1 point 10 packages = 0 points 3. Simplicity: Quick, really simple to write, really simple to read = 10 Took a while to complete, but pretty simple, easy to understand = 7 Took so long to complete, not very simple, hard to understand = 4 Hard to write, almost impossible, not able to read = 1 4. Popularity: Very Popular among the industry = 10 A lot of people use this language = 7 Some people use this language = 4 No one uses it = 1 5. Development Sources: A lot of help in the online community = 10 Some resources available, decently helpful sources = 7 Not many resources available = 4 No help available online = 1 18 6. Data Visualization Easy to manipulate, cleanliness, visually appealing = 10 Harder to manipulate, messy, not exciting = 7 Harder to manipulate, difficult to read = 4 Unable to manipulate, unreadable = 1 7. Functionality Accurate data, does everything it needs to do = 10 Mostly accurate data, does most of what it needs to do = 7 Inaccurate data, barely does what it needs to do = 4 Is not able to complete the task = 1 2. Lines of Code: 0-10 lines = 10 points 11-20 = 9 points 21-30 = 8 points 31-40 = 7 points 41-50 = 6 points 51-60 = 5 points 61-70 = 4 points 71-80 = 3 points 81-90 = 2 points 91-100 = 1 point 101 + = 0 points
  • 19. 19 USE CASE 1 - Python
  • 21. USE CASE 3 – IMAGE ANALYTICS 21

Editor's Notes

  1. Lines of code - we set up standard criteria for this measurement so if it was between 1-10 lines it got a 10, 11-20 lines it got a 9, and so forth Development sources - how strong is the online support community, how many helpful sources are out there for us to help us complete the use case and problem solve if issues arise Functionality - is it able to do what we want it to & how well is it able to accompish that
  2. TEXTBLOB struggled to identify positive/neutral tweets Explain how we got accuracy - retrieved 100 tweets and compared them (as a team) to the package results and see if we agreed with the outcome Neutral = 10/53 Negative = 11/21 Positive = 7/26
  3. Syuzhet- Used for sentiment analysis - what is reading the tweets T M - Works with Snowball C and TwitteR to mine text TwitteR - Interacts with Twitter API to get tweet for analysis Snowball C - Makes words more concise so that they are easier for other packages to read Explain how we got accuracy - retrieved 100 tweets and compared them (as a team) to the package results and see if we agreed with the outcome
  4. Load packages, Import Twitter API, Scrape, Cleaning, Analyze, Apply
  5. 50% overall. 77% negative (30/39). 27% positive (9/33). 39% neutral (11/28). Accuracy MENTION: Functionality USE FOR LESSONS LEARNED Found out that it is more accurate with negative tweets Not perfect, picks out certain words to decide whether it is positive or negative. Sarcasm is difficult. Shouldn’t trust positivity tweet analyses
  6. Language is built for predictive analytics, ready to run predictive analytics where as python needs to be molded into running the linear regression The packages we ran for R were much more accurate than the Python packages for running sentiment analysis More functionality available when running image analytics than Python and very simple to change, it was a matter of changing only 2 lines of code to switch between face detection, landmark detection, logo detection, object detection