This presentation introduces some concepts of Data Analytics including: Data Science, Big Data, Social Network Analysis, Process Mining, Market Basket Analysis, and Pattern Recognition
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Edureka!
Data Analytics for R Course: https://www.edureka.co/r-for-analytics
This Edureka Tutorial on Data Analytics for Beginners will help you learn the various parameters you need to consider while performing data analysis.
The following are the topics covered in this session:
Introduction To Data Analytics
Statistics
Data Cleaning and Manipulation
Data Visualization
Machine Learning
Roles, Responsibilities and Salary of Data Analyst
Need of R
Hands-On
Statistics for Data Science: https://youtu.be/oT87O0VQRi8
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
My class presentation at USC. It gives an introduction about what is data science, machine learning, applications, recommendation system and infrastructure.
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Edureka!
Data Analytics for R Course: https://www.edureka.co/r-for-analytics
This Edureka Tutorial on Data Analytics for Beginners will help you learn the various parameters you need to consider while performing data analysis.
The following are the topics covered in this session:
Introduction To Data Analytics
Statistics
Data Cleaning and Manipulation
Data Visualization
Machine Learning
Roles, Responsibilities and Salary of Data Analyst
Need of R
Hands-On
Statistics for Data Science: https://youtu.be/oT87O0VQRi8
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
My class presentation at USC. It gives an introduction about what is data science, machine learning, applications, recommendation system and infrastructure.
What Is Data Science? | Introduction to Data Science | Data Science For Begin...Simplilearn
This Data Science Presentation will help you in understanding what is Data Science, why we need Data Science, prerequisites for learning Data Science, what does a Data Scientist do, Data Science lifecycle with an example and career opportunities in Data Science domain. You will also learn the differences between Data Science and Business intelligence. The role of a data scientist is one of the sexiest jobs of the century. The demand for data scientists is high, and the number of opportunities for certified data scientists is increasing. Every day, companies are looking out for more and more skilled data scientists and studies show that there is expected to be a continued shortfall in qualified candidates to fill the roles. So, let us dive deep into Data Science and understand what is Data Science all about.
This Data Science Presentation will cover the following topics:
1. Need for Data Science?
2. What is Data Science?
3. Data Science vs Business intelligence
4. Prerequisites for learning Data Science
5. What does a Data scientist do?
6. Data Science life cycle with use case
7. Demand for Data scientists
This Data Science with Python course will establish your mastery of data science and analytics techniques using Python. With this Python for Data Science Course, you’ll learn the essential concepts of Python programming and become an expert in data analytics, machine learning, data visualization, web scraping and natural language processing. Python is a required skill for many data science positions, so jumpstart your career with this interactive, hands-on course.
Why learn Data Science?
Data Scientists are being deployed in all kinds of industries, creating a huge demand for skilled professionals. Data scientist is the pinnacle rank in an analytics organization. Glassdoor has ranked data scientist first in the 25 Best Jobs for 2016, and good data scientists are scarce and in great demand. As a data you will be required to understand the business problem, design the analysis, collect and format the required data, apply algorithms or techniques using the correct tools, and finally make recommendations backed by data.
The Data Science with python is recommended for:
1. Analytics professionals who want to work with Python
2. Software professionals looking to get into the field of analytics
3. IT professionals interested in pursuing a career in analytics
4. Graduates looking to build a career in analytics and data science
5. Experienced professionals who would like to harness data science in their fields
My presentation at The Richmond Data Science Community (Jan 2018). The slides are slightly different than what I had presented last year at The Data Intelligence Conference.
This video will give you an idea about Data science for beginners.
Also explain Data Science Process , Data Science Job Roles , Stages in Data Science Project
Data Science is a wonderful technology that has applications in almost every field. Let's learn the basics of this domain on 16th March at (time).
Agenda
1. What is Data Science? How is it different from ML, DL, and AI
2. Why is this skill in demand?
3. What are some popular applications of Data Science
4. Popular tools and frameworks used in Data Science
This session describes the roles and skill sets required when building a Data Science team, and starting a data science initiative, including how to develop Data Science capabilities, select suitable organizational models for Data Science teams, and understand the role of executive engagement for enhancing analytical maturity at an organization.
Objective 1: Understand the knowledge and skills needed for a Data Science team and how to acquire them.
After this session you will be able to:
Objective 2: Learn about the different organizational models for forming a Data Science team and how to choose the best for your organization.
Objective 3: Understand the importance of Executive support for Data Science initiatives and role it plays in their successful deployment.
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Edureka!
This Edureka Data Science tutorial will help you understand in and out of Data Science with examples. This tutorial is ideal for both beginners as well as professionals who want to learn or brush up their Data Science concepts. Below are the topics covered in this tutorial:
1. Why Data Science?
2. What is Data Science?
3. Who is a Data Scientist?
4. How a Problem is Solved in Data Science?
5. Data Science Components
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Edureka!
** Data Science Certification using R: https://www.edureka.co/data-science **
In this PPT on Data Science Tutorial, you’ll get an in-depth understanding of Data Science and you’ll also learn how it is used in the real world to solve data-driven problems. It’ll cover the following topics in this session:
Need for Data Science
Walmart Use case
What is Data Science?
Who is a Data Scientist?
Data Science – Skill set
Data Science Job roles
Data Life cycle
Introduction to Machine Learning
K- Means Use case
K- Means Algorithm
Hands-On
Data Science certification
Blog Series: http://bit.ly/data-science-blogs
Data Science Training Playlist: http://bit.ly/data-science-playlist
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Two hour lecture I gave at the Jyväskylä Summer School. The purpose of the talk is to give a quick non-technical overview of concepts and methodologies in data science. Topics include a wide overview of both pattern mining and machine learning.
See also Part 2 of the lecture: Industrial Data Science. You can find it in my profile (click the face)
This presentation briefly explains the following topics:
Why is Data Analytics important?
What is Data Analytics?
Top Data Analytics Tools
How to Become a Data Analyst?
Business Intelligence & Data Analytics– An Architected ApproachDATAVERSITY
Business intelligence (BI) and data analytics are increasing in popularity as more organizations are looking to become more data-driven. Many tools have powerful visualization techniques that can create dynamic displays of critical information. To ensure that the data displayed on these visualizations is accurate and timely, a strong Data Architecture is needed. Join this webinar to understand how to create a robust Data Architecture for BI and data analytics that takes both business and technology needs into consideration.
Data science is different from Data Analytics,Data Engineering,Big Data.
Presentation about Data Science.
What is Data Science its process future and scope.
Data Science Presentation By Amit Singh.
"Sexiest job of 21st century"
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...Edureka!
This Edureka Data Science course slides will take you through the basics of Data Science - why Data Science, what is Data Science, use cases, BI vs Data Science, Data Science tools and Data Science lifecycle process. This is ideal for beginners to get started with learning data science.
You can read the blog here: https://goo.gl/OoDCxz
You can also take a complete structured training, check out the details here: https://goo.gl/AfxwBc
MediaWave listening and analysis all conversation from twitter, facebook, blogs, forum, news, youtube and image. Our analysis includes brand comparison, mention, sentiment, gender, geolocation, influencer & core (reason).
You will know your influencer from twitter, facebook, blog, forum, news, image and youtube.
You can engage directly from our online dashboard
What Is Data Science? | Introduction to Data Science | Data Science For Begin...Simplilearn
This Data Science Presentation will help you in understanding what is Data Science, why we need Data Science, prerequisites for learning Data Science, what does a Data Scientist do, Data Science lifecycle with an example and career opportunities in Data Science domain. You will also learn the differences between Data Science and Business intelligence. The role of a data scientist is one of the sexiest jobs of the century. The demand for data scientists is high, and the number of opportunities for certified data scientists is increasing. Every day, companies are looking out for more and more skilled data scientists and studies show that there is expected to be a continued shortfall in qualified candidates to fill the roles. So, let us dive deep into Data Science and understand what is Data Science all about.
This Data Science Presentation will cover the following topics:
1. Need for Data Science?
2. What is Data Science?
3. Data Science vs Business intelligence
4. Prerequisites for learning Data Science
5. What does a Data scientist do?
6. Data Science life cycle with use case
7. Demand for Data scientists
This Data Science with Python course will establish your mastery of data science and analytics techniques using Python. With this Python for Data Science Course, you’ll learn the essential concepts of Python programming and become an expert in data analytics, machine learning, data visualization, web scraping and natural language processing. Python is a required skill for many data science positions, so jumpstart your career with this interactive, hands-on course.
Why learn Data Science?
Data Scientists are being deployed in all kinds of industries, creating a huge demand for skilled professionals. Data scientist is the pinnacle rank in an analytics organization. Glassdoor has ranked data scientist first in the 25 Best Jobs for 2016, and good data scientists are scarce and in great demand. As a data you will be required to understand the business problem, design the analysis, collect and format the required data, apply algorithms or techniques using the correct tools, and finally make recommendations backed by data.
The Data Science with python is recommended for:
1. Analytics professionals who want to work with Python
2. Software professionals looking to get into the field of analytics
3. IT professionals interested in pursuing a career in analytics
4. Graduates looking to build a career in analytics and data science
5. Experienced professionals who would like to harness data science in their fields
My presentation at The Richmond Data Science Community (Jan 2018). The slides are slightly different than what I had presented last year at The Data Intelligence Conference.
This video will give you an idea about Data science for beginners.
Also explain Data Science Process , Data Science Job Roles , Stages in Data Science Project
Data Science is a wonderful technology that has applications in almost every field. Let's learn the basics of this domain on 16th March at (time).
Agenda
1. What is Data Science? How is it different from ML, DL, and AI
2. Why is this skill in demand?
3. What are some popular applications of Data Science
4. Popular tools and frameworks used in Data Science
This session describes the roles and skill sets required when building a Data Science team, and starting a data science initiative, including how to develop Data Science capabilities, select suitable organizational models for Data Science teams, and understand the role of executive engagement for enhancing analytical maturity at an organization.
Objective 1: Understand the knowledge and skills needed for a Data Science team and how to acquire them.
After this session you will be able to:
Objective 2: Learn about the different organizational models for forming a Data Science team and how to choose the best for your organization.
Objective 3: Understand the importance of Executive support for Data Science initiatives and role it plays in their successful deployment.
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Edureka!
This Edureka Data Science tutorial will help you understand in and out of Data Science with examples. This tutorial is ideal for both beginners as well as professionals who want to learn or brush up their Data Science concepts. Below are the topics covered in this tutorial:
1. Why Data Science?
2. What is Data Science?
3. Who is a Data Scientist?
4. How a Problem is Solved in Data Science?
5. Data Science Components
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Edureka!
** Data Science Certification using R: https://www.edureka.co/data-science **
In this PPT on Data Science Tutorial, you’ll get an in-depth understanding of Data Science and you’ll also learn how it is used in the real world to solve data-driven problems. It’ll cover the following topics in this session:
Need for Data Science
Walmart Use case
What is Data Science?
Who is a Data Scientist?
Data Science – Skill set
Data Science Job roles
Data Life cycle
Introduction to Machine Learning
K- Means Use case
K- Means Algorithm
Hands-On
Data Science certification
Blog Series: http://bit.ly/data-science-blogs
Data Science Training Playlist: http://bit.ly/data-science-playlist
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Two hour lecture I gave at the Jyväskylä Summer School. The purpose of the talk is to give a quick non-technical overview of concepts and methodologies in data science. Topics include a wide overview of both pattern mining and machine learning.
See also Part 2 of the lecture: Industrial Data Science. You can find it in my profile (click the face)
This presentation briefly explains the following topics:
Why is Data Analytics important?
What is Data Analytics?
Top Data Analytics Tools
How to Become a Data Analyst?
Business Intelligence & Data Analytics– An Architected ApproachDATAVERSITY
Business intelligence (BI) and data analytics are increasing in popularity as more organizations are looking to become more data-driven. Many tools have powerful visualization techniques that can create dynamic displays of critical information. To ensure that the data displayed on these visualizations is accurate and timely, a strong Data Architecture is needed. Join this webinar to understand how to create a robust Data Architecture for BI and data analytics that takes both business and technology needs into consideration.
Data science is different from Data Analytics,Data Engineering,Big Data.
Presentation about Data Science.
What is Data Science its process future and scope.
Data Science Presentation By Amit Singh.
"Sexiest job of 21st century"
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...Edureka!
This Edureka Data Science course slides will take you through the basics of Data Science - why Data Science, what is Data Science, use cases, BI vs Data Science, Data Science tools and Data Science lifecycle process. This is ideal for beginners to get started with learning data science.
You can read the blog here: https://goo.gl/OoDCxz
You can also take a complete structured training, check out the details here: https://goo.gl/AfxwBc
MediaWave listening and analysis all conversation from twitter, facebook, blogs, forum, news, youtube and image. Our analysis includes brand comparison, mention, sentiment, gender, geolocation, influencer & core (reason).
You will know your influencer from twitter, facebook, blog, forum, news, image and youtube.
You can engage directly from our online dashboard
User behavior model & recommendation on basis of social networks Shah Alam Sabuj
At present social networks play an important role to express people's sentiment and interest in a particular field. Extracting a user's public social network data (what the user shares with friends and relatives and how the user reacts over others' thought) means extracting the user's behavior. Defining some determined hypothesis if we make machine understand human sentiment and interest, it is possible to recommend a user about his/her personal interest on basis of the user's sentiment analyzed by machine. Our main approach is to suggest a user regarding the user's specific interest that is anticipated by analyzing the user's public data. This can be extended to further business analysis to suggest products or services of different companies depending on the consumer's personal choice. This automation will also help to choose the correct candidate for any questionnaire. This system will also help anyone to know about himself or herself, how one's behavior may influence others. It is possible to identify different types of people such as- dependable people, leadership skilled, people of supportive mentality, people of negative mentality etc.
the near future of tourism services based on digital tracesnicolas nova
Digital objects used by tourists such as mobile phones and cameras leave a large amount of traces. The phone can indeed be geolocated through cell-phone antennas or GPS and digital cameras take pictures that people can upload on web sharing platforms such as Flickr. All of this enable new application that allow to count tourists or provide them with new sorts of services. Based on existing experiments, the presentation will describe how the tourism industry can benefit from these digital traces to obtain new representations of tourists activities and to build up new services based on them
The document describes a comprehensive approach to developing innovative recommendations for urban sectors, that are potentially transformational. The recommendations would become a part of an Urban Planning effort.
This is the slides from my presentation of the paper entitled "Activity Analysis – Applying Activity Theory to Analyze Complex Work in Hospitals" which was presented at the ACM Conference on Computer Supported Cooperative Work, CSCW 2011 in Hangzhou, China.
The paper is available from my homepage
http://www.itu.dk/people/bardram
For my final year project I used data analysis techniques to investigate user behavior pattern recognition in respect of similar interests and culture versus offline geographical location. This was an out-of-the-box topic, which I selected due to my love on Data Analysis, in respect of the Social Network Analysis in the Internet era.
Impact of Urban Logistics of Commercial Vehicles Sandeep Kar
This presentation made by Sandeep Kar, Global Director, Frost & Sullivan shows the impact of urbanization and urban logistics on commercial vehicle design philosophies
An introduction to power law distributions, with a focus on branded markets.
Somewhat text-heavy by today's standards, but presentation was created in late 2007.
Agile Data Science is a lean methodology that is adopted from Agile Software Development. At the core it centers around people, interactions, and building minimally viable products to ship fast and often to solicit customer feedback. In this presentation, I describe how this work was done in the past with examples. Get started today with our help by visiting http://www.alpinenow.com
Big Data Forum at Salt River Fields (the spring training field for the Arizona Diamondbacks). Krishnan Parasuraman discusses how companies are using big data and analytics to transform their business.
Jisc learning analytics MASHEIN Jan 2017Paul Bailey
Jisc Learning Analytics presentation at Leading Digital Learning: Key Issues for Small and Specialist Institutions event organised by MASHEIN (Management of Small Higher
Education Institutions Network)
Moving Beyond Batch: Transactional Databases for Real-time DataVoltDB
Join guest Forrester speaker, Principal Analyst Mike Gualtieri, and Dennis Duckworth Director of Product Marketing from VoltDB to learn how enterprises can create a real-time, “origin-zero” data architecture within transactional databases to become a real-time enterprise.
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Connotate
This presentation will discuss how to collect Web data with precision, transform it and then apply next-generation text analytics to reveal insights about the past activities of persons of interest and/or predict future outcomes. Featured guest speaker Claire Schmidt will discuss results of a project which proved the potential of using automated Web data collection and advanced analytics to identify potential child victims of exploitation.
Advanced Analytics and Data Science ExpertiseSoftServe
An overview of SoftServe's Data Science service line.
- Data Science Group
- Data Science Offerings for Business
- Machine Learning Overview
- AI & Deep Learning Case Studies
- Big Data & Analytics Case Studies
Visit our website to learn more: http://www.softserveinc.com/en-us/
The Role of Community-Driven Data Curation for EnterprisesEdward Curry
With increased utilization of data within their operational and strategic processes, enterprises need to ensure data quality and accuracy. Data curation is a process that can ensure the quality of data and its fitness for use. Traditional approaches to curation are struggling with increased data volumes, and near real-time demands for curated data. In response, curation teams have turned to community crowd-sourcing and semi-automatedmetadata tools for assistance. This chapter provides an overview of data curation, discusses the business motivations for curating data and investigates the role of community-based data curation, focusing on internal communities and pre-competitive data collaborations. The chapter is supported by case studies from Wikipedia, The New York Times, Thomson Reuters, Protein Data Bank and ChemSpider upon which best practices for both social and technical aspects of community-driven data curation are described.
E. Curry, A. Freitas, and S. O’Riáin, “The Role of Community-Driven Data Curation for Enterprises,” in Linking Enterprise Data, D. Wood, Ed. Boston, MA: Springer US, 2010, pp. 25-47.
Understanding big data and data analytics-Business IntelligenceSeta Wicaksana
Faster and more accurate reporting, analysis or planning; better business decisions; improved employee satisfaction and improved data quality top the list. Benefits achieved least frequently include reducing costs, and increasing revenues.
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Simplilearn
The presentation about Big Data Analytics will help you know why Big Data analytics is required, what is Big Data analytics, the lifecycle of Big Data analytics, types of Big Data analytics, tools used in Big Data analytics and few Big Data application domains. Also, we'll see a use case on how Spotify uses Big Data analytics. Big Data analytics is a process to extract meaningful insights from Big Data such as hidden patterns, unknown correlations, market trends, and customer preferences. One of the essential benefits of Big Data analytics is used for product development and innovations. Now, let us get started and understand Big Data Analytics in detail.
Below are explained in this Big Data analytics tutorial:
1. Why Big Data analytics?
2. What is Big Data analytics?
3. Lifecycle of Big Data analytics
4. Types of Big Data analytics
5. Tools used in Big Data analytics
6. Big Data application domains
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
This course will enable you to:
1. Understand the different components of the Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
A short Introduction to the Influence of Big Data in today's world and how it's helping the organization and industry to be familiar with their clients and partners.
Discovering Big Data in the Fog: Why Catalogs MatterEric Kavanagh
The Briefing Room with Dr. Robin Bloor and Waterline Data
Good enterprise data can drive positive business outcomes. But if that data isn’t organized and accessible, information workers are left with an incomplete picture. Knowing the location, lineage and permissions of data across the enterprise can lead to more accurate and insightful searches, and ultimately, knowledge discovery.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he discusses how the success of big data projects relies on understanding your data. He’ll be briefed by Todd Goldman and Mohan Sadashiva of Waterline Data, who will explain how their solution can facilitate discovery via automation and crowd sourcing. They’ll demonstrate how combining the value of tribal knowledge with rationalized data can enable self-service analytics, improve data governance, and reduce data redundancy.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
2. My Bio Data
• Postdoctoral scholar in Social Network Analysis
• PhD in Software Engineering (Recommender Systems)
• Social Network Analysis from University of Michigan
Professional Certificates:
• University Lecturer for more than 10 years
• Mining Massive Datasets from Stanford University
• Pattern Discovery in Data Mining from Illinois University
• Process Mining from Eindhoven University of Technology
• Statistical Analysis using SPSS & SAS from University of Malaya
• MongoDB for DBAs from MongoDB
Data Analytics by Vala Ali Rohani
3. Presentation outline
Data Science & Big Data
Social Network Analysis
Process Mining
Market Basket Analysis
Data Analytics by Vala Ali Rohani
5. Domain Terminology
Data Analytics by Vala Ali Rohani
Data Science & Big Data
• Data Analysis, Data Mining, Machine Learning and Mathematical Modeling are
tools: means towards an end.
• Analytics, Business Intelligence, Econometrics and Artificial Intelligence are
application areas: domains that use the tools above (and others) to produce results
within its subject.
• Statistics is a branch of Mathematics providing theoretical and practical support to the
above tools.
• Data Science is a catch-all term to describe using those all tools to provide answers in
those all areas (and also in others), specially when dealing with Big Data
http://www.quora.com/What-is-the-difference-between-Data-Analytics-Data-Analysis-Data-Mining-Data-Science-Machine-Learning-
and-Big-Data-1
6. Data Analytics by Vala Ali Rohani
Data is the New Oil!
In the last 10 minutes we generated more data than from prehistoric times until 2003!
Data Science & Big Data
7. Data Analytics by Vala Ali Rohani
A data scientist is able to collect, analyze, and interpret data
from a variety of sources (social interaction, business
processes, cyber-physical systems).
Turning data into value!
Data Science & Big Data
8. Data Analytics by Vala Ali Rohani
Four generic data science questions:
1. What happened?
2. Why did it happen?
3. What will happen?
4. What is the best that can happen?
Data Science & Big Data
10. Data Analytics by Vala Ali Rohani
Data Science & Big Data
Big data is a broad term for data sets so large or complex that traditional data processing
applications are inadequate.
http://en.wikipedia.org/wiki/Big_data
How Much Data?
1 PB = 1000000000000000B = 1015 bytes = 1000terabytes.
• Google processes 20 PB a day (2008)
• Facebook has 2.5 PB of user data + 15 TB/day (4/2009)
• Each engine of Boeing 747 generates 20 TB of information per hour
12. Data Analytics by Vala Ali Rohani
Data Science & Big Data
Some Big Data Theories and Techniques
Map-Reduce
Market Basket Analysis
Pattern Discovery
Social Network Analysis
Process Mining
14. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
Every thing is connected
When you sell items …
When you receive customer calls ... When you make a contract …
When you ship orders …
15. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
What are Networks?
• Networks are sets of nodes connected by edges
“Network” ≡ “Graph”
node
edge
16. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
What is SNA?
SNA (Social Network Analysis) is the mapping and measuring of relationships and
flows between people, groups, organizations, computers, URLs, and other
connected entities
SNA provides both a visual and a
mathematical analysis of human
relationships.
17. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
Why do we need Social Network Analysis?
• Are nodes connected through the network?
• How far apart are they?
• Are some nodes more important due to their position in the
network?
• How will be the patterns for information diffusion?
• Is the network composed of communities?
18. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
Now,
let’s see some samples of SNA …
19. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
Internet
structure of the Internet at the level of autonomous systems. Data source: Mark
Newman http://www-personal.umich.edu/~mejn/netdata/.
20. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
Political Blogs
2004 United States Presidential Election Network
Liberals
Conservatives
21. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
Facebook Friendship Network
22. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
SNA in Organizations (or ONA)
23. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
SNA Metrics :
Degree
Betweenness
Closeness
24. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
SNA Main Centrality Metrics
Degree
The number of direct connections that a node has
𝑑𝑖 =
𝑗 𝑎𝑖𝑗
(𝑛 − 1)
25. SNA Main Centrality Metrics
Betweenness
Betweenness centrality identifies an entity's position within a network in terms of its
ability to make connections to other pairs or groups in a network.
CB (i) = gjk (i)/gjk
j<k
å
CB
'
(i) = CB (i )/[(n -1)(n -2)/2]
Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
26. SNA Main Centrality Metrics
Closeness
Closeness centrality measures how quickly an entity can access more entities in a
network.
Cc (i) = d(i, j)
j=1
N
å
é
ë
ê
ê
ù
û
ú
ú
-1
CC
'
(i) = (CC (i))/(N -1)
Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
27. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
SNA Tools:
NodeXL
28. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
SNA Tools:
Gephi
29. Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
SNA Tools:
UCINET
30. Key nodes in Organization
(from ONA view)
Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
32. Find a node that has high betweenness but
low degree
Data Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
33. Find a node that has low betweenness but
high degree
Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Social Network Analysis (SNA)
35. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
https://www.coursera.org/course/procmin
36. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
https://www.coursera.org/course/procmin
Process mining is the missing link between model-based process analysis and
data-oriented analysis techniques.
Process mining seeks the confrontation between event data (i.e., observed behavior)
and process models (hand-made or discovered automatically).
Some example applications include:
• Analyzing treatment processes in hospitals
• Improving customer service processes
• Understanding the browsing behavior of customers using a booking site
• Analyzing failures of a baggage handling system
What is Process Mining?
37. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
https://www.coursera.org/course/procmin
• What is the process that people really follow?
• Where are the bottlenecks in the studied process?
• Where do people (or machines) deviate from the expected or
idealized process?
• What are the "highways" in my process?
• What factors are influencing a bottleneck?
• Can we predict problems (delay, deviation, risk, etc.) for
running cases?
• Can we recommend some improvements for main process of the
organization?
• How to redesign the process / organization / machine?
Process mining use cases
38. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
https://www.coursera.org/course/procmin
39. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
https://www.coursera.org/course/procmin
40. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
https://www.coursera.org/course/procmin
41. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
https://www.coursera.org/course/procmin
42. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
https://www.coursera.org/course/procmin
43. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
Some Examples of Real Discovered Processes
44. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
Some Examples of Real Discovered Processes
45. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
Some Examples of Real Discovered Processes
46. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Process Mining
Some Examples of Real Discovered Processes
48. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Market Basket Analysis
Introduction
Market Basket Analysis (MBA) is a data mining technique which is widely used in the
consumer package goods (CPG) industry to identify which items are purchased together
and, more importantly, how the purchase of one item affects the likelihood of another
item being purchased.
Bill Qualls, First Analytics, Raleigh, NC, Introduction to Market Basket Analysis
49. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Market Basket Analysis
SALES TRANSACTIONS
Bill Qualls, First Analytics, Raleigh, NC, Introduction to Market Basket Analysis
Our imaginary store sales the following items: bananas, bologna, bread, buns, butter, cereal,
cheese, chips, eggs, hotdogs, mayo, milk, mustard, oranges, pickles, and soda. We have
recorded 20 sales transactions as follows:
50. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Market Basket Analysis
MBA Theories:
Bill Qualls, First Analytics, Raleigh, NC, Introduction to Market Basket Analysis
Support for itemset I = the number of baskets containing all items in I.
Given a support threshold s, sets of items that appear in at least s baskets are called
frequent itemsets.
Association rules are If‐then rules about the contents of baskets.
Confidence of this association rule is the probability of j given i1,…,ik.
51. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Market Basket Analysis
MBA Theories:
Bill Qualls, First Analytics, Raleigh, NC, Introduction to Market Basket Analysis
52. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Market Basket Analysis
Market Basket Example
53. Data Analytics by Vala Ali RohaniData Analytics by Vala Ali Rohani
Thank you