Big data Analytics is a process to extract meaningful insight from big such as hidden patterns, unknown correlations, market trends and customer preferences
Big data Analytics is a process to extract meaningful insight from big such as hidden patterns, unknown correlations, market trends and customer preferences
In this presentation, let's have a look at What is Data Science and it's applications. We discussed most common use cases of Data Science.
I presented this at LSPE-IN meetup happened on 10th March 2018 at Walmart Global Technology Services.
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?Bernard Marr
There are three classifications of data: structured, semi-structured and unstructured. While structured data was the type used most often in organizations historically, artificial intelligence and machine learning have made managing and analysing unstructured and semi-structured data not only possible, but invaluable.
In this presentation, I have talked about Big Data and its importance in brief. I have included the very basics of Data Science and its importance in the present day, through a case study. You can also get an idea about who a data scientist is and what all tasks he performs. A few applications of data science have been illustrated in the end.
Introduction to Data Science and AnalyticsSrinath Perera
This webinar serves as an introduction to WSO2 Summer School. It will discuss how to build a pipeline for your organization and for each use case, and the technology and tooling choices that need to be made for the same.
This session will explore analytics under four themes:
Hindsight (what happened)
Oversight (what is happening)
Insight (why is it happening)
Foresight (what will happen)
Recording http://t.co/WcMFEAJHok
Data Science is a wonderful technology that has applications in almost every field. Let's learn the basics of this domain on 16th March at (time).
Agenda
1. What is Data Science? How is it different from ML, DL, and AI
2. Why is this skill in demand?
3. What are some popular applications of Data Science
4. Popular tools and frameworks used in Data Science
A Seminar Presentation on Big Data for Students.
Big data refers to a process that is used when traditional data mining and handling techniques cannot uncover the insights and meaning of the underlying data. Data that is unstructured or time sensitive or simply very large cannot be processed by relational database engines. This type of data requires a different processing approach called big data, which uses massive parallelism on readily-available hardware.
3 pillars of big data : structured data, semi structured data and unstructure...PROWEBSCRAPER
There are 3 pillars of Big Data
1.Structured data
2.Unstructured data
3.Semi structured data
Businesses worldwide construct their empire on these three pillars and capitalize on their limitless potential.
In this presentation, let's have a look at What is Data Science and it's applications. We discussed most common use cases of Data Science.
I presented this at LSPE-IN meetup happened on 10th March 2018 at Walmart Global Technology Services.
What’s The Difference Between Structured, Semi-Structured And Unstructured Data?Bernard Marr
There are three classifications of data: structured, semi-structured and unstructured. While structured data was the type used most often in organizations historically, artificial intelligence and machine learning have made managing and analysing unstructured and semi-structured data not only possible, but invaluable.
In this presentation, I have talked about Big Data and its importance in brief. I have included the very basics of Data Science and its importance in the present day, through a case study. You can also get an idea about who a data scientist is and what all tasks he performs. A few applications of data science have been illustrated in the end.
Introduction to Data Science and AnalyticsSrinath Perera
This webinar serves as an introduction to WSO2 Summer School. It will discuss how to build a pipeline for your organization and for each use case, and the technology and tooling choices that need to be made for the same.
This session will explore analytics under four themes:
Hindsight (what happened)
Oversight (what is happening)
Insight (why is it happening)
Foresight (what will happen)
Recording http://t.co/WcMFEAJHok
Data Science is a wonderful technology that has applications in almost every field. Let's learn the basics of this domain on 16th March at (time).
Agenda
1. What is Data Science? How is it different from ML, DL, and AI
2. Why is this skill in demand?
3. What are some popular applications of Data Science
4. Popular tools and frameworks used in Data Science
A Seminar Presentation on Big Data for Students.
Big data refers to a process that is used when traditional data mining and handling techniques cannot uncover the insights and meaning of the underlying data. Data that is unstructured or time sensitive or simply very large cannot be processed by relational database engines. This type of data requires a different processing approach called big data, which uses massive parallelism on readily-available hardware.
3 pillars of big data : structured data, semi structured data and unstructure...PROWEBSCRAPER
There are 3 pillars of Big Data
1.Structured data
2.Unstructured data
3.Semi structured data
Businesses worldwide construct their empire on these three pillars and capitalize on their limitless potential.
This presentation is prepared by one of our renowned tutor "Suraj"
If you are interested to learn more about Big Data, Hadoop, data Science then join our free Introduction class on 14 Jan at 11 AM GMT. To register your interest email us at info@uplatz.com
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptxRATISHKUMAR32
The presentation contain the business profiles in big data analytics. through this ppt user can learn about the different case studies such as facebook and walmart. This ppt contain the information and seven characteristics that are required to learn the basics of big data.
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...Experfy
Gartner, IBM, Accenture and many others have asserted that 80% or more of the world’s information is unstructured – and inherently hard to analyze. What does that mean? And what is required to extract insight from unstructured data?
Unstructured data is infinitely variable in quality and format, because it is produced by humans who can be fastidious, unpredictable, ill-informed, or even cynical, but always unique, not standard in any way. Recent advances in natural language processing provides the notion that unstructured content can be included in data analysis.
Serious growth and value companies are committed to data. The exponential growth of Big Data has posed major challenges in data governance and data analysis. Good data governance is pivotal for business growth.
Therefore, it is of paramount importance to slice and dice Big Data that addresses data governance and data analysis issues. In order to support high quality business decision making, it is important to fully harness the potential of Big Data by implementing proper Data Migration, Data Ingestion, Data Management, Data Analysis, Data Visualization and Data Virtualization tools.
Check it out: https://www.experfy.com/training/courses/march-towards-big-data-big-data-implementation-migration-ingestion-management-visualization
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
Watch full webinar here: https://bit.ly/35FUn32
Presented at CDAO New Zealand
Advanced data science techniques, like machine learning, have proven an extremely useful tool to derive valuable insights from existing data. Platforms like Spark, and complex libraries for R, Python, and Scala put advanced techniques at the fingertips of the data scientists.
However, most architecture laid out to enable data scientists miss two key challenges:
- Data scientists spend most of their time looking for the right data and massaging it into a usable format
- Results and algorithms created by data scientists often stay out of the reach of regular data analysts and business users
Watch this session on-demand to understand how data virtualization offers an alternative to address these issues and can accelerate data acquisition and massaging. And a customer story on the use of Machine Learning with data virtualization.
Every day we roughly create 2.5 Quintillion bytes of data; 90% of the worlds collected data has been generated only in the last 2 years. In this slide, learn the all about big data
in a simple and easiest way.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
2. BDA
• Big data analytics is the process of examining large and
varied data sets -- i.e., big data -- to uncover
– hidden patterns,
– unknown correlations,
– market trends,
– customer preferences
– Meaningful/actionable information/insights to make
faster and better decisions (informed decisions).
3. BDA
• Data analytics helps to slice and dice the data to extract
insights that allow to leverage this data to give an
organization a competitive advantage.
• BDA supports 360 degree view of the customer
(clickstream data which is unstructured).
4.
5.
6.
7. BDA
• Businesses can use advanced analytics techniques such
as
– text analytics,
– machine learning,
– predictive analytics,
– data mining, statistics
– and natural language processing to gain new insights from
previously untapped data sources with existing enterprise data.
8. BDA
• It is technology enabled analytics to process and
analyze big data.
• BDA is about gaining a meaningful, deeper, and
richer insight into the business to steer it in the right
direction, understanding the customer’s
demographics to cross-sell and up-sell to them, by
better leveraging the services of vendors and
suppliers, etc.
9.
10. BDA
• BDA is a tight handshake between three
communities: IT, business users, and data scientists.
• BDA is working with data sets whose volume and
variety exceed the current storage, processing
capabilities and infrastructure of an enterprise.
• BDA is about moving code to data, because programs
for distributed processing is tiny (few KBs) compared
to the data (TB, PB, EB, ZB and YB).
11. Types of USD available for analysis
BDA
RFID
CRM
ERP
POS
Websites
Social
Media
13. Why the sudden hype around BDA?
1. Data is growing at a 40% compound annual rate,
reaching nearly 74 ZB by 2021. The volume of
business data worldwide is expected to double
every 1.2 years.
– Examples:
– Wal-Mart : Process one million customer
transactions per hour.
– Twitter: 500 million tweets per day.
– 2.7 billion “Likes” and comments are posted by
Facebook users in a day.
14. Why the sudden hype around BDA?
2. Cost per gigabyte of storage has hugely dropped.
3. An overwhelming number of user friendly
analytics tools are available in the market today.
4. Three Digital Accelerators : bandwidth, digital
storage, and processing power.
16. First school of thought
• Classified analytics into:
– Basic analytics
– Operationalized analytics
– Advanced analytics
– Monetized analytics.
• convert into or express in the form of currency.
17.
18. First school of thought
• Basic analytics:
– Deals with slicing and dicing of data to help with basic
business insights.
– Reporting on historical data, basic visualization, etc..
• Operationalized analytics:
– Focusing on the integration of analytics into business
units in order to take specific action on insights.
– Analytics are integrated into both production
technology systems and business processes.
19. First school of thought
• Advanced analytics:
– Forecasting for the future by predictive and
prescriptive modeling.
• Monetized analytics:
– Used to derive direct business revenue.
– The act of generating measurable economic benefits
from available data sources
20. Six ways to indirectly monetize your
data
1. Reduce costs
2. Enhance your product or service
3. Enter new business sectors or tap into new
types of customers
4. Develop new products, services, or markets.
5. Drive sales and marketing
6. Improve productivity and efficiencies.
25. Analytics 1.0 (1950-2009)
• Descriptive statistics.
• Report on events, occurrences, etc. of the past.
• Key questions:
– What happened?
– Why did it happen?
• Data:
– CRM, ERP, or 3rd party applications
– Small and structured data sources, DW and Data mart
– Internally sourced.
– Relational databases.
26. Analytics 2.0 (2005-2012)
• Descriptive + predictive statistics
• Uses data from the past to make predictions for the
future.
• Key questions:
– What will happen?
– Why will it happen?
• Data:
– Big data (SD, USD and SSD)
– Massive parallel servers running Hadoop
– Externally sourced.
– DB applications, Hadoop clusters, Hadoop environment.
27. Analytics 3.0 (2012 to present)
• Descriptive +predictive + Prescriptive statistics.
• Uses data from the past to make predictions for the future and at
the same time make recommendations to leverage the situations
to one’s advantage.
• Key questions:
– What will happen?
– When will it happen?
– Why will it happen?
– What action should be taken?
• Data:
– Big data (SD, USD and SSD) in real-time.
– Internally + Externally sourced.
– In memory analytics, M/C learning, agile analytical methods.
28. Challenges that prevent business from
capitalizing on big data
1. Getting the business units to share information across
organizational silos.
2. Finding the right skills (business analysts and data scientists)
that can manage large amounts of SD, SSD, and USD and create
insight from it.
3. The need to address the storage and processing of large
volume, velocity and variety of big data.
4. Deciding whether to use SD or USD, internal or external data to
make business decisions.
5. Choosing the optimal way to report finding and analysis of big
data for the presentation to make the more sense.
6. Determining what to do with insights created from big data.
29. Top challenges facing big data
1. The practical issues of storing all the data (Scale)
2. Security: lack of authentication and authorization
3. Data schema (No fixed and rigid schema) leads to
dynamic schema.
4. Consistency /Eventual consistency
5. Availability (24 x 7) (Failure transparency)
6. Partition tolerance (Both H/w and S/w)
7. Validating big data (Data quality) : accuracy,
completeness and timeliness.
31. Importance of BDA
• Cost reduction: Cost-effective storage system for
huge data sets. [Hadoop and cloud based
analytics]
• Faster, better decision making: Provides ways to
analyze information quickly and make decisions.
• [Hadoop processing speed and in-memory
analytics, combined with the ability to analyze new
sources of data]
32. Importance of BDA
• New products and services: Evaluation of
customer needs and satisfaction through BDA.
–business can create new products to meet
customers’ needs.
–With BDA, more companies are creating new
products, including new revenue opportunities,
more effective marketing, better customer
service.
–Example: Automated car, health care apps and
entertainment apps, Bank apps, Govt. apps,
33. Technologies needed to meet challenges
posed by big data
1. Cheap and abundant storage.
2. Faster processor to help with quicker processing of
big data.
3. Affordable open-source, distributed big data
platforms such as Hadoop.
4. Parallel processing, clustering, virtualization, large
grid environments, high connectivity, high
throughputs and low latency.
5. Cloud computing and other flexible resource
allocation arrangements.
34.
35.
36. Data Science
• Data science is an interdisciplinary field that uses
scientific methods, processes, algorithms and
systems to extract knowledge and insights
from structured and unstructured data.
• It employs techniques and theories drawn from
many fields within the broad areas
of mathematics, statistics, information technology
including machine learning, probability
models, classification, cluster analysis, data
mining, databases, pattern recognition
and visualization.
37. Data Science
• Data Science is primarily used to make
predictions and decisions using
predictive, prescriptive analytics and machine
learning.
• Prescriptive analytics is all about providing
advice. it not only predicts but suggests a range
of prescribed actions and associated outcomes.
38. Data science – development of data
product
• A "data product" is a technical asset that: (1)
utilizes data as input, and (2) processes that data
to return algorithmically-generated results.
• The classic example of a data product is a
recommendation engine, which ingests user data,
and makes personalized recommendations based
on that data
39. Examples of data products:
• Amazon's recommendation engines suggest
items for you to buy. Netflix recommends movies
to you. Spotify recommends music to you.
• Gmail's spam filter is data product – an algorithm
behind the scenes processes incoming mail and
determines if a message is spam or not.
• Computer vision used for self-driving cars is also
data product – machine learning algorithms are
able to recognize traffic lights, other cars on the
road, pedestrians, etc.
40. The role of Data Science
• The self-driving cars collect live data from sensors,
including radars, cameras and lasers to create a map of its
surroundings. Based on this data, it takes decisions like
when to speed up, when to speed down, when to
overtake, where to take a turn – making use of advanced
machine learning algorithms.
• Data Science can be used in predictive analytics: Data from
ships, aircrafts, radars, satellites can be collected and
analyzed to build models. These models will not only
forecast the weather but also help in predicting the
occurrence of any natural calamities.
42. Difference between Data Analysis and
Data Science
• Data Analysis includes descriptive analytics and
prediction to a certain extent.
• On the other hand, Data Science is more about
Predictive Analytics and Machine Learning.
• Data Science is a more forward-looking approach, an
exploratory way with the focus on analyzing the past or
current data and predicting the future outcomes with
the aim of making informed decisions.
43. Use-cases of for Data Science
• Internet search
• Digital Advertisements
• Recommender Systems
• Image Recognition
• Speech Recognition
• Gaming
• Price Comparison Websites
• Airline Route Planning
• Fraud and Risk Detection
• Medical diagnosis, etc.
45. Data Science process
1. Collecting raw data from multiple disparate data
sources.
2. Processing the data
3. Integrating the data and preparing clean datasets
4. Engaging in explorative data analysis using model (ML
model) and algorithms.
5. Preparing presentations using data visualization.
6. Communicating the findings to all stakeholders
7. Making faster and better decisions.
46. Business Acumen (wisdom) skills of
Data Scientist
• Understanding of domain
• Business strategy
• Problem solving
• Communication
• Presentation
• Thirst for knowledge
47. Technology Expertise of Data Scientist
• Good database knowledge such as RDBMS
• Good NoSQL database knowledge such as MongoDB,
Cassandra, Hbase, etc.
• Languages such as Java, Python, R, C++, etc.
• Open-source tools such as Hadoop.
• Data warehousing
• Data Mining, Pattern recognition, algorithms
• Excellent understanding of machine learning techniques
and algorithms, such as K-means, Regression, kNN, Naive
Bayes, SVM, PCA, Decision tree, Tableau, Flare, Google
visualization APIs, text analytics, DL, NLP, AI etc.
49. Data Scientist
• A data scientist is a professional responsible for
collecting, analyzing and interpreting large
amounts of data to identify ways to help a
business to improve operations and gain a
competitive edge over competitors.
• They're part mathematician, part
computer scientist and part trend-spotter.
50. Responsibilities of Data Scientist
1. Prepare and integrates large and varied datasets
and develop relevant data sets for analysis.
2. Thoroughly clean and prune data to discard
irrelevant information
3. Applies business/domain knowledge to provide
context.
4. Employs a blend of analytical techniques to
develop models and algorithms to understand
the data, interpret relationships, spot trends
and unveil patterns.
51. Responsibilities of Data Scientist
5. Communicates or presents findings or results in
the business context in a language that is
understood by the different business
stakeholders.
6. Invent new algorithms to solve problems and
build new tools to automate work. (IIT)
52. Responsibilities of Data Scientist
7. Employ sophisticated analytics programs,
machine learning and statistical methods to
prepare data for use in predictive and
prescriptive modeling.
8. Explore and examine data from a variety of
angles to determine hidden weaknesses, trends
and/or opportunities
53. Terminologies/Technologies used in big
data environment
1. In-memory analytics
2. In-database processing
3. Symmetric multiprocessor system (SMP)
4. Massively parallel processing
5. Parallel and distributed systems
6. Shared nothing architecture
54. In-memory analytics
• In-memory analytics is an approach to querying data
that resides in a computer's random access
memory (RAM), as opposed to querying data that is
stored on physical disks.
• This results in reduced query response times, allowing
analytic applications to support faster business
decisions.
• In-memory analytics is achieved through adoption of 64-
bit architectures, which can handle more memory and
larger files compared to 32-bit.
55. In-database processing
• In-database analytics is a technology that allows data
processing to be conducted within the database by
building analytic logic into the database itself.
• In-database processing, sometimes referred to as in-
database analytics, refers to the integration of
data analytics into data warehousing functionality.
• It eliminates the time and effort required to transform
data and move it back and forth between a database and
a separate analytics application.
• Example: credit card fraud detection, Bank risk
management, etc.
56. Symmetric multiprocessor system (SMP)
• SMP (symmetric multiprocessing) is the processing of
programs by multiple processors that share a common
operating system and memory.
• In symmetric (or "tightly coupled") multiprocessing, the
processors share memory and the I/O bus or data path.
• A single copy of the operating system is in charge of all
the processors (homogeneous).
57.
58. Massively parallel processing (MPP)
• MPP (massively parallel processing) is the
coordinated processing of a problem (program) by
multiple processors that work on different parts of the
program in parallel, with each processor having its own
operating system and dedicated memory.
• The MPP processors communicate using message
passing.
• Typically, the setup for MPP is more complicated,
requiring thought about how to partition a common
database among processors and how to assign work
among the processors. An MPP system is also known as
a "loosely coupled" or "shared nothing" system.
59. Massively parallel processing (MPP)
• An MPP system is considered better than a
symmetrically parallel system ( SMP ) for applications
that allow a number of databases to be searched in
parallel. These include decision support system, data
warehouse and big data applications.
64. Shared Nothing architecture (SNA)
• A shared nothing architecture (SN) is a distributed
computing architecture in which each node is independent
and self-sufficient. More specifically, none of the nodes share
memory or disk storage.
• shared-nothing is often called massively parallel processor
(MPP). Many research prototypes and commercial products
have adopted the shared-nothing architecture because it has
the best scalability.
• In the shared-nothing architecture, each node is made of
processor, main memory and disk and communicates with
other nodes through the interconnection network. Each node
is under the control of its own copy of the operating system.
65. SNA advantages
1. Fault isolation
2. Scalability
3. Absence of single point of failure
4. Self-healing capabilities
67. CAP Theorem
• The CAP theorem, also named Brewer's theorem after
computer scientist Eric Brewer, states that it is impossible for
a distributed data store to simultaneously provide more than
two out of the following three guarantees:
1. Consistency: Every read fetches the last write.
2. Availability: Every request gets a response on
success/failure.
3. Partition tolerance: System continues to work despite
message loss or partial failure or N/W partition.
• Distributed data store: It is is a computer network where
information is stored on more than one node, often in
a replicated fashion. It is also known as distributed
database.
69. Possible combinations of CAP for databases
1. Availability and Partition Tolerance (AP)
2. Consistency and Partition Tolerance (CP)
3. Consistency and Availability (CA)
Note: Google’s BigTable, Amazon’s Dynamo and
Facebook’s Cassandra uses one of these combinations.
71. BASE
• BASE stands for Basically Available, Soft state, Eventual
consistency and used to achieve high availability in
distributed computing.
• Basically available indicates that the system guarantees
the availability, in terms of the CAP theorem.
• Soft state indicates that the state of the system may
change over time.
• Eventual consistency indicates that the system will
become consistent over time or after a certain time all
nodes become consistent,
72. Few top analytics tools
1. MS Excel
2. SAS
3. IBM SPSS Modeler
4. Statistica
5. QlickView
6. Tableau
7. R analytics
8. Weka
9. Apache Spark
10. KNIME, Rapidminer
11. Splunk and so on.