1) Data analytics involves treating available digital data as a "gold mine" to obtain tangible outputs that can improve business efficiency when applied. Machine learning uses algorithms to correlate parameters in data and improve relationships.
2) The document provides an overview of getting started in data science, covering business objectives, statistical analysis, programming tools like R and Python, and problem-solving approaches like supervised and unsupervised learning.
3) It describes the iterative "rule of seven" process for data science projects, including collecting/preparing data, exploring/analyzing it, transforming features, applying models, evaluating performance, and visualizing results.
What is Datamining? Which algorithms can be used for Datamining?Seval Çapraz
This presentation includes what is datamining, which technics and algorithms are available in datamining. This presentation helps you to understand the concepts of datamining.
key note address delivered on 23rd March 2011 in the Workshop on Data Mining and Computational Biology in Bioinformatics, sponsored by DBT India and organised by Unit of Simulation and Informatics, IARI, New Delhi.
I do not claim any originality either to slides or their content and in fact aknowledge various web sources.
ODAM is an Experiment Data Table Management System (EDTMS) that gives you an open access to your data and make them ready to be mined - A data explorer as bonus
What is Datamining? Which algorithms can be used for Datamining?Seval Çapraz
This presentation includes what is datamining, which technics and algorithms are available in datamining. This presentation helps you to understand the concepts of datamining.
key note address delivered on 23rd March 2011 in the Workshop on Data Mining and Computational Biology in Bioinformatics, sponsored by DBT India and organised by Unit of Simulation and Informatics, IARI, New Delhi.
I do not claim any originality either to slides or their content and in fact aknowledge various web sources.
ODAM is an Experiment Data Table Management System (EDTMS) that gives you an open access to your data and make them ready to be mined - A data explorer as bonus
[Webinar Slides] 5 Learning Trends Every CLO Should be WatchingDavid Blake
The world is changing and your needs have evolved. The ever-changing learning landscape is being impacted more than ever by outside influences, and today’s employees want to learn at their own pace using their own resources.
You will learn:
- How these trends affect your employees' learning habits
- Why informal learning is just as important as formal learning
- What tools you need to make learning a competitive advantage for your organization
Analyze Your Smart City: Build Sensor Analytics with OGC SensorThings API SensorUp
This webinar is a hands-on tutorial to develop a sensor analytics application using the SensorThings API. SensorThings API offers a rich set of query functions that can be the basis for analytics. This tutorial will uncover these query functions.
Python software development provides ease of programming to the developers and gives quick results for any kind of projects. Suma Soft is an expert company providing complete Python software development services for small, mid and big level companies. It holds an expertise for 19 years and is backed up by a strong patronage. To know more- https://www.sumasoft.com/python-software-development
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...Rodney Joyce
Number 2 in the Data Science for Dummies series - We'll predict Titanic survival with Databricks, python and MLSpark.
These are the slides only (excuse the Powerpoint animation issues) - check out the actual tech talk on YouTube: https://rodneyjoyce.home.blog/2019/05/03/data-science-for-dummies-machine-learning-with-databricks-python-sparkml-tech-talk-1-of-7/)
If you have not used Databricks before check out the first talk - Databricks for Dummies.
Here's the rest of the series: https://rodneyjoyce.home.blog/tag/data-science-for-dummies/
1) Data Science overview with Databricks
2) Titanic survival prediction with Azure Machine Learning Studio + Kaggle
3) Data Engineering with Titanic dataset + Databricks + Python
4) Titanic with Databricks + Spark ML
5) Titanic with Databricks + Azure Machine Learning Service
6) Titanic with Databricks + MLS + AutoML
7) Titanic with Databricks + MLFlow
8) Titanic with .NET Core + ML.NET
9) Deployment, DevOps/MLOps and Productionisation
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...Rohit Dubey
How Much Do Data Scientists Make?
The demand and salary for data scientists tend to be higher than most other ITES jobs. Experience is one of the key factors in determining the salary range of a data science professional.
According to Glassdoor, a Data Scientist in the United States earns an annual average of USD 117,212, and the same site reports that Data Scientists in India make a yearly average of ₹1,000,000.
Data Scientist Career Path
Data Science is currently considered one of the most lucrative careers available. Companies across all major industries/sectors have data scientist requirements to help them gain valuable insights from big data. There is a sharp growth in demand for highly skilled data science professionals who can straddle the business and IT worlds.
The career path to becoming a data scientist isn’t clearly defined since this is a relatively new profession. People from different backgrounds like mathematics, statistics, computer science or economics, end up in data science.
The major designations for data science professionals are:
Data Analyst
Data Scientist (entry-level)
Associate data scientist
Data Scientist (senior-level)
Product Manager
Lead data scientist
Director/VP/SVP
That was all about Data Scientist Job Description.
Become a Data Scientist Today!
In this write-up, we covered the Data Scientist job description in detail. Irrespective of which location you are in, there is no dearth of jobs for skillful data scientists. A career in data science is a rewarding journey to embark on, especially in the finance, retail, and e-commerce sectors. Jobs are also available with Government departments, universities and research institutes, telecoms, transports, the list goes on.
This video covers
Introductory Questions
Data Science Introduction
Data Science Technical Interview QnA :
#Excel
#SQL
#Python3
#MachineLearning
#DataAnalyticstechnical Interview
#DataScienceProjects
#coder #statistics #datamining #dataanalyst #code #engineering #linux #codinglife #cloudcomputing #businessintelligence #robotics #softwaredeveloper #automation #cloud #neuralnetworks #sql #science #softwareengineer #digitaltransformation #computer #daysofcode #coders #bigdataanalytics #programminglife #dataviz #html #digitalmarketing #devops #datasciencetraining #dataprotection
#rohitdubey
#teachtechtoe
#datascience #datasciencetraining #datasciencejobs #datasciencecourse #datasciencenigeria #datasciencebootcamp #datascienceworkshop #datasciencecareers #datasciencestudent #datascienceproject #datascienceforall #datasciencetraininginpatelnagar#datasciencetrainingindelhi
Machine learning for sensor Data AnalyticsMATLABISRAEL
במצגת זאת נראה כיצד עושים Machine Learning בסביבת MATLAB. נציג מספר יכולות ואפליקציות מובנות ההופכות את תהליך למידת המכונה ליעיל ומהיר יותר – כלים כמו ה-Classification Learner, ה-Regression Learner ו-Bayesian Optimization. בהסתמך על מידע המתקבל מחיישני סמארטפון, נבנה מערכת סיווג המזהה את הפעילות שמבצע המשתמש – הליכה, טיפוס במדרגות, שכיבה, וכו'
Top 30 Data Analyst Interview Questions.pdfShaikSikindar1
Data Analytics has emerged has one of the central aspects of business operations. Consequently, the quest to grab professional positions within the Data Analytics domain has assumed unimaginable proportions. So if you too happen to be someone who is desirous of making through a Data Analyst .
Advances in Exploratory Data Analysis, Visualisation and Quality for Data Cen...Hima Patel
It is widely accepted that data preparation is one of the most time-consuming steps of the machine learning (ML) lifecycle. It is also one of the most important steps, as the quality of data directly influences the quality of a model. In this session, we will discuss the importance and the role of exploratory data analysis (EDA) and data visualisation techniques to find data quality issues and for data preparation, relevant to building ML pipelines. We will also discuss the latest advances in these fields and bring out areas that need innovation. Finally, we will discuss on the challenges posed by industry workloads and the gaps to be addressed to make data-centric AI real in industry settings.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Accelerate Enterprise Software Engineering with Platformless
Data analytcis-first-steps
1. FIRST STEPS IN DATASCIENCE
Tips and tools for wannabe data analysts
By Sheshachalam Ratnala
2. Data analytics Aka Machine Learning
Data analytics as an area
where the available digital
data is treated as a Gold
Mine from where tangible
output is obtained which
when applied impacts
businesses and it’s
efficiency.
Machine Learning is the
tool in the form of y=f(x)
which co-relates all the
parameters in the data to
obtain the relation which it
learns from these
parameters and keeps on
improving the relationship
2
3. Data analytics Aka Machine Learning`
Data : It is a set of values of quantitative and qualitative
variables. Historic information or knowledge represented
in usable form
Population - Entire group
It’s the collection of data which represents whole of the problem domain
Sample - A portion of the group
Subset of the population to be taken for inference which is the true representation
of the overall population
3
4. Data analytics – How to start
Data Science/Data analytics With what ever name it’s
been known to you has essentially 3 areas to cover
Business
StatisticsProgramming
4
5. Data analytics – How to start
Business – Critical thinking
1. Objective analysis and evaluation of an issue in order to form a judgement
2. This is the stage to build the hypothesis for the problem domain in context
3. The model below could be a way to follow
5
6. Data analytics – How to start
Statistics – Mathematical Analysis
Data is considered as variable and the hierarchy is as follows
Data
(Variables)
Numerical
(Quantitative)
Discrete Continuous
Categorical
(Qualitative)
Ordinal
(Logically
ordered)
Nominal
(Unordered)
Continuous
Any values between a permitted
range(5.3, 5.35,5.45 6.0)
Discrete
Whole no: 5, 10
Ordinal
Logical order like Low; Med; High
Nominal
Male ;Female , Different types of 4
wheelers
6
7. Data analytics – How to start
Programming - Execution
R is the widely used tool due it’s historical
statistical usage and it’s abundant statistical
libraries
Python the interpreted language provides
a wide variety of packages for application
development and it’s statistical library .
Data ingestion Tools: Spark, Hadoop
7
8. Data analytics – Problem perspective
Solution
Hypothesis
Supervised
Learning
Numerical Data
(Target Variable)
Regression
Linear Regression Time Series
Categorical data
(Target Variable)
Classification
Decision Trees Random Forest K NN Logistic
Regression
Demand
Forecasting
Reinforcement
learning
Semi-Supervised
NLP and AI
Unsupervised
Clustering
K Means Hierarchical
clustering
Dimensionality
Reduction
Collaborative
filtering
8
Classifying the problem
9. Data analytics – Problem Complexity
The solution
complexity
and data
volume
increases
with the
kind of
business
value being
generated
Credits : odoscope: Overview of analytics methods
9
10. Data analytics – The execution
Basic Terminology
• Attribute - Features are a quantitative attributes of the samples
being observed
• Axis - Features are orthogonal axes of their feature space, if
they are linearly independent
• Column/Independent variables - Features are represented as
columns in your dataset
• Dimension - A dataset's features, grouped together can be
treated as a n-dimensional coordinate space
• Input - Feature values are the input of data-driven, machine
learning algorithms
• Predictor/Dependent variable - Features used to predict other
attributes are called predictors
• View - Each feature conveys a quantitative trait or perspective
about the sample being observed
• Independent Variable - Autonomous features used to calculate
others are like independent variables in algebraic equations
Structuring the data
10
11. Data analytics – The execution
The rule of Seven
The steps are iterative at any stage
• Data collection(Problem context)
• Data Wrangling/Data Munging(Data cleaning)
• Data exploring/Analysis
• Data Transforming
• Modelling
• Model evaluation
• Data Visualization( Intelligence)
The machine learning models works only on clean structured data . 5 out of 7 steps are
related to pre-processing of the data given to model.
11
12. Data analytics – The execution
1. Data collection /selection
1.No bias in the data feature
2.Relevant data feature
3.Techniques to handle
a) Data Collection:
1. Data from sources related to problem i..e DB’s,Weblogs,emails etc..
2. Any audio,video,sensor data etc .
3. The 6 Vs of data , Variety ,Velocity,Verasity,Volume,Value,Viable
b) Data Selection:
1. PCA : Unsupervised data
2.LDA (Linear discrimant analysis) : Supervised data
The rule of Seven
12
13. Data analytics – The execution
2. Data cleaning (Garbage in Garbage Out)
1. Data obtained is not clean and have below issues:
1. Outliers 4. Erroneous data7. Need formatting
2. Missing data 5. Irrelevant data
3. Malicious data 6. Inconsistent data
2. Techniques to handle
1.Impute values by Mean ,Median or Mode
2. Treat outliers by deleting the row if not at all related else analyze with more data
3.Binning
4.Creating new features from given features
5.Dummy variables
The rule of Seven
13
14. Data analytics – The execution
3. Data Analysis (Data exploring)
1.Find the relevance of the feature set. Apply all the basic statistical exploration i..e moments
2. Obtain the statistical relation.
3.Perform basic visualizations for obtaining the concrete feature set.
4.Techniques to handle
1.Univariate analysis ( Mean ,mode, Normal distrubution,Variance,Skewness,Kurtosis)
2.Bi-Variate analysis ( Scatter plot, Box plot, Histogram)
3.Multi-variate analysis (Probability distribution functions PDFs)
The rule of Seven
14
16. Data analytics – The execution
4. Data Transformation(Data on the same scale)
1. Ensure that the rest of the features are informative and transformation changes the no. of features or
the feature values. This is also known as Feature engineering
2. Dimensionality Reduction
3. Curse of dimensionality
4. Techniques to handle
1.PCA : Principal component analysis
2.Kernel Trick
3.Normalization
The rule of Seven
16
17. Data analytics – The execution
6. Machine learning modeling
1. Split data as Test , Train.
2. Keep some data never tested or get
some sample termed as “out of sample”
3. Apply the appropriate ML algorithm on the train data.
4.Check the accuracy with the test data .
5.Observer the Bias and Variance
a)Bias is how far is the target value w.r.t actual value
b)Variance is how distributed is the value w.r.t actual value
c)Error = variance + Bias²
The rule of Seven
17
18. Data analytics – The execution
The rule of Seven
6.1 Machine learning modeling
2.Apply the appropriate algorithm
as described by solution hypothesis
Ref: cheatsheet
18
19. Data analytics – The execution
6.2 Machine learning model
1. Model Performance
1. Model validation
1. MSE ( Mean square error) 2. Hypothesis testing 3.Cross-validation
2. Algorithm tuning
1.Tuning the co-efficient parameters 2..Increasing the splits
3. Feature engineering (iterate again for features)
4. Cross validation
1. K-Fold
5. Ensemble method ( Combining the ML algorithms)
1. Voting ( Selection based on voting on performance) 2.Bagging( Bootstrapping + Aggregating) 3.Boosting (Weak learner
to strong learner.
The rule of Seven
19
20. Data analytics Aka Machine Learning
6.3.1 Machine learning model performance
1. Confusion matrix ( Hypothesis testing
Measurement terms
1. Precision 3.Accuracy 5.False positive(Fallout-rate)
2. Recall 4.Specificity 6.False negative (Miss rate)
20
The rule of Seven
21. Data analytics Aka Machine Learning
6.3.2 Machine learning model performance
1. Cross-fold validations
• Random division of data sets
• ML algorithm check for each
subset
• Overall efficiency as the final
accuracy of the model
21
The rule of Seven
22. Data analytics Aka Machine Learning
7. Data Visualization
1. Storifying the data analysis as Descriptive ,prescriptive or predictive
2. Effective use of the visuals graphs.
3.Tools like Tableau ,D3.js ,Matplotlib,chart.js
22
The rule of Seven
23. Data analytics Aka Machine Learning
Tools in practice
Core – Python library
NumPy
Pandas
Matplotlib
Scikit-learn
(Machine learning algos)
(Mathematical computing functions /N- Dimensional array )
(Data Analysis ,Data munging by in
memory data representation) (2 D Visualization library)
For a high level language user python is the best tool available to use
23
24. Data analytics Aka Machine Learning
Tools sources
1. Anaconda
1. Use IPython universal editor
2. Python 2.7+ or 3.5
3. Careful about the version because of supporting function
4. A good starting tool
5. Spyder Interactive editor tool for basic python learning
2. Enthought Canopy.
1. Interactive environment
3. Pycharm by jetbrains : Interactive IDE debugger tool
24
25. Data analytics Aka Machine Learning
Tools cheat sheets
Must visit sites
KdNuggets
Kaggle
DatascienceCentral
DataCamp
https://www.class-central.com/
http://analyticsvidhya.com/
https://www.odsc.com/
http://www.pythonlearn.com/
http://datascienceplus.com/
Practice data sets
http://ipython-books.github.io/minibook/
http://learnds.com/
https://vincentarelbundock.github.io/Rdatasets/
25