Decision Tree is one of the most efficient technique
to carry out data mining, which can be easily implemented by
using R, a powerful statistical tool which is used by more than 2
million statisticians and data scientists worldwide. Decision trees
can be used in a variety of disciplines, such as for predicting
which patient characteristics are associated with high risk of a
disease. When used together, we can find the relevant set of data,
from the large data stored in the Enterprise Data Warehouses
(EDWs). This provides new opportunities for organizations to
derive new value from their most valuable and abundantly
available asset: information, to create a competitive edge.
this presentation is an introduction to R programming language.we will talk about usage, history, data structure and feathers of R programming language.
Data Science - Part II - Working with R & R studioDerek Kane
This tutorial will go through a basic primer for individuals who want to get started with predictive analytics through downloading the open source (FREE) language R. I will go through some tips to get up and started and building predictive models ASAP.
this presentation is an introduction to R programming language.we will talk about usage, history, data structure and feathers of R programming language.
Data Science - Part II - Working with R & R studioDerek Kane
This tutorial will go through a basic primer for individuals who want to get started with predictive analytics through downloading the open source (FREE) language R. I will go through some tips to get up and started and building predictive models ASAP.
FellowBuddy.com is an innovative platform that brings students together to share notes, exam papers, study guides, project reports and presentation for upcoming exams.
We connect Students who have an understanding of course material with Students who need help.
Benefits:-
# Students can catch up on notes they missed because of an absence.
# Underachievers can find peer developed notes that break down lecture and study material in a way that they can understand
# Students can earn better grades, save time and study effectively
Our Vision & Mission – Simplifying Students Life
Our Belief – “The great breakthrough in your life comes when you realize it, that you can learn anything you need to learn; to accomplish any goal that you have set for yourself. This means there are no limits on what you can be, have or do.”
Like Us - https://www.facebook.com/FellowBuddycom
Worldranking universities final documentationBhadra Gowdra
With the upcoming data deluge of semantic data, the fast growth of ontology bases has brought significant challenges in performing efficient and scalable reasoning. Traditional centralized reasoning methods are not sufficient to process large ontologies. Distributed searching methods are thus required to improve the scalability and performance of inferences. This paper proposes an incremental and distributed inference method for large-scale ontologies by using Map reduce, which realizes high-performance reasoning and runtime searching, especially for incremental knowledge base. By constructing transfer inference forest and effective assertion triples, the storage is largely reduced and the search process is simplified and accelerated. We propose an incremental and distributed inference method (IDIM) for large-scale RDF datasets via Map reduce. The choice of Map reduce is motivated by the fact that it can limit data exchange and alleviate load balancing problems by dynamically scheduling jobs on computing nodes. In order to store the incremental RDF triples more efficiently, we present two novel concepts, i.e., transfer inference forest (TIF) and effective assertion triples (EAT). Their use can largely reduce the storage and simplify the reasoning process. Based on TIF/EAT, we need not compute and store RDF closure, and the reasoning time so significantly decreases that a user’s online query can be answered timely, which is more efficient than existing methods to our best knowledge. More importantly, the update of TIF/EAT needs only minimum computation since the relationship between new triples and existing ones is fully used, which is not found in the existing literature. In order to store the incremental RDF triples more efficiently, we present two novel concepts, transfer inference forest and effective assertion triples. Their use can largely reduce the storage and simplify the searching process.
Introduction to the R Statistical Computing Environmentizahn
Get an introduction to R, the open-source system for statistical computation and graphics. With hands-on exercises, learn how to import and manage datasets, create R objects, and conduct basic statistical analyses. Full workshop materials can be downloaded from http://projects.iq.harvard.edu/rtc/event/introduction-r
A PERMISSION BASED TREE-STRUCTURED APPROACH FOR REPLICATED DATABASESijp2p
Data replication is gaining increased importance due to the increasing demand for availability, performance and fault tolerance in databases. The main challenge for deploying replicated databases on a large scale is to resolve conflicting update requests. In this paper, we propose a permission based dynamic primary copy algorithm to resolve conflicting requests among various sites of replicated databases. The contribution of this work is the reduction in the number of messages per update request. Further, we propose a novel approach for handling single-site fault tolerance including algorithms for detection, removal, and restoration of a faulty site in the database system.
FellowBuddy.com is an innovative platform that brings students together to share notes, exam papers, study guides, project reports and presentation for upcoming exams.
We connect Students who have an understanding of course material with Students who need help.
Benefits:-
# Students can catch up on notes they missed because of an absence.
# Underachievers can find peer developed notes that break down lecture and study material in a way that they can understand
# Students can earn better grades, save time and study effectively
Our Vision & Mission – Simplifying Students Life
Our Belief – “The great breakthrough in your life comes when you realize it, that you can learn anything you need to learn; to accomplish any goal that you have set for yourself. This means there are no limits on what you can be, have or do.”
Like Us - https://www.facebook.com/FellowBuddycom
The R language is a project designed to create a free, open source language which can be used as a replacement for the S-PLUS language, originally developed as the S language at AT&T Bell Labs, and currently marketed by Insightful Corporation of Seattle, Washington. R is an open source implementation of S, and differs from S-plus largely in its command-line only format.
Topics Covered:
1.Introduction to R
2.Installing R
3.Why Learn R
4.The R Console
5.Basic Arithmetic and Objects
6.Program Example
7.Programming with Big Data in R
8.Big Data Strategies in R
9.Applications of R Programming
10.Companies Using R
11.What R is not so good at
12.Conclusion
FellowBuddy.com is an innovative platform that brings students together to share notes, exam papers, study guides, project reports and presentation for upcoming exams.
We connect Students who have an understanding of course material with Students who need help.
Benefits:-
# Students can catch up on notes they missed because of an absence.
# Underachievers can find peer developed notes that break down lecture and study material in a way that they can understand
# Students can earn better grades, save time and study effectively
Our Vision & Mission – Simplifying Students Life
Our Belief – “The great breakthrough in your life comes when you realize it, that you can learn anything you need to learn; to accomplish any goal that you have set for yourself. This means there are no limits on what you can be, have or do.”
Like Us - https://www.facebook.com/FellowBuddycom
Worldranking universities final documentationBhadra Gowdra
With the upcoming data deluge of semantic data, the fast growth of ontology bases has brought significant challenges in performing efficient and scalable reasoning. Traditional centralized reasoning methods are not sufficient to process large ontologies. Distributed searching methods are thus required to improve the scalability and performance of inferences. This paper proposes an incremental and distributed inference method for large-scale ontologies by using Map reduce, which realizes high-performance reasoning and runtime searching, especially for incremental knowledge base. By constructing transfer inference forest and effective assertion triples, the storage is largely reduced and the search process is simplified and accelerated. We propose an incremental and distributed inference method (IDIM) for large-scale RDF datasets via Map reduce. The choice of Map reduce is motivated by the fact that it can limit data exchange and alleviate load balancing problems by dynamically scheduling jobs on computing nodes. In order to store the incremental RDF triples more efficiently, we present two novel concepts, i.e., transfer inference forest (TIF) and effective assertion triples (EAT). Their use can largely reduce the storage and simplify the reasoning process. Based on TIF/EAT, we need not compute and store RDF closure, and the reasoning time so significantly decreases that a user’s online query can be answered timely, which is more efficient than existing methods to our best knowledge. More importantly, the update of TIF/EAT needs only minimum computation since the relationship between new triples and existing ones is fully used, which is not found in the existing literature. In order to store the incremental RDF triples more efficiently, we present two novel concepts, transfer inference forest and effective assertion triples. Their use can largely reduce the storage and simplify the searching process.
Introduction to the R Statistical Computing Environmentizahn
Get an introduction to R, the open-source system for statistical computation and graphics. With hands-on exercises, learn how to import and manage datasets, create R objects, and conduct basic statistical analyses. Full workshop materials can be downloaded from http://projects.iq.harvard.edu/rtc/event/introduction-r
A PERMISSION BASED TREE-STRUCTURED APPROACH FOR REPLICATED DATABASESijp2p
Data replication is gaining increased importance due to the increasing demand for availability, performance and fault tolerance in databases. The main challenge for deploying replicated databases on a large scale is to resolve conflicting update requests. In this paper, we propose a permission based dynamic primary copy algorithm to resolve conflicting requests among various sites of replicated databases. The contribution of this work is the reduction in the number of messages per update request. Further, we propose a novel approach for handling single-site fault tolerance including algorithms for detection, removal, and restoration of a faulty site in the database system.
FellowBuddy.com is an innovative platform that brings students together to share notes, exam papers, study guides, project reports and presentation for upcoming exams.
We connect Students who have an understanding of course material with Students who need help.
Benefits:-
# Students can catch up on notes they missed because of an absence.
# Underachievers can find peer developed notes that break down lecture and study material in a way that they can understand
# Students can earn better grades, save time and study effectively
Our Vision & Mission – Simplifying Students Life
Our Belief – “The great breakthrough in your life comes when you realize it, that you can learn anything you need to learn; to accomplish any goal that you have set for yourself. This means there are no limits on what you can be, have or do.”
Like Us - https://www.facebook.com/FellowBuddycom
The R language is a project designed to create a free, open source language which can be used as a replacement for the S-PLUS language, originally developed as the S language at AT&T Bell Labs, and currently marketed by Insightful Corporation of Seattle, Washington. R is an open source implementation of S, and differs from S-plus largely in its command-line only format.
Topics Covered:
1.Introduction to R
2.Installing R
3.Why Learn R
4.The R Console
5.Basic Arithmetic and Objects
6.Program Example
7.Programming with Big Data in R
8.Big Data Strategies in R
9.Applications of R Programming
10.Companies Using R
11.What R is not so good at
12.Conclusion
Introduction to Data Science, Prerequisites (tidyverse), Import Data (readr), Data Tyding (tidyr),
pivot_longer(), pivot_wider(), separate(), unite(), Data Transformation (dplyr - Grammar of Manipulation): arrange(), filter(),
select(), mutate(), summarise()m
Data Visualization (ggplot - Grammar of Graphics): Column Chart, Stacked Column Graph, Bar Graph, Line Graph, Dual Axis Chart, Area Chart, Pie Chart, Heat Map, Scatter Chart, Bubble Chart
The presentation is a brief case study of R Programming Language. In this, we discussed the scope of R, Uses of R, Advantages and Disadvantages of the R programming Language.
These Lecture series are relating the use R language software, its interface and functions required to evaluate financial risk models. Furthermore, R software applications relating financial market data, measuring risk, modern portfolio theory, risk modeling relating returns generalized hyperbolic and lambda distributions, Value at Risk (VaR) modelling, extreme value methods and models, the class of ARCH models, GARCH risk models and portfolio optimization approaches.
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...2023240532
Quantitative data Analysis
Overview
Reliability Analysis (Cronbach Alpha)
Common Method Bias (Harman Single Factor Test)
Frequency Analysis (Demographic)
Descriptive Analysis
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
3. Understanding Decision Tree Algorithm
using R Programming Language
Presented By:
Akshat Sharma#1, Anuj Srivastava#2
#Computer Science Department, Integral
University Lucknow, Uttar Pradesh, India
1mailidakshat@gmail.com
2anuj@iul.ac.in
3
4. Abstract
Decision Tree is one of the most efficient technique to carry out
data mining, which can be easily implemented by using R, a powerful
statistical tool which is used by more than 2 million statisticians and
data scientists worldwide. Decision trees can be used in a variety of
disciplines, such as for predicting which patient characteristics are
associated with high risk of a disease. When used together, we can find
the relevant set of data, from the large data stored in the Enterprise
Data Warehouses (EDWs). This provides new opportunities for
organizations to derive new value from their most valuable and
abundantly available asset: information, to create a competitive edge.
4
5. Introduction
• The first three phases of Data Analytics Lifecycle- discovery, data
preparation, and model planning, involve various aspects of data
exploration. The success of a data analysis project requires a deep
understanding of the data, it requires a tool for data mining and presenting
the data. We can use decision tree as a tool for data mining and R for
presenting the data.
• Data Mining is an analytic process designed to explore data (usually large
amounts of data - also known as "big data") in search of consistent patterns
and/or systematic relationships between variables, and then to validate the
findings by applying the detected patterns to new subsets of data. The
ultimate goal of data mining is prediction. The most common type of data
mining is Predictive data mining and it is the one that has the most direct
business applications. [1] 5
6. • Fundamental learning methods that appear in applications related
to data mining are-
1. Clustering
2. Association Rules
3. Regression
4. Classification
5. Time Series Analysis
6. Text Analysis
6
7. Introduction to ‘R’
• The ‘R’ language is a project designed to create a free, open source
language which can be used as a replacement for the ‘S’ language, ‘R’ was
created by Ross Ihaka and Robert Gentleman at the University of Auckland,
New Zealand, and is currently developed by the R Development Core Team, of
which Chambers is a member. ‘R’ first appeared in 1993; 23 years ago.
• ‘R’ is named partly after the first names of the first two ‘R’ authors and
partly as a play on the name of ‘S’.
• ‘R’ is an open source implementation of ‘S’, and differs from S-plus largely in
its command-line only format.
• ‘S’ is a statistical programming language developed primarily by John
Chambers and Rick Becker and Allan Wilks of Bell Laboratories.
7
8. Figure 1. A schematic view of how ‘R’ works. [2]
8
9. Installing R
• R is freely downloadable from http://cran.r-project.org/ , as is a wide
range of documentation.
• Most users should download and install a binary version. This is a
version that has been translated (by compilers) into machine language
for execution on a particular type of computer with a particular
operating system. Installation on Microsoft Windows is
straightforward.
• A binary version is available for windows 98 or above from the web
page http://cran.r-project.org/bin/windows/base .
9
11. Packages in ‘R’
• The most important single innovation in ‘R’ is the package
system, which provides a cross-platform system for distributing
and testing code and data.
• The Comprehensive R Archive Network (http://cran.r-
project.org) distributes public packages, but packages are also
useful for internal distribution.
• A package consists of a directory with a DESCRIPTION file
and subdirectories with code, data, documentation, etc. The
Writing R Extensions manual documents the package system,
and package.skeleton() simplifies package creation. [3]
11
12. Package Description
base Base R functions
datasets Base R datasets
grDevices Graphics devices for base and grid graphics
graphics R functions for base graphics
methods Formally defined methods and classes for R
objects
stats R statistical functions
utils R utility functions
12
Table 1. ‘R’ packages, loaded on startup. [4]
The libraries shown in Table 1 are loaded on the ‘R’ startup.
13. Applications of ‘R’
1. Clinical Trials
2. Cluster Analysis
3. Computational Physics
4. Differential Equations
5. Econometrics
6. Environmental Studies
7. Experimental Design
8. Finance
9. Genetics
10. Graphical Models
13
11. Graphics and Visualizations
12. High Performance Computing
13. High Throughput Genomics
14. Machine Learning
15. Medical Imaging
16. Meta Analysis
17. Multivariate Statistics
18. Natural Language Processing
19. Official Statistics
20. Optimization
15. INTRODUCTION TO DECISION TREE
• A decision tree is also called a prediction tree. A decision tree uses a
structure to specify sequences of decisions and consequences. Given
input X={X1, X2,..,Xn}, the goal is to predict a response or output
variable Y. [5] Each member of the set {X1,X2,…,Xn} is called an input
variable.
• A decision tree consists of nodes, and thus form a rooted tree, this
means that it is a directed tree with a node called root. There are no
incoming edges on root node, all other nodes in a decision tree have
exactly one incoming edge. An internal node is a node with an
incoming edge and outgoing edges, internal node is also known as
test node. Nodes with no outgoing edges are known as leaves or
terminal nodes.
15
16. • Figure 2 describes a decision tree
that reasons whether to send an
alert or to send a warning when an
error is triggered.
• In Figure 2, to depict the tree in
more clear way, we can represent
the internal nodes as circles, and
leaves as triangles, to tell them apart
in one glance. Each node is labeled
with the attribute it tests, and its
branches are labeled with its
corresponding values. The leaf nodes
represent class labels, i.e. the
outcome of all the prior decisions.
Figure 2
16
17. THE GENERAL DECISION TREE ALGORITHM
• The objective of a decision tree algorithm is to construct a tree ‘T’ from
a training set ‘S’. If all the records in ‘S’ belong to some class ‘C’, or if ‘S’
is sufficiently pure, then that node is considered a leaf node and
assigned the label ‘C’. The purity of a node is defined as its probability of
the corresponding class. [6]
• The algorithm constructs subtrees for the subsets of S recursively until
one of the following criteria is met: [7]
1. All the leaf nodes in the tree satisfy the minimum purity threshold.
2. The tree cannot be further split with the preset minimum purity
threshold.
3. Any other stopping criterion is satisfied (such as the maximum depth
of the tree).
17
18. DECISION TREES IN ‘R’
• In ‘R’, ‘rpart’ is for modelling decision trees, and an optional package
‘rpart.plot’ enables the plotting of a tree. [8] The rpart package can
be used for classification by decision trees and can also be used to
generate regression trees.
• To grow a tree, use: [9]
rpart (formula, data=, method=, control=)
Where,
18
19. • Entropy and information gain are common technical
terms/notations used in decision trees.
• Entropy is a measure of the number of random ways in which a
system may be arranged. For a data set ‘S’ containing ‘n’ records the
information entropy is defined as,[10]
• Let Hx is entropy of X defined as: [11]
(Here ‘Px’is the proportion of ‘S’)
19
20. Conclusion
• ‘R’ is the best at what it does- letting experts quickly and easily
interpret, interact with, and visualize data.
• The use of ‘R’ for implementing decision trees for classification in data
mining is highly beneficial over SAS and Matlab because:
i. ‘R’ is open source,
ii. ‘R’ has a large library support,
iii. ‘R’ supports visualization,
iv. all of this, while being inexpensive in comparison to both.
• Thanks to the worldwide community effort, ‘R’ keeps growing with
thousands of user-created packages built and added to enhance ‘R’
functionality.
20
21. References
[1] Data Mining Techniques. [Online]. Available:
•http://documents.software.dell.com/Statistics/Tex
tbook/Data-Mining-Techniques [Accessed 21
November 2015].
[2] Emmanuel Paradis, “R for Beginners”, CRAN.
Figure 1, pp. 8. [Online]. Available:
•https://cran.r-project.org/doc/contrib/Paradis-
rdebuts_en.pdf [Accessed 22 August 2015].
[3] Thomas Lumley, “R Fundamentals and
Programming Techniques,” in R Packages, January
2015, pg. 416. [Online].
[4] Adedin Culhane, Harvard School of Public
Health, BIO503, January 2013, “Introduction to
Programming and Statistical Modelling in R”.
[Online]. Available:
http://isites.harvard.edu/fs/docs/icb.topic1202070.fil
es/Bio503_winter13.pdf [Accessed 22 August 2015]. 21
[5] EMC Education Services, “Data Science and Big Data
Analytics,” in Advanced Analytical Theory and Methods:
Classification, January 2015, pp. 192.
[6] EMC Education Services, “Data Science and Big Data
Analytics,” in Advanced Analytical Theory and Methods:
Classification, January 2015, pg. 197.
[7] EMC Education Services, “Data Science and Big Data
Analytics,” in Advanced Analytical Theory and Methods:
Classification, January 2015, pg. 197.
[8] EMC Education Services, “Data Science and Big Data
Analytics,” in Advanced Analytical Theory and Methods:
Classification, January 2015, pg. 206.
22. [9] Data Mining Algorithms In R / Classification /
Decision Trees. [Online]. Available:
https://en.wikibooks.org/wiki/Data_Mining_Algo
rithms_In_R/Classification/Decision_Trees
[Accessed 27 November 2015].
[10] Venkatadri.M, Lokanatha C. Reddy. (2010, Apr.-
2010, Sept.). A comparative study on decision tree
classification algorithms in data mining.
International Journal of Computer Application in
Engineering, Technology and Sciences (IJ-CA-ETS).
[Online]. 1.1, pg. 1. Available:
https://www.academia.edu/
[11] Anuj Srivastava, Dr. Vinodini Katiyar, Navjot
Singh. (2015, June). AReview of Decision Tree
Algorithm: Big Data Analytics. International Journal
of Informative & Futuristic Research. ISSN (Online):
2347-1697, pg. 6.
22