Distributed Logistic Model Trees

•

0 likes•1,779 views

Classification algorithms play an important role in different business areas, such as fraud detection, cross selling or customer behavior. In the business context, interpretability is a very desirable property, sometimes even a hard requirement. However, interpretable algorithms are usually outperformed by other non-interpretable algorithms such as Random Forest. In this talk Antonio Soriano and Mateo Alvarez presented a distributed implementation in Spark of the Logistic Model Tree (LMT) algorithm (Landwehr, et al. (2005). Machine Learning, 59(1-2), 161-205.), which consists of a decision tree with logistic classifiers in the leaves. While being highly interpretable, the LMT consistently performs equally or better than other popular algorithms in several performance metrics such as accuracy, precision/recall or area under the ROC curve.

Data & Analytics

Distributed Logistic Model Trees, Stratio Intelligence
Mateo Álvarez and Antonio Soriano
@StratioBD

Aerospace Engineer, MSc in
Propulsion Systems (UPM), Master
in Data Science (URJC).
Working as data scientist and Big
Data developer at Stratio Big Data in
the data science department
mateo-alvarez

Ph.D. in Telecommunications, MSc in
Electronic Systems Engineering and
Telecommunication Technologies,
Systems and Networks (UPV), and MSc
“Big Data Expert” (UTAD).
Working as data scientist and Big Data
developer at at Stratio Big Data in the
data science department
@Phd_A_Soriano

Why using interpretable algorithms instead of “black boxes”
Logistic Regression
Decision Trees
Variance-Bias tradeoff
Metrics
Demo
Logistic Model Trees
Distributed implementation
Cost function & configuration params
Demo

• Why use interpretable algorithms instead of “black boxes”
• Logistic Regression
• Decision Trees
• Variance-Bias tradeoff
@StratioBD

Accuracy Explainability
VS
Medical Studies Power management Financial environment Criminal activity

Threshold
Probability
Local bad adjust
Feature

Feature 1
Root node
Feature 2 Feature 2
Leaf node Leaf node Leaf node Leaf node

Feature 1
Root node
Feature 2 Feature 2
Leaf node Leaf node Leaf node Leaf node
Local bad adjust

MeanError
Model complexity
Test error
Training error
Model complexity
Variance
Total error
Bias
2
Error
OverfittingUnderfitting

Missing important variables for the problem
to make the predictions
Variance
Bias

Overfitting to the sample/training data
Variance
Bias

Irreducible error on prediction
Variance
Bias

• Logistic Model Trees
• Distributed implementation
• Cost function & configuration parametres
• Demo
@StratioBD

Performance
metrics
Feature 1
Root node
Performance
metrics
Performance
metrics

Feature 1
Root node
Performance
metrics
Performance
metrics
Performance
metrics
Performance
metrics
Performance
metrics
Performance
metrics

DISTRIBUTED IMPLEMENTATION
Spark’s Decision Tree
(distributed implementation of random forests)
Spark’s Logistic Regression / weka’s
Logistic Regression on the nodes

LMT Cost function to fix the logistic regression threshold
• AccuracyCostFunction
• ConfusionMatrix
• PrecisionCostFunction
• PrecisionRecallCostFunction
• RocCostFunction
The same cost function for pruning criteria
Performance
metrics
Performance
metrics
Performance
metrics

Big datasets
Power of spark to distribute building the
tree and logistic regressions
ADVANTAGES OF THIS IMPLEMENTATION
Medium datasets
Distributed tree growth and weka’s
logistic regression
Small datasets
Although it can be slow to
distribute the data for the decision
tree, cost functions can be still
used and specific optimization for
particular cases

Example of DLMT algorithm
in a synthetic dataset

PREDICTION
Positive Negative
TRUE
CONDITION
Positive True Positives False Negatives
Negative False Positives True Negatives
Precission
True Positive Rate (Recall)
False Positive Rate
TPR = TP/(TP+FN) Insensitive to unbalance
FPR = FP/(FP+TN) Insensitive to unbalance
Precision = TP/(TP+FP) Sensitive to unbalance
Accuracy = (TP+TN)/(TP+TN+FP+FN) Sensitive to unbalance

PREDICTION
Positive Negative
TRUE
CONDITION
Positive True Positives False Negatives
Negative False Positives True Negatives
True Positive Rate (Recall)
False Positive Rate
AUROC (AUC): TPR/FPR -> Insensitive to unbalance! TPR
FPR
Best performance

Accuracy ExplainabilityVS
Performance Metrics:
AUROC, AUPRC, ACCURACY
Automatic
Benchmarking
Framework
f
f
1
n
BenchmarkABF
1
2
3 4

THANK YOU
UNITED STATES
Tel: (+1) 408 5998830
EUROPE
Tel: (+34) 91 828 64 73
contact@stratio.com
www.stratio.com
@StratioBD

people@stratio.com
WE ARE HIRING
@StratioBD

What's hot

Big Data Landscape 2016

Josef Adersberger

Big Data Tech Stack

Abdullah Çetin ÇAVDAR

Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric

Cambridge Semantics

Finding the number of unique users out of 10 billion events per day is challenging. At this session, we're going to describe how re-architecting our data infrastructure, relying on Druid and ThetaSketch, enables our customers to obtain these insights in real-time. To put things into context, at NMC (Nielsen Marketing Cloud) we provide our customers (marketers and publishers) real-time analytics tools to profile their target audiences. Specifically, we provide them with the ability to see the number of unique users who meet a given criterion. Historically, we have used Elasticsearch to answer these types of questions, however, we have encountered major scaling and stability issues. In this presentation we will detail the journey of rebuilding our data infrastructure, including researching, benchmarking and productionizing a new technology, Druid, with ThetaSketch, to overcome the limitations we were facing. We will also provide guidelines and best practices with regards to Druid. Topics include : * The need and possible solutions * Intro to Druid and ThetaSketch * How we use Druid * Guidelines and pitfalls

Counting Unique Users in Real-Time: Here's a Challenge for You!

DataWorks Summit

Aggreko are a leading provider of temporary power and temperature control solutions, serving customers across the globe as they work on projects ranging from the Olympics to aiding humanitarian disaster relief. In this talk, Helena and Andy will discuss how the Insights team have developed scalable machine learning solutions to support the business. In particular they will discuss fuel consumption forecasts that have helped Aggreko’s fuel logistics teams improve customer service levels and reduce costs by becoming more proactive and insight driven.

Embedding Insight through Prediction Driven Logistics

Databricks

The Curse of the Data Lake Monster

Thoughtworks

Presentation of my talk given in Phoenix Data Conference 2019. In this we will look at challenges with current Apache Hadoop ecosystem Apache Hadoop is still relevant but way of doing Hadoop and enterprise data architecture has to be re-looked as we enter Cognitive and Cloud Native Era We need  Architecture that is enabled by common run time layer across on premise and cloud  Architecture that can abstract away dependency and version conflicts with tons of open source machine learning out there. Yarn did not scale up in that aspect until one want to deal with multiple conda environment  Architecture that can enable real Hybrid Cloud and Multi Cloud portability  And many more challenges that one has to overcome to keep architecture simple, infrastructure agile and better utilized

Future of Data Platform in Cloud Native world

Srivatsan Srinivasan

The Synapse IoT Stack: Technology Trends in IOT and Big Data

InMobi Technology

Graphs in Telecommunications - Jesus Barrasa, Neo4j

Neo4j

Maintaining networks and servers availability while reducing downtimes to minimum are fundamental missions for IT managers and administrators. But with the growing complexity of infrastructures, the pressure from business strategists to deliver new services or the heterogeneity of data assets, managing networks is often a challenge. Graph technologies like Linkurious offer an intuitive approach to model and investigate data by putting the connections between components at the forefront. Modeling the network into a flexible and unified overview is one of the keys to understand your architecture and reduce risks, costs and time spent on maintenance operations.

Graph-based Network & IT Management.

Linkurious

Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric

Cambridge Semantics

Introducing Databricks Delta

Databricks

Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...

Dataconomy Media

Neo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best Practices

Neo4j

Knowledge Graph for Machine Learning and Data Science

Cambridge Semantics

Privacy-Preserving AI Network - PlatON 2.0

ShiHeng1

From hadoop to spark

steccami

Recently, in the fields Business Intelligence and Data Management, everybody is talking about data science, machine learning, predictive analytics and many other “clever” terms with promises to turn your data into gold. In this slides, we present the big picture of data science and machine learning. First, we define the context for data mining from BI perspective, and try to clarify various buzzwords in this field. Then we give an overview of the machine learning paradigms. After that, we are going to discuss - at a high level - the various data mining tasks, techniques and applications. Next, we will have a quick tour through the Knowledge Discovery Process. Screenshots from demos will be shown, and finally we conclude with some takeaway points.

Data Mining - The Big Picture!

Khalid Salama

The relationships between data sets matter. Discovering, analyzing, and learning those relationships is a central part to expanding our understand, and is a critical step to being able to predict and act upon the data. Unfortunately, these are not always simple or quick tasks. To help the analyst we introduce RAPIDS, a collection of open-source libraries, incubated by NVIDIA and focused on accelerating the complete end-to-end data science ecosystem. Graph analytics is a critical piece of the data science ecosystem for processing linked data, and RAPIDS is pleased to offer cuGraph as our accelerated graph library. Simply accelerating algorithms only addressed a portion of the problem. To address the full problem space, RAPIDS cuGraph strives to be feature-rich, easy to use, and intuitive. Rather than limiting the solution to a single graph technology, cuGraph supports Property Graphs, Knowledge Graphs, Hyper-Graphs, Bipartite graphs, and the basic directed and undirected graph. A Python API allows the data to be manipulated as a DataFrame, similar and compatible with Pandas, with inputs and outputs being shared across the full RAPIDS suite, for example with the RAPIDS machine learning package, cuML. This talk will present an overview of RAPIDS and cuGraph. Discuss and show examples of how to manipulate and analyze bipartite and property graph, plus show how data can be shared with machine learning algorithms. The talk will include some performance and scalability metrics. Then conclude with a preview of upcoming features, like graph query language support, and the general RAPIDS roadmap.

RAPIDS cuGraph – Accelerating all your Graph needs

Connected Data World

It may be that I just became old and grumpy, but it bothers me tremendously when someone starts to implement the tools without a clear understanding of what the initial requirements are. When we talk about monitoring, especially for the web apps, a lot of folks will start thinking something like "I will call some dedicated endpoint and in that way, I will determine the current health state of my application". That is definitely a good starting point, but this is only the beginning. In my session, I will talk more about the observability quality attribute of the system, different monitoring models that can be implemented with different tools, SLA calculation, and a lot of other important stuff that you need to be concern about before your application will go live.

"Application monitoring — from requirements to tools, not the other way aroun...

Fwdays

What's hot (20)

Big Data Landscape 2016

Big Data Tech Stack

Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric

Counting Unique Users in Real-Time: Here's a Challenge for You!

Embedding Insight through Prediction Driven Logistics

The Curse of the Data Lake Monster

Future of Data Platform in Cloud Native world

The Synapse IoT Stack: Technology Trends in IOT and Big Data

Graphs in Telecommunications - Jesus Barrasa, Neo4j

Graph-based Network & IT Management.

Using Cloud Automation Technologies to Deliver an Enterprise Data Fabric

Introducing Databricks Delta

Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...

Neo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best Practices

Knowledge Graph for Machine Learning and Data Science

Privacy-Preserving AI Network - PlatON 2.0

From hadoop to spark

Data Mining - The Big Picture!

RAPIDS cuGraph – Accelerating all your Graph needs

"Application monitoring — from requirements to tools, not the other way aroun...

Viewers also liked

Stratio platform overview v4.1

Stratio

Hoy en día la combinación de modelos de Machine Learning es muy popular. En la mayoría de los concursos de kaggle, los primeros clasificados suelen emplear alguna variante de estas técnicas. Hablamos de qué nos lleva a emplearlas, cuándo y cómo. Nos centramos en las técnicas de resampleado, como bagging y boosting, trabajamos el implementado de un AdaBoost sencillo sobre árboles binarios, y finalmente, vemos una aplicación de Random Forest para selección de variables. More information: www.stratio.com https://www.youtube.com/StratioBD

Lunch&Learn: Combinación de modelos

Stratio

[Strata] Sparkta

Stratio

Los proyectos de Big Data han pasado de sus fases de POC, donde la seguridad ha sido, en el mejor de los casos, un aspecto secundario. Por ello las herramientas Big Data y más concretamente las usadas para procesar datos, deben ponerse al día en seguridad. Las herramientas como Spark, no están pensados para la seguridad. Por eso Abel y Jorge quieren compartir los hacks que han sido necesarios hacer a Spark para poder usar Kerberos para autenticarse contra servicios securizados.

Meetup: Spark + Kerberos

Stratio

Stratio’s Cassandra Lucene Index, derived from Stratio Cassandra, is an open sourced plugin for Apache Cassandra that extends its index functionality to provide near real time search such as ElasticSearch or Solr, including full text search capabilities and free multivariable, geospatial and bitemporal search. It is achieved through an Apache Lucene based implementation of Cassandra secondary indexes, where each node of the cluster indexes its own data. Stratio’s Cassandra indexes are one of the core modules on which Stratio’s BigData platform is based. Andres de la Peña discusses the recently added geospatial search features in Stratio's Cassandra Lucene index using some Nephila Capital use cases. These new features include indexing complex polygons, nearest neighbour search, and the application of chained geometrical transformations such as bounding box, convex hull, centroid, union, intersection, exclusion and distance buffer.

Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016

Stratio

One of the top banks in Europe, needed a system to provide better performance, scaling almost linearly with the increase in information to be analyzed, and allowing to move the processes that were currently being executed in the Host to a Big Data infrastructure. During a year we've worked on a system which is able to provide greater agility, flexibility and simplicity for the user to view information when profiling and is now able to analyze the structure of profile data. It's a powerful way to make online queries to a graph database, which is integrated with Apache Spark and different graph libraries. Basically, we get all the necessary information through Cypher queries which are sent to a Neo4j database. Using the last Big Data technologies like Spark Dataframe, HDFS, Stratio Intelligence or Stratio Crossdata, we have developed a solution which is able to obtain critical information for multiple datasources like text files o graph databases.

Multiplaform Solution for Graph Datasources

Stratio

Stratio CrossData: an efficient distributed datahub with batch and streaming ...

Stratio

A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)

Spark Summit

Apache Spark & Cassandra use case at Telefónica Cbs by Antonio Alcacer

Stratio

Spark Streaming @ Berlin Apache Spark Meetup, March 2015

Stratio

Functional programming in scala

Stratio

Primeros pasos con Spark - Spark Meetup Madrid 30-09-2014

Stratio

Introduction to Asynchronous scala

Stratio

La Unión Bancaria Europea

koball

Presentacion

ComunicacionesPDB

El modelo europeo de reporting y el lenguaje XBRL - Ignacio Boixo

Asociación XBRL España

UNION BANCARIA EN LA UNION EUROPEA

Ramiro Ojeda

Recuperación y Unión Bancaria Europea. Emilio Ontiveros

Universidad de Deusto - Deustuko Unibertsitatea - University of Deusto

Stratio big data spain

Álvaro Agea Herradón

Estándares en Unión Europea: Marco, Desafíos y Oportunidades - Francisco Garc...

Asociación XBRL España

Viewers also liked (20)

Stratio platform overview v4.1

Lunch&Learn: Combinación de modelos

[Strata] Sparkta

Meetup: Spark + Kerberos

Stratio's Cassandra Lucene index: Geospatial use cases - Big Data Spain 2016

Multiplaform Solution for Graph Datasources

Stratio CrossData: an efficient distributed datahub with batch and streaming ...

A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)

Apache Spark & Cassandra use case at Telefónica Cbs by Antonio Alcacer

Spark Streaming @ Berlin Apache Spark Meetup, March 2015

Functional programming in scala

Primeros pasos con Spark - Spark Meetup Madrid 30-09-2014

Introduction to Asynchronous scala

La Unión Bancaria Europea

Presentacion

El modelo europeo de reporting y el lenguaje XBRL - Ignacio Boixo

UNION BANCARIA EN LA UNION EUROPEA

Recuperación y Unión Bancaria Europea. Emilio Ontiveros

Stratio big data spain

Estándares en Unión Europea: Marco, Desafíos y Oportunidades - Francisco Garc...

Similar to Distributed Logistic Model Trees

In this deck from the 2019 Stanford HPC Conference, Srinivasan Parthasarathy from Ohio State University presents: Scalable Machine Learning: The Role of Stratified Data Sharding. "With the increasing popularity of structured data stores, social networks and Web 2.0 and 3.0 applications, complex data formats, such as trees and graphs, are becoming ubiquitous. Managing and learning from such large and complex data stores, on modern computational eco-systems, to realize actionable information efficiently, is daunting. In this talk I will begin with discussing some of these challenges. Subsequently I will discuss a critical element at the heart of this challenge relates to the sharding, placement, storage and access of such tera- and peta- scale data. In this work we develop a novel distributed framework to ease the burden on the programmer and propose an agile and intelligent placement service layer as a flexible yet unified means to address this challenge. Central to our framework is the notion of stratification which seeks to initially group structurally (or semantically) similar entities into strata. Subsequently strata are partitioned within this eco-system according to the needs of the application to maximize locality, balance load, minimize data skew or even take into account energy consumption. Results on several real-world applications validate the efficacy and efficiency of our approach. (Notes: Joint work with Y. Wang (Airbnb) and A. Chakrabarti (MSR))." Srinivasan Parthasarathy, Professor of Computer Science & Engineering, The Ohio State University Srinivasan Parthasarathy is a Professor of Computer Science and Engineering and the director of the data mining research laboratory at Ohio State. His research interests span databases, data mining and high performance computing. He is among a handful of researchers nationwide to have won both the Department of Energy and National Science Foundation Career awards. He and his students have won multiple best paper awards or "best of" nominations from leading forums in the field including: SIAM Data Mining, ACM SIGKDD, VLDB, ISMB, WWW, ICDM, and ACM Bioinformatics. He chairs the SIAM data mining conference steering committee and serves on the action board of ACM TKDD and ACM DMKD --leading journals in the field. Since 2012 he also helped lead the creation of OSU's first-of-a-kind nationwide (USA) undergraduate major in data analytics and serves as one of its founding directors. Watch the video: https://youtu.be/hOJI8e0p-UI Learn more: http://web.cse.ohio-state.edu/~parthasarathy.2/ and http://hpcadvisorycouncil.com/events/2019/stanford-workshop/ Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter

Scalable Machine Learning: The Role of Stratified Data Sharding

inside-BigData.com

Data reduction

GowriLatha1

Polikar10missing

kagupta

Azure Databricks for Data Scientists

Richard Garris

Machine Learning.pdf

BeyaNasr1

Synthesis of analytical methods data driven decision-making

Adam Doyle

Big learning 1.2

Mohit Garg

Machine learning and linear regression programming

Soumya Mukherjee

Parallel KNN for Big Data using Adaptive Indexing

IRJET Journal

IRJET - Fault Detection and Classification in Transmission Line by using KNN ...

IRJET Journal

Among many data clustering approaches available today, mixed data set of numeric and category data poses a significant challenge due to difficulty of an appropriate choice and employment of distance/similarity functions for clustering and its verification. Unsupervised learning models for artificial neural network offers an alternate means for data clustering and analysis. The objective of this study is to highlight an approach and its associated considerations for mixed data set clustering with Adaptive Resonance Theory 2 (ART-2) artificial neural network model and subsequent validation of the clusters with dimensionality reduction using Autoencoder neural network model.

An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...

Happiest Minds Technologies

ML-Unit-4.pdf

AnushaSharma81

In recent years, the static shortest path (SP) problem has been well addressed using intelligent optimization techniques, e.g., artificial neural networks, genetic algorithms (GAs), particle swarm optimization, etc. However, with the advancement in wireless communications, more and more mobile wireless networks appear, e.g., mobile networks [mobile ad hoc networks (MANETs)], wireless sensor networks, etc. One of the most important characteristics in mobile wireless networks is the topology dynamics, i.e., the network topology changes over time due to energy conservation or node mobility. Therefore, the SP routing problem in MANETs turns out to be a dynamic optimization problem. GA's are able to find, if not the shortest, at least an optimal path between source and destination in mobile ad-hoc network nodes. And we obtain the alternative path or backup path to avoid reroute discovery in the case of link failure or nodeex

APPLICATION OF GENETIC ALGORITHM IN DESIGNING A SECURITY MODEL FOR MOBILE ADH...

cscpconf

Survey on classification algorithms for data mining (comparison and evaluation)

Alexander Decker

In the era of IoTs and A.I., distributed and parallel computing is embracing big data driven and algorithm focused applications and services. With rapid progress and development on parallel frameworks, algorithms and accelerated computing capacities, it still remains challenging on deliver an efficient and scalable data analysis solution. This talk shares a research experience on data pattern discovery in domain applications. In particular, the research scrutinizes key factors in analysis workflow design and data parallelism improvement on cloud.

A Tale of Data Pattern Discovery in Parallel

Jenny Liu

A TALE of DATA PATTERN DISCOVERY IN PARALLEL

Jenny Liu

IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES

Vikash Kumar

Hw09 Hadoop Based Data Mining Platform For The Telecom Industry

Cloudera, Inc.

Anomaly detection has been an important research topic in data mining and machine learning. Many real-world applications such as intrusion or credit card fraud detection require an effective and efficient framework to identify deviated data instances. However, most anomaly detection methods are typically implemented in batch mode, and thus cannot be easily extended to large-scale problems without sacrificing computation and memory requirements. In this paper, we propose multidimensional reduction principal component analysis (MdrPCA) algorithm to address this problem, and we aim at detecting the presence of outliers from a large amount of data via an online updating technique. Unlike prior principal component analysis (PCA)-based approaches, we do not store the entire data matrix or covariance matrix, and thus our approach is especially of interest in online or large-scale problems. By using multidimensional reduction PCA the target instance and extracting the principal direction of the data, the proposed MdrPCA allows us to determine the anomaly of the target instance according to the variation of the resulting dominant eigenvector. Since our MdrPCA need not perform eigen analysis explicitly, the proposed framework is favored for online applications which have computation or memory limitations. Compared with the well-known power method for PCA and other popular anomaly detection algorithms

Anomaly Detection using multidimensional reduction Principal Component Analysis

IOSR Journals

Subspace clustering discovers the clusters embedded in multiple, overlapping subspaces of high dimensional data. Many significant subspace clustering algorithms exist, each having different characteristics caused by the use of different techniques, assumptions, heuristics used etc. A comprehensive classification scheme is essential which will consider all such characteristics to divide subspace clustering approaches in various families. The algorithms belonging to same family will satisfy common characteristics. Such a categorization will help future developers to better understand the quality criteria to be used and similar algorithms to be used to compare results with their proposed clustering algorithms. In this paper, we first proposed the concept of SCAF (Subspace Clustering Algorithms’ Family). Characteristics of SCAF will be based on the classes such as cluster orientation, overlap of dimensions etc. As an illustration, we further provided a comprehensive, systematic description and comparison of few significant algorithms belonging to “Axis parallel, overlapping, density based” SCAF.

SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS

ijdkp

Similar to Distributed Logistic Model Trees (20)

Scalable Machine Learning: The Role of Stratified Data Sharding

Data reduction

Polikar10missing

Azure Databricks for Data Scientists

Machine Learning.pdf

Synthesis of analytical methods data driven decision-making

Big learning 1.2

Machine learning and linear regression programming

Parallel KNN for Big Data using Adaptive Indexing

IRJET - Fault Detection and Classification in Transmission Line by using KNN ...

An Approach to Mixed Dataset Clustering and Validation with ART-2 Artificial ...

ML-Unit-4.pdf

APPLICATION OF GENETIC ALGORITHM IN DESIGNING A SECURITY MODEL FOR MOBILE ADH...

Survey on classification algorithms for data mining (comparison and evaluation)

A Tale of Data Pattern Discovery in Parallel

A TALE of DATA PATTERN DISCOVERY IN PARALLEL

IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES

Hw09 Hadoop Based Data Mining Platform For The Telecom Industry

Anomaly Detection using multidimensional reduction Principal Component Analysis

SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS

More from Stratio

On November 6th, we got together at Google Campus to talk about Mesos and DC/OS. Ignacio Mulas, Sparta & Spark Product Owner at Stratio, explained how to build an environment that can secure and govern its data for operational and analytical applications on top of DC/OS platform. He showed that analytical and machine learning pipelines can be combined with operational processes maintaining the security and providing governing tools to manage our data. He focused on the architecture and tools needed to achieve an ecosystem like this and we will show a demo of it. He also explained how we can develop our pipelines interactively with auto-discovered data catalogs and explore our results. Find out more: https://www.stratio.com/events/discover-how-to-deploy-a-secure-big-data-pipeline-with-dcos/

Mesos Meetup - Building an enterprise-ready analytics and operational ecosyst...

Stratio

Marco Baena, Head of AI at Stratio, presented at Big Data Spain 2018 "Can an intelligent system exist without awareness?" Description: Is a system without awareness a truly intelligent system? Here he showed why it is a mandatory approach to pursue awareness and how an Awareness-Centric model is the key of any competitive modern business. In addition to this, he also showed the technical benefits of such systems and how Stratio proposal is a reference when following this model.

Can an intelligent system exist without awareness? BDS18

Stratio

Kafka and KSQL - Apache Kafka Meetup

Stratio

On July 19th, we got together at Google Campus to talk about how to increase and complete existing data to improve Machine Learning Models. Fernando Velasco, Data Scientist at Stratio, and Raúl de la Fuente, Presales at Stratio, talked about techniques of image processing like Data Augmentation and other more modern techniques that involve the use of Deep Learning models. More info: http://www.stratio.com/blog/events/planet-data-scientist-live-meet-the-wild-data/

Wild Data - The Data Science Meetup

Stratio

Within the use of Machine Learning models for prediction, one of the sets of techniques that stands out is the model combination. We will study these combinations with Fernando Velasco, Data Scientist at Stratio, who will explain what they are, why and when to use them. Two of the main general techniques will be explained: boosting and bagging and, finally, how to make feature selection through ensembles.

Ensemble methods in Machine Learning

Stratio

Looking to build Kappa architectures or SMACK applications or to use distributed technologies such as Spark, Kafka, Elasticsearch and Hadoop? Do you want to build your own data-centric platform? Are you lacking a development team with Scala and Spark skills? Are you managing to move towards full Digital Transformation? Have our questions stressed you out enough? Then you are ready for our latest release! Our next product release will include Sparta 2.0, ready to solve the above-mentioned issues. Stratio clients will be able to build Big Data processes and data pipeline workflows in minutes with an amazing UI, integrated with Spark, Kafka and several Big Data technologies. In this session, we will show how Sparta 2.0 and its workflow jobs are a key piece in a data-centric platform and how it is integrated with a PaaS (DC/OS). We will also show how Stratio has made sure all pieces within the platform, including Sparta 2.0 are secured. By: José Carlos García and Javier Yuste

Stratio Sparta 2.0

Stratio

A data-centric platform integrates multiple Big Data open source technologies. For example, at Stratio we use Spark, Kafka, Elastic search and many more. Most of these technologies do not offer native security. This lack of security, not only leaves companies open to critical risks like data leakage, unsecure communications or DoS attacks but is also a major barrier to complying with different regulations such as LOPD, PCI-DSS or the upcoming GDPR. This talk gives a technical and innovative overview of how companies can face the challenge of protecting the data and services that are in their data-centric platform, focusing on three main aspects: implementing network segmentation, managing AAA and securing data processing. By: Carlos Gómez

Big Data Security: Facing the challenge

Stratio

Digital Transformation starts with data. What if a solution existed that put data at the center, in a single place, serving all applications around it? This training will include a demonstration in a distributed data-centric platform which provides a data intelligence layer, composed of artificial intelligence models able to make use of a whole company’s data. Nowadays, one of the most innovative techniques in the realm of artificial intelligence is Deep Neural Nets. Among the many applications, language modelling, machine translation and image generation are receiving particular attention. Deep nets are also powerful in predictive modelling ambits such as stock pricing and the energy industry. We will address a few case studies modeled with TensorFlow, running on Stratio’s data-centric product in a distributed cluster. By: Fernando Velasco

Artificial Intelligence on Data Centric Platform

Stratio

Introduction to Artificial Neural Networks

Stratio

Apache Spark ya es una realidad en el mundo de la informática y ahora necesitamos no sólo saber de lo que la tecnología es capaz, necesitamos hacerlo productivo. Para ello necesitamos saber cómo poder auditar cada uno de sus procesos de una manera sencilla y sin necesidad de conocimientos destacados de esta tecnología. Jorge López-Malla muestra qué herramientas proporciona el propio framework de Apache Spark para poder monitorizar el rendimiento de los algoritmos y cómo sacarle partido para mejorar los jobs de Apache Spark, tanto streaming como batch, y ver qué magia hace SparkSQL cuando se quiere hacer un simple join.

Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...

Stratio

Advanced search and Top-K queries in Cassandra

Stratio

[Spark meetup] Spark Streaming Overview

Stratio

Why spark by Stratio - v.1.0

Stratio

On-the-fly ETL con EFK: ElasticSearch, Flume, Kibana

Stratio

Spark Summit - Stratio Streaming

Stratio

More from Stratio (15)

Mesos Meetup - Building an enterprise-ready analytics and operational ecosyst...

Can an intelligent system exist without awareness? BDS18

Kafka and KSQL - Apache Kafka Meetup

Wild Data - The Data Science Meetup

Ensemble methods in Machine Learning

Stratio Sparta 2.0

Big Data Security: Facing the challenge

Artificial Intelligence on Data Centric Platform

Introduction to Artificial Neural Networks

Meetup: Cómo monitorizar y optimizar procesos de Spark usando la Spark Web - ...

Advanced search and Top-K queries in Cassandra

[Spark meetup] Spark Streaming Overview

Why spark by Stratio - v.1.0

On-the-fly ETL con EFK: ElasticSearch, Flume, Kibana

Spark Summit - Stratio Streaming

Recently uploaded

(Vivek)Call Us, 8448380779,Call girls in Delhi NCr – We Offer best in class call girls. escort Service At Affordable Price At low Rate with Space Night 8000 We Are One Of The Oldest Escort and Call girls Agencies in Delhi. You Will Find That Our Female Escorts Are Full Of Fun, Sexy And They Would Love Enjoy Your Company. We Have A Fantastic Selection Of Escort Ladies Available For In-Calls As Well As Out-Calls. Our Escorts Are Not Only Beautiful But All Have Great Personalities Making Them The Perfect Companion For Any Occasion. In-Call:- You Can Come At Our Place in Delhi Our place Which Is Very Clean Hygienic 100% safe Accommodation. Out-Call:- You have To Come Pick The Girl From My Place We Are Also Provide Door Step Services (Delhi Ncr, Noida, Gurgaon, Faridabad, Ghaziabad Note:- Pic Collectors Time Passers Bargainers Stay Away As We Respect The Value For Your Money Time And Expect The Same From You Hygienic:- Full Ac room And Clean Rooms Available In Hotel 24 * 7 Hourly In Delhi NCR More Details, With WhatsApp Number, +91-8448380779

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service

Delhi Call girls

FESE Capital Markets Fact Sheet 2024 Q1.pdf

MarinCaroMartnezBerg

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore Booking Contact Details :- WhatsApp Chat :- +91-7737669865 2-May-2024(SMW) Call Girls In Model Towh Bangalore +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in Bangalore NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in Bangalore, Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service Bangalore NCRWelcome To Bangalore Escorts Service – An All Over New Bangalore Very Sexy Hot Call Girls Agency Service Escorts In South BangaloreNCRBangalore’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Bangalore Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...

amitlee9823

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand Booking Contact Details :- WhatsApp Chat :- +91-7737669865 Call Girls In Model Towh +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in , Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service NCRWelcome To Escorts Service – An All Over New Very Sexy Hot Call Girls Agency Service Escorts In South NCR’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At #K09 Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand

amitlee9823

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore Escorts Service Booking Contact Details :- WhatsApp Chat :- +91-7737669865 2-May-2024(SMW) Call Girls In Model Towh Bangalore +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in Bangalore NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in Bangalore, Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service Bangalore NCRWelcome To Bangalore Escorts Service – An All Over New Bangalore Very Sexy Hot Call Girls Agency Service Escorts In South BangaloreNCRBangalore’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Bangalore Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...

amitlee9823

Saudi Arabia [ Abortion pills) Jeddah/riaydh/dammam/+966572737505☎️] cytotec tablets uses abortion pills 💊💊 How effective is the abortion pill? 💊💊 +966572737505) "Abortion pills in Jeddah" how to get cytotec tablets in Riyadh " Abortion pills in dammam*💊💊 The abortion pill is very effective. If you’re taking mifepristone and misoprostol, it depends on how far along the pregnancy is, and how many doses of medicine you take:💊💊 +966572737505) how to buy cytotec pills At 8 weeks pregnant or less, it works about 94-98% of the time. +966572737505[ 💊💊💊 At 8-9 weeks pregnant, it works about 94-96% of the time. +966572737505) At 9-10 weeks pregnant, it works about 91-93% of the time. +966572737505)💊💊 If you take an extra dose of misoprostol, it works about 99% of the time. At 10-11 weeks pregnant, it works about 87% of the time. +966572737505) If you take an extra dose of misoprostol, it works about 98% of the time. In general, taking both mifepristone and+966572737505 misoprostol works a bit better than taking misoprostol only. +966572737505 Taking misoprostol alone works to end the+966572737505 pregnancy about 85-95% of the time — depending on how far along the+966572737505 pregnancy is and how you take the medicine. +966572737505 The abortion pill usually works, but if it doesn’t, you can take more medicine or have an in-clinic abortion. +966572737505 When can I take the abortion pill?+966572737505 In general, you can have a medication abortion up to 77 days (11 weeks)+966572737505 after the first day of your last period. If it’s been 78 days or more since the first day of your last+966572737505 period, you can have an in-clinic abortion to end your pregnancy.+966572737505 Why do people choose the abortion pill? Which kind of abortion you choose all depends on your personal+966572737505 preference and situation. With+966572737505 medication+966572737505 abortion, some people like that you don’t need to have a procedure in a doctor’s office. You can have your medication abortion on your own+966572737505 schedule, at home or in another comfortable place that you choose.+966572737505 You get to decide who you want to be with during your abortion, or you can go it alone. Because+966572737505 medication abortion is similar to a miscarriage, many people feel like it’s more “natural” and less invasive. And some+966572737505 people may not have an in-clinic abortion provider close by, so abortion pills are more available to+966572737505 them. +966572737505 Your doctor, nurse, or health center staff can help you decide which kind of abortion is best for you. +966572737505 More questions from patients: Saudi Arabia+966572737505 CYTOTEC Misoprostol Tablets. Misoprostol is a medication that can prevent stomach ulcers if you also take NSAID medications. It reduces the amount of acid in your stomach, which protects your stomach lining. The brand name of this medication is Cytotec®.+966573737505) Unwanted Kit is a combination of two medici

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec

Abortion pills in Riyadh +966572737505 get cytotec

Gen AI on Enterprise Cloud Apache NiFi Milvus Apache Kafka Apache Flink Cloudera Machine Learning Cloudera DataFlow https://medium.com/@tspann/building-a-milvus-connector-for-nifi-34372cb3c7fa https://www.meetup.com/futureofdata-princeton/events/300737266/ https://lu.ma/q7pcfyjn?source=post_page-----34372cb3c7fa--------------------------------&tk=TTyakY If you're interested in working with Generative AI on the cloud, this virtual workshop is for you. Tim Spann from Cloudera and Yujian Tang from Zilliz will cover how you can implement your own GenAI workflows on the cloud at enterprise scale. 9:00 - 9:05: Intro 9:05 - 9:15: What is Milvus 9:15 - 9:25: Cloudera Development Platform 9:25 - 10:00: Demo Location https://www.youtube.com/watch?v=IfWIzKsoHnA https://github.com/tspannhw/SpeakerProfile https://www.linkedin.com/in/yujiantang/

Generative AI on Enterprise Cloud with NiFi and Milvus

Timothy Spann

Discover Why Less is More in B2B Research

michael115558

April 2024 - Crypto Market Report's Analysis

manisha194592

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K to 25K High Profile Escorts In Pune Booking Now open +91- 8005736733 Why you Choose Us- +91- 8005736733 HOT⇄ 8005736733 Mr ashu ji Call Mr ashu Ji +91- 8005736733 (V030524]N) 𝐇𝐨𝐭𝐞𝐥 𝐑𝐨𝐨𝐦𝐬 𝐈𝐧𝐜𝐥𝐮𝐝𝐢𝐧𝐠 𝐑𝐚𝐭𝐞 𝐒𝐡𝐨𝐭𝐬/𝐇𝐨𝐮𝐫𝐲🆓 .█▬█⓿▀█▀ 𝐈𝐍𝐃𝐄𝐏𝐄𝐍𝐃𝐄𝐍𝐓 𝐆𝐈𝐑𝐋 𝐕𝐈𝐏 𝐄𝐒𝐂𝐎𝐑𝐓 Hello Guys ! High Profiles young Beauties and Good Looking standard Profiles Available , Enquire Now if you are interested in Hifi Service and want to get connect with someone who can understand your needs. Service offers you the most beautiful High Profile sexy independent female Escorts in genuine ✔✔✔ To enjoy with hot and sexy girls ✔✔✔ ★providing:- • Models • vip Models • Russian Models • Foreigner Models • TV Actress and Celebrities • Receptionist • Air Hostess • Call Center Working Girls/Women • Hi-Tech Co. Girls/Women • Housewife

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...

SUHANI PANDEY

(NEHA) Call Girls Katra Call Now: 8617697112 Katra Escorts Booking Contact Details WhatsApp Chat: +91-8617697112 Katra Escort Service includes providing maximum physical satisfaction to their clients as well as engaging conversation that keeps your time enjoyable and entertaining. Plus, they look fabulously elegant, making an impression. Independent Escorts Katra understands the value of confidentiality and discretion; they will go the extra mile to meet your needs. Simply contact them via text messaging or through their online profiles; they'd be more than delighted to accommodate any request or arrange a romantic date or fun-filled night together. We provide:

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7

Call Girls in Nagpur High Profile Call Girls

Probability Grade 10 Third Quarter Lessons

JoseMangaJr1

Call Girl In Dwarka ☎92055#41914 ¶¶ Indian,Russian Best Quality full Educated And Full Cooperative Independent Call Girls Escort Services In New Delhi- I Have Extremely Beautiful Broad Minded Cute Sexy & Hot Call Girls and Escorts, We Are Located in 3* 4* 5* Hotels in Delhi. Safe & Secure High Class Services Affordable Rate 100% Satisfaction, Unlimited Enjoyment. Any Time for Model/Teens Escort in Delhi High class luxury and premium escorts agency Indian Russian Call Girls In Delhi Booking Good High Profile Escorts (Call Girls) In Delhi 5 Star Hotel ,Incall Service,OutCall Service, We provide services by Call Girls,College Girls,Modals Get High Profile queens,Well Educated,Good Looking,Full Cooperative Model, Russian Models,Punjabi Girls Kashmeri Girls Services etc… We Provide Hottest Female With Safe And Consensual With Most Limits Respected Complete Satisfaction Guaranteed…Service. Call Me Spacial For Including Incall//outcall Service In New Delhi Indian Russian Escorts Service,

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night

Delhi Call girls

Mature dropshipping via API with DroFx.pptx

olyaivanovalion

Midocean dropshipping via API with DroFx

olyaivanovalion

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand Booking Contact Details :- WhatsApp Chat :- +91-7737669865 Call Girls In Model Towh +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in , Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service NCRWelcome To Escorts Service – An All Over New Very Sexy Hot Call Girls Agency Service Escorts In South NCR’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At #K09 Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand

amitlee9823

Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand Booking Contact Details :- WhatsApp Chat :- +91-7737669865 Call Girls In Model Towh +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in , Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service NCRWelcome To Escorts Service – An All Over New Very Sexy Hot Call Girls Agency Service Escorts In South NCR’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At #K09 Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand

amitlee9823

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore Booking Contact Details :- WhatsApp Chat :- +91-7737669865 2-May-2024(SMW) Call Girls In Model Towh Bangalore +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in Bangalore NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in Bangalore, Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service Bangalore NCRWelcome To Bangalore Escorts Service – An All Over New Bangalore Very Sexy Hot Call Girls Agency Service Escorts In South BangaloreNCRBangalore’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Bangalore Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...

amitlee9823

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service Bangalore Booking Contact Details :- WhatsApp Chat :- +91-9155563397 2-May-2024(SMW) Call Girls In Model Towh Bangalore +91-9155563397 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in Bangalore NCR 24 Hours Available Service Call Girls, Contact Us +91-9155563397 (Any Time. Any Where) Call Girls in Bangalore, Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service Bangalore NCRWelcome To Bangalore Escorts Service – An All Over New Bangalore Very Sexy Hot Call Girls Agency Service Escorts In South BangaloreNCRBangalore’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Bangalore Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —9155563397 We are available 24*7 all days of the year.

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...

only4webmaster01

Halmar dropshipping via API with DroFx

olyaivanovalion

Recently uploaded (20)

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service

FESE Capital Markets Fact Sheet 2024 Q1.pdf

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec

Generative AI on Enterprise Cloud with NiFi and Milvus

Discover Why Less is More in B2B Research

April 2024 - Crypto Market Report's Analysis

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7

Probability Grade 10 Third Quarter Lessons

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night

Mature dropshipping via API with DroFx.pptx

Midocean dropshipping via API with DroFx

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand

Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...

Halmar dropshipping via API with DroFx

Distributed Logistic Model Trees

1. Distributed Logistic Model Trees, Stratio Intelligence Mateo Álvarez and Antonio Soriano @StratioBD

2. Aerospace Engineer, MSc in Propulsion Systems (UPM), Master in Data Science (URJC). Working as data scientist and Big Data developer at Stratio Big Data in the data science department mateo-alvarez

3. Ph.D. in Telecommunications, MSc in Electronic Systems Engineering and Telecommunication Technologies, Systems and Networks (UPV), and MSc “Big Data Expert” (UTAD). Working as data scientist and Big Data developer at at Stratio Big Data in the data science department @Phd_A_Soriano

4. Why using interpretable algorithms instead of “black boxes” Logistic Regression Decision Trees Variance-Bias tradeoff Metrics Demo Logistic Model Trees Distributed implementation Cost function & configuration params Demo

5. • Why use interpretable algorithms instead of “black boxes” • Logistic Regression • Decision Trees • Variance-Bias tradeoff @StratioBD

6. Accuracy Explainability VS Medical Studies Power management Financial environment Criminal activity

7. Threshold Probability Feature

8. Threshold Probability Local bad adjust Feature

9. Threshold Probability Feature

10. Feature 1 Root node Leaf node Leaf node

11. Feature 1 Root node Feature 2 Feature 2 Leaf node Leaf node Leaf node Leaf node

12. Feature 1 Root node Feature 2 Feature 2 Leaf node Leaf node Leaf node Leaf node Local bad adjust

13. MeanError Model complexity Test error Training error Model complexity Variance Total error Bias 2 Error OverfittingUnderfitting

14. Variance Bias

15. Missing important variables for the problem to make the predictions Variance Bias

16. Overfitting to the sample/training data Variance Bias

17. Irreducible error on prediction Variance Bias

18. • Logistic Model Trees • Distributed implementation • Cost function & configuration parametres • Demo @StratioBD

19. Root node Performance metrics

20. Performance metrics Feature 1 Root node Performance metrics Performance metrics

21. Feature 1 Root node Performance metrics Performance metrics Performance metrics Performance metrics Performance metrics Performance metrics

22. Feature 1 Root node Feature 2 Feature 2

23. Feature 1 Root node Feature 2

24. Feature 1 Root node Feature 2

25. DISTRIBUTED IMPLEMENTATION Spark’s Decision Tree (distributed implementation of random forests) Spark’s Logistic Regression / weka’s Logistic Regression on the nodes

26. LMT Cost function to fix the logistic regression threshold • AccuracyCostFunction • ConfusionMatrix • PrecisionCostFunction • PrecisionRecallCostFunction • RocCostFunction The same cost function for pruning criteria Performance metrics Performance metrics Performance metrics

27. Big datasets Power of spark to distribute building the tree and logistic regressions ADVANTAGES OF THIS IMPLEMENTATION Medium datasets Distributed tree growth and weka’s logistic regression Small datasets Although it can be slow to distribute the data for the decision tree, cost functions can be still used and specific optimization for particular cases

28. Example of DLMT algorithm in a synthetic dataset

29. • Metrics • Demo @StratioBD

30. PREDICTION Positive Negative TRUE CONDITION Positive True Positives False Negatives Negative False Positives True Negatives Precission True Positive Rate (Recall) False Positive Rate TPR = TP/(TP+FN) Insensitive to unbalance FPR = FP/(FP+TN) Insensitive to unbalance Precision = TP/(TP+FP) Sensitive to unbalance Accuracy = (TP+TN)/(TP+TN+FP+FN) Sensitive to unbalance

31. PREDICTION Positive Negative TRUE CONDITION Positive True Positives False Negatives Negative False Positives True Negatives True Positive Rate (Recall) False Positive Rate AUROC (AUC): TPR/FPR -> Insensitive to unbalance! TPR FPR Best performance

32. PREDICTION Positive Negative TRUE CONDITION Positive True Positives False Negatives Negative False Positives True Negatives True Positive Rate (Recall) AUPRC: Precision/TPR -> Sensitive to unbalance! Precission Precision Recall Best performance

33. f f 1 n Benchmark ABF Data Algorithms

34. @StratioBD

35. Accuracy ExplainabilityVS Performance Metrics: AUROC, AUPRC, ACCURACY Automatic Benchmarking Framework f f 1 n BenchmarkABF 1 2 3 4

36. THANK YOU UNITED STATES Tel: (+1) 408 5998830 EUROPE Tel: (+34) 91 828 64 73 contact@stratio.com www.stratio.com @StratioBD

37. people@stratio.com WE ARE HIRING @StratioBD

Distributed Logistic Model Trees

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Distributed Logistic Model Trees

Similar to Distributed Logistic Model Trees (20)

More from Stratio

More from Stratio (15)

Recently uploaded

Recently uploaded (20)

Distributed Logistic Model Trees