SlideShare a Scribd company logo
1 of 22
Download to read offline
ALTIC Big Data Stack
Charly Clairmont, ALTIC
@egwada
charly.clairmont@altic.org
http://www.altic.org
smart #OpenSource Software
#BusinessIntelligence

assembler

www.ow2.org

Twitter #ow2con @egwada
Our historical tools

• ETL : Talend
• Reporting : JasperReports, Birt
• OLAP : Mondrian, Palo
• BI platform : SpagoBI

www.ow2.org

Twitter #ow2con @egwada
Smart assembling
Innovation & customers'needs
●

●

●

Identify when applied research
is an opportunity for us, our
solutions and our customers.

➔

Understand the business
process of our customer &
assess the impact of Open IT
on their activities

➔

Offer an approach of the project
both a technical and a operative

➔

➔

➔

Altic projects
Allows our customer to optimize
their business process
Takes the customer job into
account
Offers perennial solutions
Follows the customer present
needs and not the editors'
agenda

www.ow2.org

Twitter #ow2con @egwada
Identify Big Data potential / Hadoop

www.ow2.org

Twitter #ow2con @egwada
Our first Big Data project at Altic
●

eFraudBox project (2010 – 2013)
●

Goal : predict frauds on Internet

●

Context :
–
–
–

●

Customer : GIE carte bancaire
European Research and Development project
Lot of industrial and academic partners

Data :
–
–

Type : Banking transactions
Volume : One GB per day

www.ow2.org

Twitter #ow2con @egwada
How did we start our first BigData project ?

www.ow2.org

Twitter #ow2con @egwada
« In data mining processing is done
line by line »
… [ there's not about a data volume
issue ]

www.ow2.org

Twitter #ow2con @egwada
But we have too much data !

www.ow2.org

Twitter #ow2con @egwada
Let's have a look at Hadoop ?
●

Open Source

●

MPP compute platform
●

●

●

Distributed file system
MapReduce processing

Cost efficient
●

Fault tolerant

●

Infinite scale

●

Enterprise Information System ready

●

Continuous Improvement

●

« Even transactions are possible
on Hadoop - it's inevitable that ALL
kinds of workloads will move there
in the future »

Growing community

Doug CUTTING
Hadoop Creator
Octobre 2013

www.ow2.org

Twitter #ow2con @egwada
How do we query Hadoop ?

Java
● Very optimised
● Very customisable
●

Pig Latin
● Easy syntax
● Support
unstructured data
●

www.ow2.org

SQL like
● Easy development
●

Twitter #ow2con @egwada
How do we query Hadoop ?

Need to code
evertything
●

●

Why not ?

www.ow2.org

We already
know SQL !
●

Twitter #ow2con @egwada
Ok, we have our storage and
computation engine, but how can we
manage data ?
By using our Swiss Army Knife !

www.ow2.org

Twitter #ow2con @egwada
Now our Hadoop / Hive platform is filled
with Big Data,
but It's a little bit too slow to query for
end users...

http://ih2.redbubble.net/image.13088996.5766/sticker,375x360.png

www.ow2.org

Twitter #ow2con @egwada
Aggregate data
Processing data with Hive and store results in
fast databases

www.ow2.org

Twitter #ow2con @egwada
Ok, now we have our fast queryable
datasets, but how can we visualize these ?
To manage users and visualizations

To quickly have a vision of your data

To go deeper in your visualizations

www.ow2.org

Twitter #ow2con @egwada
BigData and Datamining : tMahout

+
+

= tMahout
www.ow2.org

Twitter #ow2con @egwada
BigData and Datamining v2
●

Spark : new InMemory data processing framework
●

Very appropriate for Machine learning

●

MLBase : Machine learning library

●

Spark-clustering : Implementation of SOM algorithm

●

Proof Of Concept : Analysis of mobile
telecommunications

www.ow2.org

Twitter #ow2con @egwada
We have now a Big Data stack !

www.ow2.org

Twitter #ow2con @egwada
BI & Big Data for Altic
●

Eventually, we still do BI as usual
●

Tools evolve :
–
–

●

New storage and processing
We do not change our tools, fortunately THEY progress
for us and we contribute

Fundamental does not really change, only
technologies do
–
–

Hadoop
Spark
www.ow2.org

Twitter #ow2con @egwada
We improve our Big Data stack and its
approach...
And support Big Analytic customer project

Our Big Data Stack

Our Big Data Approach

www.ow2.org

Twitter #ow2con @egwada
Questions ?
Thanks !

Charly CLAIRMONT
CTO at ALTIC
@egwada
charly.clairmont@altic.org
http://altic.org
www.ow2.org

Twitter #ow2con @egwada

More Related Content

What's hot

What's hot (15)

Traveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analyticsTraveloka's journey to no ops streaming analytics
Traveloka's journey to no ops streaming analytics
 
Wizualne budowanie aplikacji na Sparku przy pomocy narzędzia Seahorse
Wizualne budowanie aplikacji na Sparku przy pomocy narzędzia SeahorseWizualne budowanie aplikacji na Sparku przy pomocy narzędzia Seahorse
Wizualne budowanie aplikacji na Sparku przy pomocy narzędzia Seahorse
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015
 
ASPgems - kappa architecture
ASPgems - kappa architectureASPgems - kappa architecture
ASPgems - kappa architecture
 
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...
Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...
Neptune - narzędzie do monitorowania i zarządzania eksperymentami Machine Lea...
 
Converging Big Data and Application Infrastructure by Steven Poutsy
Converging Big Data and Application Infrastructure by Steven PoutsyConverging Big Data and Application Infrastructure by Steven Poutsy
Converging Big Data and Application Infrastructure by Steven Poutsy
 
Webinar - SpagoBI 5: here comes the Social Network analysis
Webinar - SpagoBI 5: here comes the Social Network analysis Webinar - SpagoBI 5: here comes the Social Network analysis
Webinar - SpagoBI 5: here comes the Social Network analysis
 
Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...
Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...
Webinar: SpagoBI 5 - Self-build your interactive cockpits, get instant insigh...
 
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDBMongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
 
Druid meetup 2018-03-13
Druid meetup 2018-03-13Druid meetup 2018-03-13
Druid meetup 2018-03-13
 
What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!
What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!
What's new with SpagoBI 4.0 - Business Intelligence at your fingertips!
 
Openhab Grafana and Influxdb
Openhab Grafana and InfluxdbOpenhab Grafana and Influxdb
Openhab Grafana and Influxdb
 
Webinar: BI Mobile with SpagoBI: be aware everywhere!
Webinar: BI Mobile with SpagoBI: be aware everywhere!Webinar: BI Mobile with SpagoBI: be aware everywhere!
Webinar: BI Mobile with SpagoBI: be aware everywhere!
 

Viewers also liked

Big Data Benchmarking Tutorial
Big Data Benchmarking TutorialBig Data Benchmarking Tutorial
Big Data Benchmarking Tutorial
Tilmann Rabl
 
Jaspersoft Open Source Business Intelligence
Jaspersoft Open Source Business IntelligenceJaspersoft Open Source Business Intelligence
Jaspersoft Open Source Business Intelligence
OW2
 
OS Approach Industrializing Research Tools
OS Approach Industrializing Research ToolsOS Approach Industrializing Research Tools
OS Approach Industrializing Research Tools
OW2
 

Viewers also liked (20)

IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big dataIEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
 
Building k-nn Graphs From Large Text Data
Building k-nn Graphs From Large Text DataBuilding k-nn Graphs From Large Text Data
Building k-nn Graphs From Large Text Data
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
Big Data Benchmarking Tutorial
Big Data Benchmarking TutorialBig Data Benchmarking Tutorial
Big Data Benchmarking Tutorial
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
BigData - Hadoop -by 侯圣文@secooler
BigData - Hadoop -by 侯圣文@secooler BigData - Hadoop -by 侯圣文@secooler
BigData - Hadoop -by 侯圣文@secooler
 
IBM Bluemix Paris Meetup #20 - 20161214
IBM Bluemix Paris Meetup #20 - 20161214IBM Bluemix Paris Meetup #20 - 20161214
IBM Bluemix Paris Meetup #20 - 20161214
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysis
 
Textual Robot programming
Textual Robot programmingTextual Robot programming
Textual Robot programming
 
A data analyst view of Bigdata
A data analyst view of Bigdata A data analyst view of Bigdata
A data analyst view of Bigdata
 
BlueMind : next gen mail and collaboration solution, OW2con'16, Paris.
BlueMind : next gen mail and collaboration solution, OW2con'16, Paris. BlueMind : next gen mail and collaboration solution, OW2con'16, Paris.
BlueMind : next gen mail and collaboration solution, OW2con'16, Paris.
 
Indexing Still and Moving Images
Indexing Still and Moving ImagesIndexing Still and Moving Images
Indexing Still and Moving Images
 
Jaspersoft Open Source Business Intelligence
Jaspersoft Open Source Business IntelligenceJaspersoft Open Source Business Intelligence
Jaspersoft Open Source Business Intelligence
 
Mobile integration
Mobile integrationMobile integration
Mobile integration
 
Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...
Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...
Emerginov, A Telco Web PaaS for African Cloud, Open Cloud Forum at Cloud Expo...
 
Palacio Gobierno del Ecuador
Palacio Gobierno del EcuadorPalacio Gobierno del Ecuador
Palacio Gobierno del Ecuador
 
OS Approach Industrializing Research Tools
OS Approach Industrializing Research ToolsOS Approach Industrializing Research Tools
OS Approach Industrializing Research Tools
 
Logic Circuit Project Final Presentation
Logic Circuit Project Final PresentationLogic Circuit Project Final Presentation
Logic Circuit Project Final Presentation
 
Jasmine Probe, OW2con11, Nov 24-25, Paris
Jasmine Probe, OW2con11, Nov 24-25, ParisJasmine Probe, OW2con11, Nov 24-25, Paris
Jasmine Probe, OW2con11, Nov 24-25, Paris
 
DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris.
DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris. DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris.
DocDokuPLM : Domain Specific PaaS and Business Oriented API, OW2con'16, Paris.
 

Similar to Altic's big analytics stack, Charly Clairmont, Altic.

Data Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch FixData Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch Fix
Stefan Krawczyk
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch FixData Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
Stefan Krawczyk
 

Similar to Altic's big analytics stack, Charly Clairmont, Altic. (20)

Serverless for High Performance Computing
Serverless for High Performance ComputingServerless for High Performance Computing
Serverless for High Performance Computing
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
 
Satisfaction hadoop meetup presentation
Satisfaction hadoop meetup presentationSatisfaction hadoop meetup presentation
Satisfaction hadoop meetup presentation
 
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
IoT databases - review and challenges - IoT, Hardware & Robotics meetup - onl...
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the Cloud
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the Cloud
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
 
Data Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch FixData Day Texas 2017: Scaling Data Science at Stitch Fix
Data Day Texas 2017: Scaling Data Science at Stitch Fix
 
Google BigQuery for Everyday Developer
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday Developer
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
 
When it all GOes right
When it all GOes rightWhen it all GOes right
When it all GOes right
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016
 
Building Modern Data Pipelines on GCP via a FREE online Bootcamp
Building Modern Data Pipelines on GCP via a FREE online BootcampBuilding Modern Data Pipelines on GCP via a FREE online Bootcamp
Building Modern Data Pipelines on GCP via a FREE online Bootcamp
 
Serverless for High Performance Computing
Serverless for High Performance ComputingServerless for High Performance Computing
Serverless for High Performance Computing
 
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch FixData Day Seattle 2017: Scaling Data Science at Stitch Fix
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 

More from OW2

OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...
OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...
OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...
OW2
 
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
OW2
 

More from OW2 (20)

OW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in Roma
OW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in RomaOW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in Roma
OW2 and RIOS teaming up to boost the open source impact, Nov. 2022 in Roma
 
The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...
The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...
The Open Source Good Governance Initiative presented at RIOS OS Week, Nov. 20...
 
GLPi v.10, les fonctionnalités principales et l'offre cloud
GLPi v.10, les fonctionnalités principales et l'offre cloudGLPi v.10, les fonctionnalités principales et l'offre cloud
GLPi v.10, les fonctionnalités principales et l'offre cloud
 
Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...
Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...
Centreon: superviser le Cloud et le Legacy à partir d'une même plateforme, po...
 
FusionIAM : la gestion des identités et des accés open source
FusionIAM : la gestion des identités et des accés open sourceFusionIAM : la gestion des identités et des accés open source
FusionIAM : la gestion des identités et des accés open source
 
OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...
OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...
OW2 Association Européenne aux racines grenobloises, transformer l'industrie ...
 
SFScon'20 Bringing the User into the Equation
SFScon'20 Bringing the User into the EquationSFScon'20 Bringing the User into the Equation
SFScon'20 Bringing the User into the Equation
 
Towards a sustainable solution to open source sustainability, OW2online20, Ju...
Towards a sustainable solution to open source sustainability, OW2online20, Ju...Towards a sustainable solution to open source sustainability, OW2online20, Ju...
Towards a sustainable solution to open source sustainability, OW2online20, Ju...
 
Advanced proactive and polymorphing cloud application adaptation with MORPHEM...
Advanced proactive and polymorphing cloud application adaptation with MORPHEM...Advanced proactive and polymorphing cloud application adaptation with MORPHEM...
Advanced proactive and polymorphing cloud application adaptation with MORPHEM...
 
Open Source governance and the Eclipse Foundation, OW2online, June 2020
Open Source governance and the Eclipse Foundation, OW2online, June 2020Open Source governance and the Eclipse Foundation, OW2online, June 2020
Open Source governance and the Eclipse Foundation, OW2online, June 2020
 
Open source contribution policies, OW2online, June 2020
Open source contribution policies, OW2online, June 2020Open source contribution policies, OW2online, June 2020
Open source contribution policies, OW2online, June 2020
 
Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...
Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...
Software development at scale, pandemic lockdown and oss ecosystems, OW2onlin...
 
Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020
Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020
Overview of the OpenChain Reference Tooling Work Group, OW2online20, June 2020
 
Open Source Compliance at Orange, OW2online, June 2020
Open Source Compliance at Orange, OW2online, June 2020Open Source Compliance at Orange, OW2online, June 2020
Open Source Compliance at Orange, OW2online, June 2020
 
Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020
Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020
Ideas, methods and tools for OSS Compliance assessment, OW2online, June 2020
 
Intelligent package management with FASTEN, OW2online, June 2020
Intelligent package management with FASTEN, OW2online, June 2020Intelligent package management with FASTEN, OW2online, June 2020
Intelligent package management with FASTEN, OW2online, June 2020
 
DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020
DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020
DECODER, a Smarter Environment for DevOps Teams , OW2online, June 2020
 
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
Enabling DevOps for IoT software development, powered by Open Source, OW2onli...
 
Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...
Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...
Upcoming Challenges in Artificial Intelligence Research and Development, OW2o...
 
Cacti and Big Data at Orange France, OW2online, June 2020
Cacti and Big Data at Orange France, OW2online, June 2020Cacti and Big Data at Orange France, OW2online, June 2020
Cacti and Big Data at Orange France, OW2online, June 2020
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Altic's big analytics stack, Charly Clairmont, Altic.

  • 1. ALTIC Big Data Stack Charly Clairmont, ALTIC @egwada charly.clairmont@altic.org http://www.altic.org
  • 3. Our historical tools • ETL : Talend • Reporting : JasperReports, Birt • OLAP : Mondrian, Palo • BI platform : SpagoBI www.ow2.org Twitter #ow2con @egwada
  • 4. Smart assembling Innovation & customers'needs ● ● ● Identify when applied research is an opportunity for us, our solutions and our customers. ➔ Understand the business process of our customer & assess the impact of Open IT on their activities ➔ Offer an approach of the project both a technical and a operative ➔ ➔ ➔ Altic projects Allows our customer to optimize their business process Takes the customer job into account Offers perennial solutions Follows the customer present needs and not the editors' agenda www.ow2.org Twitter #ow2con @egwada
  • 5. Identify Big Data potential / Hadoop www.ow2.org Twitter #ow2con @egwada
  • 6. Our first Big Data project at Altic ● eFraudBox project (2010 – 2013) ● Goal : predict frauds on Internet ● Context : – – – ● Customer : GIE carte bancaire European Research and Development project Lot of industrial and academic partners Data : – – Type : Banking transactions Volume : One GB per day www.ow2.org Twitter #ow2con @egwada
  • 7. How did we start our first BigData project ? www.ow2.org Twitter #ow2con @egwada
  • 8. « In data mining processing is done line by line » … [ there's not about a data volume issue ] www.ow2.org Twitter #ow2con @egwada
  • 9. But we have too much data ! www.ow2.org Twitter #ow2con @egwada
  • 10. Let's have a look at Hadoop ? ● Open Source ● MPP compute platform ● ● ● Distributed file system MapReduce processing Cost efficient ● Fault tolerant ● Infinite scale ● Enterprise Information System ready ● Continuous Improvement ● « Even transactions are possible on Hadoop - it's inevitable that ALL kinds of workloads will move there in the future » Growing community Doug CUTTING Hadoop Creator Octobre 2013 www.ow2.org Twitter #ow2con @egwada
  • 11. How do we query Hadoop ? Java ● Very optimised ● Very customisable ● Pig Latin ● Easy syntax ● Support unstructured data ● www.ow2.org SQL like ● Easy development ● Twitter #ow2con @egwada
  • 12. How do we query Hadoop ? Need to code evertything ● ● Why not ? www.ow2.org We already know SQL ! ● Twitter #ow2con @egwada
  • 13. Ok, we have our storage and computation engine, but how can we manage data ? By using our Swiss Army Knife ! www.ow2.org Twitter #ow2con @egwada
  • 14. Now our Hadoop / Hive platform is filled with Big Data, but It's a little bit too slow to query for end users... http://ih2.redbubble.net/image.13088996.5766/sticker,375x360.png www.ow2.org Twitter #ow2con @egwada
  • 15. Aggregate data Processing data with Hive and store results in fast databases www.ow2.org Twitter #ow2con @egwada
  • 16. Ok, now we have our fast queryable datasets, but how can we visualize these ? To manage users and visualizations To quickly have a vision of your data To go deeper in your visualizations www.ow2.org Twitter #ow2con @egwada
  • 17. BigData and Datamining : tMahout + + = tMahout www.ow2.org Twitter #ow2con @egwada
  • 18. BigData and Datamining v2 ● Spark : new InMemory data processing framework ● Very appropriate for Machine learning ● MLBase : Machine learning library ● Spark-clustering : Implementation of SOM algorithm ● Proof Of Concept : Analysis of mobile telecommunications www.ow2.org Twitter #ow2con @egwada
  • 19. We have now a Big Data stack ! www.ow2.org Twitter #ow2con @egwada
  • 20. BI & Big Data for Altic ● Eventually, we still do BI as usual ● Tools evolve : – – ● New storage and processing We do not change our tools, fortunately THEY progress for us and we contribute Fundamental does not really change, only technologies do – – Hadoop Spark www.ow2.org Twitter #ow2con @egwada
  • 21. We improve our Big Data stack and its approach... And support Big Analytic customer project Our Big Data Stack Our Big Data Approach www.ow2.org Twitter #ow2con @egwada
  • 22. Questions ? Thanks ! Charly CLAIRMONT CTO at ALTIC @egwada charly.clairmont@altic.org http://altic.org www.ow2.org Twitter #ow2con @egwada