OW2 Big Data 
Initiative 
Charly Clairmont, ALTIC 
@egwada / @altic_buzz 
charly.clairmont@altic.org 
http://www.altic.org
smart #OpenSource Software 
#BusinessIntelligence 
assembler 
Twitter www.ow2.org #ow2 #sl2014 @Altic_buzz
Altic tools / approach 
• ETL : Talend 
• Big Data : Spark, Hortonworks Data 
Platform (Hadoop), Elasticsearch 
• Data Warehouse : InfiniDB 
• Reporting : JasperReports, Birt 
• OLAP : Mondrian, Palo 
• Dashboard : Tableau Software, D3 
• BI platform : SpagoBI 
Twitter www.ow2.org #ow2 #sl2014 @Altic_buzz
Biclustring on Big Data 
Twitter www.ow2.org #ow2 #sl2014 @Altic_buzz 
● Tugdual SARAZIN, PhD 
● ALTIC 
● LIPEN (Paris 13) 
● Biclustring 
● a Biclustring algorithm on Big Data 
● Spark 
● Based on SOM – Self Organized Map 
● Available on Github : Spark-Clustering
Integration with SpagoBI 
● Spark Bi Clustering can be an engine for SpagoBI 
● Define a data set as input 
● Execute the biclustering with appropriate settings 
● Store result in a defined format 
– Databases 
– Big data storage (HDFS) 
– SpagoBI Dataset 
Twitter www.ow2.org #ow2 #sl2014 @Altic_buzz
Integration with Talend 
● Spark Biclustering can be a component for Talend Big Data 
● Add new features to existing Talend Big Data components 
Twitter www.ow2.org #ow2 #sl2014 @Altic_buzz 
– Biclustering 
● Allow to map your data
Thanks 
Charly CLAIRMONT 
@egwada / @altic_buzz 
charly.clairmont@altic.org 
http://www.altic.org

Spark Bi-Clustering - OW2 Big Data Initiative, altic

  • 1.
    OW2 Big Data Initiative Charly Clairmont, ALTIC @egwada / @altic_buzz charly.clairmont@altic.org http://www.altic.org
  • 2.
    smart #OpenSource Software #BusinessIntelligence assembler Twitter www.ow2.org #ow2 #sl2014 @Altic_buzz
  • 3.
    Altic tools /approach • ETL : Talend • Big Data : Spark, Hortonworks Data Platform (Hadoop), Elasticsearch • Data Warehouse : InfiniDB • Reporting : JasperReports, Birt • OLAP : Mondrian, Palo • Dashboard : Tableau Software, D3 • BI platform : SpagoBI Twitter www.ow2.org #ow2 #sl2014 @Altic_buzz
  • 4.
    Biclustring on BigData Twitter www.ow2.org #ow2 #sl2014 @Altic_buzz ● Tugdual SARAZIN, PhD ● ALTIC ● LIPEN (Paris 13) ● Biclustring ● a Biclustring algorithm on Big Data ● Spark ● Based on SOM – Self Organized Map ● Available on Github : Spark-Clustering
  • 5.
    Integration with SpagoBI ● Spark Bi Clustering can be an engine for SpagoBI ● Define a data set as input ● Execute the biclustering with appropriate settings ● Store result in a defined format – Databases – Big data storage (HDFS) – SpagoBI Dataset Twitter www.ow2.org #ow2 #sl2014 @Altic_buzz
  • 6.
    Integration with Talend ● Spark Biclustering can be a component for Talend Big Data ● Add new features to existing Talend Big Data components Twitter www.ow2.org #ow2 #sl2014 @Altic_buzz – Biclustering ● Allow to map your data
  • 7.
    Thanks Charly CLAIRMONT @egwada / @altic_buzz charly.clairmont@altic.org http://www.altic.org