This document discusses big data and Talend's goal of democratizing big data through its open source integration platform. It begins by defining big data and explaining the challenges it poses related to volume, velocity, variety and other factors. It then outlines Talend's goal of providing intuitive graphical tools to design and run big data jobs within Hadoop, abstracting away the underlying code generation. The document stresses that data quality is especially important for big data and how Talend supports implementing data quality checks either as part of loading data into Hadoop or as a separate job after loading. Finally it provides an overview of Talend's roadmap to add support for additional Hadoop technologies over time such as HCatalog, Oozie and more
Big Data World Forum (BDWF http://www.bigdatawf.com/) is specially designed for data-driven decision makers, managers, and data practitioners, who are shaping the future of the big data.
Master Data Management (MDM) has been one of the hot technology areas that are striving to solve the age old data quality and data management problems of the Master Data such as Customer, Product, Chart of Accounts (COA), etc. Of late given the ever increasing capabilities of Hardware, global single instances of packaged applications, mergers and acquisitions, it has become apparent that the data quality problems associated with Master data have been continue to worsen. It is in this context that the MDM solutions try to address the management of master data with robust data quality solutions. The Trading Community Architecture (TCA) framework is an Oracle's answer to solve the problem associated with managing the customer data. Of late the TCA has evolved much more into managing of Location data, Supplier data, Citizen Data, etc. The objective of this session is to provide the overview of Master Data Management (MDM) and Oracle's Trading Community Architecture (TCA) and how it can be used to model the customer data in an enterprise. This is an entry level session and any one with keen interest to learn what MDM and TCA can attend this session. Learn the basics of Master Data Management (MDM), MDM for Customer, and Oracle's Trading Community Architectue (TCA) Learn about the importance of MDM to an enterprise Take a brief look at the TCA's logical data model and the power/flexibility of model to solution cusotmer data
Business Intelligence:Optimizing Data Across the EnterpriseProformative, Inc.
Financial professionals often have too little and too much information at the same time. What they need is the data to make a great business decision fast. Discover how the finance executive of 2011 sifts through an exponentially growing pile of internal and external data to determine the best way to integrate and channel information to the right decision makers, at the right time, while maintaining appropriate controls over critical enterprise data.
Big Data World Forum (BDWF http://www.bigdatawf.com/) is specially designed for data-driven decision makers, managers, and data practitioners, who are shaping the future of the big data.
Master Data Management (MDM) has been one of the hot technology areas that are striving to solve the age old data quality and data management problems of the Master Data such as Customer, Product, Chart of Accounts (COA), etc. Of late given the ever increasing capabilities of Hardware, global single instances of packaged applications, mergers and acquisitions, it has become apparent that the data quality problems associated with Master data have been continue to worsen. It is in this context that the MDM solutions try to address the management of master data with robust data quality solutions. The Trading Community Architecture (TCA) framework is an Oracle's answer to solve the problem associated with managing the customer data. Of late the TCA has evolved much more into managing of Location data, Supplier data, Citizen Data, etc. The objective of this session is to provide the overview of Master Data Management (MDM) and Oracle's Trading Community Architecture (TCA) and how it can be used to model the customer data in an enterprise. This is an entry level session and any one with keen interest to learn what MDM and TCA can attend this session. Learn the basics of Master Data Management (MDM), MDM for Customer, and Oracle's Trading Community Architectue (TCA) Learn about the importance of MDM to an enterprise Take a brief look at the TCA's logical data model and the power/flexibility of model to solution cusotmer data
Business Intelligence:Optimizing Data Across the EnterpriseProformative, Inc.
Financial professionals often have too little and too much information at the same time. What they need is the data to make a great business decision fast. Discover how the finance executive of 2011 sifts through an exponentially growing pile of internal and external data to determine the best way to integrate and channel information to the right decision makers, at the right time, while maintaining appropriate controls over critical enterprise data.
Oracle has recently launced a new MDM hub for tackling the Site domain. Many organizations in industries such as Retail, Utilities, Financials, etc., have a huge problem to manage site information in their business context and all of the information (lots of attributes) that they need to manage at these site levels. Oracle addresses this need through the launching of their Site Hub product
This presentation talks about few of the use cases for SIte Hub and discusses the features of the Site Hub product.
About ActuateOne for Utility Analytics
Water and Energy Utilities are under tremendous pressure to demonstrate progress in asset optimization, grid optimization and performance gains across traditional business drivers such as customers, revenue protection, utility regulatory compliance and financials. ActuateOne for Utility Analytics provides a comprehensive portfolio of software and utility analytics industry expertise to ensure today’s utility leaders and customers always have access to the right information, insight and collaborative capabilities for accurate and informed decisions. Delivered through a single platform, ActuateOne for Utility Analytics ignites any utility or grid Analytics initiative with integrated asset optimization dashboards, grid optimization dashboards, utility compliance reports as well as Transformer Management Scorecards, Substation & Equipment Management Scorecards and Utility KPI Dashboards which help today’s Utility enhance performance and maximize grid performance.
Site/Location Hub is an MDM solution for mastering site/location data in an enterprise to facilitate the management of enterprise wide (single) view of locations and associated information with key MDM features such as Data Quality, Extensibility, etc., built in.
Big Data can drive Big Value. In this presentation, Espen Sletteng-Fagerli, CTO of Avanade Norway, explains how Microsoft technology best can be used in Big Data scenarios, utilizing technologies such as Hadoop on Azure, SQL Server and Microsoft Office. First presented at CIO-Forum in Oslo, Norway, October 11th 2012.
The Briefing Room with Mark Madsen and Hortonworks
Slides from the Live Webcast on Oct. 16, 2012
The power of Hadoop cannot be denied, as evidenced by the fact that all the biggest closed-source vendors in the world of data management have embraced this open-source project with virtually open arms. But Hadoop is not a data warehouse, nor ever will it likely be. Rather, it's ideal role for now is to augment traditional data warehousing and business intelligence. As an adjunct, Hadoop provides an amazing mechanism for storing and analyzing Big Data. The key is to manage expectations and move forward carefully.
Check out this episode of The Briefing Room to hear veteran Analyst Mark Madsen of Third Nature, who will explain how, where, when and why to leverage the open-source elephant in the enterprise. He'll be briefed by Jim Walker of Hortonworks who will tout his company's vision for the future of Big Data management. He'll provide details on their data platform and how it can be used to complete the picture of information management. He'll also discuss how the Hortonworks partner network can help companies get big value from Big Data.
Visit: http://www.insideanalysis.com
Intel Developer Forum: Taming the Big Data Tsunami
using Intel® Architecture by Clive D’Souza, Solutions Architect, Intel Corporation and
Dhruv Bansal, Chief Science Officer, Infochimps
Oracle has recently launced a new MDM hub for tackling the Site domain. Many organizations in industries such as Retail, Utilities, Financials, etc., have a huge problem to manage site information in their business context and all of the information (lots of attributes) that they need to manage at these site levels. Oracle addresses this need through the launching of their Site Hub product
This presentation talks about few of the use cases for SIte Hub and discusses the features of the Site Hub product.
About ActuateOne for Utility Analytics
Water and Energy Utilities are under tremendous pressure to demonstrate progress in asset optimization, grid optimization and performance gains across traditional business drivers such as customers, revenue protection, utility regulatory compliance and financials. ActuateOne for Utility Analytics provides a comprehensive portfolio of software and utility analytics industry expertise to ensure today’s utility leaders and customers always have access to the right information, insight and collaborative capabilities for accurate and informed decisions. Delivered through a single platform, ActuateOne for Utility Analytics ignites any utility or grid Analytics initiative with integrated asset optimization dashboards, grid optimization dashboards, utility compliance reports as well as Transformer Management Scorecards, Substation & Equipment Management Scorecards and Utility KPI Dashboards which help today’s Utility enhance performance and maximize grid performance.
Site/Location Hub is an MDM solution for mastering site/location data in an enterprise to facilitate the management of enterprise wide (single) view of locations and associated information with key MDM features such as Data Quality, Extensibility, etc., built in.
Big Data can drive Big Value. In this presentation, Espen Sletteng-Fagerli, CTO of Avanade Norway, explains how Microsoft technology best can be used in Big Data scenarios, utilizing technologies such as Hadoop on Azure, SQL Server and Microsoft Office. First presented at CIO-Forum in Oslo, Norway, October 11th 2012.
The Briefing Room with Mark Madsen and Hortonworks
Slides from the Live Webcast on Oct. 16, 2012
The power of Hadoop cannot be denied, as evidenced by the fact that all the biggest closed-source vendors in the world of data management have embraced this open-source project with virtually open arms. But Hadoop is not a data warehouse, nor ever will it likely be. Rather, it's ideal role for now is to augment traditional data warehousing and business intelligence. As an adjunct, Hadoop provides an amazing mechanism for storing and analyzing Big Data. The key is to manage expectations and move forward carefully.
Check out this episode of The Briefing Room to hear veteran Analyst Mark Madsen of Third Nature, who will explain how, where, when and why to leverage the open-source elephant in the enterprise. He'll be briefed by Jim Walker of Hortonworks who will tout his company's vision for the future of Big Data management. He'll provide details on their data platform and how it can be used to complete the picture of information management. He'll also discuss how the Hortonworks partner network can help companies get big value from Big Data.
Visit: http://www.insideanalysis.com
Intel Developer Forum: Taming the Big Data Tsunami
using Intel® Architecture by Clive D’Souza, Solutions Architect, Intel Corporation and
Dhruv Bansal, Chief Science Officer, Infochimps
Patrice Bertrand is the chairman of CNLL, the National Council of Free Software. The CNLL gathers the french clusters of enterprises working in free software. Through these clusters, the CNLL represents more than 300 french businesses specialized in free and open source software. The missions of the CNLL are to facilitate and coordinate the actions of the clusters, to represent the branch towards public bodies, to raise awareness towards this job creating industry.
Patrice Bertrand is among the founders of Smile, a french integrator of open source software, which he served as General Manager up until 2013, notably defining and deploying its open source strategy.
He is the author of numerous essays and articles related to free and open source software, in all its aspects, economic, legal, societal as well as technical.
VLC media player is a universal, free and open-source, cross-platform media playback and streaming application. This presentation will give an overview of the VideoLAN project, the history behind VideoLAN and VLC, and the various legal issues we face.
VLC media player is published by VideoLAN, a French non-profit organization, and is mostly developed by volunteers in their spare time. VideoLAN’s additional products include the well-known H.264 encoder x264, specialized media streaming applications and a diverse set of libraries to support DVD and BR playback on all major operating systems. A bit of history of the VideoLAN project will be explained too.
VLC media player supports MS Windows, OS/2, Solaris, GNU/Linux, BSD and Mac OS X on the desktop; and Android, iOS and WinRT mobile operating systems. The application is licensed under the GNU General Public License version 2 and later, based upon a portable library licensed under the GNU Lesser General Public License.
The number of users of VLC is believed to be more than 100 million users worldwide, on all platforms.
Big Data World Forum (BDWF http://www.bigdatawf.com/) is specially designed for data-driven decision makers, managers, and data practitioners, who are shaping the future of the big data.
Simplifying Big Data Analytics for the BusinessTeradata Aster
Tasso Argyros, Co-Founder & Co-President, Teradata Aster presents at the 2012 Big Analytics Roadshow.
The opportunity exists for organizations in every industry to unlock the power of iterative, big data analysis with new applications such as digital marketing optimization and social network analysis to improve their bottom line. Big data analysis is not just the ability to analyze large volumes of data, but the ability to analyze more varieties of data by performing more complex analysis than is possible with more traditional technologies. This session will demonstrate how to bring the science of data to the art of business by empowering more business users and analysts with operationalized insights that drive results. See how data science is making emerging analytic technologies more accessible to businesses while providing better manageability to enterprise architects across retail, financial services, and media companies.
EDF2013 - Richard Benjamins: Big Data – Big opportunities – Big risks? And ...European Data Forum
Keynote talk of Richard Benjamins, Director of Business Intelligence at Telefonica Digital, at the European Data Forum 2013, 9 April 2013 in Dublin, Ireland: Big Data - Big Opportunities - Big Risks? And what about Europe?
Webinar | Using Hadoop Analytics to Gain a Big Data AdvantageCloudera, Inc.
Learn about:
Why big data matters to your business: realize revenue, increase customer loyalty, and pinpoint effective strategies
The business and technical challenges of big data solutions
How to leverage big data for competitive advantage
The “must haves” of an effective big data solution
Real-world examples of Cloudera, Pentaho and Dell big data solutions in action
PromptCloud's presentation during Nasscom Emerge 50 nominations for 2012 Emerge 50 awards. PromptCloud was awarded one of the winners. This presentation overviews our journey and what got us this far.
Infor i: Setting The Scene. Infor is the largest IBM i ISV in the World.Inforsystemi
Infor is the third largest provider of enterprise applications and services. Infor helps 70,000 customers in 194 countries improve operations, drive growth, and quickly adapt to changes in business demands.
You're one of the nearly 15,000 Infor manufacturing customers whose enterprise resource planning (ERP) solution leverages the IBM System i platform. You know it's a powerful combination. You'll be pleased to know that it's becoming more powerful every day.
Like IBM, Infor is making a significant investment in its System i capabilities so you can:
• Protect and leverage your current IT investment
• Easily add new capabilities to your ERP solution to meet changing business requirements
• Continue to enjoy the reliability, security and low total cost of ownership delivered by the System i and Infor ERP solutions
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...DATAVERSITY
Do you wonder how to process huge amounts of data in short amount of time? If yes, this session is for you! You will learn why Apache Hadoop and Streams is the core framework that enables storing, managing and analyzing of vast amounts of data. You will learn the idea behind Hadoop's famous map-reduce algorithm and why it is at the heart of solutions that process massive amounts of data with flexible workloads and software based scaling. We explore how to go beyond Hadoop with both real-time and batch analytics, usability, and manageability. For practical examples, we will use IBM InfoSphere BigInsights and Streams, which build on top of open source tooling when going beyond basics and scaling up and out is needed.
Hadoop World 2011: Big Data Analytics – Data Professionals: The New Enterpris...Cloudera, Inc.
This presentation will explore how Hadoop and Big Data are re-inventing enterprise workflows, and the pivotal role of the Data Analyst. It will examine the changing face of analytics and the streamlining of iterative queries through evolved user interfaces. The speaker will cut through hype around “shorter time to insight” and explain how combining Hadoop and SQL-based analytics help companies discover emergent trends hidden in unstructured data, without having to retrain data miners or restaff. In particular, it will highlight changes to Big Data analysis from this paradigm and illustrate stepwise how analysts can now connect to Big Data platforms, assemble working data sets from disparate sources, analyze and mine that data for actionable insight, publish the results as visualizations and for feeding reporting tools, and operationalize Map-Reduce and Big Data outcomes into company workflows – all without touching the command line.
#OSSPARIS19 : Control your Embedded Linux remotely by using WebSockets - Gian...Paris Open Source Summit
Always wanted to control your IoT device without SSH'ing into it? In this talk we will show how WebSockets, MQTT and a set of custom go/js libraries can help in managing remotely your IoT device without knowing its IP address. Learn how you can use the Arduino Create Agent to easily deploy containers, remotely. A journey on Docker client, APT command line, sockets, systemd and much more on Arm and Intel Linux devices.
#OSSPARIS19 : RIOT: towards open source, secure DevOps on microcontroller-bas...Paris Open Source Summit
La mise-à-jour de firmwares "Over-The-Air" sur microcontrôleur a toujours été un sujet ambitieux et pourtant primordial pour sécuriser une application IoT. Le système d'exploitation RIOT (https://riot-os.org) fournit désormais les briques logicielles pour réaliser des mise-à-jour de firmware en utilisant des protocoles standards et sécurisés de bout-en-bout.
#OSSPARIS19 : The evolving (IoT) security landscape - Gianluca Varisco, ArduinoParis Open Source Summit
IoT is at the peak of the hype cycle - what they call the 'Peak of Inflated Expectations’. The complexity of the cybersecurity landscape is at an all-time high, with security researchers, vendors and even governments all trying to come to a consensus for making the cyber-world a safer place. In this world of lightning-fast development cycles, it may intuitively feel like security gets left behind. The battle over standards is always a struggle. The unresolved problem of software updates and short vendor support cycle combined with the lack of effort into security makes these devices an easy target. Companies not only need to update their technology stack for the evolving security landscape but also their mindset, processes and culture. This talk will shine a light on some of the challenges that today’s executives face in finding and fixing systemic problems in and outside of security through people, tools and understanding.
#OSSPARIS19: Construire des applications IoT "secure-by-design" - Thomas Gaza...Paris Open Source Summit
"Cette présentation a pour but de présenter MirageOS et ses applications à l'écriture d'applications IoT sécurées. En particulier, MirageOS permet de développer des applications d'infrastructure réseau --- firewalls, proxy VPN, serveurs d'emails, etc. --- qui peuvent être déployées sur des processeurs embarqués de type ARMv8, ESP32 ou RISC-V. Nous expliquerons comment nous nous appuierons sur cette couche d'infrastructure entièrement open-source pour développer OSMOSE, une plateforme sécurisée et décentralisée permettant de construire des application IoT centrées sur l'utilisateur et le respect de sa vie privée.
"
#OSSPARIS19 : Detecter des anomalies de séries temporelles à la volée avec Wa...Paris Open Source Summit
WarpScript est un langage de programmation open source conçu pour facilement requêter, manipuler et traiter des données de séries temporelles à la volée. Bien qu'il soit compatible nativement avec Warp 10, WarpScript peut aussi être connecté à d'autre sources de données. Dans cette présentation, nous allons détecter des anomalies à la volée en utilisant des fonctions WarpScript et répondre aux questions suivantes. Que doit-on définir comme une anomalie ? Quel algorithme correspond au type d'anomalie que l'on cherche à détecter ? Comment prendre en compte la possible saisonnalité de mes données ?
#OSSPARIS19 : Comment ONLYOFFICE aide à organiser les travaux de recherches ...Paris Open Source Summit
ONLYOFFICE développée par Ascensio System SIA, est une suite bureautique open-source basée sur l'élément Canvas de HTML5, qui offre une gamme complète d’outils d’édition en ligne des documents texte, feuilles de calcul et présentations.
Cette présentation commence par l’aperçu des principes de base :
- support de tous les formats courants,
- riche éventail d’outils de la mise en forme,
- affichage du contenu de manière identique, quel que soit le navigateur utilisé,
- ressources permettant d’étendre les fonctionnalités des éditeurs,
- capacités avancées de co-édition,
- transfert de données sécurisé en temps réel.
Le nombre des universités et des écoles qui optent pour les alternatives open source aux solutions populaires offertes par les grandes marques, augmente chaque année. Les solutions de ONLYOFFICE sont actuellement utilisées par plus de 30 établissements d’enseignement en France tels que treize Universités de la Sorbonne, l’Université de Grenoble, l’Université de Nantes, l’École Nationale d'Ingénieurs de Brest, le l'établissement public Campus Condorcet, etc.
Dans cette partie, Jeremy Maton, l’Administrateur Systèmes et Réseaux à l’Institut de Biologie de Lille, va présenter comment ONLYOFFICE est intégrée au sein de leur unité de recherches et aide à organiser le flux de travail.
#OSSPARIS19 - Understanding Open Source Governance - Gilles Gravier, Wipro Li...Paris Open Source Summit
Stratégie, risques liés à l'adoption de l'open source... Comment un modèle de gouvernance fort peut rendre votre parcours open source le plus efficace.
#OSSPARIS19 : Publier du code Open Source dans une banque : Mission impossibl...Paris Open Source Summit
Dans une banque vieille de 200 ans, il ne parait pas forcément évident au premier abord de convaincre d’une démarche Open Source. Et pourtant, nous l’avons fait !
Dans cette conférence, nous vous expliquerons comment l’idée de publier du code Open Source est née, quels sont les leviers et opportunités que nous avons actionnés pour convaincre nos différentes directions. Nous expliquerons également les difficultés rencontrées et les choix retenus.
Si vous aussi, vous êtes dans une banque, une assurance ou encore un groupe industriel, et que vous cherchez des clés pour initier une démarche Open Source, alors venez nous voir !
#OSSPARIS19 - Tuto de première installation de VITAM, un système d'archivage ...Paris Open Source Summit
#Business #Apps - Track - Gestion documentaire et collaboration
VITAM est une solution d'archivage open source utilisée pour des volumes élevés jusqu'à des milliards de documents. C'est un système distribué qui peut être implémenté aussi bien sur du bare metal que du cloud OpenStack, utilisant de 3 à plus de 100 VM.
Il est conçu pour être efficace et très facile à administrer. Les principales opérations techniques sont entièrement automatisées.
Cette présentation vous donnera les principales informations sur l'architecture VITAM, la façon de l'installer sur votre infrastructure et les pièges classiques à éviter. Elle vous permettra aussi de rencontrer des techniciens impliqués dans le développement de VITAM.
This is a classic diagram that maps how business and data are related. Nothing is new. This never changes. In fact in becomes even more important today.
We accomplish this innovation by offering two editions of our products. The Talend Open Studio, at the bottom of this diagram, is a set of free open source products for Data Quality, Data Integration, Master Data Management, Enterprise Service Bus and Business Process Management. And when you are ready to deploy, you can purchase a Talend Enterprise commercial license, which includes the features found in world-class integration solutions such as extreme scalability, high availability and 24x7 mission-critical support – all backed by a large services and partner ecosystem. Unlike competitors “non-integrated” integration products, Talend’s uniqueness is in the unification of our products – they are built from the same unified platform, maximizing your productivity and providing greater software reuse and repeatability. An analogy would be the user experience you see with the integration of the iPod, iPad and iPhone. As shown in this picture, our products leverage the same studio, repository, and deployment, execution and monitoring tools to maximize your productivity. As modular products, you can buy what you need when you need it, or easily combine them to solve more comprehensive integration problems.
For instance, this is a SIMPLE drawing of how the map reduce features work. This is abstract and does not reflect the complexity of code. Still pretty complex.
Big data has an OPERATION DI challenge. This is the core of what talend was built on and part of our DNA. We simplify the process of implementation to speed projects and increase adoption.Note: I am trying to get a recording that can be embedded in the slide that will build a HDFS load as you speak. It is so simple that it was completed in the time it took for me to present this slide!
Finally, the entire big data world has been built as an open source ecosystem. This all makes sense… talend is the open source leader.To this end we will introduce the first compelte set of tools that will democratize big data. Talend Open Studio for Big Data
However, with big data comes significant challenges. For example, poor data quality can be magnified at huge scale. Consider a small company with 100 customers. Assume they had a bad address for three customers and sent a mailer out to their list. Three mailers would be returned and they would have wasted about 5 dollars or so. Now imagine the world of big data where this number of customers expands across business lines and companies and partners to millions. The costs are big. Even more interesting is the ability to not only use the data but to analyze. Across your customer base, how could you monitor and analyze every interaction they ever had with you (social media, web, stores, etc). This is large amounts of data. A small problem with the data can lead to very LARGE issues with analysis, invalidating the entire reason for big data. Data quality is KEY for big data – it is a core tenant of our strategy.