1
©2015 Talend Inc
Embedding Machine Learning for Actionable
Insight
Introducing Talend 6.1
2
Connecting the Data-Driven Enterprise
Data-Driven companies…
· 23 times greater customer acquisition
· 6 times greater customer retention
· 19 times more profitability
McKinsey’s DataMatics 2013 Survey - Using customer analytics to boost corporate performance
3
• Data-Driven Opportunities and
Challenges
• Introducing Talend 6.1
• Demo
• Next Steps
Agenda
4
Data
Explosion
44
Trillion
Gigabytes
Cloud
Success
$7B
AWS Growing
at 81%
80%
time fixing
data
Self-service
Data
1. 7th Annual IDC DigitalUniverse Study estimatesthe digitaluniverse will be 44 Zettabytesby 2020
2. Re:Invent 2015 Keynote - Andy Jassy
3. Recent report by Crowdflower found that data scientistsspend 80% of their time wrangling data.
What We Believe: Market Changes
5
INTERNET OF THINGS
Potential OpEx Savings
(15 Year Timeframe)
+276B
SMART ENERGY
(Oil & Gas)
1% Reduction in Capital
Expenditures
+90B
SMART UTILITIES
1% Fuel Savings
+66B
+30B
+27B
SMART HEALTHCARE
1% Improvement in
Operational Efficiency
SMART AVIATION
1% Fuel Savings
CONNECTED
TRANSPORTAION
1% Improvement in
Operational Efficiency
+63B
Source: http://www.ge.com/docs/chapters/Industrial_Internet.pdf
6
• Data Science
• Extracting information from large
volumes of data to improve decisions
• Machine Learning
• A form of artificial intelligence, where
computers can learn and act to make
decisions
• Examples:
• Fraud detection
• Predict machine failure
• Customer recommendations
The Importance of Data Science and Machine Learning
Benefit: Making data applications intelligent
7
• Acquiring skills to implement analytics
• Analysis is often on a sample of data
instead of all your data
• 67% state cleaning and organizing data is
the most time-consuming task
• It takes a long time to put a model into
production (months)
• Analytics models change frequently
requiring system updates
• Many tasks are unproductive hand-coding
New Challenges In Becoming Data-Driven
8
?
How Do You Turn Data Science Into Production?
• Use data science tools
• Program in R, Spark
• Acquire data, model,
analyze
• Create a prediction,
model, score
• Use data integration tools
• Little coding
• Continuous integration –
dev, test, deploy, maintain
• Deploy a prediction,
model, score
Need to move from deploying models in months to days
9
What if there was an
easier way to
operationalize analytics?
10
Faster
deployment of
advanced
analytics
Protect all your
data at the
speed of Spark
Expands your
big data
ecosystem
Improves
usability and
collaboration
Introducing Talend 6.1
Talend Continues Big Data Integration Innovation
#4 is a buc
rest of the
inc Talend
Manager (
want TMM
11
Benefits: Make decisions faster. Tremendous developer productivity.
• Visually develop jobs that run 100% on Spark
• 5X times faster using independent benchmarks
• 10X developer productivity gained over hand-
coding Spark
• Over 100 new drag-n-drop Spark components
• HDFS, RDBMS, NoSQL, Cloud Storage,
Transformation, Messaging, In-memory
analytics & machine learning
recommendations, and much more
• In-memory data caching & “windowed”
computations
• Click to enable Spark Streaming for real-time
data processing
Talend 6 Introduced Talend Real-time Big Data
1st Data Integration Platform on Spark
12
New Components to Operationalize Analytics
Talend 6.1 Continues Big Data Innovation
Entity Question Model Type Talend Components
(MLlib)
Customer
Buy / No Buy,
Fraud / Not Fraud
Classification Random Forest
Naïve Bayes (6.0)
Predict Churn,
Forecasting
Regression Logistic Regression
Gold Customers
(Segmentation)
Clustering K-means
Product Recommendation Collaborative
Filtering
Alternating Least
Squares (ALS) (6.0)
Yann, Isabe
review. Al
would we
“Encoding
data featu
changes in T
13
• Fast data masking performance running on
Hadoop and Spark
Benefit: Meet compliance mandates and prevent data breaches.
Data
Masking
More Secure Data – Now Runs on Spark
14
Building Intelligent Data Applications
Talend 6.1 Operationalizes Analytics
Data Integrate Learn Act Value
15
Building Intelligent Data Applications
Talend 6.1 Operationalizes Analytics
Data
Fuel
learning
Apply
learning
Integrate Learn Act Value
• Graphical tools simplify
development
• Continuous integration
speeds delivery
• Deploy on Spark and
Hadoop
• Batch and streaming
• Data cleansing and masking
• Machine learning
• Time-based analysis
• 900+ components
Benefit: Easily move data science into data processing applications
16
Real-Time Analytics Use Cases
Predictive Maintenance
Personalized
Patient Care Smart Cities
Product Recommendation Precision Farming
Dynamic Pricing
17
©2015 Talend Inc
Demo Time
18
1. Highlight Predictive Analytics scenario
2. Show Data Masking
3. Introduction to the Talend 6 Real-time Big Data Sandbox
Talend 6.1 Demonstration
19
DiscoverSparkwithTalendSandbox
Create a streaming
data flow
with Kafka
Create a
recommendation
model with Spark
MLlib
Create a real-time
Spark
recommendation
engine
<Sandbox link>
#TalendConnect
20
©2015 Talend Inc
Talend Real-Time Big Data Sandbox
Big Data Insights
21
Where do I get the Sandbox?
http://www.talend.com/products/real-time-big-data
22
Faster
deployment of
advanced
analytics
Protect all your
data at the
speed of Spark
Expands your
big data
ecosystem
Improves
usability and
collaboration
Introducing Talend 6.1
Talend Continues Big Data Integration Innovation
#4 is
rest
inc T
Man
wa
23
BIG DATA
Cloudera Navigator 2.3 certification
New and updated connectors
• Cassandra 2.2
• Cloudera 5.5
• Hortonworks 2.3
• MarkLogic 8
• Microsoft Azure HDInsight 3.2 on Spark
• MapR 5.0
DATA INTEGRATION AND DATA QUALITY
Git distributed revision control system
24
MDM
Graphical entity / relationship visualizer
Hierarchy search panel
ESB / DATA MAPPER
Continuous delivery for agile development
Routes and routelets improves reuse
XA transaction support for complex
transactions
Easier mapping of SAP Idoc
Application profiles for provisioning service
25
http://www.talend.com/products/real-time-big-data
Learn More
Free Spark Real-Time Big Data Sandbox!
https://info.talend.com/prodevaltpbdrealtimesandbox.html
Talend 6.0 Webinar-on-Demand
http://www.talend.com/resources/webinars/what’s-new-in-talend-6

Talend 6.1 - What's New in Talend?

  • 1.
    1 ©2015 Talend Inc EmbeddingMachine Learning for Actionable Insight Introducing Talend 6.1
  • 2.
    2 Connecting the Data-DrivenEnterprise Data-Driven companies… · 23 times greater customer acquisition · 6 times greater customer retention · 19 times more profitability McKinsey’s DataMatics 2013 Survey - Using customer analytics to boost corporate performance
  • 3.
    3 • Data-Driven Opportunitiesand Challenges • Introducing Talend 6.1 • Demo • Next Steps Agenda
  • 4.
    4 Data Explosion 44 Trillion Gigabytes Cloud Success $7B AWS Growing at 81% 80% timefixing data Self-service Data 1. 7th Annual IDC DigitalUniverse Study estimatesthe digitaluniverse will be 44 Zettabytesby 2020 2. Re:Invent 2015 Keynote - Andy Jassy 3. Recent report by Crowdflower found that data scientistsspend 80% of their time wrangling data. What We Believe: Market Changes
  • 5.
    5 INTERNET OF THINGS PotentialOpEx Savings (15 Year Timeframe) +276B SMART ENERGY (Oil & Gas) 1% Reduction in Capital Expenditures +90B SMART UTILITIES 1% Fuel Savings +66B +30B +27B SMART HEALTHCARE 1% Improvement in Operational Efficiency SMART AVIATION 1% Fuel Savings CONNECTED TRANSPORTAION 1% Improvement in Operational Efficiency +63B Source: http://www.ge.com/docs/chapters/Industrial_Internet.pdf
  • 6.
    6 • Data Science •Extracting information from large volumes of data to improve decisions • Machine Learning • A form of artificial intelligence, where computers can learn and act to make decisions • Examples: • Fraud detection • Predict machine failure • Customer recommendations The Importance of Data Science and Machine Learning Benefit: Making data applications intelligent
  • 7.
    7 • Acquiring skillsto implement analytics • Analysis is often on a sample of data instead of all your data • 67% state cleaning and organizing data is the most time-consuming task • It takes a long time to put a model into production (months) • Analytics models change frequently requiring system updates • Many tasks are unproductive hand-coding New Challenges In Becoming Data-Driven
  • 8.
    8 ? How Do YouTurn Data Science Into Production? • Use data science tools • Program in R, Spark • Acquire data, model, analyze • Create a prediction, model, score • Use data integration tools • Little coding • Continuous integration – dev, test, deploy, maintain • Deploy a prediction, model, score Need to move from deploying models in months to days
  • 9.
    9 What if therewas an easier way to operationalize analytics?
  • 10.
    10 Faster deployment of advanced analytics Protect allyour data at the speed of Spark Expands your big data ecosystem Improves usability and collaboration Introducing Talend 6.1 Talend Continues Big Data Integration Innovation #4 is a buc rest of the inc Talend Manager ( want TMM
  • 11.
    11 Benefits: Make decisionsfaster. Tremendous developer productivity. • Visually develop jobs that run 100% on Spark • 5X times faster using independent benchmarks • 10X developer productivity gained over hand- coding Spark • Over 100 new drag-n-drop Spark components • HDFS, RDBMS, NoSQL, Cloud Storage, Transformation, Messaging, In-memory analytics & machine learning recommendations, and much more • In-memory data caching & “windowed” computations • Click to enable Spark Streaming for real-time data processing Talend 6 Introduced Talend Real-time Big Data 1st Data Integration Platform on Spark
  • 12.
    12 New Components toOperationalize Analytics Talend 6.1 Continues Big Data Innovation Entity Question Model Type Talend Components (MLlib) Customer Buy / No Buy, Fraud / Not Fraud Classification Random Forest Naïve Bayes (6.0) Predict Churn, Forecasting Regression Logistic Regression Gold Customers (Segmentation) Clustering K-means Product Recommendation Collaborative Filtering Alternating Least Squares (ALS) (6.0) Yann, Isabe review. Al would we “Encoding data featu changes in T
  • 13.
    13 • Fast datamasking performance running on Hadoop and Spark Benefit: Meet compliance mandates and prevent data breaches. Data Masking More Secure Data – Now Runs on Spark
  • 14.
    14 Building Intelligent DataApplications Talend 6.1 Operationalizes Analytics Data Integrate Learn Act Value
  • 15.
    15 Building Intelligent DataApplications Talend 6.1 Operationalizes Analytics Data Fuel learning Apply learning Integrate Learn Act Value • Graphical tools simplify development • Continuous integration speeds delivery • Deploy on Spark and Hadoop • Batch and streaming • Data cleansing and masking • Machine learning • Time-based analysis • 900+ components Benefit: Easily move data science into data processing applications
  • 16.
    16 Real-Time Analytics UseCases Predictive Maintenance Personalized Patient Care Smart Cities Product Recommendation Precision Farming Dynamic Pricing
  • 17.
  • 18.
    18 1. Highlight PredictiveAnalytics scenario 2. Show Data Masking 3. Introduction to the Talend 6 Real-time Big Data Sandbox Talend 6.1 Demonstration
  • 19.
    19 DiscoverSparkwithTalendSandbox Create a streaming dataflow with Kafka Create a recommendation model with Spark MLlib Create a real-time Spark recommendation engine <Sandbox link> #TalendConnect
  • 20.
    20 ©2015 Talend Inc TalendReal-Time Big Data Sandbox Big Data Insights
  • 21.
    21 Where do Iget the Sandbox? http://www.talend.com/products/real-time-big-data
  • 22.
    22 Faster deployment of advanced analytics Protect allyour data at the speed of Spark Expands your big data ecosystem Improves usability and collaboration Introducing Talend 6.1 Talend Continues Big Data Integration Innovation #4 is rest inc T Man wa
  • 23.
    23 BIG DATA Cloudera Navigator2.3 certification New and updated connectors • Cassandra 2.2 • Cloudera 5.5 • Hortonworks 2.3 • MarkLogic 8 • Microsoft Azure HDInsight 3.2 on Spark • MapR 5.0 DATA INTEGRATION AND DATA QUALITY Git distributed revision control system
  • 24.
    24 MDM Graphical entity /relationship visualizer Hierarchy search panel ESB / DATA MAPPER Continuous delivery for agile development Routes and routelets improves reuse XA transaction support for complex transactions Easier mapping of SAP Idoc Application profiles for provisioning service
  • 25.
    25 http://www.talend.com/products/real-time-big-data Learn More Free SparkReal-Time Big Data Sandbox! https://info.talend.com/prodevaltpbdrealtimesandbox.html Talend 6.0 Webinar-on-Demand http://www.talend.com/resources/webinars/what’s-new-in-talend-6

Editor's Notes

  • #2 Welcome to this webinar, Embedding Machine Learning for Actionable insight. …Introducing Talend 6.1 I am <speaker, title>, and today we have a very exciting session to show how easy it is to start moving your Data Science projects into production. Today, we are going to present the new features in Talend 6.1 and demonstrate how you can “drag-and-drop” your way using graphical tools and pre-built Spark machine learning components to operationalize and embed analytics into your systems, that will not only bring significant new insight into your business, but will enable you to Act on this insight in Real-Time.
  • #3 As a brief introduction, Talend is the leading open source integration software provider focused on enabling organizations to become data-driven enterprises A recent report from McKinsey Global Institute highlights the impact of making decisions based on data-driven insights. In the end, companies that are data-driven, that can gather, process and analyze data as it flows through the enterprise, make better decisions This approach results in a 23 times greater likelihood of customer acquisition, a 6 times greater likelihood of customer retention and a 19 times greater likelihood of profitability. Talend 6.1 delivers new capabilities that make companies even more data-driven, and able to turn all their data into decisions.
  • #4 Today we are going to talk about data-driven opportunities and both the organizational and technical challenges companies have in becoming data-driven. Next we will review the new capabilities that Talend 6.1 provides allowing you to easily get more out of your data than every before. And it is more than just moving data from point A to point B. Today companies need to create intelligent data applications that can sense and respond to opportunities and threats as the occur. Next we will do a demonstration showing some of the new machine learning and data masking capabilities that Talend 6.1 delivers. Finally, we will wrap up with some next steps that you can do today to get quick wins in your company, and be on the path to a truly data-driven enterprise.
  • #5  What we are seeing are three significant trends that are impacting every company today. First, the amount of data being generated is staggering. By 2020, IDC estimates that the digital universe will be at 44 Trillion Gigabytes .. The amount of data being created and consumed is doubling every 2 years Secondly, Cloud is becoming the new frontier for applications consisting of tens of thousands of applications. As an example Amazon recently reported record revenue for Amazon Web Services at $7B which is growing at 81% Year over Year. That is amazing. Finally, a recent Data Scientist and Analytics report by Crowdflower, found that data scientists spend 80% of their time wrangling data. With the data appetite that everyone in your company has to make better decisions, it is necessary that everyone can get to clean data and use it for their analysis. In summary, these 3 trends are having a profound impact on how companies manage their data today. Source 1. 7th Annual IDC Digital Universe Study estimates the digital universe will be 44 Zettabytes by 2020 2. Re:Invent 2015 Keynote - Andy Jassy 3. Recent report by Crowdflower found that data scientists spend 80% of their time wrangling data.
  • #6  And by having intelligent data applications, instead of ETL pipelines that just push data from point A to point B, you can get tremendous benefits. A recent GE Internet of Things research study shows that a modest 1% improvement in operational efficiency in key industries such as aviation, energy, transportation and logistics, and healthcare could drive a potential $276B in opex savings over the next 15 years. Today Talend customers include GE and Otto are achieving significant benefits through intelligent data applications. You need to ask yourself, what is your 1%. What if you could reduce operational expenses by 1%? Source: http://www.ge.com/docs/chapters/Industrial_Internet.pdf
  • #7 Data Science and Machine Learning are two topics that are getting a lot of interest today, as companies move from batch analytics to predictive analytics to prescriptive analytics. Data is life-blood of your business. You need to not only absorb as much as you can, but you need to analyze it, and act on it to make data-driven decisions. But what exactly is Data Science and Machine Learning? Data Science is about -Extracting information from large volumes of data to improve decisions -It is using data instead of your intuition. - It combines computer science, statistics, mathematics and domain expertise An example of data science would be finding the location for highest car theft in a city. San Francisco, did such a study and found it to be near parks, that had a number of access points, so they changed how they monitor for these events. Another example of data science is how the city of Portland,Oregon optimized traffic signals to eliminate 157,000 tons of CO2 emissions in 6 years – the equivalent of taking 30,000 cars off the road for a year. And Machine Learning - Is a form of artificial intelligence and data science, where computers can learn and act to make decisions - Examples: Fraud detection – or looking though terabytes of data to find patterns so you can not only stop fraud, but you can predict where it will occur Predict machine failure – as an example one of Talend’s customers GE can predict if wind turbines are going to fail Customer recommendations – such as product recommendations on a website based on past purchases And as we know today, there is a huge demand for data scientists. People who can create these models, so you make decisions based on data, NOT intuition.
  • #8 Amid all this opportunity is the operations of how you turn data into decisions. Or how do you move data science into reality. Think of the back office, or an assembly line. You have all these data sources, producing thousands and millions of events per day coming into your business, and you need to process it intelligently and quickly. This way you can tell for example if There is an opportunity to upsell a customer, or Predict if someone is going to abandon their shopping cart The challenge companies are finding however: Acquiring skills to implement analytics. Data scientists are expensive and hard to find Analysis is often on a sample of data instead of all your data… although the sample size may be good enough, there is inherent risks that it is not. Dirty data means dirty insight. 67% of respondents reported cleaning and organizing data is the most time-consuming task,. This stat is amazing … we want data scientists to model and analyze, not clean data. It takes a long time to put a model into production. A recent TDWI report states that for most companies it takes months to deploy a model into operational use. And once it is deployed. The analytics models can change frequently requiring deployed updates .. Otherwise Out-of-date models = out-of-date insight And finally, many tasks are unproductive hand-coding .. And that is both the data scientist and big data developer. <next slide> --------More ------------ The Four Things Data Scientists Wish You Knew Get The Most Out Of Your Data Science Investments by Brian Hopkins October 26, 2015 A study by McKinsey projects that “by 2018, the U.S. alone may face a 50 percent to 60 percent gap between supply and requisite demand of deep analytic talent.   Data scientists cost $200K/yr and up, or about approx. 50% more than a Java developer.   Sometimes, after the initial exploration phase, the work of a data scientist will be “productized,” or extended, hardened (i.e., made fault-tolerant), and tuned to become a production data processing application, which itself is a component of a business application. For example, the initial investigation of a data scientist might lead to the creation of a production recommender system that is integrated into a web application and used to generate product suggestions to users. Often it is a different person or team that leads the process of productizing the work of the data scientists, and that person is often an engineer.
  • #9 The question on everyone's mind is … How do I turn these insightful models into an assembly line of intelligent data that is sensing and responding in the moment? The Data Scientist will acquire data and build the model, but then the IT developer needs to deploy that model. How do I take the data scientist work and as Brian Hopkins from Forrester states, “deploy it in a way that prompts action” How do you move a model into something that is scalable, easy to maintain, access all your data, and acts in real-time based on what is happening, such a machine is about to fail. Today, we need data scientists and developers to collaborate more ---------- Supporting quote “The output of data science — a prediction, a model, or a score — is useless until the organization deploys it in a way that prompts action.” Brian Hopkins, Forrester
  • #10 What if there was an easier way to operationalize analytics? This is what Talend 6.1 enables….
  • #11 Well, we are very excited to introduce Talend 6.1, which continues Talend’s innovation in big data integration Talend 6.1 delivers new machine learning components based on Spark MLlib, so you can build intelligent data pipelines and do faster deployment of advanced analytics Easily move data models into production, supporting fast and frequent iterations Improves big data design and development collaboration between data scientists and developers We are not saying that developers can do all the work of the data scientist, what we are offering are tools to help developers build and deploy analytics models. Recently Gartner mentioned, that they forecast the role of a “Citizen data scientist” . .or someone who can do some of the functionality, freeing up the data scientist. Talend 6.1 provides new data masking capabilities on Spark, so you can make data private across the data lake and connected systems, reducing associated risks. It expands your big data connectivity options with support for the latest big data technologies from Cloudera, MapR, Hortonworks, Amazon, Microsoft and new partnerships with MarkLogic and ServiceNow Talend 6.1 also includes numbers enhancements such as Git support to help your development and operations team be much more productive supporting fast and frequent iterations of machine learning models. And new ESB and MDM capabilities that increase the productivity of your team. ------stop -------- Talend 6.1 support for Spark machine learning libraries (Mllib) and other cool new features, developers can easily move data science models into production, supporting fast and frequent iterations. With Talend 6.1, developers use pre-built components and drag-and-drop tools to build Spark analytics models for customer segmentation, forecasting, classification, regression analysis and more. recent study showed that the most time consuming task for data scientists is cleaning and organizing data. Well now, with Talend 6.1 support for Spark machine learning libraries (Mllib) and other cool new features, developers can easily move data science models into production, supporting fast and frequent iterations. With Talend 6.1, developers use pre-built components and drag-and-drop tools to build Spark analytics models for customer segmentation, forecasting, classification, regression analysis and more. Join our live webinar to learn more about Talend 6.1, including: Spark machine learning algorithms for advanced analytics Data masking on Spark Continuous delivery enhancements with Git New and updated big data components and connectors
  • #12 For those that might have missed our recent announcement, we just announced Talend Real-Time Big Data the 1st data integration platform on Spark. that provides native support for Spark, Spark Streaming, and Hadoop along with Enterprise Messaging capabilities like Kafka and Kinesis. Using the graphical Talend Studio you visually build integration jobs that generate native Spark code. That code can then be run inside a Hadoop platform like Cloudera, Hortonworks, MapR, or it can be run standalone – in fact our new Real-Time Big Data Sandbox runs standalone. Or it can run in the Cloud, e.g. it runs in Amazon EMR. Talend 6 with Spark is 5X faster over MapReduce using independent benchmarks, and as a developer you are 10 times more productive using Talend model-driven tooling instead of handcoding Spark. Talend 6 provides over 100 new drag-and-drop Spark components, so you can immediately connect to traditional data sources, Hadoop, NoSQL, Cloud storage. There are components for transformation, connecting to messaging systems like Apache Kafka and Amazon Kinesis, and machine learning components and caching components as well. Another important capability that comes with Spark and Talend 6 is running processes in-memory and doing window-based computations for time series analysis - which is extremely important for building intelligent data applications. So for example if you are ingesting a stream of data, you can set a time period to analyze that data stream for any changes.
  • #13 Talend 6.1 continues big data innovation with new components to operationalize analytics. When analytics are embedded into business systems, the end result is that analytics become more consumable, which means that more people can make use of analytics output. It also makes analytics actionable. Talend 6.1 adds Classification, Regression and Clustering components to our Talend 6 components, so you can “drag-and-drop” your way to operational analytics with easy iterative development. And best of all, Talend generates native Spark code so there is no handcoding and with Continuous Delivery integration, your development team is extremely productive for developing, deploying and maintaining analytics models. What this also means is that you can offload some analytics tasks to your development team. This table highlights what both Talend 6.0 and 6.1 deliver <step through each row in table ….> • Will an event happen in the future?• Classification of fraud, churn, propensity to buy • Question such as how much, or forecasting?• Regression: estimate/predict the potential outcome of actions • What are the different profiles in this population?• Clustering customers or sites by similar behavior • Which products are likely to be bought together?• Recommendation --------background information -------------------- 6.0 was shipped with 2 models for 2 families : Recommendation (tALSModel) Classification (tNaiveBayes) 6.1 extends it with a component to help featurization, 3 new models in Classification and added 1 model in the Clustering family Entirely based on Spark ML & ML lib to make profit of all the frameworks capabilities, especially from Spark 1.4+ What is Spark MLlib Spark comes with a library containing common machine learning (ML) functionality, called MLlib. MLlib provides multiple types of machine learning algorithms, including classification, regression, clustering, and collaborative filtering, as well as supporting functionality such as model evaluation and data import. It also provides some lower-level ML primitives, including a generic gradient descent optimization algorithm. All of these methods are designed to scale out across a cluster.
  • #14 In Talend 6.1, we extend Talend Data Masking features to run on Spark.. So now you can privatize the information in you data lake. This helps meet compliance mandates or privacy code of conducts, and protects data against abuses or breaches. This component (tDataMasking) obfuscates data (numbers, strings, dates, personally-identifiable information and so on) without impacting the rules that surrounds that data or allowing other users to see the data. This is very important for all companies, but in particular healthcare and finance companies where there is the sharing of sensitive data. Customers have been asking for this and we are delighted to introduce this.
  • #15 In summary, operationalizing and embedding analytics is about integrating actionable insights into systems and business processes used to make decisions. Streaming analytics is about applying statistical models, algorithms, or other analysis practices to data arriving continuously. By running predictive models on these flows, organizations can monitor events and processes and become more situationally aware, predictive and prescriptive ------- more -------- This enables you to spot trends and patterns, do correlations, and respond to anomalous events. They can also filter for relevance and enrich the quality of data flowing in real time with information from their other sources.
  • #16 And if we look at what Talend 6.1 delivers, it is all the tools you need to operationalize analytics. .. Read list
  • #17 And as a few examples of what Talend customers are doing today Predictive maintenance on wind turbines by (GE) (also airplanes by AirFrance) Product recommendation & dynamic pricing for retailers (Otto) Precision farming - Springg, an argricultural company in the Netherlands, measures soil and other info like weather, with sensors to recommend the right amount of fertilizers to optimize crop yields and reduce costs Personalized patient care by another Talend customer, where they are: collecting Fitbit info from patients that is then exchanged with their physician. The doctor then can adjust the patients medication based on activity levels and other physical, biological, and health tests. And finally , Talend customer m2ocity, a French telecom operator, is making cities Smarter. Talend Helps m2ocity Seamlessly Collect and Process Up to Four Million Pieces of Data per day from its network of millions of smart meters on water, gas meters and other sensors. They are able to provide this information to their clients for behavioral analysis and value added services. -----------more --------------- optical character recognition: categorize images of handwritten characters by the letters represented face detection: find faces in images (or indicate if a face is present) spam filtering: identify email messages as spam or non-spam topic spotting: categorize news articles (say) as to whether they are about politics, sports, entertainment, etc. spoken language understanding: within the context of a limited domain, determine the meaning of something uttered by a speaker to the extent that it can be classified into one of a fixed set of categories medical diagnosis: diagnose a patient as a sufferer or non-sufferer of some disease customer segmentation: predict, for instance, which customers will respond to a par- ticular promotion fraud detection: identify credit card transactions (for instance) which may be fraud- ulent in nature weather prediction: predict, for instance, whether or not it will rain tomorrow
  • #18 Now this is the Demonstration part of the webinar .. I would like to turn it over to Mark Balkenende
  • #19 Mark
  • #20 IN
  • #23 Thanks xxxx In Summary, We are very excited to introduce Talend 6.1, which continues Talend’s innovation in big data integration Talend 6.1 delivers new machine learning algorithms, so you can do operationalize analytics, and turn data science into production. Talend 6.1 enables you to graphical build analytics projects that include customer segmentation, forecasting, classification, regression analysis and more. The benefit is three fold: You can significantly increase your analytics projects, so your business is getting more insight You can deliver analytics projects and updates faster – so instead of updating models once a month and using out-of-data data, you are using the latest model and getting better insight and results. And finally, with Talend’s rich data integration and data quality functions, you can analyze and cleanse all your data at the speed of Spark. Talend 6.1 provides new data masking capabilities on Spark, so you can make data private across the data lake and connected systems It expands your big data connectivity options with support for the latest big data technologies and new partnerships with MarkLogic and ServiceNow And finally, Talend 6.1 includes numbers enhancements such as Git support to facilitate collaboration, ESB and MDM enhancements that increase the productivity of your team.
  • #24 And there are other numerous Talend 6.1 enhancements that we did not get a chance to talk about, but you can learn more about at www.talend.com
  • #26 Thank you for attending this on-demand webinar. To learn more about Talend Real-time Big Data, please visit http://www.talend.com/products/real-time-big-data And also from that URL or the one shown here, you can get your FREE Real-Time Big Data Sandbox, so you can be operationalization analytics in the next few hours! Finally, if you would like to learn more about Talend 6.0 cool new features including Spark and Internet of Things, there is an on-demand webinar located here.