Artificial Intelligence: How Enterprises Can Crush It With Apache Spark: Keyn...Spark Summit
Artificial intelligence (AI) is not new. It emerged as a computer science discipline in the 50’s and has been a persistent theme in science fiction. What is new is that enterprises now have the prerequisites needed to create pragmatic AI applications: plenty of data, deep learning frameworks, and blazing fast distributed compute clusters à la Apache Spark. Forrester Vice President and Principal Analyst, Mike Gualtieri will enumerate and demystify nine essential AI technology building blocks that enterprises can use to add a modicum of intelligence to existing and new applications.
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...Sri Ambati
This session was recorded in San Francisco on February 5th, 2019 and can be viewed here: https://youtu.be/VAW2eDht7JA
Bio: Krish Swamy is an experienced professional with deep skills in applying analytics and BigData capabilities to challenging business problems and driving customer insights. Krish's analytic experience includes marketing and pricing, credit risk, digital analytics and most recently, big data analytics and data transformation. His key experiences lie in banking and financial services, the digital customer experience domain, with a background in management consulting. Other key skills include influencing organizational change towards a data and analytics driven culture, and building teams of analytics, statisticians and data scientists.
Bio: Balaji Gopalakrishnan has over 15 years experience in the Machine Learning and Data Science space. Balaji has led cross functional data science and engineering teams for developing cutting-edge machine learning and cognitive computing capabilities for insurance fraud and underwriting, telematics, multi-asset class risk, scheduling under uncertainty, and others. He is passionate about driving AI adoption in organizations and strongly believes in the power of cross functional collaboration for this purpose.
Commercial Analytics at Scale in Pharma: From Hackathon to MVP with Azure Dat...Databricks
GSK are a science-led global healthcare company with a special purpose: to help people do more, feel better, live longer.
We have three global businesses that discover, develop and manufacture innovative pharmaceutical medicines, vaccines and consumer healthcare products.
In this talk i will share our experience in the Pharmaceutical business delivering commercial analytics going from hackathon to MVP.
From the initial ideas and business discussions through delivery of a hackathon as an accelerator, on to building an MVP. Using the Azure cloud platform and Databricks to rapidly ingest data and prototype.
I will touch on the challenges, opportunities and learning points of the process we went through to deliver commercial analytics at scale in Pharma.
Using Apache Spark for Intelligent Services: Keynote at Spark Summit East by ...Spark Summit
Salesforce is developing Einstein which is an artificial intelligence (AI) capability built into the core of the Salesforce Platform. Einstein helps power the world’s smartest CRM to deliver advanced AI capabilities to sales, services, and marketing teams – helping them discover new insights, predict likely outcomes to power smarter decision making, recommend next steps, and automate workflows so users can focus on building meaningful relationships with every customer.
Salesforce is using Apache Spark (batch, streaming, GraphX and ML) to power the Einstein platform and services. In this keynote and demo, Alexis will highlight how Salesforce is building intelligent Services for Einstein using activity data by leveraging Spark and Databricks to scale data science and engineering.
Wizard Driven AI Anomaly Detection with Databricks in AzureDatabricks
Fraud is prevalent in every industry, and growing at an increasing rate, as the volume of transactions increases with automation. The National Healthcare Anti-Fraud Association estimates $350B of fraudulent spending. Forbes estimates $25B spending by US banks on anti-money laundering compliance. At the same time as fraud and anomaly detection use cases are booming, the skills gap of expert data scientists available to perform fraud detection is widening.
The Kavi Global team will present a cloud native, wizard-driven AI anomaly detection solution, enabling Citizen Data Scientists to easily create anomaly detection models to automatically flag Collective, Contextual, and Point anomalies, at the transaction level, as well as collusion between actors. Unsupervised methods (Distribution, Clustering, Association, Sequencing, Historical Occurrence, Custom Rules) and supervised (Random Forest, Neural Network) models are executed in Apache Spark on Databricks.
An innovative aggregation framework converts probabilistic fraud scores and their probabilities into a meaningful and actionable prioritized list of suspicious (a statistical outlier) and potentially fraudulent transaction to be investigated from a business point of view. The AI Anomaly Detection models improve over time using Human-in-the-Loop feedback methods to label data for supervised modeling.
Finally, The Kavi team overviews the Anomaly Lifecycle: from statistical outlier to validated business fraud for reclaim and business process changes to long term prevention strategies using proactive audits upstream at the time of estimate to prevent revenue leakage. Two client success stories will be presented acros Pharmaceutical Rx and Transportation industries.
Artificial Intelligence: How Enterprises Can Crush It With Apache Spark: Keyn...Spark Summit
Artificial intelligence (AI) is not new. It emerged as a computer science discipline in the 50’s and has been a persistent theme in science fiction. What is new is that enterprises now have the prerequisites needed to create pragmatic AI applications: plenty of data, deep learning frameworks, and blazing fast distributed compute clusters à la Apache Spark. Forrester Vice President and Principal Analyst, Mike Gualtieri will enumerate and demystify nine essential AI technology building blocks that enterprises can use to add a modicum of intelligence to existing and new applications.
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...Sri Ambati
This session was recorded in San Francisco on February 5th, 2019 and can be viewed here: https://youtu.be/VAW2eDht7JA
Bio: Krish Swamy is an experienced professional with deep skills in applying analytics and BigData capabilities to challenging business problems and driving customer insights. Krish's analytic experience includes marketing and pricing, credit risk, digital analytics and most recently, big data analytics and data transformation. His key experiences lie in banking and financial services, the digital customer experience domain, with a background in management consulting. Other key skills include influencing organizational change towards a data and analytics driven culture, and building teams of analytics, statisticians and data scientists.
Bio: Balaji Gopalakrishnan has over 15 years experience in the Machine Learning and Data Science space. Balaji has led cross functional data science and engineering teams for developing cutting-edge machine learning and cognitive computing capabilities for insurance fraud and underwriting, telematics, multi-asset class risk, scheduling under uncertainty, and others. He is passionate about driving AI adoption in organizations and strongly believes in the power of cross functional collaboration for this purpose.
Commercial Analytics at Scale in Pharma: From Hackathon to MVP with Azure Dat...Databricks
GSK are a science-led global healthcare company with a special purpose: to help people do more, feel better, live longer.
We have three global businesses that discover, develop and manufacture innovative pharmaceutical medicines, vaccines and consumer healthcare products.
In this talk i will share our experience in the Pharmaceutical business delivering commercial analytics going from hackathon to MVP.
From the initial ideas and business discussions through delivery of a hackathon as an accelerator, on to building an MVP. Using the Azure cloud platform and Databricks to rapidly ingest data and prototype.
I will touch on the challenges, opportunities and learning points of the process we went through to deliver commercial analytics at scale in Pharma.
Using Apache Spark for Intelligent Services: Keynote at Spark Summit East by ...Spark Summit
Salesforce is developing Einstein which is an artificial intelligence (AI) capability built into the core of the Salesforce Platform. Einstein helps power the world’s smartest CRM to deliver advanced AI capabilities to sales, services, and marketing teams – helping them discover new insights, predict likely outcomes to power smarter decision making, recommend next steps, and automate workflows so users can focus on building meaningful relationships with every customer.
Salesforce is using Apache Spark (batch, streaming, GraphX and ML) to power the Einstein platform and services. In this keynote and demo, Alexis will highlight how Salesforce is building intelligent Services for Einstein using activity data by leveraging Spark and Databricks to scale data science and engineering.
Wizard Driven AI Anomaly Detection with Databricks in AzureDatabricks
Fraud is prevalent in every industry, and growing at an increasing rate, as the volume of transactions increases with automation. The National Healthcare Anti-Fraud Association estimates $350B of fraudulent spending. Forbes estimates $25B spending by US banks on anti-money laundering compliance. At the same time as fraud and anomaly detection use cases are booming, the skills gap of expert data scientists available to perform fraud detection is widening.
The Kavi Global team will present a cloud native, wizard-driven AI anomaly detection solution, enabling Citizen Data Scientists to easily create anomaly detection models to automatically flag Collective, Contextual, and Point anomalies, at the transaction level, as well as collusion between actors. Unsupervised methods (Distribution, Clustering, Association, Sequencing, Historical Occurrence, Custom Rules) and supervised (Random Forest, Neural Network) models are executed in Apache Spark on Databricks.
An innovative aggregation framework converts probabilistic fraud scores and their probabilities into a meaningful and actionable prioritized list of suspicious (a statistical outlier) and potentially fraudulent transaction to be investigated from a business point of view. The AI Anomaly Detection models improve over time using Human-in-the-Loop feedback methods to label data for supervised modeling.
Finally, The Kavi team overviews the Anomaly Lifecycle: from statistical outlier to validated business fraud for reclaim and business process changes to long term prevention strategies using proactive audits upstream at the time of estimate to prevent revenue leakage. Two client success stories will be presented acros Pharmaceutical Rx and Transportation industries.
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...Alluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses & Data Lakes with Kyligence Cloud
George Demarest, Head of Marketing, Kyligence
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Using Apache Spark for Intelligent Services by Alexis RoosSpark Summit
Salesforce is developing Einstein which is an artificial intelligence (AI) capability built into the core of the Salesforce Platform. Einstein helps power the world’s smartest CRM to deliver advanced AI capabilities to sales, services, and marketing teams – helping them discover new insights, predict likely outcomes to power smarter decision making, recommend next steps, and automate workflows so users can focus on building meaningful relationships with every customer.
Salesforce is using Apache Spark (batch, streaming, GraphX and ML) to power the Einstein platform and services. In this keynote and demo, Alexis will highlight how Salesforce is building intelligent Services for Einstein using activity data by leveraging Spark and Databricks to scale data science and engineering.
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Sri Ambati
This talk was given at H2O World 2018 NYC and can be viewed here: https://youtu.be/xc3j20Om3UM
Description:
Data science is indeed one of the sexy jobs of the 21st century. But it is also a lot of hard work. And the hard work is seldom about the math or the algorithms. It is about building relevant machine learning products for the real world. We will go over some of the must-haves as you take your machine learning model out of the sandbox and make it work in the big, bad world outside.
Speaker's Bio:
Krish Swamy is an experienced professional with deep skills in applying analytics and BigData capabilities to challenging business problems and driving customer insights. Krish's analytic experience includes marketing and pricing, credit risk, digital analytics and most recently, big data analytics and data transformation. His key experiences lie in banking and financial services, the digital customer experience domain, with a background in management consulting. Other key skills include influencing organizational change towards a data and analytics driven culture, and building teams of analysts, statisticians and data scientists.
Tom Aliff, Equifax - Configurable Modeling for Maximizing Business Value - H2...Sri Ambati
This session was recorded in San Francisco on February 5th, 2019 and can be viewed here: https://youtu.be/LUwMtXM2q88
In the current world of data science there many available data sources, big data platforms, and advanced Machine Learning and AI based technologies available. It has become easier and easier to derive predictive value in an efficient and streamlined way and lose sight of objectives especially in the business world. This session will focus on not losing the business context and objective as the navigator for these powerful tools we have at our disposal. Through this discussion, I will review a path towards how to use the tools like explainable and driverless AI to your advantage versus letting the tools set the direction.
Bio: At Equifax, Tom leads the Data and Analytics consulting practice. Previously, Tom was the US Consumer and Commercial Data Sciences Leader. Tom joined Equifax in July of 2009. He brings several years of analytical experience in leading teams on statistical modeling engagements, analysis and consultation across several verticals including telecommunications, lending, mortgage, automotive, and marketing. Prior to Equifax, Tom was a data science manager at Experian and a Risk Modeler/Manager at American General Finance (now OneMain Financial). Tom holds a Master of Science in Applied Statistics from Purdue University, and a Bachelor of Science degree in Mathematics with a concentration in Statistics, also from Purdue University.
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Sri Ambati
This session was recorded in San Francisco on February 5th, 2019 and can be viewed here: https://youtu.be/otq2nQUSV3s
We will talk about the AI transformation journey at Vision Banco - Paraguay, from the early initiatives to futures use cases, and how we adopted open source H2O.ai and Driverless AI in our organization.
Bio:
Ruben Diaz
My name is Ruben Diaz, from Asunción, Paraguay. I am married and father of 3 children. I work as Data Scientist at Vision Banco
Luis Armenta:
Luis holds a BSc in Electrical Engineering from the National University of Mexico and a MSc in Electrical Engineering/Computer Science from the University of Waterloo in Canada. He is also currently completing an Executive MBA at McCombs School of Business at the University of Texas in Austin. Luis has over ~14 years of experience, having started his career as a Research Scientist at Intel Labs before being promoted to 2nd Line Engineering Manager, leading the high-speed interconnect hardware design of Intel’s server portfolio. Luis also has held roles as Product Manager of EM simulators at Ansys, Inc. and as a Systems Engineer of 4K and 8K UHDTVs at Macom.
Generalized B2B Machine Learning by Andrew WaageData Con LA
Abstract:- In this talk, we propose a generalized machine learning framework for e-commerce businesses. The framework is responsible for over 30 different user-level predictions including lifetime value, recommendations, churn predictions, engagement and lead scoring. These predictions provide a vital layer of intelligence for a digital marketer. Kinesis is used to capture browsing information from over 120M users across 100 companies (both in-app and web). A data processing and feature engineering layer is build on Apache Spark. These features provide inputs to predictive models for business applications. Different models each for Churn, Lifetime value, Product recommendation and search are written on Spark. These models can be plugged into any marketing campaign for any integrated e-commerce company leading to a generalized system. We finally present a monitoring system for machine learning called RS Sauron. This system provides more than 200 objective metrics measuring the health of predictive models, and depicts KPIs for model accuracy in a continual setting.
It is rightfully said that data is money in today's world. Along with the transition to an app-based world comes the exponential growth of data.
Orange, Weka,Rattle GUI, Apache Mahout, SCaViS, RapidMiner, R, ML-Flex, Databionic ESOM Tools, Natural Language Toolkit, SenticNet API , ELKI , UIMA, KNIME, Chemicalize.org , Vowpal Wabbit, GNU Octave, CMSR Data Miner, Mlpy, MALLET, Shogun, Scikit-learn, LIBSVM, LIBLINEAR, Lattice Miner, Dlib, Jubatus, KEEL and more
Let's analyze how world reacts to road traffic by sentiment analysis finalSajeetharan
In this session you will build a sentiment analysis solution step by step, using Azure Platform. We will talk about sentiment analysis and how you can get this introduced in your application. We will run live demo and extract data from live twitter feeds and work together in processing the data and performing sentiment analysis on the data. We encourage audience to come with mobile phones to tweet :P but if not, not to worry as we can use the historical data.
DataRobot 머신러닝 자동화 플랫폼은 전 세계 Top Data Scientist 들의 지식, 경험 및 모범 사례를 바탕으로 최고 수준의 자동화와 사용 편리성을 확보한 가장 진보된 머신러닝 자동화 솔루션 입니다. DataRobot을 통해 비즈니스 관계자, 분석가 및 데이터 과학자 등 기술 수준과 관계 없이 모든 사용자가 기존 모델링 기법에 비해 아주 빠르게, 매우 정확한 예측 모델을 수립하고 구축, 관리할 수 있습니다.
Presentation held during Belgrad AI's opening event in March 2019. Terminology and workflows for a regular ML or data science project, top points from selecting an AI project and a little framework to help you get everything on the same page for the top levels details of your ML project.
This talk was given at H2O World 2018 NYC and can be viewed here: https://youtu.be/oxLZZMR1lVY
Description:
Driverless AI is H2O.ai's latest flagship product for automatic machine learning. It fully automates some of the most challenging and productive tasks in applied data science such as feature engineering, model tuning, model ensembling and model deployment. Driverless AI turns Kaggle-winning grandmaster recipes into production-ready code, and is specifically designed to avoid common mistakes such as under- or overfitting, data leakage or improper model validation, some of the hardest challenges in data science. Avoiding these pitfalls alone can save weeks or more for each model, and is necessary to achieve high modeling accuracy, especially for time-series problems.
With Driverless AI, data scientists of all proficiency levels can train and deploy modeling pipelines with just a few clicks from the GUI. Advanced users can use the client API from Python. Driverless AI builds hundreds or thousands of models under the hood to select the best feature engineering and modeling pipeline for every specific problem such as churn prediction, fraud detection, real-estate pricing, store sales prediction, marketing ad campaigns and many more.
To speed up training, Driverless AI uses highly optimized C++/CUDA algorithms to take full advantage of the latest compute hardware. For example, Driverless AI runs orders of magnitudes faster on the latest Nvidia GPU supercomputers on Intel and IBM platforms, both in the cloud or on premise. Driverless AI is fully supported on all major cloud providers.
There are two more product innovations in Driverless AI: statistically rigorous automatic data visualization and machine learning interpretability with reason codes and explanations in plain English. Both help data scientists and analysts to quickly validate the data and the models.
In this talk, we explain how Driverless AI works and show how easy it is to reach top 5% rankings for several highly competitive Kaggle competitions. (edited)
Speaker's Bio:
Arno Candel is the Chief Technology Officer at H2O.ai. He is the main committer of H2O-3 and Driverless AI and has been designing and implementing high-performance machine-learning algorithms since 2012. Previously, he spent a decade in supercomputing at ETH and SLAC and collaborated with CERN on next-generation particle accelerators. Arno holds a PhD and Masters summa cum laude in Physics from ETH Zurich, Switzerland. He was named “2014 Big Data All-Star” by Fortune Magazine and featured by ETH GLOBE in 2015. Follow him on Twitter: @ArnoCandel.
Shift AI 2020: Business benefits of privacy-preserving synthetic data | Sebas...Shift Conference
Shift AI was a success, connecting hundreds of professionals that were eager to propel the progress of AI and discuss the newest technologies in data mining, machine learning and neural networks. More at https://ai.shiftconf.co/.
Talk description:
Privacy defines a state in which one is free from public attention and not observed or disturbed by others. Taken in the context of data, privacy is therefore a state in which an individual’s data is used only with their specific consent, and where any person or organization party to that individual’s data guarantee to prevent unauthorized disclosures or misuse of that information.
Therefore, in order to protect the individual's privacy, strict regulations have already been introduced in many regions and countries worldwide, such as CCPA in California or GDPR in the EU and we can expect many more to come. This puts businesses in a position in which they need to find a solution in order to leverage data while preserving privacy. We will address this topic and answer how businesses can benefit from synthetic data and unlock the value of data.
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...Alluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses & Data Lakes with Kyligence Cloud
George Demarest, Head of Marketing, Kyligence
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Using Apache Spark for Intelligent Services by Alexis RoosSpark Summit
Salesforce is developing Einstein which is an artificial intelligence (AI) capability built into the core of the Salesforce Platform. Einstein helps power the world’s smartest CRM to deliver advanced AI capabilities to sales, services, and marketing teams – helping them discover new insights, predict likely outcomes to power smarter decision making, recommend next steps, and automate workflows so users can focus on building meaningful relationships with every customer.
Salesforce is using Apache Spark (batch, streaming, GraphX and ML) to power the Einstein platform and services. In this keynote and demo, Alexis will highlight how Salesforce is building intelligent Services for Einstein using activity data by leveraging Spark and Databricks to scale data science and engineering.
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Sri Ambati
This talk was given at H2O World 2018 NYC and can be viewed here: https://youtu.be/xc3j20Om3UM
Description:
Data science is indeed one of the sexy jobs of the 21st century. But it is also a lot of hard work. And the hard work is seldom about the math or the algorithms. It is about building relevant machine learning products for the real world. We will go over some of the must-haves as you take your machine learning model out of the sandbox and make it work in the big, bad world outside.
Speaker's Bio:
Krish Swamy is an experienced professional with deep skills in applying analytics and BigData capabilities to challenging business problems and driving customer insights. Krish's analytic experience includes marketing and pricing, credit risk, digital analytics and most recently, big data analytics and data transformation. His key experiences lie in banking and financial services, the digital customer experience domain, with a background in management consulting. Other key skills include influencing organizational change towards a data and analytics driven culture, and building teams of analysts, statisticians and data scientists.
Tom Aliff, Equifax - Configurable Modeling for Maximizing Business Value - H2...Sri Ambati
This session was recorded in San Francisco on February 5th, 2019 and can be viewed here: https://youtu.be/LUwMtXM2q88
In the current world of data science there many available data sources, big data platforms, and advanced Machine Learning and AI based technologies available. It has become easier and easier to derive predictive value in an efficient and streamlined way and lose sight of objectives especially in the business world. This session will focus on not losing the business context and objective as the navigator for these powerful tools we have at our disposal. Through this discussion, I will review a path towards how to use the tools like explainable and driverless AI to your advantage versus letting the tools set the direction.
Bio: At Equifax, Tom leads the Data and Analytics consulting practice. Previously, Tom was the US Consumer and Commercial Data Sciences Leader. Tom joined Equifax in July of 2009. He brings several years of analytical experience in leading teams on statistical modeling engagements, analysis and consultation across several verticals including telecommunications, lending, mortgage, automotive, and marketing. Prior to Equifax, Tom was a data science manager at Experian and a Risk Modeler/Manager at American General Finance (now OneMain Financial). Tom holds a Master of Science in Applied Statistics from Purdue University, and a Bachelor of Science degree in Mathematics with a concentration in Statistics, also from Purdue University.
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Sri Ambati
This session was recorded in San Francisco on February 5th, 2019 and can be viewed here: https://youtu.be/otq2nQUSV3s
We will talk about the AI transformation journey at Vision Banco - Paraguay, from the early initiatives to futures use cases, and how we adopted open source H2O.ai and Driverless AI in our organization.
Bio:
Ruben Diaz
My name is Ruben Diaz, from Asunción, Paraguay. I am married and father of 3 children. I work as Data Scientist at Vision Banco
Luis Armenta:
Luis holds a BSc in Electrical Engineering from the National University of Mexico and a MSc in Electrical Engineering/Computer Science from the University of Waterloo in Canada. He is also currently completing an Executive MBA at McCombs School of Business at the University of Texas in Austin. Luis has over ~14 years of experience, having started his career as a Research Scientist at Intel Labs before being promoted to 2nd Line Engineering Manager, leading the high-speed interconnect hardware design of Intel’s server portfolio. Luis also has held roles as Product Manager of EM simulators at Ansys, Inc. and as a Systems Engineer of 4K and 8K UHDTVs at Macom.
Generalized B2B Machine Learning by Andrew WaageData Con LA
Abstract:- In this talk, we propose a generalized machine learning framework for e-commerce businesses. The framework is responsible for over 30 different user-level predictions including lifetime value, recommendations, churn predictions, engagement and lead scoring. These predictions provide a vital layer of intelligence for a digital marketer. Kinesis is used to capture browsing information from over 120M users across 100 companies (both in-app and web). A data processing and feature engineering layer is build on Apache Spark. These features provide inputs to predictive models for business applications. Different models each for Churn, Lifetime value, Product recommendation and search are written on Spark. These models can be plugged into any marketing campaign for any integrated e-commerce company leading to a generalized system. We finally present a monitoring system for machine learning called RS Sauron. This system provides more than 200 objective metrics measuring the health of predictive models, and depicts KPIs for model accuracy in a continual setting.
It is rightfully said that data is money in today's world. Along with the transition to an app-based world comes the exponential growth of data.
Orange, Weka,Rattle GUI, Apache Mahout, SCaViS, RapidMiner, R, ML-Flex, Databionic ESOM Tools, Natural Language Toolkit, SenticNet API , ELKI , UIMA, KNIME, Chemicalize.org , Vowpal Wabbit, GNU Octave, CMSR Data Miner, Mlpy, MALLET, Shogun, Scikit-learn, LIBSVM, LIBLINEAR, Lattice Miner, Dlib, Jubatus, KEEL and more
Let's analyze how world reacts to road traffic by sentiment analysis finalSajeetharan
In this session you will build a sentiment analysis solution step by step, using Azure Platform. We will talk about sentiment analysis and how you can get this introduced in your application. We will run live demo and extract data from live twitter feeds and work together in processing the data and performing sentiment analysis on the data. We encourage audience to come with mobile phones to tweet :P but if not, not to worry as we can use the historical data.
DataRobot 머신러닝 자동화 플랫폼은 전 세계 Top Data Scientist 들의 지식, 경험 및 모범 사례를 바탕으로 최고 수준의 자동화와 사용 편리성을 확보한 가장 진보된 머신러닝 자동화 솔루션 입니다. DataRobot을 통해 비즈니스 관계자, 분석가 및 데이터 과학자 등 기술 수준과 관계 없이 모든 사용자가 기존 모델링 기법에 비해 아주 빠르게, 매우 정확한 예측 모델을 수립하고 구축, 관리할 수 있습니다.
Presentation held during Belgrad AI's opening event in March 2019. Terminology and workflows for a regular ML or data science project, top points from selecting an AI project and a little framework to help you get everything on the same page for the top levels details of your ML project.
This talk was given at H2O World 2018 NYC and can be viewed here: https://youtu.be/oxLZZMR1lVY
Description:
Driverless AI is H2O.ai's latest flagship product for automatic machine learning. It fully automates some of the most challenging and productive tasks in applied data science such as feature engineering, model tuning, model ensembling and model deployment. Driverless AI turns Kaggle-winning grandmaster recipes into production-ready code, and is specifically designed to avoid common mistakes such as under- or overfitting, data leakage or improper model validation, some of the hardest challenges in data science. Avoiding these pitfalls alone can save weeks or more for each model, and is necessary to achieve high modeling accuracy, especially for time-series problems.
With Driverless AI, data scientists of all proficiency levels can train and deploy modeling pipelines with just a few clicks from the GUI. Advanced users can use the client API from Python. Driverless AI builds hundreds or thousands of models under the hood to select the best feature engineering and modeling pipeline for every specific problem such as churn prediction, fraud detection, real-estate pricing, store sales prediction, marketing ad campaigns and many more.
To speed up training, Driverless AI uses highly optimized C++/CUDA algorithms to take full advantage of the latest compute hardware. For example, Driverless AI runs orders of magnitudes faster on the latest Nvidia GPU supercomputers on Intel and IBM platforms, both in the cloud or on premise. Driverless AI is fully supported on all major cloud providers.
There are two more product innovations in Driverless AI: statistically rigorous automatic data visualization and machine learning interpretability with reason codes and explanations in plain English. Both help data scientists and analysts to quickly validate the data and the models.
In this talk, we explain how Driverless AI works and show how easy it is to reach top 5% rankings for several highly competitive Kaggle competitions. (edited)
Speaker's Bio:
Arno Candel is the Chief Technology Officer at H2O.ai. He is the main committer of H2O-3 and Driverless AI and has been designing and implementing high-performance machine-learning algorithms since 2012. Previously, he spent a decade in supercomputing at ETH and SLAC and collaborated with CERN on next-generation particle accelerators. Arno holds a PhD and Masters summa cum laude in Physics from ETH Zurich, Switzerland. He was named “2014 Big Data All-Star” by Fortune Magazine and featured by ETH GLOBE in 2015. Follow him on Twitter: @ArnoCandel.
Shift AI 2020: Business benefits of privacy-preserving synthetic data | Sebas...Shift Conference
Shift AI was a success, connecting hundreds of professionals that were eager to propel the progress of AI and discuss the newest technologies in data mining, machine learning and neural networks. More at https://ai.shiftconf.co/.
Talk description:
Privacy defines a state in which one is free from public attention and not observed or disturbed by others. Taken in the context of data, privacy is therefore a state in which an individual’s data is used only with their specific consent, and where any person or organization party to that individual’s data guarantee to prevent unauthorized disclosures or misuse of that information.
Therefore, in order to protect the individual's privacy, strict regulations have already been introduced in many regions and countries worldwide, such as CCPA in California or GDPR in the EU and we can expect many more to come. This puts businesses in a position in which they need to find a solution in order to leverage data while preserving privacy. We will address this topic and answer how businesses can benefit from synthetic data and unlock the value of data.
How will AI and analytics change life in the next 25 years? In this episode, we look forward to the next 25 years and will share predictions about the technological innovations prevalent then based on a projection of AI and analytics forward.
Functionalities in AI Applications and Use Cases (OECD)AnandSRao1962
This presentation was given at the OECD Network of AI Specialists (ONE) held in Paris on February 26 and 27. It covers the methodology for assessing AI use cases by technology, value chain, use, business impact, business value, and effort required.
Keynote presentation from ECBS conference. The talk is about how to use machine learning and AI in improving software engineering. Experiences from our project in Software Center (www.software-center.se).
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...DATAVERSITY
Many data scientists are well grounded in creating accomplishment in the enterprise, but many come from outside – from academia, from PhD programs and research. They have the necessary technical skills, but it doesn’t count until their product gets to production and in use. The speaker recently helped a struggling data scientist understand his organization and how to create success in it. That turned into this presentation, because many new data scientists struggle with the complexities of an enterprise.
Why Everything You Know About bigdata Is A LieSunil Ranka
As a big data technologist, you can bet that you have heard it all: every crazy claim, myth, and outright lie about what big data is and what it isn't that you can imagine, and probably a few that you can't.If your company has a big data initiative or is considering one, you should be aware of these false statements and the reasons why they are wrong.
How an AI-backed recommendation system can help increase revenue for your onl...Skyl.ai
About the webinar
Picture this: A customer logs onto your E-commerce platform to purchase an item. As soon as they put in the product details into the search bar, they are bombarded with a long catalog of various items that they have to painfully sort through. High chance that they leave without completing a purchase, not sure of what they should pick.
Product recommendation systems must become way better - Platforms need to understand the shopper, and provide them with best-fitting tailored products. This can be way more challenging for retailers with vast catalogs or the ones with only slight variations in products.AI/ML model for 'Recommendations' generated using Skyl.ai can help E-commerce platforms to provide a superior digital-shopping experience to its customers.
This webinar will showcase a live demo of how to build such a robust recommendation model in hours.
What you will learn
- How e-commerce companies drive sales through AI-powered product recommendation engines
- Challenges faced in ML automation and how to overcome those using a unified ML platform
- Live Demo: Demo on how to create a product recommendation system using Skyl.ai end-end ML automation platform
The Sky’s the Limit – The Rise of Machine LearninInside Analysis
The Briefing Room with Analyst Dr. Robin Bloor and SkyTree
Live Webcast on June 24, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=1da2b498fc39b8b331a5bbb8dea2660f
With data growing more complex these days, many organizations are looking for ways to make sense of new information sources. The goal? Sprint ahead of the competition by exploiting fast-moving opportunities. The challenge? The data volumes, variety and velocity call for significantly greater horsepower than ever before. That’s where machine learning comes into play, and it’s already fundamentally changing the Big Data Analytics landscape.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he explains how advanced analytics technology can transform the enterprise. He’ll be briefed by Martin Hack, CEO of Skytree, who will tout his company’s machine learning solution for big data. Hack will discuss the critical challenges facing today’s data professionals, and present use cases to show how machine learning can help organizations leverage big data as a capital asset. He’ll specifically address the power of predictive analytics, which can help companies seize opportunities and prevent serious problems.
Visit InsideAnlaysis.com for more information.
Power to the People: A Stack to Empower Every User to Make Data-Driven DecisionsLooker
Infectious Media runs on data. But, as an ad-tech company that records hundreds of thousands of web events per second, they have have to deal with data at a scale not seen by most companies. You can not make decisions with data when people need to write manual SQL only for queries take 10-20 minutes to return. Infectious Media made the switch to Google BigQuery and Looker and now every member of every team can get the data they need in seconds.
Infectious Media shares:
- Why they chose their current stack
- Why faster data means happier customers
- Advantages and practical implications of storing and processing that much data
Check out the recording at https://info.looker.com/h/i/308848878-power-to-the-people-a-stack-to-empower-every-user-to-make-data-driven-decisions
Future of Ecommerce: How to Improve the Online Shopping Experience Using Mach...Skyl.ai
About the webinar
It’s no secret that a well-organized product catalog becomes extremely crucial as consumers look for a more rich and consistent online experience while E-shopping. Often, the task of digitizing the catalog of the fast-moving and large volume products becomes daunting due to insufficient, erroneous, and fragmented data.
This leads us to the question: If E-commerce and fashion companies need to be agile and consumer-friendly, then why are so many still using the same product catalog management methods that were devised years ago? The manual product classification and data attribution process are only leading to an increased risk of error and time delay affecting the brand reputation. Also, leading to lost sales opportunities due to incomplete or inaccurate product records that don’t really reflect the actual product.
In this webinar, we will discuss how to efficiently manage machine learning projects without tech headaches by plugging in your data and building your models instantly.
What you will learn
- How E-commerce companies are using AI to drive more sales and seamless customer experience
- Know the secret sauce of automating time-intensive, repetitive steps to quickly build models
- Demo: A deeper understanding of the end-to-end machine learning workflow for a fashion product catalog management using Skyl.ai
How to analyze text data for AI and ML with Named Entity RecognitionSkyl.ai
About the webinar
The Internet is a rich source of data, mainly textual data. But making use of huge quantities of data is a complex and time-consuming task. NLP can help with this problem through the use of Named Entity Recognition systems. Named entities are terms that refer to names, organizations, locations, values etc. NER annotates texts – marking where and what type of named entities occurred in it. This step significantly simplifies further use of such data, allowing for easy categorization of documents, analyze sentiments, improving automatically generated summaries etc.
Further, in many industries, the vocabulary keeps changing and growing with new research, abbreviations, long and complex constructions, and makes it difficult to get accurate results or use rule-based methods. Named Entity Recognition and Classification can help to effectively extract, tag, index, and manage this fast and ever-growing knowledge.
Through this webinar, we will understand how NER can be used to extract key entities from large volumes of text data
What you will learn
- How organizations are leveraging Named Entity Recognition across various industries
- Live demo - Identify & classify complex terms & with NERC (Named Entity Recognition & Categorization)
- Best practice to automate machine learning models in hours not months
There are patterns for things such as domain-driven design, enterprise architectures, continuous delivery, microservices, and many others.
But where are the data science and data engineering patterns?
Sometimes, data engineering reminds me of cowboy coding - many workarounds, immature technologies and lack of market best practices.
Similar to 2019 CDM CIO Summit AI Driven Development (20)
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
20240605 QFM017 Machine Intelligence Reading List May 2024
2019 CDM CIO Summit AI Driven Development
1. AI-Driven Development (AutoML)
Mar 26th 2019
?
Deep Learning
Machine Learning
Artificial Intelligence
What’s next?
GundlapalliC
Chandra-Gundlapalli
2. Agenda (help accelerate Think Tank discussion)
AI Data
Strategy
Big Picture#01
Time-to-
Market
Levers
#02
Lessons
Learned
Recap
#03
AI = Prediction + Automation + Optimization
2
3. #01 Quick look at AI market opportunity big picture
3
Key takeaway
# McKinsey: AI
$3.5 - $5.8 trillion
potential, or 40
percent from all
analytics
# >30% revenue
increase for the AI
front runners
Proven AI in the
market today
Sales
Retail
Finance
Health
• Omnichannel
• Recommendation
• Prescriptive sales
• Robo-Advisory
• Fraud Detection
• Billing Exceptions
• Call routing
• Voice auth
• Social listening
• Patient analytics
• Imaging insights
• Drug effectiveness
Heliograf Procurement Smart Grid Investor services Zenbo nanny
ROI-driven use cases
4. #Building ML on top of existing data tech strategy
Fetch
data
Clean
data
Transform
data
Train
model
Evaluate
model
Prod
Deploy
Monitor
Machine Learning
BuildDeploy
Train
Batch
What
happened?
Real-time
What is happening
now?
Inferences
What should we
do?
Key takeaway
Expected
Reality(2-3Xexpectedduration)
Data Lake, Hub Analytics
4
5. #03 ML DNA – accelerate business ROI
Services
Frameworks
Platforms
Infrastructure
Key takeaway
# Build hybrid
cloud with
scalability &
“Bring Your Own”
flexibility in each
of the layers.
# Address
hidden tech
debt (ML code
is only a small
fraction)
Vision Video Natural
Lang
Translate Conv BOT ???
AWS
SageMaker
Google
AutoML
Kubefow AI-ble Azure ML ???
MXNet TensorFlow PyTorch R Spark ???
GPUTPU Containers
(DockerK
ubernetes)
In-memory Xilink
FPGAs
Serverless ???
5
Skillset
Citizen
data
scientists
Top coder Rent a data
scientist
Domain
SME
DataOps
engineers
???
6. #Lessons Learned
#2 Collaboration
5. Embrace DataOps clarity
of purpose (not DevOps)
6. Bias - Business Impact
(ROI) vs. Model Accuracy
7. Unified data – discovery &
Access unaltered raw data.
8. Data Governance (quality)
automation
#1 Talent
1. Empower (Value
mindset) Data Scientists
2. Talent Gap & Prototyping
(Top coder) - status quo
3. Trust & Transparency -
Fear & Misunderstanding
4. TEAM RACI –
SMEScientistEngineers
#3 Deployment
9. Open Data Format & data
virtualization latency
10.on-premise multitenancy
for future hybrid cloud
11. Train and Run anywhere
(cloudedge) – AWS Neo
12. Model Explainability &
PROD-like data copies
6
7. #02 ML Time-to-Market levers
Speed
Quality
Cost
#1 Zero setup
• Serverless plug and play.
• Open source pre-installed.
• Data governance pre-built.
#2 Optimized Algorithms
• Streaming datasets.
• Bring Your Own Algorithm.
• Very large datasets.
#3 Easier Training
• Hyper parameterization
• Bring Your Own Container.
• Bring Your Own Script.
#4 Operationalization
• One step deploy
• AB testing.
• Bring Your Own Model.
Key takeaway
# BUY & BUILD
Solution exists in
market today
cutting down
efforts by >50%
# Narrow gap
between 1 billion
workers & 1
million data
experts
7
AI: The theory & development of computer systems that able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making and translation between languages.
ML: Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to “learn” (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.
DL: Deep learning is a subset of machine learning in Artificial Intelligence (AI) that has networks capable of learning unsupervised from data that is unstructured or unlabeled. Also known as Deep Neural Learning or Deep Neural Network.