How large-scale image analytics (near-real time analysis of satellite images, machine learning) could help (re-)insurer anticipate natural catastrophes and estimate damages more precisely
Machine learning, or predictive analytics have started entering into our daily life. Businesses and enterprises could use predictive analytics to improve efficiency, improve user experience, as well as to create new business opportunities. This talk will present WSO2 Machine Learner, our experiences of predicting Super Bowl winners, and few real life use cases. Furthermore, talk will discuss open challenges and problems people are working on.
Leveraging Open Source Automated Data Science ToolsDomino Data Lab
The data science process seeks to transform and empower organizations by finding and exploiting market inefficiencies and potentially hidden opportunities, but this is often an expensive, tedious process. However, many steps can be automated to provide a streamlined experience for data scientists. Eduardo Arino de la Rubia explores the tools being created by the open source community to free data scientists from tedium, enabling them to work on the high-value aspects of insight creation and impact validation.
The promise of the automated statistician is almost as old as statistics itself. From the creations of vast tables, which saved the labor of calculation, to modern tools which automatically mine datasets for correlations, there has been a considerable amount of advancement in this field. Eduardo compares and contrasts a number of open source tools, including TPOT and auto-sklearn for automated model generation and scikit-feature for feature generation and other aspects of the data science workflow, evaluates their results, and discusses their place in the modern data science workflow.
Along the way, Eduardo outlines the pitfalls of automated data science and applications of the “no free lunch” theorem and dives into alternate approaches, such as end-to-end deep learning, which seek to leverage massive-scale computing and architectures to handle automatic generation of features and advanced models.
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...Impetus Technologies
SPARK SUMMIT SESSION -
A majority of the electricity in the U.S. is traded in independent system operator (ISO) based wholesale markets. ISO-based markets typically function in a two-step settlement process with day-ahead (DA) financial settlements followed by physical real-time (spot) market settlements for electricity. In this work, we focus on obtaining equilibrium bidding strategies for electricity generators in DA markets. Electricity prices in DA markets are determined by the ISO, which matches competing supply offers from power generators with demand bids from load serving entities. Since there are multiple generators competing with one another to supply power, this can be modeled as a competitive Markov decision problem, which we solve using a reinforcement learning approach. For power networks of realistic sizes, the state-action space could explode, making the RL procedure computationally intensive. This has motivated us to solve the above problem over Spark. The talk provides the following takeaways:
1. Modeling the day-ahead market as a Markov decision process
2. Code sketches to show the markov decision process solution over Spark and Mahout over Apache Tez
3. Performance results comparing Mahout over Apache Tez and Spark.
Agile Data Science is a lean methodology that is adopted from Agile Software Development. At the core it centers around people, interactions, and building minimally viable products to ship fast and often to solicit customer feedback. In this presentation, I describe how this work was done in the past with examples. Get started today with our help by visiting http://www.alpinenow.com
Machine learning, or predictive analytics have started entering into our daily life. Businesses and enterprises could use predictive analytics to improve efficiency, improve user experience, as well as to create new business opportunities. This talk will present WSO2 Machine Learner, our experiences of predicting Super Bowl winners, and few real life use cases. Furthermore, talk will discuss open challenges and problems people are working on.
Leveraging Open Source Automated Data Science ToolsDomino Data Lab
The data science process seeks to transform and empower organizations by finding and exploiting market inefficiencies and potentially hidden opportunities, but this is often an expensive, tedious process. However, many steps can be automated to provide a streamlined experience for data scientists. Eduardo Arino de la Rubia explores the tools being created by the open source community to free data scientists from tedium, enabling them to work on the high-value aspects of insight creation and impact validation.
The promise of the automated statistician is almost as old as statistics itself. From the creations of vast tables, which saved the labor of calculation, to modern tools which automatically mine datasets for correlations, there has been a considerable amount of advancement in this field. Eduardo compares and contrasts a number of open source tools, including TPOT and auto-sklearn for automated model generation and scikit-feature for feature generation and other aspects of the data science workflow, evaluates their results, and discusses their place in the modern data science workflow.
Along the way, Eduardo outlines the pitfalls of automated data science and applications of the “no free lunch” theorem and dives into alternate approaches, such as end-to-end deep learning, which seek to leverage massive-scale computing and architectures to handle automatic generation of features and advanced models.
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...Impetus Technologies
SPARK SUMMIT SESSION -
A majority of the electricity in the U.S. is traded in independent system operator (ISO) based wholesale markets. ISO-based markets typically function in a two-step settlement process with day-ahead (DA) financial settlements followed by physical real-time (spot) market settlements for electricity. In this work, we focus on obtaining equilibrium bidding strategies for electricity generators in DA markets. Electricity prices in DA markets are determined by the ISO, which matches competing supply offers from power generators with demand bids from load serving entities. Since there are multiple generators competing with one another to supply power, this can be modeled as a competitive Markov decision problem, which we solve using a reinforcement learning approach. For power networks of realistic sizes, the state-action space could explode, making the RL procedure computationally intensive. This has motivated us to solve the above problem over Spark. The talk provides the following takeaways:
1. Modeling the day-ahead market as a Markov decision process
2. Code sketches to show the markov decision process solution over Spark and Mahout over Apache Tez
3. Performance results comparing Mahout over Apache Tez and Spark.
Agile Data Science is a lean methodology that is adopted from Agile Software Development. At the core it centers around people, interactions, and building minimally viable products to ship fast and often to solicit customer feedback. In this presentation, I describe how this work was done in the past with examples. Get started today with our help by visiting http://www.alpinenow.com
RAPIDS is a suite of open source software libraries and APIs gives you the ability to execute end-to-end data science and analytics pipelines entirely on GPUs.In this workshop, we will:
1. Introduce Rapids.ai & GPUs
2. Illustrate why GPUs are critical for machine learning and AI applications
3. Demonstrate common machine learning algorithms such as Regression, KNN,SGD etc. using RAPIDS on the QuSandbox
A final project presentation on the project based on THE GDELT Database.
Complete Report : https://samvat.github.io/ivmooc-gdelt-project/The GDELT Project - Final Report.pdf
Leveraging NLP and Deep Learning for Document Recommendations in the CloudDatabricks
Efficient recommender systems are critical for the success of many industries, such as job recommendation, news recommendation, ecommerce, etc. This talk will illustrate how to build an efficient document recommender system by leveraging Natural Language Processing(NLP) and Deep Neural Networks (DNNs). The end-to-end flow of the document recommender system is build on AWS at scale, using Analytics Zoo for Spark and BigDL. The system first processes text rich documents into embeddings by incorporating Global Vectors (GloVe), then trains a K-means model using native Spark APIs to cluster users into several groups. The system further trains a recommender model for each group, and gives an ensemble prediction for each test record. By adopting the end-to-end pipeline of Analytics Zoo solution, we saw about 10% improvement of mean reciprocal ranking and 6% of precision respectively compared to the search recommendations for a job recommendation study.
Speaker: Guoqiong Song
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...Data Science Milan
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrigoni, Senior Data Scientist, Pirelli (pirelli.com)
Abstract:
Pirelli, a global performance tire manufacturer, uses data science in its 20 factories to improve quality and efficiency, and reduce energy consumption. For this “Smart Manufacturing” initiative, Pirelli’s data science team has developed predictive models and analytics tools to monitor processes, machines and materials on the factory floors. In this talk we will show some of the solutions we deploy, demonstrate how we used Domino’s data science platform and Plot.ly to build these solutions, and discuss the next steps in this journey towards predictive maintenance.
Bio:
Alberto Arrigoni is a data scientist at Pirelli, where he works to process sensors and telemetry data for IoT, Smart Factories and connected-vehicle applications.
He works closely with all major business units such as R&D, industrial engineering and BI to develop tailored machine learning algorithms and production systems.
He holds a PhD in biostatistics from the University of Milan Bicocca and prior to joining Pirelli was a staff data scientist at the National Institute of Molecular Genetics (Milan), as well as a Fulbright student at the Santa Clara University and visiting PhD student at Pacific Biosciences (Menlo Park, CA).
DN18 | Applied Machine Learning in Cybersecurity: Detect malicious DGA Domain...Dataconomy Media
Abstract of the Presentation:
Malware like GameOver Zeus and CryptoLocker Botnets are a massive threat for organizations. They use domain generation algorithms (DGAs) to create URLs that host malicious websites or command and control servers. Traditional approaches fail to detect and stop them early. In this Talk you learn in a live demo how you can use machine learning to detect malicious domains in your environment and learn how to implement a full end to end data science use case leveraging the Splunk Machine Learning Toolkit.
About the Author:
Philipp works as Staff Machine Learning Architect at Splunk. His background is in data sciene, visualization and analytics with experience in automotive, transportation and software industries. He enjoys working with Splunk customers and partners across EMEA.
To be successful as a data science team, we need to continuously deliver data-driven insights and data products that generate business value. Identifying the best opportunities and building solutions that actually get used in production requires very close collaboration with business users and subject matter experts. What can we learn from agile software development methodologies, and how can we apply them to data science projects?
Data Science in the Real World: Making a Difference Srinath Perera
We use the terms “Big Data” and “Data Science” for use of data processing to make sense of the world around us. Spanning many fields, Big Data brings together technologies like Distributed Systems, Machine Learning, Statistics, and Internet of Things together. It is a multi-billion-dollar industry including use cases like targeted advertising, fraud detection, product recommendations, and market surveys. With new technologies like Internet of Things (IoT), these use cases are expanding to scenarios like Smart Cities, Smart health, and Smart Agriculture.
These usecases use basic analytics, advanced statistical methods, and predictive technologies like Machine Learning. However, it is not just about crunching the data. Some usecases like Urban Planning can be slow, and there is enough time to process the data. However, with use cases like traffic, patient monitoring, surveillance the the value of results degrades much faster with time and needs results within milliseconds to seconds. Collecting data from many sources, cleaning them up, processing them using computation clusters, and doing all these fast is a major challenge.
This talk will discuss motivation behind big data and data science and how it can make a difference. Then it will discuss the challenges, systems, and methodologies for implementing and sustaining a data science pipeline.
• GDPR protects personal information from being exploited by business
• product development and testing without realistic data
• impossible to share data with other researchers and developers, hands-on lab courses, hackathons.
Can AI solve this problem by obscuring personal data?
Mitigating User Experience from 'Breaking Bad': The Twitter Approach [Velocit...Piyush Kumar
Frequent deployments, large set of in-flight A/B tests, new product launches etc. directly impact the profile of application metrics as well as system metrics. Specifically, the above can induce sudden breakouts – which manifest themselves as a mean-shift or a rampup (these are different from an anomaly) – in the time series of a given metric. Further, the profile on the incoming traffic may also experience a breakout due to a variety of reasons such as, but not limited to, roll out of a new feature or roll out for a new platform; this in turn results in breakouts in application and/or system metrics.
Breakouts can potentially impact performance of the corresponding service and consequently impact the end user experience. To alleviate the impact of breakouts – in other words, preventing user experience from ‘Breaking Bad’ – we developed statistically rigorous techniques to automatically detect breakouts in a timely fashion. The breakouts detected are used to guide capacity planning. In particular, there are two scenarios:
Positive breakout: Depending on the magnitude, deploy additionally capacity
Negative breakout: Depending on the magnitude, scale down the current capacity
We shall walk the audience through how the techniques are being at Twitter using REAL data.
Presented at Bitkom AK Big Data & Advanced Analytics strategy workshop on 30 June 2021. We point to scalability across data schemata as a major current bottleneck on the road towards building a data-driven organization, and illustrate on the example of Analyst-2 (https://analyst-2.ai/) how Autonomous Analytics may provide a way forward.
Quoc Le at AI Frontiers : Automated Machine LearningAI Frontiers
Traditional machine learning systems are hand-designed and tuned by machine learning experts. To scale up the impact of machine learning to many real-world applications, we must figure out a way to automate the designing process of these pipelines. In this talk, I will discuss the use of machine learning to automate the process of designing neural architectures and data augmentation strategies (Neural Architecture Search and AutoAugment).
Agile development of data science projects | Part 1 Anubhav Dhiman
Broadly data science encompasses quantitative research, advanced analytics, predictive modelling and machine learning.
How reliably and sustainably can data science team deliver value for organizations?
Data Science Readiness Levels
How to make collaboration easier across organization?
Monitoring world geopolitics through Big Data by Tomasa Rodrigo and Álvaro Or...Big Data Spain
Data from the media allows to enrich our analysis and to incorporate these insights into our models to capture nonlinear behaviour and feedback effects of human interaction, assessing their global impact on the society and enabling us to construct fragility indices and early warning systems.
https://www.bigdataspain.org/2017/talk/monitoring-world-geopolitics-through-big-data
Big Data Spain 2017
16th - 17th November Kinépolis Madrid
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...PAPIs.io
Artificial Intelligence and Machine Learning are becoming increasingly accessible. Starting from example use cases, I’ll aim at demystifying how they work and how they improve businesses in 3 areas: increasing the number of customers, serving them better, and serving them more efficiently. I’ll show how machines can use data to automatically learn business rules and make predictions, that can then be used to make better decisions. I’ll introduce the main concepts of ML, its possibilities, its limitations, and I’ll give tips on framing the right problems for your company to tackle.
Louis Dorard is the author of Bootstrapping Machine Learning, a co-founder of PAPIs, and an independent consultant. His goal is to help people use new machine learning technologies to make their apps and businesses smarter. He does this by writing, speaking and teaching.
RAPIDS is a suite of open source software libraries and APIs gives you the ability to execute end-to-end data science and analytics pipelines entirely on GPUs.In this workshop, we will:
1. Introduce Rapids.ai & GPUs
2. Illustrate why GPUs are critical for machine learning and AI applications
3. Demonstrate common machine learning algorithms such as Regression, KNN,SGD etc. using RAPIDS on the QuSandbox
A final project presentation on the project based on THE GDELT Database.
Complete Report : https://samvat.github.io/ivmooc-gdelt-project/The GDELT Project - Final Report.pdf
Leveraging NLP and Deep Learning for Document Recommendations in the CloudDatabricks
Efficient recommender systems are critical for the success of many industries, such as job recommendation, news recommendation, ecommerce, etc. This talk will illustrate how to build an efficient document recommender system by leveraging Natural Language Processing(NLP) and Deep Neural Networks (DNNs). The end-to-end flow of the document recommender system is build on AWS at scale, using Analytics Zoo for Spark and BigDL. The system first processes text rich documents into embeddings by incorporating Global Vectors (GloVe), then trains a K-means model using native Spark APIs to cluster users into several groups. The system further trains a recommender model for each group, and gives an ensemble prediction for each test record. By adopting the end-to-end pipeline of Analytics Zoo solution, we saw about 10% improvement of mean reciprocal ranking and 6% of precision respectively compared to the search recommendations for a job recommendation study.
Speaker: Guoqiong Song
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrig...Data Science Milan
"How Pirelli uses Domino and Plotly for Smart Manufacturing" by Alberto Arrigoni, Senior Data Scientist, Pirelli (pirelli.com)
Abstract:
Pirelli, a global performance tire manufacturer, uses data science in its 20 factories to improve quality and efficiency, and reduce energy consumption. For this “Smart Manufacturing” initiative, Pirelli’s data science team has developed predictive models and analytics tools to monitor processes, machines and materials on the factory floors. In this talk we will show some of the solutions we deploy, demonstrate how we used Domino’s data science platform and Plot.ly to build these solutions, and discuss the next steps in this journey towards predictive maintenance.
Bio:
Alberto Arrigoni is a data scientist at Pirelli, where he works to process sensors and telemetry data for IoT, Smart Factories and connected-vehicle applications.
He works closely with all major business units such as R&D, industrial engineering and BI to develop tailored machine learning algorithms and production systems.
He holds a PhD in biostatistics from the University of Milan Bicocca and prior to joining Pirelli was a staff data scientist at the National Institute of Molecular Genetics (Milan), as well as a Fulbright student at the Santa Clara University and visiting PhD student at Pacific Biosciences (Menlo Park, CA).
DN18 | Applied Machine Learning in Cybersecurity: Detect malicious DGA Domain...Dataconomy Media
Abstract of the Presentation:
Malware like GameOver Zeus and CryptoLocker Botnets are a massive threat for organizations. They use domain generation algorithms (DGAs) to create URLs that host malicious websites or command and control servers. Traditional approaches fail to detect and stop them early. In this Talk you learn in a live demo how you can use machine learning to detect malicious domains in your environment and learn how to implement a full end to end data science use case leveraging the Splunk Machine Learning Toolkit.
About the Author:
Philipp works as Staff Machine Learning Architect at Splunk. His background is in data sciene, visualization and analytics with experience in automotive, transportation and software industries. He enjoys working with Splunk customers and partners across EMEA.
To be successful as a data science team, we need to continuously deliver data-driven insights and data products that generate business value. Identifying the best opportunities and building solutions that actually get used in production requires very close collaboration with business users and subject matter experts. What can we learn from agile software development methodologies, and how can we apply them to data science projects?
Data Science in the Real World: Making a Difference Srinath Perera
We use the terms “Big Data” and “Data Science” for use of data processing to make sense of the world around us. Spanning many fields, Big Data brings together technologies like Distributed Systems, Machine Learning, Statistics, and Internet of Things together. It is a multi-billion-dollar industry including use cases like targeted advertising, fraud detection, product recommendations, and market surveys. With new technologies like Internet of Things (IoT), these use cases are expanding to scenarios like Smart Cities, Smart health, and Smart Agriculture.
These usecases use basic analytics, advanced statistical methods, and predictive technologies like Machine Learning. However, it is not just about crunching the data. Some usecases like Urban Planning can be slow, and there is enough time to process the data. However, with use cases like traffic, patient monitoring, surveillance the the value of results degrades much faster with time and needs results within milliseconds to seconds. Collecting data from many sources, cleaning them up, processing them using computation clusters, and doing all these fast is a major challenge.
This talk will discuss motivation behind big data and data science and how it can make a difference. Then it will discuss the challenges, systems, and methodologies for implementing and sustaining a data science pipeline.
• GDPR protects personal information from being exploited by business
• product development and testing without realistic data
• impossible to share data with other researchers and developers, hands-on lab courses, hackathons.
Can AI solve this problem by obscuring personal data?
Mitigating User Experience from 'Breaking Bad': The Twitter Approach [Velocit...Piyush Kumar
Frequent deployments, large set of in-flight A/B tests, new product launches etc. directly impact the profile of application metrics as well as system metrics. Specifically, the above can induce sudden breakouts – which manifest themselves as a mean-shift or a rampup (these are different from an anomaly) – in the time series of a given metric. Further, the profile on the incoming traffic may also experience a breakout due to a variety of reasons such as, but not limited to, roll out of a new feature or roll out for a new platform; this in turn results in breakouts in application and/or system metrics.
Breakouts can potentially impact performance of the corresponding service and consequently impact the end user experience. To alleviate the impact of breakouts – in other words, preventing user experience from ‘Breaking Bad’ – we developed statistically rigorous techniques to automatically detect breakouts in a timely fashion. The breakouts detected are used to guide capacity planning. In particular, there are two scenarios:
Positive breakout: Depending on the magnitude, deploy additionally capacity
Negative breakout: Depending on the magnitude, scale down the current capacity
We shall walk the audience through how the techniques are being at Twitter using REAL data.
Presented at Bitkom AK Big Data & Advanced Analytics strategy workshop on 30 June 2021. We point to scalability across data schemata as a major current bottleneck on the road towards building a data-driven organization, and illustrate on the example of Analyst-2 (https://analyst-2.ai/) how Autonomous Analytics may provide a way forward.
Quoc Le at AI Frontiers : Automated Machine LearningAI Frontiers
Traditional machine learning systems are hand-designed and tuned by machine learning experts. To scale up the impact of machine learning to many real-world applications, we must figure out a way to automate the designing process of these pipelines. In this talk, I will discuss the use of machine learning to automate the process of designing neural architectures and data augmentation strategies (Neural Architecture Search and AutoAugment).
Agile development of data science projects | Part 1 Anubhav Dhiman
Broadly data science encompasses quantitative research, advanced analytics, predictive modelling and machine learning.
How reliably and sustainably can data science team deliver value for organizations?
Data Science Readiness Levels
How to make collaboration easier across organization?
Monitoring world geopolitics through Big Data by Tomasa Rodrigo and Álvaro Or...Big Data Spain
Data from the media allows to enrich our analysis and to incorporate these insights into our models to capture nonlinear behaviour and feedback effects of human interaction, assessing their global impact on the society and enabling us to construct fragility indices and early warning systems.
https://www.bigdataspain.org/2017/talk/monitoring-world-geopolitics-through-big-data
Big Data Spain 2017
16th - 17th November Kinépolis Madrid
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...PAPIs.io
Artificial Intelligence and Machine Learning are becoming increasingly accessible. Starting from example use cases, I’ll aim at demystifying how they work and how they improve businesses in 3 areas: increasing the number of customers, serving them better, and serving them more efficiently. I’ll show how machines can use data to automatically learn business rules and make predictions, that can then be used to make better decisions. I’ll introduce the main concepts of ML, its possibilities, its limitations, and I’ll give tips on framing the right problems for your company to tackle.
Louis Dorard is the author of Bootstrapping Machine Learning, a co-founder of PAPIs, and an independent consultant. His goal is to help people use new machine learning technologies to make their apps and businesses smarter. He does this by writing, speaking and teaching.
The Evolution of Data Analysis with Hadoop - StampedeCon 2014StampedeCon
At StampedeCon 2014, Tom Wheeler (Cloudera) presented, "The Evolution of Data Analysis with Hadoop."
This session will lead the audience through the evolution of data analysis in Hadoop to illustrate its progression from the original low-level, batch-oriented MapReduce approach to today’s higher-level interactive tools that require very little technical knowledge. We’ll discuss Apache Crunch, Hive, Impala and Solr.
While the nature of this talk is somewhat technical, no prior knowledge of Hadoop or any specific programming language is required. Frequent live demonstrations of the tools discussed will emphasize that analyzing data in Hadoop can be as easy as using a relational database or Internet search engine.
Terabyte-scale image similarity search: experience and best practiceDenis Shestakov
Slides for the talk given at IEEE BigData 2013, Santa Clara, USA on 07.10.2013. Full-text paper is available at http://goo.gl/WTJoxm
To cite please refer to http://dx.doi.org/10.1109/BigData.2013.6691637
The growth of the amount of medical image data produced on a daily basis in modern hospitals forces the adaptation of traditional medical image analysis and indexing approaches towards scalable solutions. In this work, MapReduce is used to speed up and make possible three large–scale medical image processing use–cases: (i) parameter optimization for lung texture classification using support vector machines (SVM), (ii) content–based medical image indexing, and (iii) three–dimensional directional wavelet analysis for solid texture classification.
Don’t bid farewell to your insurance agent just yet. NTT DATA Consulting provides a reality check on AI’s transformative impact on the insurance industry. Download the full report “The AI Revolution in Insurance: A Reality Check” on the NTT DATA website.
Disrupting Internal Processes with Artificial Intelligence APIsIBM Watson
Get a high-level overview of Watson Developer Cloud, including a deep dive into the four GTM models, different pricing scenarios and an overview of the tools you have to support your choice.
At InsightNG, we’re always thinking about the future. While our innovative platform has the potential to help solve real and complex problems, we’re aiming higher: we’re determined to empower everyone we touch. Our vision is to help people improve their understanding of the complex challenges or subject matter they encounter every day - whether at school, work or at home.
At InsightNG we’re excited at the possibility of bridging the gap between people and technology: bringing together the wealth of human experience and knowledge and the incredible advances of technology to create something altogether new.
We anticipate a future where people and intelligent technology work together to solve specific problems and challenges – your challenges. The technology we’ve developed, Your Smartest Friend, understands your life, your context.
Our ultimate goal is to create the Cognitive Fabric of a Global Brain, that harnesses distributed intelligence to facilitate the expansion of human understanding and the flexibility in the way each of us perceives the challenges we face.
This discussion document is intended to share our thinking and seek out like-minded individuals or organizations who would like chat or collaborate around areas of mutual interests.
How Insurers Can Harness Artificial IntelligenceCognizant
Once science fiction, artificial intelligence now holds vast potential for insurers interested in reinventing their business models and transforming customer experience.
Monitoring and Analysis of Web Information for Various Business Contexts : Co...Dr. Haxel Consult
A range of business use cases will be presented for illustrating the combined use of several technologies and tools including web content crawling and monitoring, advanced information search and retrieval, information analytics and information delivery developed by Qwam.
Use cases presented will cover :
Automated gathering and extraction of corporate web information for competitive intelligence, CRM systems, lead generation, etc.
Building knowledge base with selected web information for marketing intelligence needs
Information analytics of content and usage with knowledge portal applications
Our secure remote connectivity tool provides full video recording of all work our engineers perform on client systems. We have requirements to analyze the video log to detect suspicious activity, provide forensic and root cause analysis capabilities. Some of the obvious use cases include detection of credit card patterns or personally identifiable information (PII) as well as malicious activity like dropping database objects. We need to process hundreds of gigabytes per day representing thousands of hours of video. Our solution leverages a variety of Hadoop components to perform optical text recognition and indexing, keyboard and mouse movement analysis as well as integration with variety of other data sources such as our monitoring, documentation, ticketing and communication systems. We will present our complete architecture starting from multi-source data ingestion through data processing and analysis up to the end user interface, reporting and integration layer.
A brief history of artificial intelligence for businessJack C Crawford
Since the 1960s, Artificial Intelligence has promised us benefits in business and in our personal lives. This presentation takes us from the early days up to machine learning and applications for enterprise businesses that are delivering personalized experiences to customers ... to a "segment of one."
Using Data Integration to Deliver Intelligence to Anyone, AnywhereSafe Software
Data integration makes it possible to deliver intelligence and keep decision makers, first responders, and civilians informed. For over 20 years, FME has been trusted by federal governments to move data from nearly any source to the target destination, while saving time and budget resources.
With FME, federal governments can deliver open data, improve emergency & disaster response, enhance land management, turn public safety and defense into actionable results, and integrate & deliver location intelligence.
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...Kai Wähner
"Big Data" is currently a big hype. Large amounts of historical data are stored in Hadoop or other platforms. Business Intelligence tools and statistical computing are used to draw new knowledge and to find patterns from this data, for example for promotions, cross-selling or fraud detection. The key challenge is how these findings can be integrated from historical data into new transactions in real time to make customers happy, increase revenue or prevent fraud.
"Fast Data" via stream processing is the solution to embed patterns - which were obtained from analyzing historical data - into future transactions in real-time. This session uses several real world success stories to explain the concepts behind stream processing and its relation to Hadoop and other big data platforms. The session discusses how patterns and statistical models of R, Spark MLlib and other technologies can be integrated into real-time processing using open source frameworks (such as Apache Storm, Spark or Flink) or products (such as IBM InfoSphere Streams or TIBCO StreamBase). A live demo shows the complete development lifecycle combining analytics, machine learning and stream processing.
Future IT Trends Talk @Stanford OIT 554 Class - Guest Speaker - 3.7.17Paul Hofmann
The big five future IT trends
Internet of Things:
Assets Turn Into Applications
Machine Intelligence:
AI Could Replace 50M Professional Jobs
Distributed Ledgers:
Block chain is becoming mainstream
Sharing Economy:
We don’t owe anything anymore
Virtual and Augmented Reality:
Remote experience merge visual & digital world
Cloud Powered IoT: Connected Solutions Helping CommunitiesAmazon Web Services
The Internet of Things (IoT) solutions have the ability to transform public services in a positive way and when powered by cloud-based services, solutions can securely and cost-effectively scale and provide benefits to people that need them the most. Nexleaf’s mission is to preserve human life and protect the planet by designing sensor technologies, generating data analytics, and advocating for data-driven solutions. Nexleaf developed ColdTrace and StoveTrace solutions to improve the distribution of life-saving vaccines and to increase the adoption of cleaner cookstoves. City of Virginia Beach, in collaboration with Virginia Institute of Marine Science, developed a sensor based flood alerting system to give advance notice to citizens and first responders. In this session, you will hear from both about how these AWS powered solutions are driving effective results and helping communities.
Deep Learning Image Processing Applications in the EnterpriseGanesan Narayanasamy
The presentation has many use cases covering the following Image classification: "The process of identifying and detecting an object or a feature in a digital image or video," the report states. In retail, deep learning models "quickly scan and analyze in-store imagery to intuitively determine inventory movement."
Voice recognition: "The ability to receive and interpret dictation or to understand and carry out spoken commands. Models are able to convert captured voice commands to text and then use natural language processing to understand what is being said and in what context." In transportation, deep learning "uses voice commands to enable drivers to make phone calls and adjust internal controls - all without taking their hands off the steering wheel."
Anomaly detection: "Deep learning technique strives to recognize abnormal patterns which don't match the behaviors expected for a particular system, out of millions of different transactions. These applications can lead to the discovery of an attack on financial networks, fraud detection in insurance filings or credit card purchases, even isolating sensor data in industrial facilities signifying a safety issue."
Recommendation engines: "Analyze user actions in order to provide recommendations based on user behavior."
Sentiment analysis: "Leverages deep learning-heavy techniques such as natural language processing, text analysis, and computational linguistics to gain clear insight into customer opinion, understanding of consumer sentiment, and measuring the impact of marketing strategies."
Video analysis: "Process and evaluate vast streams of video footage for a range of tasks including threat detection, which can be used in airport security, banks, and sporting events."
Reddix Group - Quantum AI - PresentationJoe Reddix
Although AI/ML has made rapid progress over the past decade, it has not yet overcome technological limitations. With the unique features of quantum computing, obstacles to achieve AGI (Artificial General Intelligence) can be eliminated. Quantum computing can be used for the rapid training of machine learning models and to create optimized algorithms. This is what we call Master Systems Integration (MSI).
An optimized and stable AI provided by quantum computing can complete years of analysis in a short time and lead to advances in technology. Neuromorphic cognitive models, adaptive machine learning, or reasoning under uncertainty are some fundamental challenges of today’s AI. Quantum AI (Q-AI) is one of the most likely solutions for next-generation AI and is where our teams of SMEs excel
Elastic como solución de analítica avanzada en los procesos del sector petrolero. Analítica de datos de sensores en tiempo real para adicionar valor a las decisiones estratégicas de las organizaciones
29 SETTEMBRE 2021 – Aula Magna – Corso Duca degli Abruzzi, 24 – Politecnico di Torino
Ricerca, trasferimento tecnologico e supporto alle aziende sui temi fondamentali dei Big Data, Intelligenza Artificiale, la robotica e la rivoluzione digitale
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
This discusses the architecture of an end-to-end application that combines streaming data with machine learning to do real-time analysis and visualization of where and when Uber cars are clustered, so as to analyze and visualize the most popular Uber locations.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
1. BIG IMAGE ANALYTICS FOR (RE-) INSURER
FLAVIO TROLESE
4QUANT | BIG IMAGE ANALYTICS
THURSDAY, APRIL 7 2016
2. 4Quant | BIG IMAGE ANALYTICS
BIG DATA IMAGE ANALYSIS BIG IMAGE
ANALYTICS
REMOTE SENSING
POTENTIAL (RE-)
INSURANCE CASES
Q&A
3. 4Quant | BIG IMAGE ANALYTICS
[...] Data sets that are so large or complex that traditional data
processing applications are inadequate.
Challenges include analysis, capture, data curation, search,
sharing, storage, transfer, visualization, querying and
information privacy.
BIG DATA
https://en.wikipedia.org/wiki/Big_data
4. 4Quant | BIG IMAGE ANALYTICS
BIG DATA
Optimize Funnel Conversion
Behavioral Analytics
Customer Segmentation
Predictive Support
Market Basket Analysis and
Pricing Optimizations
Predict Security Threats
Fraud Detection
5. 4Quant | BIG IMAGE ANALYTICS
IMAGE ANALYSIS
Tools and technologies (for analysis of «small (image) data»)
(limited) services for «big (image) data»
6. 4Quant | BIG IMAGE ANALYTICS
Linearly scalable, interactive, fault-tolerant big image data
processing technology for two-, three- and four-dimensional
image data.
Processes tera- to petabytes of image and video data in near
real-time and at low costs
+ IMAGE ANALYSIS
+ BIG DATA
= BIG IMAGE ANALYTICS
7. 4Quant | BIG IMAGE ANALYTICS
We do image analytics on Big Data → BIG IMAGE ANALYTICS
Our mission is to help companies monetize their image and
video content.
4QUANT
8. 4Quant | BIG IMAGE ANALYTICS
Problem
«Analyze phenotyping and
genetic linkage of cortical bone
microstructure in the mouse»
Terabytes of image data
Analysis of image data takes years with existing
(small data) image analysis tools
Solution
«Use a Big Data Framework in combination
with image analysis algorithms»
BIG IMAGE ANALYTICS PLATFORM
Linearly scalable, interactive, fault-tolerant big
image data processing technology for two-,
three- and four-dimensional image data
Tera- to petabytes of image and video data
Near real-time processing
TECHNOLOGY
«High-throughput phenotyping and genetic linkage of cortical
bone microstructure in the mouse»
K. Mader, L. Donahue, R. Müller, M. Stampanoni; BMC
Genomics 2015 16:493;
«Moving image analysis to the cloud: A case study with a
genome-scale tomographic study»
K. Mader, M. Stampanoni; AIP Conf. Proc. 1696, 020045 (2016)
9. 4Quant | BIG IMAGE ANALYTICS
2014 20162015
HISTORY
PhD
Thesis
ETH Pioneer
Fellowship
Talk at Spark
Summit
ETH Lecturer
Assignment
Invited Big Data
Workshop
Werner-Meyer-
Ilse Award
Databricks
Partnership
IBM Partnership
Partnership
Partnership
COPD Study
NPC Study
2013
10. 4Quant | BIG IMAGE ANALYTICS
Veracity
Tested validated correctness
Fully traceability of all algorithms
Variety
For two-, three- and four-di-
mensional image data
Supports / parallelizes ex- isting
single CPU algorithms (e.g.
ImageJ, Matlab, R)
Velocity
Stream-based analysis in
almost real-time
Ad-hoc queries
Throughput of > GB/s
Volume
Linearly scalable beyond
gigavoxels and tera- / petabytes
Elastically scale to available
resources
Value
Detection of anomalies, intelligent
feature recognition, tracking of
millions of features, smart
segmentation, parameter sweeping,
pattern recognition
BIG IMAGE ANALYTICS PLATFORM
Variety
Finds insights from complex, noisy,
heterogeneous, longitudinal, and
voluminous imaging data.
Answer questions that were
previously unanswered.
11. 4Quant | BIG IMAGE ANALYTICS
Flavio Trolese / CFO
Corporate Finance
HR / Administration
Dr. Kevin Mader / CTO
R&D
Product Development
Scientific Consulting
Joachim Hagger / CEO
General Management
Business Development
Fundraising
MANAGEMENT TEAM
M.Sc. in Physics ETH
Co-founder and managing partner of Netcetera, an IT
consulting company, 390 employees
BScIT at ZHAW
Co-Founder and managing partner of Panter AG, an IT consulting
company, 40 employees; Co-Founder and managing partner of
Colab Zurich / ImpactHub Zurich (> 350 Members)
PhD on “High-throughput, synchrotron based tomographic
microscopy tools” at ETH/PSI
ETH Pioneer Fellowship, Lecturer in the X-ray microscopy group
D-ITET ETH Zurich
12. 4Quant | BIG IMAGE ANALYTICS
Transportation Material screening
INDUSTRY&
TECHNOLOGY
PathologyRadiology
Image intelligence
Pharma
Remote sensing Human swarms
Imaging CRO
GEOIMAGING
MEDICINE&
PHARMA
15. 4Quant | BIG IMAGE ANALYTICS
REMOTE SENSING
New, exciting sources of images with meter resolution,
updated almost daily
225 GB / y 333 TB / y 333 TB / y 15 PB / y
16. 4Quant | BIG IMAGE ANALYTICS
REMOTE SENSING / BIAP
Meaningful Results
BIAP
Petabytes of
satellite image data
e.g. flood risk calculations
4Quant’s BIAP
VM VMVM VM
VM VMVM VM
17. 4Quant | BIG IMAGE ANALYTICS
Integrate Deep Learning Algorithms to automatically
identify unknown, or changing regions in the image
INTEGRATION OF MACHINE INTELLIGENCE
18. 4Quant | BIG IMAGE ANALYTICS
Utilize high-temporal resolution
and decades of archives...
… to see new insights in the data and extract
quantitative results for building new models
HISTORICAL CHANGES IN SURFACE PATTERNS
19. 4Quant | BIG IMAGE ANALYTICS
Satellites produce lots of data. What can we do with it?
Data courtesy of Planet Labs
Location and size
of water-features
Distance to nearest
water feature
Building location
Identifying
High-Risk areas
TRANSFORM IMAGE DATA INTO MEANING QUANTITIES
20. 4Quant | BIG IMAGE ANALYTICS
REAL-TIME DAMAGE ESTIMATION
Often after a major incident (hail storm, fire, flood), experts
are required to assess the overall lost value. A timely and
accurate loss assessment helps insurers to allocate the right
amount of provisions in time.
Using satellite and drone images, 4Quant accelerates and
improves the process of loss assessment. Automated
analysis of the damaged assets may be both faster and
more reliable than the traditional, manual assessment.
This helps insurers to allocate resources more precisely
both in volume and time.
21. 4Quant | BIG IMAGE ANALYTICS
ACCELERATION / IMPROVEMENT OF CLAIMS MODELING
Fast and reliable information on damages is also crucial for claims
modeling. Modeling firms undertake huge efforts to investigate
damage events to improve their predictive capabilities.
As 4Quant may help insurers in
assessing insured events, we may also
support modeling firms with a thorough
first-level event assessment. Using event
imagery analytics enables modelers to
base their equations on a growing base
of robust data.
This in turn helps insurers to source
improved severity curves in ever more
reliable predictive hazard models.
22. 4Quant | BIG IMAGE ANALYTICS
➔ Our trained algorithms calculate the
expected claims in real time with great
accuracy
➔ Results can be directly integrated into
insurer’s claims management system
ONLINE CLAIMS MANAGEMENT FOR CUSTOMERS
Example: effective car insurance claims handling
Process:
➔ After an accident, the driver or garage
makes pictures of the damaged car
➔ Our database knows the topology of
the healthy car
23. 4Quant | BIG IMAGE ANALYTICS
REAL-TIME INFORMATION FOR INSURANCE LINKED
SECURITIES
Combining trained algorithms with real-time satellite imagery,
4Quant develops models for real time hurricane analytics.
This enables 4Quant to provide value for insurers as the
sponsors of Cat Bonds, and also for investors, asset
managers, and rating agencies.
Parametric Cat Bonds are indexed to the natural
hazard caused by nature, such as atmospheric
pressure in the eye of a hurricane.
Rapid assessment of such parameters is key to
enable liquid markets and effective risk transfer.
However, often parameters are not available in
real time.
24. 4Quant | BIG IMAGE ANALYTICS
INSURANCE MODELS
Worldwide Loss Events 1980-2013
2014 Mü nchener Rü ckversicherungs-Gesellschaft, Geo Risks Research, NatCatSERVICE – As at
February 2014
25. 4Quant | BIG IMAGE ANALYTICS
ACCOUNTING FOR CLIMATE CHANGE
Climate Change Predictions
2014 Mü nchener Rü ckversicherungs-Gesellschaft, Geo Risks Research, NatCatSERVICE – As at
February 2014
26. 4Quant | BIG IMAGE ANALYTICS
2014 Mü nchener Rü ckversicherungs-Gesellschaft, Geo Risks Research, NatCatSERVICE – As at February 2014
ACCOUNTING FOR CLIMATE CHANGE
27. 4Quant | BIG IMAGE ANALYTICS
IMPACT OF CLIMATE CHANGE
2014 Mü nchener Rü ckversicherungs-Gesellschaft, Geo Risks Research, NatCatSERVICE – As at February 2014
28. 4Quant | BIG IMAGE ANALYTICS
Image-supported premium adaptation of car insurances
29. 4Quant | BIG IMAGE ANALYTICS
ADVANCED ANALYTICS: INNOVATION DRIVER IN INSURANCE
“
”