Calculation Consulting provides data science leadership and machine learning consulting services, with a focus on developing algorithms that can generate sustainable revenue. The company is led by Dr. Charles Martin, who has over 10 years of experience in applied machine learning and developing algorithms for companies like Demand Media. Calculation Consulting helps clients address challenges like measuring the impact of data science work, managing the data science process, and ensuring algorithmic accountability and transparency.
Why Deep Learning Works: Dec 13, 2018 at ICSI, UC BerkeleyCharles Martin
Talk given on Dec 13, 2018 at ICSI, UC Berkeley
http://www.icsi.berkeley.edu/icsi/events/2018/12/regularization-neural-networks
Random Matrix Theory (RMT) is applied to analyze the weight matrices of Deep Neural Networks (DNNs), including both production quality, pre-trained models and smaller models trained from scratch. Empirical and theoretical results clearly indicate that the DNN training process itself implicitly implements a form of self-regularization, implicitly sculpting a more regularized energy or penalty landscape. In particular, the empirical spectral density (ESD) of DNN layer matrices displays signatures of traditionally-regularized statistical models, even in the absence of exogenously specifying traditional forms of explicit regularization. Building on relatively recent results in RMT, most notably its extension to Universality classes of Heavy-Tailed matrices, and applying them to these empirical results, we develop a theory to identify 5+1 Phases of Training, corresponding to increasing amounts of implicit self-regularization. For smaller and/or older DNNs, this implicit self-regularization is like traditional Tikhonov regularization, in that there appears to be a ``size scale'' separating signal from noise. For state-of-the-art DNNs, however, we identify a novel form of heavy-tailed self-regularization, similar to the self-organization seen in the statistical physics of disordered systems. Moreover, we can use these heavy tailed results to form a VC-like average case complexity metric that resembles the product norm used in analyzing toy NNs, and we can use this to predict the test accuracy of pretrained DNNs without peeking at the test data.
Plenary session presentation from 8th Workshop on Design Theory Special Interest Group, Paris, 2015.
The main argument:
We argue that, in order to rise to the data-related challenges that the society is facing, data-science initiatives should ensure a renewal of traditional research methodologies that are still largely based on trial-error processes depending on the talent and insights of a single (or a restricted group of) researchers.
We claim that design theories and methods can provide, at least to some extent, the much-needed framework. We will use a worldwide data-science challenge organized to study a technical problem in physics, namely the detection of Higgs boson, as a use case to demonstrate some of the ways in which design theory and methods can help in analyzing and shaping the innovation dynamics in such projects.
Data Science for Business Managers - Trends and EvolutionsAkin Osman Kazakci
This first module is an overview of the current data science panorama. Why this is happening now? Who are various actors? Where will it impact next? A special attention is paid to how predictive technologies will transform legacy industries.
Innovative Design Workshop - HiggsML and beyond (Machine Learning in Particle...Akin Osman Kazakci
Introduction to design theory for strategy and analytics design (machine learning application) for particle physics.
More on:
http://www.osmanakin.org/2015/01/big-data-pushing-buzzword-into.html
Why Deep Learning Works: Dec 13, 2018 at ICSI, UC BerkeleyCharles Martin
Talk given on Dec 13, 2018 at ICSI, UC Berkeley
http://www.icsi.berkeley.edu/icsi/events/2018/12/regularization-neural-networks
Random Matrix Theory (RMT) is applied to analyze the weight matrices of Deep Neural Networks (DNNs), including both production quality, pre-trained models and smaller models trained from scratch. Empirical and theoretical results clearly indicate that the DNN training process itself implicitly implements a form of self-regularization, implicitly sculpting a more regularized energy or penalty landscape. In particular, the empirical spectral density (ESD) of DNN layer matrices displays signatures of traditionally-regularized statistical models, even in the absence of exogenously specifying traditional forms of explicit regularization. Building on relatively recent results in RMT, most notably its extension to Universality classes of Heavy-Tailed matrices, and applying them to these empirical results, we develop a theory to identify 5+1 Phases of Training, corresponding to increasing amounts of implicit self-regularization. For smaller and/or older DNNs, this implicit self-regularization is like traditional Tikhonov regularization, in that there appears to be a ``size scale'' separating signal from noise. For state-of-the-art DNNs, however, we identify a novel form of heavy-tailed self-regularization, similar to the self-organization seen in the statistical physics of disordered systems. Moreover, we can use these heavy tailed results to form a VC-like average case complexity metric that resembles the product norm used in analyzing toy NNs, and we can use this to predict the test accuracy of pretrained DNNs without peeking at the test data.
Plenary session presentation from 8th Workshop on Design Theory Special Interest Group, Paris, 2015.
The main argument:
We argue that, in order to rise to the data-related challenges that the society is facing, data-science initiatives should ensure a renewal of traditional research methodologies that are still largely based on trial-error processes depending on the talent and insights of a single (or a restricted group of) researchers.
We claim that design theories and methods can provide, at least to some extent, the much-needed framework. We will use a worldwide data-science challenge organized to study a technical problem in physics, namely the detection of Higgs boson, as a use case to demonstrate some of the ways in which design theory and methods can help in analyzing and shaping the innovation dynamics in such projects.
Data Science for Business Managers - Trends and EvolutionsAkin Osman Kazakci
This first module is an overview of the current data science panorama. Why this is happening now? Who are various actors? Where will it impact next? A special attention is paid to how predictive technologies will transform legacy industries.
Innovative Design Workshop - HiggsML and beyond (Machine Learning in Particle...Akin Osman Kazakci
Introduction to design theory for strategy and analytics design (machine learning application) for particle physics.
More on:
http://www.osmanakin.org/2015/01/big-data-pushing-buzzword-into.html
Big Data & Machine Learning - TDC2013 Sao PauloOCTO Technology
BigData and Machine Learning: Usage and Opportunities for your IT department
Talk presented at The Developer Conference in São Paulo - 12/0713
Mathieu DESPRIEE
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Sri Ambati
This talk was given at H2O World 2018 NYC and can be viewed here: https://youtu.be/xc3j20Om3UM
Description:
Data science is indeed one of the sexy jobs of the 21st century. But it is also a lot of hard work. And the hard work is seldom about the math or the algorithms. It is about building relevant machine learning products for the real world. We will go over some of the must-haves as you take your machine learning model out of the sandbox and make it work in the big, bad world outside.
Speaker's Bio:
Krish Swamy is an experienced professional with deep skills in applying analytics and BigData capabilities to challenging business problems and driving customer insights. Krish's analytic experience includes marketing and pricing, credit risk, digital analytics and most recently, big data analytics and data transformation. His key experiences lie in banking and financial services, the digital customer experience domain, with a background in management consulting. Other key skills include influencing organizational change towards a data and analytics driven culture, and building teams of analysts, statisticians and data scientists.
The Sky’s the Limit – The Rise of Machine LearninInside Analysis
The Briefing Room with Analyst Dr. Robin Bloor and SkyTree
Live Webcast on June 24, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=1da2b498fc39b8b331a5bbb8dea2660f
With data growing more complex these days, many organizations are looking for ways to make sense of new information sources. The goal? Sprint ahead of the competition by exploiting fast-moving opportunities. The challenge? The data volumes, variety and velocity call for significantly greater horsepower than ever before. That’s where machine learning comes into play, and it’s already fundamentally changing the Big Data Analytics landscape.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he explains how advanced analytics technology can transform the enterprise. He’ll be briefed by Martin Hack, CEO of Skytree, who will tout his company’s machine learning solution for big data. Hack will discuss the critical challenges facing today’s data professionals, and present use cases to show how machine learning can help organizations leverage big data as a capital asset. He’ll specifically address the power of predictive analytics, which can help companies seize opportunities and prevent serious problems.
Visit InsideAnlaysis.com for more information.
Building a robust machine learning model is not an easy task. After all, most POCs don't make it into production. And even if they make it into production, you still need to monitor its performance.
How can you build performant, tolerant, stable, predictive models that have known and fair biases? How can you make sure your models yield their value over time and stay performant after your team has deployed them? What are the current practices of model validation (or lack of), how are they flawed, and how could we improve them?
Simon Dagenais from Snitch AI will go through the reasons behind using an efficient validation framework that goes beyond the common metrics used by ML practitioners and why these tests matter when building high-quality models.
Agenda:
-----------
3:45pm - 4:00pm: Arrival & Networking
4:00pm - 4:15pm: News & Intro
4:15pm - 5:15pm: How to QA your ML models
5:15pm - 5:30pm: Virtual Snack & Networking
About the main speaker:
---------------------------------
Simon Dagenais is the Lead Data Scientist at Snitch AI, a machine learning validation tool. Before working on Snitch AI, Simon was a data scientist consultant at Moov AI, the parent company of Snitch AI. During his time as a consultant, he built and deployed custom ML solutions to solve business needs at companies like DRW, Société de Transport de Montréal and Cogeco. He now aspires to solve problems that data science teams will encounter during the course of a ML project cycle. Simon obtained an M.Sc. in economics from HEC Montreal. He frequently speaks in conferences, panels and meetups.
Machine Learning: Business Perspective - Main Conference: Introduction to Machine Learning.
DutchMLSchool: 1st edition of the Machine Learning Summer School in The Netherlands.
CTO Radshow Hamburg17 - Keynote - The CxO responsibilities in Big Data and AI...Santiago Cabrera-Naranjo
When talking about how the future of Big Data will look like, this conversation often turns straight to Artificial Intelligence and Deep Learning. However, today data science is all too often a process where new insights and models get developed as a one-time effort or deployed to production on an ad-hoc basis i.e. they commonly require regular babysitting for monitoring and updating.
According to Gartner, the number of useless Data Lakes will be of 90% in 2018. Furthermore, only 15% of Big Data Products are mature enough to be deployed into Production - Who is responsible to make Big Data successful and Business relevant within an enterprise?
This presentation was given to a class of MBA students at Oakland University in Rochester, Michigan. I share my personal journey of starting and running a business that provides software development and analytics consulting services.
Companies have adopted data into their DNA using a variety of methods, including data driven, data enabled, and data informed, but many implementations have fallen short of the promised ROI, the result of a gap between the cost of investing in people and infrastructure and the business value delivered.
June Andrews takes a structured look at how to strategically invest in data to maximize the benefit gained from incorporating data, highlighting situations when investment in experimental platforms doesn’t make sense and others when building a custom analysis platform does. Along the way, June explains how best practices have evolved over time. The result is to grow the mindset from creating data-driven organizations to creating data-competitive organizations that can adapt and deliver in the rapidly changing landscape between data science, machine learning, and artificial intelligence.
Data. It keeps coming up time and time again. On our social media feeds, in our client conversations, and has of course been the driver behind never-before-seen tools like ChatGPT.
But how can you do more with the data your organisation has and produces? What is data engineering and big data, and how can you enable data-driven decision-making within your organisation?
Hear from Nabi Rezvani—Lead Data Engineer—and Gaurav Thadani—Lead Software Engineer at DiUS on the latest trends, use cases and real-life examples of how our clients are using data and analytics to improve their decision making, customer experiences and business operations.
Also joining us are Jonathan Gomez—Head of Data Platforms at Wesfarmers OneDigital OnePass—and John Sullivan—CEO at ChargeFox—on their own [big and small] data journeys, along with the lessons they’ve learned along the way.
Watch the presentation on YouTube: https://youtu.be/ccghOfcdGN8
At ING Bank, machine learning models are a key factor in making relevant engagements with our customers, empowering them to stay a step ahead in life and in business. In our efforts to make the model building process more rapid, compliant, validated and accessible to roles other than data scientists (such as data analysts or customer journey experts), we have structured it for an easy creation of propensity models.
In this talk, I will present this structure, focusing on pipelining data science models in Apache Spark. In particular, I will show how we use Apache Sqoop & Ranger to comply with GDPR, build a data science workflow on top of python and Jupyter, extend the SparkML libraries on PySpark to create custom standardizers and cross-validators, and show an in-house developed monitoring tool built on top of Elasticsearch for model evaluation.
Finally, I will describe the type of engagement analysts and customer journey experts have with the result set of the models created, and how we refine our dashboards (in IBM Cognos) accordingly.
Speaker: Dor Kedem, Lead Data Scientist
ING Bank
Minne analytics presentation 2018 12 03 final compressedBonnie Holub
Monday was another great conference by MinneAnalytics! #MinneFRAMA was a great success with over 1,100 attendees at Science Museum of Minnesota. Alison Rempel Brown is a great host! A Teradata colleague told me that her post about my presentation "blew up" with hits and she got over 2K views, and 60+ likes. I'm proud to be a part of this great #datascience organization brining #machinelearning and #artificialintelligence #analytics to our #bigdata clients. If you want my slides, here they are.
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...DATAVERSITY
Artificial Intelligence (AI) may conjure up images of robots and science fiction. But AI has practical applications in today’s data-driven organization for product recommendation engines, customer support, inventory management, and more. To support AI in order to drive concrete business outcomes, a strong data foundation is needed. This webinar will discuss practical applications for AI in your organization, and how to build a data architecture to support its use.
Big Data & Machine Learning - TDC2013 Sao PauloOCTO Technology
BigData and Machine Learning: Usage and Opportunities for your IT department
Talk presented at The Developer Conference in São Paulo - 12/0713
Mathieu DESPRIEE
Helping data scientists escape the seduction of the sandbox - Krish Swamy, We...Sri Ambati
This talk was given at H2O World 2018 NYC and can be viewed here: https://youtu.be/xc3j20Om3UM
Description:
Data science is indeed one of the sexy jobs of the 21st century. But it is also a lot of hard work. And the hard work is seldom about the math or the algorithms. It is about building relevant machine learning products for the real world. We will go over some of the must-haves as you take your machine learning model out of the sandbox and make it work in the big, bad world outside.
Speaker's Bio:
Krish Swamy is an experienced professional with deep skills in applying analytics and BigData capabilities to challenging business problems and driving customer insights. Krish's analytic experience includes marketing and pricing, credit risk, digital analytics and most recently, big data analytics and data transformation. His key experiences lie in banking and financial services, the digital customer experience domain, with a background in management consulting. Other key skills include influencing organizational change towards a data and analytics driven culture, and building teams of analysts, statisticians and data scientists.
The Sky’s the Limit – The Rise of Machine LearninInside Analysis
The Briefing Room with Analyst Dr. Robin Bloor and SkyTree
Live Webcast on June 24, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=1da2b498fc39b8b331a5bbb8dea2660f
With data growing more complex these days, many organizations are looking for ways to make sense of new information sources. The goal? Sprint ahead of the competition by exploiting fast-moving opportunities. The challenge? The data volumes, variety and velocity call for significantly greater horsepower than ever before. That’s where machine learning comes into play, and it’s already fundamentally changing the Big Data Analytics landscape.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he explains how advanced analytics technology can transform the enterprise. He’ll be briefed by Martin Hack, CEO of Skytree, who will tout his company’s machine learning solution for big data. Hack will discuss the critical challenges facing today’s data professionals, and present use cases to show how machine learning can help organizations leverage big data as a capital asset. He’ll specifically address the power of predictive analytics, which can help companies seize opportunities and prevent serious problems.
Visit InsideAnlaysis.com for more information.
Building a robust machine learning model is not an easy task. After all, most POCs don't make it into production. And even if they make it into production, you still need to monitor its performance.
How can you build performant, tolerant, stable, predictive models that have known and fair biases? How can you make sure your models yield their value over time and stay performant after your team has deployed them? What are the current practices of model validation (or lack of), how are they flawed, and how could we improve them?
Simon Dagenais from Snitch AI will go through the reasons behind using an efficient validation framework that goes beyond the common metrics used by ML practitioners and why these tests matter when building high-quality models.
Agenda:
-----------
3:45pm - 4:00pm: Arrival & Networking
4:00pm - 4:15pm: News & Intro
4:15pm - 5:15pm: How to QA your ML models
5:15pm - 5:30pm: Virtual Snack & Networking
About the main speaker:
---------------------------------
Simon Dagenais is the Lead Data Scientist at Snitch AI, a machine learning validation tool. Before working on Snitch AI, Simon was a data scientist consultant at Moov AI, the parent company of Snitch AI. During his time as a consultant, he built and deployed custom ML solutions to solve business needs at companies like DRW, Société de Transport de Montréal and Cogeco. He now aspires to solve problems that data science teams will encounter during the course of a ML project cycle. Simon obtained an M.Sc. in economics from HEC Montreal. He frequently speaks in conferences, panels and meetups.
Machine Learning: Business Perspective - Main Conference: Introduction to Machine Learning.
DutchMLSchool: 1st edition of the Machine Learning Summer School in The Netherlands.
CTO Radshow Hamburg17 - Keynote - The CxO responsibilities in Big Data and AI...Santiago Cabrera-Naranjo
When talking about how the future of Big Data will look like, this conversation often turns straight to Artificial Intelligence and Deep Learning. However, today data science is all too often a process where new insights and models get developed as a one-time effort or deployed to production on an ad-hoc basis i.e. they commonly require regular babysitting for monitoring and updating.
According to Gartner, the number of useless Data Lakes will be of 90% in 2018. Furthermore, only 15% of Big Data Products are mature enough to be deployed into Production - Who is responsible to make Big Data successful and Business relevant within an enterprise?
This presentation was given to a class of MBA students at Oakland University in Rochester, Michigan. I share my personal journey of starting and running a business that provides software development and analytics consulting services.
Companies have adopted data into their DNA using a variety of methods, including data driven, data enabled, and data informed, but many implementations have fallen short of the promised ROI, the result of a gap between the cost of investing in people and infrastructure and the business value delivered.
June Andrews takes a structured look at how to strategically invest in data to maximize the benefit gained from incorporating data, highlighting situations when investment in experimental platforms doesn’t make sense and others when building a custom analysis platform does. Along the way, June explains how best practices have evolved over time. The result is to grow the mindset from creating data-driven organizations to creating data-competitive organizations that can adapt and deliver in the rapidly changing landscape between data science, machine learning, and artificial intelligence.
Data. It keeps coming up time and time again. On our social media feeds, in our client conversations, and has of course been the driver behind never-before-seen tools like ChatGPT.
But how can you do more with the data your organisation has and produces? What is data engineering and big data, and how can you enable data-driven decision-making within your organisation?
Hear from Nabi Rezvani—Lead Data Engineer—and Gaurav Thadani—Lead Software Engineer at DiUS on the latest trends, use cases and real-life examples of how our clients are using data and analytics to improve their decision making, customer experiences and business operations.
Also joining us are Jonathan Gomez—Head of Data Platforms at Wesfarmers OneDigital OnePass—and John Sullivan—CEO at ChargeFox—on their own [big and small] data journeys, along with the lessons they’ve learned along the way.
Watch the presentation on YouTube: https://youtu.be/ccghOfcdGN8
At ING Bank, machine learning models are a key factor in making relevant engagements with our customers, empowering them to stay a step ahead in life and in business. In our efforts to make the model building process more rapid, compliant, validated and accessible to roles other than data scientists (such as data analysts or customer journey experts), we have structured it for an easy creation of propensity models.
In this talk, I will present this structure, focusing on pipelining data science models in Apache Spark. In particular, I will show how we use Apache Sqoop & Ranger to comply with GDPR, build a data science workflow on top of python and Jupyter, extend the SparkML libraries on PySpark to create custom standardizers and cross-validators, and show an in-house developed monitoring tool built on top of Elasticsearch for model evaluation.
Finally, I will describe the type of engagement analysts and customer journey experts have with the result set of the models created, and how we refine our dashboards (in IBM Cognos) accordingly.
Speaker: Dor Kedem, Lead Data Scientist
ING Bank
Minne analytics presentation 2018 12 03 final compressedBonnie Holub
Monday was another great conference by MinneAnalytics! #MinneFRAMA was a great success with over 1,100 attendees at Science Museum of Minnesota. Alison Rempel Brown is a great host! A Teradata colleague told me that her post about my presentation "blew up" with hits and she got over 2K views, and 60+ likes. I'm proud to be a part of this great #datascience organization brining #machinelearning and #artificialintelligence #analytics to our #bigdata clients. If you want my slides, here they are.
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...DATAVERSITY
Artificial Intelligence (AI) may conjure up images of robots and science fiction. But AI has practical applications in today’s data-driven organization for product recommendation engines, customer support, inventory management, and more. To support AI in order to drive concrete business outcomes, a strong data foundation is needed. This webinar will discuss practical applications for AI in your organization, and how to build a data architecture to support its use.
Description: WeightWatcher (WW): is an open-source, diagnostic tool for analyzing Deep Neural Networks (DNN), without needing access to training or even test data. It can be used to:analyze pre/trained PyTorch, Keras, DNN models (Conv2D and Dense layers) monitor models, and the model layers, to see if they are over-trained or over-parameterized, predict test accuracies across different models, with or without training data, and detect potential problems when compressing or fine-tuning pre-trained models. see https://weightwatcher.ai
Stanford ICME Lecture on Why Deep Learning WorksCharles Martin
Random Matrix Theory (RMT) is applied to analyze the weight matrices
of Deep Neural Networks (DNNs), including production quality,
pre-trained models, and smaller models trained from scratch. Empirical
and theoretical results indicate that the DNN training process itself
implements a form of self-regularization, evident in the empirical
spectral density (ESD) of DNN layer matrices. To understand this, we
provide a phenomenology to identify 5+1 Phases of Training,
corresponding to increasing amounts of implicit self-regularization.
For smaller and/or older DNNs, this implicit self-regularization is
like traditional Tikhonov regularization, with a "size scale"
separating signal from noise. For state-of-the-art DNNs, however, we
identify a novel form of heavy-tailed self-regularization, similar to
the self-organization seen in the statistical physics of disordered systems.
To that end, building on the statistical mechanics of generalization,
and applying recent results from RMT, we derive a new VC-like
complexity metric that resembles the familiar product norms, but is
suitable for studying average-case generalization behavior in real
systems. We then demonstrate its effectiveness by testing how well
this new metric correlates with trends in the reported test accuracies
across models for over 450 pretrained DNNs covering a range of data
sets and architectures.
Why Deep Learning Works: Self Regularization in Deep Neural Networks Charles Martin
Talk given on June 8, 2018 at UC Berkeley / NERSC
In Collaboration with Michael Mahoney, UC Berkeley
National Energy Research Scientific Computing Center
Empirical results, using the machinery of Random Matrix Theory (RMT), are presented that are aimed at clarifying and resolving some of the puzzling and seemingly-contradictory aspects of deep neural networks (DNNs). We apply RMT to several well known pre-trained models: LeNet5, AlexNet, and Inception V3, as well as 2 small, toy models. We show that the DNN training process itself implicitly implements a form of self-regularization associated with the entropy collapse / information bottleneck. We find that the self-regularization in small models like LeNet5, resembles the familar Tikhonov regularization, whereas large, modern deep networks display a new kind of heavy tailed self-regularization. We characterize self-regularization using RMT by identifying a taxonomy of the 5+1 phases of training. Then, with our toy models, we show that even in the absence of any explicit regularization mechanism, the DNN training process itself leads to more and more capacity-controlled models. Importantly, this phenomenon is strongly affected by the many knobs that are used to optimize DNN training. In particular, we can induce heavy tailed self-regularization by adjusting the batch size in training, thereby exploiting the generalization gap phenomena unique to DNNs. We argue that this heavy tailed self-regularization has practical implications both designing better DNNs and deep theoretical implications for understanding the complex DNN Energy landscape / optimization problem.
Why Deep Learning Works: Self Regularization in Deep Neural NetworksCharles Martin
Talk (to be given) June 8, 2018 at UC Berkeley / NERSC
Empirical results, using the machinery of Random Matrix Theory (RMT), are presented that are aimed at clarifying and resolving some of the puzzling and seemingly-contradictory aspects of deep neural networks (DNNs). We apply RMT to several well known pre-trained models: LeNet5, AlexNet, and Inception V3, as well as 2 small, toy models. We show that the DNN training process itself implicitly implements a form of self-regularization associated with the entropy collapse / information bottleneck. We find that the self-regularization in small models like LeNet5, resembles the familar Tikhonov regularization, whereas large, modern deep networks display a new kind of heavy tailed self-regularization. We characterize self-regularization using RMT by identifying a taxonomy of the 5+1 phases of training. Then, with our toy models, we show that even in the absence of any explicit regularization mechanism, the DNN training process itself leads to more and more capacity-controlled models. Importantly, this phenomenon is strongly affected by the many knobs that are used to optimize DNN training. In particular, we can induce heavy tailed self-regularization by adjusting the batch size in training, thereby exploiting the generalization gap phenomena unique to DNNs. We argue that this heavy tailed self-regularization has practical implications both designing better DNNs and deep theoretical implications for understanding the complex DNN Energy landscape / optimization problem.
Why Deep Learning Works: Self Regularization in Deep Neural NetworksCharles Martin
Talk (to be given) June 8, 2018 at UC Berkeley / NERSC
In Collaboration with Michael Mahoney, UC Berkeley
Empirical results, using the machinery of Random Matrix Theory (RMT), are presented that are aimed at clarifying and resolving some of the puzzling and seemingly-contradictory aspects of deep neural networks (DNNs). We apply RMT to several well known pre-trained models: LeNet5, AlexNet, and Inception V3, as well as 2 small, toy models. We show that the DNN training process itself implicitly implements a form of self-regularization associated with the entropy collapse / information bottleneck. We find that the self-regularization in small models like LeNet5, resembles the familar Tikhonov regularization, whereas large, modern deep networks display a new kind of heavy tailed self-regularization. We characterize self-regularization using RMT by identifying a taxonomy of the 5+1 phases of training. Then, with our toy models, we show that even in the absence of any explicit regularization mechanism, the DNN training process itself leads to more and more capacity-controlled models. Importantly, this phenomenon is strongly affected by the many knobs that are used to optimize DNN training. In particular, we can induce heavy tailed self-regularization by adjusting the batch size in training, thereby exploiting the generalization gap phenomena unique to DNNs. We argue that this heavy tailed self-regularization has practical implications both designing better DNNs and deep theoretical implications for understanding the complex DNN Energy landscape / optimization problem.
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.AnnySerafinaLove
This letter, written by Kellen Harkins, Course Director at Full Sail University, commends Anny Love's exemplary performance in the Video Sharing Platforms class. It highlights her dedication, willingness to challenge herself, and exceptional skills in production, editing, and marketing across various video platforms like YouTube, TikTok, and Instagram.
Recruiting in the Digital Age: A Social Media MasterclassLuanWise
In this masterclass, presented at the Global HR Summit on 5th June 2024, Luan Wise explored the essential features of social media platforms that support talent acquisition, including LinkedIn, Facebook, Instagram, X (formerly Twitter) and TikTok.
LA HUG - Video Testimonials with Chynna Morgan - June 2024Lital Barkan
Have you ever heard that user-generated content or video testimonials can take your brand to the next level? We will explore how you can effectively use video testimonials to leverage and boost your sales, content strategy, and increase your CRM data.🤯
We will dig deeper into:
1. How to capture video testimonials that convert from your audience 🎥
2. How to leverage your testimonials to boost your sales 💲
3. How you can capture more CRM data to understand your audience better through video testimonials. 📊
FIA officials brutally tortured innocent and snatched 200 Bitcoins of worth 4...jamalseoexpert1978
Farman Ayaz Khattak and Ehtesham Matloob are government officials in CTW Counter terrorism wing Islamabad, in Federal Investigation Agency FIA Headquarters. CTW and FIA kidnapped crypto currency owner from Islamabad and snatched 200 Bitcoins those worth of 4 billion rupees in Pakistan currency. There is not Cryptocurrency Regulations in Pakistan & CTW is official dacoit and stealing digital assets from the innocent crypto holders and making fake cases of terrorism to keep them silent.
Personal Brand Statement:
As an Army veteran dedicated to lifelong learning, I bring a disciplined, strategic mindset to my pursuits. I am constantly expanding my knowledge to innovate and lead effectively. My journey is driven by a commitment to excellence, and to make a meaningful impact in the world.
Digital Transformation and IT Strategy Toolkit and TemplatesAurelien Domont, MBA
This Digital Transformation and IT Strategy Toolkit was created by ex-McKinsey, Deloitte and BCG Management Consultants, after more than 5,000 hours of work. It is considered the world's best & most comprehensive Digital Transformation and IT Strategy Toolkit. It includes all the Frameworks, Best Practices & Templates required to successfully undertake the Digital Transformation of your organization and define a robust IT Strategy.
Editable Toolkit to help you reuse our content: 700 Powerpoint slides | 35 Excel sheets | 84 minutes of Video training
This PowerPoint presentation is only a small preview of our Toolkits. For more details, visit www.domontconsulting.com
Discover the innovative and creative projects that highlight my journey throu...dylandmeas
Discover the innovative and creative projects that highlight my journey through Full Sail University. Below, you’ll find a collection of my work showcasing my skills and expertise in digital marketing, event planning, and media production.
At Techbox Square, in Singapore, we're not just creative web designers and developers, we're the driving force behind your brand identity. Contact us today.
Building Your Employer Brand with Social MediaLuanWise
Presented at The Global HR Summit, 6th June 2024
In this keynote, Luan Wise will provide invaluable insights to elevate your employer brand on social media platforms including LinkedIn, Facebook, Instagram, X (formerly Twitter) and TikTok. You'll learn how compelling content can authentically showcase your company culture, values, and employee experiences to support your talent acquisition and retention objectives. Additionally, you'll understand the power of employee advocacy to amplify reach and engagement – helping to position your organization as an employer of choice in today's competitive talent landscape.
Navigating the world of forex trading can be challenging, especially for beginners. To help you make an informed decision, we have comprehensively compared the best forex brokers in India for 2024. This article, reviewed by Top Forex Brokers Review, will cover featured award winners, the best forex brokers, featured offers, the best copy trading platforms, the best forex brokers for beginners, the best MetaTrader brokers, and recently updated reviews. We will focus on FP Markets, Black Bull, EightCap, IC Markets, and Octa.
3. calculation | consulting data science leadership
Who Are We?
c|c
(TM)
Dr. Charles H. Martin, PhD
University of Chicago, Chemical Physics
NSF Fellow in Theoretical Chemistry
Over 10 years experience in applied Machine Learning
Developed ML algos for Demand Media; the first $1B IPO since Google
Lean Start Ups: Aardvark (acquired by Google), eHow, Mode
Wall Street: BlackRock, GLG
Fortune 500: Big Pharma, Telecom, eBay
www.calculationconsulting.com
charles@calculationconsulting.com
(TM)
3
4. BackStory: in 2011, Search Changed. Forever.
• first $1B IPO since Google
• Machine Learning based SEO algorithms
• Measure the demand for search, and fulfill it
data science algorithms created a billion $ company
c|c
(TM)
(TM)
Demand Media
calculation | consulting data science leadership(TM)
4
eHow.com
5. BackStory: in 2011, Search Changed. Forever.
• Google adapted (Panda)
• Lack of diversification
• Lack of adaptation
• Stock price never recovered
algorithmic accountability: DMD or Google?
c|c
(TM)
IPO
Panda
stock price 2011-2012
(TM)
calculation | consulting data science leadership
DMD
(TM)
5
6. • first $1B collapse due to Panda ?
• CPC revenues down
• premium online publishers died
collapse
?
stock price 2011-2012
c|c
(TM)
$1B in ad revenue was repriced and reallocated
Problem: Cornering the market on
search induced a market crash
calculation | consulting data science leadership(TM)
6
8. Data Science is Different
c|c
(TM)
Davenport
calculation | consulting data science leadership
Generating sustainable revenue requires
Data Science Leadership and Execution
(TM)
8
“Companies need a Spock in the boardroom”
9. Data Science is Different
c|c
(TM)
Davenport
calculation | consulting data science leadership
Generating sustainable revenue requires
Data Science Leadership and Execution
(TM)
9
http://www.theonion.com/articles/national-science-foundation-science-hard,1405/
10. Problem: Data Scientists are Different
c|c
(TM)
Davenport
calculation | consulting data science leadership(TM)
10
not all techies are the same
11. Problem: Data Scientists are Different
c|c
(TM)
Davenport
calculation | consulting data science leadership
theoretical physics
machine learning specialist
(TM)
11
experimental physics
data scientist
engineer
software, browser tech, dev ops, …
not all techies are the same
12. Problem: Data Scientists are Different
c|c
(TM)
Davenport
calculation | consulting data science leadership(TM)
12
not all techies are the same
13. Managing: Data Science Process
• Acquire Domain Knowledge
• Formulate Hypothesis
• Generate Model(s) from the Data
• Predict Revenue Gains
• Backtest Predictions on your Data
• A/B Test in Production
• Attribute Gains to Model(s)
c|c
(TM)
(TM)
acting
solving
framing
calculation | consulting data science leadership
13
14. Managing: Data Science Process
c|c
(TM)
(TM)
calculation | consulting data science leadership
14
15. c|c
(TM)
• Systems Thinking: leveraging the inter-relationships
between data, marketing, and the customer
• Knowledge Transfer: mentoring — not training — to
develop both personal mastery and team learning
• Mental Models: create a base of small-scale models for
thinking about how to use your data
• Knowledge Sharing: foster collaboration between
research, engineering, and product to drive revenue
Managing: Learning from Data
calculation | consulting data science leadership(TM)
15
16. c|c
(TM)
• Cross-functional engineering, product, marketing, finance
• Autonomous: separate from the traditional engineering
product lifecycle. self-organizing and self-managing
• Experimental: form hypothesis, analyze data, make
predictions, run backtests, A/B testing
• Self-sustaining: not a cost center; generates revenue
(TM)
Data Science is Different
calculation | consulting data science leadership
16
17. Solution: Collecting and Organizing Data
(TM)
c|c
(TM)
• Most companies are struggling organizing their data
• Data needs to be examined
• Don’t assume data is correct or useful
• More is More: simple algos work
• More is Less: noise is noise
Data not examined is not collected
calculation | consulting data science leadership
17
18. Solutions: Hadoop and Big Data
(TM)
c|c
(TM)
• Hadoop is an internal data ecosystem
• Hadoop appears to have won the adoption wars ?
• Hadoop : 90% deployments internal
• Hadoop is a cost center
• ROI needs cut across business divisions
Algorithms, not data, generate revenue
calculation | consulting data science leadership
18
19. Solutions: Cloud
(TM)
c|c
(TM)
• Startups don’t need infrastructure
• long term Data Storage is virtually free
• Amazon Redshift
• Google Big Query
• Cloud is the future ?
Algorithms, not data, generate revenue
calculation | consulting data science leadership
19
20. Solutions: Spark
(TM)
c|c
(TM)
• Next Gen Platform for Machine Learning
• Sits on Hadoop or the Cloud
• Still very high touch
• Limited algos
Algorithms, not data, generate revenue
calculation | consulting data science leadership
20
22. Data Science’s Measurement Problem
(TM)
c|c
(TM)
good experiments are hard to design
calculation | consulting data science leadership
22
http://www.forbes.com/sites/lizryan/2014/02/10/if-you-cant-measure-it-you-cant-manage-it-is-bs/
23. Data Science’s Measurement Problem
(TM)
c|c
(TM)
good experiments are hard to design
calculation | consulting data science leadership
23
“Data science has a measurement problem.
Simple metrics may not address complex situations.
But complex metrics present myriad problems.”
“As we strive for better algorithms,
we often fail to think critically about what it means
for predictions to be ‘good’”
http://www.kdnuggets.com/2015/03/data-science-measurement-problem-accuracy-auroc-f1.html
24. Data Science’s Measurement Problem
(TM)
c|c
(TM)
good experiments are hard to design
calculation | consulting data science leadership
24
“Buffett found it 'extraordinary' that academics studied such things.
They studied what was measurable, rather than what was meaningful.‘
… to a man with a hammer,
everything looks like a nail.”
― Roger Lowenstein, Buffett:
The Making of an American Capitalist
25. c|c
(TM)
(TM)
Problem: The Cult of the Algorithm
calculation | consulting data science leadership
25
what can algos actually do ?
“We have a new machine learning algo that anticipate
your needs over time and behave accordingly”
26. c|c
(TM)
(TM)
Problem: What can Machine Learning Do?
calculation | consulting data science leadership
26
what can algos actually do ?
27. Demand Algos: Gas Station Analogy
Problem: where to open a gas station ?
Need: good traffic, weak competition
c|c
(TM)
less competitors
no traffic
sweet spot
great traffic
too many competitors
calculation | consulting data science leadership
all businesses balance supply and demand
(TM)
27
28. SAAS Machine Learning Algos
c|c
(TM)
calculation | consulting data science leadership
(TM)
28
$100,000 • 167 teams
Diabetic Retinopathy Detection
$15,000 • 341 teams
March Machine Learning Mania 2015
machine learning contests
32. c|c
(TM)
(TM)
Problem: Externalities
calculation | consulting data science leadership
32
“Zynga is our best company ever!” (2010)
John Doerr, Google Investor, LegendaryVC
http://venturebeat.com/2010/11/16/google-investor-john-doerr-zynga-is-our-best-company-ever/
one marketplace | big risks
33. c|c
(TM)
(TM)
Solution: Algorithmic Accountability
calculation | consulting data science leadership
An asset is an economic resource.
Anything tangible or intangible that is capable of
being owned or controlled to produce value and
that is held to have positive economic value is
considered an asset.
algorithms can be valuable assets
33
34. c|c
(TM)
(TM)
Algorithmic Accountability
calculation | consulting data science leadership
34
does revenue depends on hidden algos ?
• WebMD Google SEO
• Amazon Product Listing Algo
• Pinterest Relevance Algo
• Twitter Spam filter
• Apple App Store Rankings
35. c|c
(TM)
(TM)
Algorithmic Accountability
calculation | consulting data science leadership
35
do decisions depend on hidden factors ?
A 'Crisis' in Online Ads: One-Third of Traffic Is Bogus
http://www.wsj.com/articles/SB10001424052702304026304579453253860786362
Now Algorithms Are DecidingWhomTo Hire…
http://www.npr.org/blogs/alltechconsidered/2015/03/23/394827451/now-algorithms-are-deciding-whom-to-hire-based-on-voice
What you don’t know about Internet algorithms is hurting you…
http://www.washingtonpost.com/news/the-intersect/wp/2015/03/23/what-you-dont-know-about-internet-algorithms-is-hurting-you-and-you-probably-dont-know-very-much/
36. c|c
(TM)
(TM)
Solution: Algorithmic Transparency
calculation | consulting data science leadership
36
can you be transparent and not be gamed ?
http://fortune.com/2015/03/18/how-do-you-govern-a-hidden-fluid-and-amoral-algorithm/
83% of the participants in the study changed their behavior
once they knew about the algorithm
How do you govern a (hidden, fluid and amoral) algorithm?
participants mistakenly believed that their friends intentionally
chose not to show them stories
37. c|c
(TM)
(TM)
Algorithmic Accountability
calculation | consulting data science leadership
Do you depend on some else’s marketplace?
How does your revenue depend on algos?
Do you need an internal algo ?
Who will manage it? build it? maintain it?
algorithms have unforeseen liabilities
37