Companies like Google, Microsoft, Amazon and Facebook are in fierce competition for teams that can build deep-learning applications. Because of deep learning's general usefulness in pattern recognition, those applications are surprisingly diverse, ranging from image recognition to machine translation. This talk will explore deep learning use cases for the major data types -- image, sound, text and time series -- as they're emerging in the private sector. Presented by Chris Nicholson, Co-Founder and CEO at Skymind.
Applied Machine Learning for the IoT - Data Science Pop-up SeattleDomino Data Lab
The Internet of Things is about data, not things. Some forecasts that by 2018 the number of connect things will exceed the combined number of personal computers, smartphones, and tablets. Each ’thing’ can produce a tremendous stream of data from sensors and other sources. This presentation will discuss progress, examples, challenges, and opportunities with machine learning for the IoT. A short presentation will be done on some recent applications of ML (using H2O) to the domains of machine prognostics / health management (PHM) and agriculture. Presented by Hank Roark, Data Scientist / Hacker at
H2O.ai.
데이터 과학자의 실체 The Reality of Data Scientist
전체 분석 과정에서 대부분은 데이터를 모으고 가공하는데 소요한다.
그리고 애플리케이션에 데이터를 적용하기 위해서는 테스팅이 가장 중요하다.
인간공학 전공자들을 대상으로 준비한 발표자료라서 '데이터 수집 및 클렌징'보다는 '테스트 (온라인 테스트)'에 초점을 두고 자료를 만들었습니다.
How to Become a Data Scientist
SF Data Science Meetup, June 30, 2014
Video of this talk is available here: https://www.youtube.com/watch?v=c52IOlnPw08
More information at: http://www.zipfianacademy.com
Zipfian Academy @ Crowdflower
From the webinar presentation "Data Science: Not Just for Big Data", hosted by Kalido and presented by:
David Smith, Data Scientist at Revolution Analytics, and
Gregory Piatetsky, Editor, KDnuggets
These are the slides for David Smith's portion of the presentation.
Watch the full webinar at:
http://www.kalido.com/data-science.htm
Presentation at Data ScienceTech Institute campuses, Paris and Nice, May 2016 , including Intro, Data Science History and Terms; 10 Real-World Data Science Lessons; Data Science Now: Polls & Trends; Data Science Roles; Data Science Job Trends; and Data Science Future
Applied Machine Learning for the IoT - Data Science Pop-up SeattleDomino Data Lab
The Internet of Things is about data, not things. Some forecasts that by 2018 the number of connect things will exceed the combined number of personal computers, smartphones, and tablets. Each ’thing’ can produce a tremendous stream of data from sensors and other sources. This presentation will discuss progress, examples, challenges, and opportunities with machine learning for the IoT. A short presentation will be done on some recent applications of ML (using H2O) to the domains of machine prognostics / health management (PHM) and agriculture. Presented by Hank Roark, Data Scientist / Hacker at
H2O.ai.
데이터 과학자의 실체 The Reality of Data Scientist
전체 분석 과정에서 대부분은 데이터를 모으고 가공하는데 소요한다.
그리고 애플리케이션에 데이터를 적용하기 위해서는 테스팅이 가장 중요하다.
인간공학 전공자들을 대상으로 준비한 발표자료라서 '데이터 수집 및 클렌징'보다는 '테스트 (온라인 테스트)'에 초점을 두고 자료를 만들었습니다.
How to Become a Data Scientist
SF Data Science Meetup, June 30, 2014
Video of this talk is available here: https://www.youtube.com/watch?v=c52IOlnPw08
More information at: http://www.zipfianacademy.com
Zipfian Academy @ Crowdflower
From the webinar presentation "Data Science: Not Just for Big Data", hosted by Kalido and presented by:
David Smith, Data Scientist at Revolution Analytics, and
Gregory Piatetsky, Editor, KDnuggets
These are the slides for David Smith's portion of the presentation.
Watch the full webinar at:
http://www.kalido.com/data-science.htm
Presentation at Data ScienceTech Institute campuses, Paris and Nice, May 2016 , including Intro, Data Science History and Terms; 10 Real-World Data Science Lessons; Data Science Now: Polls & Trends; Data Science Roles; Data Science Job Trends; and Data Science Future
Una breve introduzione alla data science e al machine learning con un'enfasi sugli scenari applicativi, da quelli tradizionali a quelli più innovativi. La overview copre la definizione di base di data science, una overview del machine learning e esempi su scenari tradizionali, Recommender systems e Social Network Analysis, IoT e Deep Learning
A presentation delivered by Mohammed Barakat on the 2nd Jordanian Continuous Improvement Open Day in Amman. The presentation is about Data Science and was delivered on 3rd October 2015.
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-ShapiroData ScienceTech Institute
Data Science Tech Institute - Big Data and Data Science Conference around Dr Gregory Piatetsky-Shapiro.
Keynote - An overview on Big Data & Data Science Dr Gregory Piatetsky-Shapiro - KDnuggets.com Founder & Editor.
Paris May 23rd & Nice May 26th 2016 @ Data ScienceTech Institute (https://www.datasciencetech.institute/)
Data science is the new thing! How to be a data scientist? See here.
This was originally was written by the team behind DataCamp, - the online interactive learning platform for data science!
Ordinary people included anyone who is not a Geek like myself. This book is written for ordinary people. That includes manager, marketers, technical writers, couch potatoes and so on.
Data Science and Analytics for Ordinary People is a collection of blogs I have written on LinkedIn over the past year. As I continue to perform big data analytics, I continue to discover, not only my weaknesses in communicating the information, but new insights into using the information obtained from analytics and communicating it. These are the kinds of things I blog about and are contained herein.
Big Data & Machine Learning - TDC2013 Sao PauloOCTO Technology
BigData and Machine Learning: Usage and Opportunities for your IT department
Talk presented at The Developer Conference in São Paulo - 12/0713
Mathieu DESPRIEE
Applications of Machine Learning at USC presentation by Alex Tellez
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Curious about Data Science? Self-taught on some aspects, but missing the big picture? Well, you’ve got to start somewhere and this session is the place to do it.
This session will cover, at a layman’s level, some of the basic concepts of Data Science. In a conversational format, we will discuss: What are the differences between Big Data and Data Science – and why aren’t they the same thing? What distinguishes descriptive, predictive, and prescriptive analytics? What purpose do predictive models serve in a practical context? What kinds of models are there and what do they tell us? What is the difference between supervised and unsupervised learning? What are some common pitfalls that turn good ideas into bad science?
During this session, attendees will learn the difference between k-nearest neighbor and k-means clustering, understand the reasons why we do normalize and don’t overfit, and grasp the meaning of No Free Lunch.
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
What 'kind of things' does a data scientist do? What are the foundations and principles of data science? What is a Data Product? What does the data science process looks like? Learning from data: Data Modeling or Algorithmic Modeling? - talk by Carlos Somohano @ds_ldn at The Cloud and Big Data: HDInsight on Azure London 25/01/13
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
In the last couple of years, deep learning techniques have transformed the world of artificial intelligence. One by one, the abilities and techniques that humans once imagined were uniquely our own have begun to fall to the onslaught of ever more powerful machines. Deep neural networks are now better than humans at tasks such as face recognition and object recognition. They’ve mastered the ancient game of Go and thrashed the best human players. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist).
What sparked this new hype? How is Deep Learning different from previous approaches? Let’s look behind the curtain and unravel the reality. This talk will introduce the core concept of deep learning, explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why “deep learning is probably one of the most exciting things that is happening in the computer industry“ (Jen-Hsun Huang – CEO NVIDIA).
Una breve introduzione alla data science e al machine learning con un'enfasi sugli scenari applicativi, da quelli tradizionali a quelli più innovativi. La overview copre la definizione di base di data science, una overview del machine learning e esempi su scenari tradizionali, Recommender systems e Social Network Analysis, IoT e Deep Learning
A presentation delivered by Mohammed Barakat on the 2nd Jordanian Continuous Improvement Open Day in Amman. The presentation is about Data Science and was delivered on 3rd October 2015.
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-ShapiroData ScienceTech Institute
Data Science Tech Institute - Big Data and Data Science Conference around Dr Gregory Piatetsky-Shapiro.
Keynote - An overview on Big Data & Data Science Dr Gregory Piatetsky-Shapiro - KDnuggets.com Founder & Editor.
Paris May 23rd & Nice May 26th 2016 @ Data ScienceTech Institute (https://www.datasciencetech.institute/)
Data science is the new thing! How to be a data scientist? See here.
This was originally was written by the team behind DataCamp, - the online interactive learning platform for data science!
Ordinary people included anyone who is not a Geek like myself. This book is written for ordinary people. That includes manager, marketers, technical writers, couch potatoes and so on.
Data Science and Analytics for Ordinary People is a collection of blogs I have written on LinkedIn over the past year. As I continue to perform big data analytics, I continue to discover, not only my weaknesses in communicating the information, but new insights into using the information obtained from analytics and communicating it. These are the kinds of things I blog about and are contained herein.
Big Data & Machine Learning - TDC2013 Sao PauloOCTO Technology
BigData and Machine Learning: Usage and Opportunities for your IT department
Talk presented at The Developer Conference in São Paulo - 12/0713
Mathieu DESPRIEE
Applications of Machine Learning at USC presentation by Alex Tellez
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Curious about Data Science? Self-taught on some aspects, but missing the big picture? Well, you’ve got to start somewhere and this session is the place to do it.
This session will cover, at a layman’s level, some of the basic concepts of Data Science. In a conversational format, we will discuss: What are the differences between Big Data and Data Science – and why aren’t they the same thing? What distinguishes descriptive, predictive, and prescriptive analytics? What purpose do predictive models serve in a practical context? What kinds of models are there and what do they tell us? What is the difference between supervised and unsupervised learning? What are some common pitfalls that turn good ideas into bad science?
During this session, attendees will learn the difference between k-nearest neighbor and k-means clustering, understand the reasons why we do normalize and don’t overfit, and grasp the meaning of No Free Lunch.
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
What 'kind of things' does a data scientist do? What are the foundations and principles of data science? What is a Data Product? What does the data science process looks like? Learning from data: Data Modeling or Algorithmic Modeling? - talk by Carlos Somohano @ds_ldn at The Cloud and Big Data: HDInsight on Azure London 25/01/13
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
In the last couple of years, deep learning techniques have transformed the world of artificial intelligence. One by one, the abilities and techniques that humans once imagined were uniquely our own have begun to fall to the onslaught of ever more powerful machines. Deep neural networks are now better than humans at tasks such as face recognition and object recognition. They’ve mastered the ancient game of Go and thrashed the best human players. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist).
What sparked this new hype? How is Deep Learning different from previous approaches? Let’s look behind the curtain and unravel the reality. This talk will introduce the core concept of deep learning, explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why “deep learning is probably one of the most exciting things that is happening in the computer industry“ (Jen-Hsun Huang – CEO NVIDIA).
A map circ 2012 of the Enterprise Products Partners TE Pipelines Products Pipelines Company (TEPPCO) refined products pipeline that runs from the Gulf Coast to the Midwest and northeast. The TEPPCO ships refined products like gasoline along with natural gas liquids like propane. On Feb. 7, 2014 the Federal Energy Regulatory Commission (FERC) for the first time ever exercised its emergency authority and directed Enterprise, the owner of the TEPPCO, to prioritize propane shipments for at least a week, to help alleviate the growing crisis of propane shortages in the Midwest.
The lessons imparted in Fish are very simple and provocatively empowering. The authors assert that employees in the workplace have the power to change their own attitudes about their work, creating a positive environment that will not only be more productive but happier in their lives.
Clearly, these are not novel ideas, but what gives them power is that Fish! uses as an example of this management technique the Pike Place fish market in Seattle, which is known for its lively fishmongers tossing fish to and fro. The idea is that if these workers, who have very difficult jobs that are not particularly lucrative, can maintain an amazingly motivated attitude about their work, then so can anyone.
View this webinar with HubSpot's Chief Marketing Officer, Mike Volpe, to find out how you can learn from what HubSpot did and get your company onto a future Inc 500 list and become one of the fastest growing companies: http://www.hubspot.com/how-to-make-the-inc-500-list/
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...MaRS Discovery District
Deep learning is changing the field of artificial intelligence and revolutionizing our online experience, with applications including speech and image recognition. Information and communications technology giants such as Google, Facebook, IBM and Baidu, among others, are rapidly deploying deep learning into new products and services.
Behind all of the present-day excitement about deep learning are years of high risk and hard work by a small group of eminent computer scientists and theorists connected through the Canadian Institute for Advanced Research (CIFAR).
It has been said that Mobiles +Cloud + Social + Big Data = Better Run The World. IBM has invested over $20 billion since 2005 to grow its analytics business, many companies will invest more than $120 billion by 2015 on analytics, hardware, software and services critical in almost every industry like ; Healthcare, media, sports, finance, government, etc.
It has been estimated that there is a shortage of 140,000 – 190,000 people with deep analytical skills to fill the demand of jobs in the U.S. by 2018.
Decoding the human genome originally took 10 years to process; now it can be achieved in one week with the power of Analytic and BI (Business Intelligence). This lecture’s Key Messages is that Analytics provide a competitive edge to individuals , companies and institutions and that Analytics and BI are often critical to the success of any organization.
Methodology used is to teach analytic techniques through real world examples and real data with this goal to convince audience of the Analytics Edge and power of BI, and inspire them to use analytics and BI in their career and their life.
Using Past Statistical Data and Present Situation We can Predict the Future Using Data Science.
This Small PowerPoint Presentation is given by P.Nikhil, D.Dhanunjaya Rao from Government College, Rajahmundry.
Hope it is useful for future Generation.
Thank You.
Disrupting technologies like Data Science and Knowledge Automation are projected to have an economic impact of trillions of dollars in the next decade.
This presentation was given at the Dallas Tableau User Group on Oct 29, 2103 and
How will AI and analytics change life in the next 25 years? In this episode, we look forward to the next 25 years and will share predictions about the technological innovations prevalent then based on a projection of AI and analytics forward.
DBMSs, database models, digital economy, data mining, information systems, artificial intellicance, database ethics, security issues and safeguards, quality of life, economic and political issues
Maritime Information Warfare - The Human DimensionAndy Fawkes
Presented at SMi's Inaugural Maritime Information Warfare Conference, London, 6/7 December 2017. A perspective on the modern sailor, training and simulation, training data, and defence organisational challenges.
COMEX2017 Smart Talks by Amjid Ali , Muscat, Oman. Covering Introduction to big data, Big Data Definitions, Big Data Revolution, Big Data Timeline, Hadoop and Map Reduce covers importance of storage and DNA, Oceanstore 9000, Microsoft R, Spark,
Presentation that I delivered at "Accelerate AI, Europe 2018" in London on Sept 19, 2018. My focus is on socio-cultural perspective as well as proving information about various tools, vendors and partners available to help companies get started using AI.
CASA UASSC Meeting May 2016- Presentation by Industry Chair, Terry MartinTerrence Martin (PhD)
This is the presentation given by Dr Terry Martin at the CASA UASSC in May 2016. He is the current Industry Chair. Note that the original presentation contains extensive animation. In order to maintain the animation effect, the presentation has been split into individual slides however this now makes it more than 270 slides. This may be offputting to some!
Please note that the contents are the view of Dr Martin and are not necessarily endorsed by CASA.
The presentation covers 5 key areas:
Part 1: (Slides 1-65) Covers some innovative UAV applications such as NASA UTM, and Google Loon, before discussing the impact of automation on the jobs market, internationally and Nationally. The key point is that disruption and automation mean that many jobs are at risks, and Australia could be vulnerable if it continues on its current path
Part 2 (Slides 67-82) then provides an overview of the recently amended RPAS regulatory framework
Part 3 (Slides 83- 171 ) then works through whats broken in the current RPAS UASSC before describing the efforts of Europe and the US, the implications of Australian having no UAV Roadmap, the need for a pragmatic appreciation of our national capacity, particularly when it comes to the expertise necessary to making assessments about risk, safety and equipage requirements.
Part 4 (172 - 243 ) Selected elements of Detect and Avoid and Command and Non Payload Communications are outlined
Part 5 (244- 286) proposed steps we could take to identify our priorities, identify the operational and technical shortcomings that are hindering UAV integration and work with Industry more effectively to achieve that end state. This includes the restructure plans for the UASSC Working Groups.
Technology Through the Looking Glass: 2013-2020Peter Crosby
The news is filled with stories of companies promising to “disrupt” this technology or that market. Growing trends such as mobile, apps, BYOD, open source, MOOCs, Vine-video, Social TV, 'big data,' compete for our attention and understanding. Microsoft is finally in the cloud, YouTube adds 100 hours of video per minute, Google is making devices like 'Glass,' Twitter is truly revolutionary, and Facebook may be competing with them all. Yet some of the biggest social impacts are due to much lower technologies such as sms mapping, micro-payments, mobile health. Don’t miss this look into the future from two provocative thought leaders!
Delivered by Dan Callahan (CGNET) & Peter S. Crosby (Dotsub) at InsideNGO: Operational Excellence for Global Impact conference <www.insidengo.org> on July 31, 2013, in Washington, DC
Similar to Deep Learning Use Cases - Data Science Pop-up Seattle (20)
What's in your workflow? Bringing data science workflows to business analysis...Domino Data Lab
While business analysis rapidly grows more data-driven, the analyst community is slow to adapt the best practices of data science workflows. Many parallels exists between data science “top topics” (e.g. reproducibility) and business pain points, but these common needs are obscured by the different “languages” of these two communities. The opportunity cost is greatest in heavily regulated industries such as finance and insurance where documentation and compliance are paramount.
In this talk, we will review our experience transitioning Capital One business analysts from legacy systems to open-source workflows by developing user-friendly tools. We incentivized business analysts to adopt the data science mindset by curating open-source tools and developing code packages which simplify workflows and eliminate pain points.
Our internal R package, tidycf, reimagines cumbersome Excel cashflow statements as dataframes and uses RMarkdown templates and the RStudio IDE for an intuitive, user-friendly experience without the overhead of maintaining a custom GUI. We tackle challenges in documentation and communication while immersing new users in the R language.
We will share best practices and lessons learned from our experience designing tools for non-technical end-users, standardizing workflows based on the RStudio IDE’s infrastructure, and evangelizing data science methods.
The Proliferation of New Database Technologies and Implications for Data Scie...Domino Data Lab
In this talk, we’ll describe NoSQL (“not-only SQL”) and document-oriented databases and the value they provide for data science companies like Uptake. We will walk through the unique challenges such datastores pose for data science workflows. To make these challenges and lessons learned concrete, we’ll explore data science workflows through a discussion of the development efforts that led to “uptasticsearch”, an R package released by the Uptake Data Science team to reduce friction in interacting with a document store called Elasticsearch. The talk will conclude with a discussion of recent developments in NoSQL technologies and implications for data scientists.
Racial Bias in Policing: an analysis of Illinois traffic stops dataDomino Data Lab
Since 2004, Illinois has collected demographic information about traffic stops conducted by police in an effort to identify racial bias. This data has been used by groups such as the ACLU and the Stanford Open Policing Project to identify key markers that infer racial bias in policing. We have applied exploratory data analysis to investigate whether systemic racial bias may appear and to what extent. This talk will walk the audience through the insights gleaned from the exploration of this data along with the challenges posed and ongoing questions raised.
Data Quality Analytics: Understanding what is in your data, before using itDomino Data Lab
Analytics and data science are ever growing fields, as business decision makers continue to use data to drive decisions. The pinnacle of these fields are the models and their accuracy/fit,; what about the data? Is your data clean, and how do you know that? Our discussion will focus on best practices for data preprocessing for analytic uses. Beginning with essential distributional checks of a dataset to a propose method for automated data validation process during ETL for transactional data.
Supporting innovation in insurance with randomized experimentationDomino Data Lab
Recent technological advances, a dynamic competitive landscape, and an evolving regulatory environment have led to a period of rapid innovation for many insurance providers. Here, we’ll explore how data scientists may use randomized experiments to rigorously assess the causal impact of innovations on business outcomes. Particular emphasis will be placed on experimentation in “offline” channels, with some of the challenges and mitigation strategies highlighted.
Leveraging Data Science in the Automotive IndustryDomino Data Lab
Cars.com Inc. is a decision engine for car buyers and a growth engine for our partners. Data Science is the bread and butter of any decision engine and Cars is no different. In this talk, I will discuss how we quantify various parameters of a car and plan to make use of all the data in hand to put predictive models at various stages of a users’ automobile lifecycle. This talk will also cater to students looking to gain knowledge on how data science is utilized at scale while still following certain processes and leading the way for business and product partners.
Summertime Analytics: Predicting E. coli and West Nile VirusDomino Data Lab
Lake Michigan and outdoor recreation are enjoyable aspects of summers in Chicago, but it can come with risk of potential E. coli in Lake Michigan or West Nile Virus from mosquitos. This summer, the City of Chicago launched two new predictive analytics projects to forecasts the risks and to proactively limit these risks. Members of the research team, Gene Leynes and Nick Lucius discuss the projects and how they’re being used as part of city operations.
Today, more than ever before, maps are being used to bring data to life. In this presentation I will demonstrate how geoviz can make data science more tangible by providing an interactive canvas for spatial data. Gregory Brunner will shows several examples of how maps are being used enhance how we communicate data and how this applies across all scales, including spatial, temporal, and size of data.
What you till learn:
GOALS - What is the bar for data science teams
PITFALLS - What are common data science struggles
DIAGNOSES - Why so many of our efforts fail to deliver value
RECOMMENDATIONS - How to address these struggles with best practices
Presented by Mac Steele
Director of Product at Domino Data Lab
Doing your first Kaggle (Python for Big Data sets)Domino Data Lab
You love python. You love Data Science. But the size of your data set keeps crashing your code. Is it time to bring in big data tools or simply code smarter? Lee is going to show you efficiency hacks, drawn from top Kaggle competitors, to get python to work on large data sets. Skip the hassle of creating a Big Data infrastructure. Let’s find out how far we can push our home laptop first.
Most of analytics modeling work today focuses on the production of single-purpose "artisanal" models for predictions. This approach to analytics is fragile with respect to model consistency, reorganization, and resource availability. This talk will argue that instead the focus of analytics modeling should be toward the production of analytics interchangeable parts, which can be combined in creative ways to produce a wide variety of analytics results. This "nuts and bolts" approach allows analytics groups to produce results in an agile way where the time between ask and answer is determined by the right combination of analytics, rather than the modeling.
How I Learned to Stop Worrying and Love Linked DataDomino Data Lab
In this presentation, Jon Loyens will share:
-Best practices for sharing context and knowledge about your data projects
-How linked data can augment your existing data science workflow and toolchain to accelerate your work
-How a social network can unlock power of Linked Data and data collaboration
-How Linked Data can help you easily combine private and Open Data for fun and profit
Although both disciplines are unique in their own ways, Software Engineering and Data Science make heavy use of programing languages to do their respective jobs. Data Science is a relatively new discipline and many of its practitioners have not previously been professional software engineers. There are a few techniques that Data Scientists can leverage from Software Engineering in order to make their tooling and environments, faster to design, more easily debugged and most importantly, clearer to read. This talk will be going over some practical tips that anyone can use to help better understand their code; give clarity around cloud environments, their uses and drawbacks and finally briefly touching on the Software Development Lifecycle.
Within marketing research, big data is often described as being “census” data for the population that it represents. The devil is in the details and when we take a closer look we can see that this isn’t the case. There are many situations that are not captured within the population that big data purports to be a census of. Big data isn’t even a census of itself since it’s not uncommon for records to be excluded either by accident during the collection process or by design in the cleaning processor. Unfortunately, our industry is so enamored with the size of big data that some users of data are willing to trade off precision for tonnage. Fortunately, if the shortcomings of big data are understood and corrected it can accurately represent the population that it measures in the correct proportion to the universe. We will discuss a method that Nielsen has developed called “Common Homes” that is designed to identify and correct the shortcomings of big data sets that represent media consumption.
Moving Data Science from an Event to A Program: Considerations in Creating Su...Domino Data Lab
The exponential growth of Big Data and Analytics has outpaced the ability of organizations to govern their data appropriately. The ability to reuse the work done by data scientists work is becoming an economic necessity. The mix of data sources is changing from tradition transactional and ERP systems to include a mix of structured, semi-structured and unstructured data. Data Governance needs to adapt to these changes. This session discusses these data changes and proposed how to adapt current data governance processes. These include, how the concept of a stakeholder has changed and the need for expansion of communications and content management. We look at need to consolidate data from disparate systems and how it governed. Lastly we will investigate how context is emerging as an important factor in governance and how it can be leveraged to provide for accurate, reliable data reuse.
Building Data Analytics pipelines in the cloud using serverless technologyDomino Data Lab
Big Data analytics is well known to uncover hidden insights that gives an organization an edge over the competition. But data does not need to be big in order to be useful. Smaller companies and startups may lack the volume of data that qualifies as big data, yet the variety of data can still yield a trove of insights that helps in driving the business strategies of a company. Startups may also lack the resources to fund an additional, seemingly expensive development project. The key is in simplicity, start small, simple and architect for scalability and performance. But how do you start? In this presentation, we share our experience in building a cost effective, AWS serverless data analytics platform that became an invaluable tool for sales, marketing and operational efficiencies.Serverless architectures simplify development work where servers and software are managed by a third party cloud provider. Developers can focus on just building the data wrangling and data analysis logic where critical aspects like scalability and high availability are guaranteed by the cloud provider. Besides, serverless services offer the pay as you go model, where you pay only based on the amount of resources you use. This turns out to be another attractive aspect where costs can be managed based on the usage. In this presentation we will focus on techniques and best practices to build a big data analytics platform using AWS serverless services like Lambda, DynamoDB, S3, Kinesis, Athena, QuickSight and Amazon ML. We will highlight the strengths of each of these services and what role each plays in the data analytics pipeline. We compare and contrast these services with some of the other popularly used big data technologies like Hadoop, Spark and Kafka. We also demonstrate the usage of these services to build intelligent components that detect anomalies, yield recommendations, simulate chat bots and generate predictive analytics.
Leveraging Open Source Automated Data Science ToolsDomino Data Lab
The data science process seeks to transform and empower organizations by finding and exploiting market inefficiencies and potentially hidden opportunities, but this is often an expensive, tedious process. However, many steps can be automated to provide a streamlined experience for data scientists. Eduardo Arino de la Rubia explores the tools being created by the open source community to free data scientists from tedium, enabling them to work on the high-value aspects of insight creation and impact validation.
The promise of the automated statistician is almost as old as statistics itself. From the creations of vast tables, which saved the labor of calculation, to modern tools which automatically mine datasets for correlations, there has been a considerable amount of advancement in this field. Eduardo compares and contrasts a number of open source tools, including TPOT and auto-sklearn for automated model generation and scikit-feature for feature generation and other aspects of the data science workflow, evaluates their results, and discusses their place in the modern data science workflow.
Along the way, Eduardo outlines the pitfalls of automated data science and applications of the “no free lunch” theorem and dives into alternate approaches, such as end-to-end deep learning, which seek to leverage massive-scale computing and architectures to handle automatic generation of features and advanced models.
The Role and Importance of Curiosity in Data ScienceDomino Data Lab
by Alfred Lee
Lead Data Scientist, White Ops
Is curiosity useful for more than serendipitous discovery? Can curiosity be taught? How do I foster curiosity in my team? Can someone be too curious? Questions!
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
2. #datapopupseattle
UNSTRUCTURED
Data Science POP-UP in Seattle
www.dominodatalab.com
D
Produced by Domino Data Lab
Domino’s enterprise data science platform is used
by leading analytical organizations to increase
productivity, enable collaboration, and publish
models into production faster.
3. DEEP LEARNING IS
MACHINE PERCEPTION FOR…
TEXT
• CRM
• SEARCH +
• ADS
SOUND
• VOICE SEARCH
• MUSIC GEN.
• TRANSLATION
TIME SERIES
• HEALTH DATA
• SENSORS
• FINANCE
IMAGES
• FACES
• SELF-DRIVING
VEHICLES
4. RECORD-BREAKING ACCURACY
• FACIAL RECOGNITION = 97% accuracy
• GENERAL IMAGE RECOG. = 93%
• SPEECH RECOGNITION = 81%
• VIDEO ACTIVITY RECOG. = 52% - 94%
(Varies by dataset)
• TEXT CLASSIFICATION = 94%
5. NEURAL NETWORKS ARE OLD
1943 – First electrical model of a neural network
1958 – Perceptron (Rosenblatt)
1986 – Backpropagation
1990s – Convolutional networks for images (LeCun)
2006 – Deep-belief networks (Hinton)
>> THEN SOMETHING HAPPENED <<
2013 – Google hires Geoff Hinton
Facebook hires Yann LeCun
2014 – Google buys DeepMind for $600 million
2015 – IBM partners with Yoshua Bengio
6. WHAT HAPPENED? WHY NOW?
DEEP LEARNING
WORKS FOR
ENTERPRISE
MORE
DATA
FASTER
HARDWARE
BETTER
ALGORITHMS
8. "The biggest disruptor that we are sure about is
the arrival of big data and machine
intelligence. This disruption will not only change
every business globally, it will also have an
important impact on the consumer."
Google Chairman Eric Schmidt
THE BIGGEST DISRUPTOR
10. • “A breakthrough in machine learning would be
worth 10 Microsofts.” - Bill Gates, Microsoft
• “Machine learning is the next Internet.”
- Tony Tether, DARPA Director
• “Web rankings today are mostly a matter of
machine learning.” - Prabhakar Raghavan,
Yahoo Director of Research
• “Machine learning is going to result in a real
revolution.”
- Greg Papadopoulos, Sun CTO
• “Machine learning is todayʼs discontinuity.”
- Jerry Yang, Yahoo
MACHINE LEARNING’S
BREAKTHROUGH IS DEEP…
26. WHAT WILL YOU COUNT TODAY?
SURVEYS
• Land use
• Urban development
• Property lines
• Crop growth
SURVEILLANCE
• Large-scale human movements (migrants, commuters)
• Campus and on-site security
• National Parks (poaching)
PAYLOADS
• Goods delivered
• Rescue ops in mountains, at sea
47. THE FUTURE OF ENTERPRISE AI
An open-source framework.
Open source dominates OS with Linux,
and big data with Hadoop.
Open source will win AI with
Deeplearning4j:
• Distributed deep learning on GPUs
• Serving 10M Java/Scala programmers