Dr. Kirk Borne is a Principal Data Scientist at Booz Allen Hamilton. With a rich background in Astrophysics and Computational Science, he was a precursor on implementing courses of big data in academia. He is one of the most important promotors of data literacy in the world.
About Kirk and his view on data literacy and evolution
On his first visit to Brussels, Kirk first activity was sharing his best practices to promote data literacy. While enjoying a magnificent view of Brussels from the ING headquarter building, Kirk playfully (with a pair of socks!) explained how subjectivity plays a major role in the way that data is understood, derived by the wide variety of involved. This keynote was delivered at the speakers reception, which took place the day before the DI Summit.
The following day, Kirk wrapped up the DI summit with his closing keynote on how data has shifted into something that is sense-making, following the evolution from “data” to “big data” into “smart data” composed by both enriched and semantic data and essential for IoT. He also discussed the levels of maturity in a self-driving enterprise, wrapping up his participation sharing this equation:
Big Data + IoT + Citizen Data Scientists = Partners in Sustainability
Kirk’s impression on the DI Summit was that it was a fun and informative event to join. His favorite format were the 5” pitches, as they were properly structured, providing the most critical information to the attendees. He also think that the networking dynamic ensured that all attendees met interesting people.
A takeaway from Kirk’s presentation
“Big data is not about how big it is, but the value you extract from it”
We look forward to have Kirk sometime soon back in Brussels!
Kirk’s interview:
Kirk’s presentation recording:
Kirk’s decks:
Kirk’s presentation drawing:
2) Here are some video interviews that I have done:
https://www.youtube.com/watch?v=ku2na1mLZZ8
https://www.youtube.com/watch?v=iXjvht91nFk
Here is my TedX talk: https://www.youtube.com/watch?v=Zr02fMBfuRA
DISUMMIT - Rishi Nalin Kumar from DatakindDigitYser
Rishi Nalin Kumar
Chief Scientist at eBench
Half professional, half collaborator, one quarter mathematician. Currently at eBench helping brands understand their consumers and win with their content. Previously leading data science & analytics in large-corporate consumer goods with a light touch of news & media. Proud volunteer at DataKind and a regular on the data & analytics speaker circuit.
By definition, “big data” involves large volumes of diverse data sources.
Considering all the data that your activities generate and that 99% of this data is irrelevant “noise,” business users and stakeholders have to struggle to understand your company’s status.
See how a business perspective on your big, small or just complex data will generate business value.
Is big data just a buzzword -Big data simply explainedVivek Srivastava
Big data helps us to uncover and discover those facets of data which we are not aware of . Using predictive science it helps us to provide insights on which actions can be taken and suggests those actions which will impact the business significantly boosting the revenue or market reach.For example, using large amount of data and appropriate tools, we can categorize different strata of population and build customize products. So whether companies deploy it or not, all depends on what factor constitute the value of company and where the center of value creation lies. It may be money or it may be geographic reach. - Watch this video at https://www.youtube.com/watch?v=ELyOl0fkqNM
Adatao Keynote Address @ UIUC Research Park Big-Data Summit, December 6, 2013
We were invited to give the Keynote address at the UIUC Research Park Big-Data Summit. We talked about (a) Why Big Data, (b) Big-Data Success Factors, and (c) The Future of Big Data. We also showed how Adatao approaches Big Data analysis for business users, via a beautiful, easy-to-use yet powerful, interactive web application.
Trends in Big Data & Business Challenges Experian_US
Join our #DataTalk on Thursdays at 5 p.m. ET. This week, we tweeted with Sushil Pramanick – who is the founder and president of the The Big Data Institute (TBDI).
You can learn about upcoming chats and see the archive of past big data tweetchats here
http://www.experian.com/blogs/news/about/datadriven
Content:
Introduction
What is Big Data?
Big Data facts
Three Characteristics of Big Data
Storing Big Data
THE STRUCTURE OF BIG DATA
WHY BIG DATA
HOW IS BIG DATA DIFFERENT?
BIG DATA SOURCES
BIG DATA ANALYTICS
TYPES OF TOOLS USED IN BIG-DATA
Application Of Big Data analytics
HOW BIG DATA IMPACTS ON IT
RISKS OF BIG DATA
BENEFITS OF BIG DATA
Future of big data
DISUMMIT - Rishi Nalin Kumar from DatakindDigitYser
Rishi Nalin Kumar
Chief Scientist at eBench
Half professional, half collaborator, one quarter mathematician. Currently at eBench helping brands understand their consumers and win with their content. Previously leading data science & analytics in large-corporate consumer goods with a light touch of news & media. Proud volunteer at DataKind and a regular on the data & analytics speaker circuit.
By definition, “big data” involves large volumes of diverse data sources.
Considering all the data that your activities generate and that 99% of this data is irrelevant “noise,” business users and stakeholders have to struggle to understand your company’s status.
See how a business perspective on your big, small or just complex data will generate business value.
Is big data just a buzzword -Big data simply explainedVivek Srivastava
Big data helps us to uncover and discover those facets of data which we are not aware of . Using predictive science it helps us to provide insights on which actions can be taken and suggests those actions which will impact the business significantly boosting the revenue or market reach.For example, using large amount of data and appropriate tools, we can categorize different strata of population and build customize products. So whether companies deploy it or not, all depends on what factor constitute the value of company and where the center of value creation lies. It may be money or it may be geographic reach. - Watch this video at https://www.youtube.com/watch?v=ELyOl0fkqNM
Adatao Keynote Address @ UIUC Research Park Big-Data Summit, December 6, 2013
We were invited to give the Keynote address at the UIUC Research Park Big-Data Summit. We talked about (a) Why Big Data, (b) Big-Data Success Factors, and (c) The Future of Big Data. We also showed how Adatao approaches Big Data analysis for business users, via a beautiful, easy-to-use yet powerful, interactive web application.
Trends in Big Data & Business Challenges Experian_US
Join our #DataTalk on Thursdays at 5 p.m. ET. This week, we tweeted with Sushil Pramanick – who is the founder and president of the The Big Data Institute (TBDI).
You can learn about upcoming chats and see the archive of past big data tweetchats here
http://www.experian.com/blogs/news/about/datadriven
Content:
Introduction
What is Big Data?
Big Data facts
Three Characteristics of Big Data
Storing Big Data
THE STRUCTURE OF BIG DATA
WHY BIG DATA
HOW IS BIG DATA DIFFERENT?
BIG DATA SOURCES
BIG DATA ANALYTICS
TYPES OF TOOLS USED IN BIG-DATA
Application Of Big Data analytics
HOW BIG DATA IMPACTS ON IT
RISKS OF BIG DATA
BENEFITS OF BIG DATA
Future of big data
Introduction to big data for the EA course at Solvay MBAWim Van Leuven
Introduction to what is big data, what can it do and not do, the importance of datascience and how to architect big data solutions (lambda architecture)
Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...Simplilearn
In this Big Data presentation, we will be discussing the Big data growth over the last few years followed by the various big data applications. We will look into the various sectors where big data is used such as weather forecast, healthcare, media and entertainment, logistics, travel & tourism and finally in the government & law enforcement sector.
We will be discussing how below industries are using Big Data presentation:
1. Weather forecast
2. Media and entertainment
3. Healthcare
4. Logistics
5. Travel n tourism
6. Government and law enforcement
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
This course will enable you to:
1. Understand the different components of Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
This presentation offers a basic understanding of Big Data. It does this by defining Big Data, offers a History of Big Data, Big Data by the Numbers and the 8 Laws of Big Data
Predictive Modeling & Data-Driven Product Insights at LinkedIn - Scott Nichol...Scott Nicholson
Talk given at Advanced Analytics & Big Data Forum conference in San Francisco on April 25, 2012.
Abstract: Data on 150+ million professionals' careers and networks provide a fascinating playground for analysts to discover data insights about career trends, the social web and the economy. This talk will focus on how insights extracted from the LinkedIn dataset enable individuals with limited information the ability to make better decisions about their professional lives. In the course of this theme we will discuss data tools, insights and approaches to predictive modeling in the context of the LinkedIn dataset and Analytics Team.
A presentation from Gavin Freeguard (senior researcher at the Institute for Government) opening the Young Policy Professionals event, ‘Public policy in the “big data” age’, held on 9 March 2016 at the National Audit Office, London.
Attend The Data Science Course in Bangalore From ExcelR. Practical Data Science Course in Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Course in Bangalore.
A presentation from Roeland Beerten (Director of Policy and Public Affairs at the Royal Statistical Society) as part of the Young Policy Professionals event, ‘Public policy in the “big data” age’, held on 9 March 2016 at the National Audit Office, London.
7 Big Data Challenges and How to Overcome ThemQubole
Implementing a big data project is difficult. Hadoop is complex, and data governance is crucial. Learn common big data challenges and how to overcome them.
Slide 2: Etymology: The etymology of the term ‘Big Data’ can be traced back to the mid-1990s, when it was first used by John Mashey to refer to handling and analysis of massive datasets. However, by 2013, ‘Big Data’ was already being declared obsolescent as a meaningful term by some, as it was too wide ranging and vague in definition (e.g. de Goes, 2013).
Side 6: Vagaries: Kitchin argues that it is velocity and these additional key characteristics that set Big Data apart and make them a “disruptive innovation – one that radically changes the nature of data and what can be done with them” (Kitchin, 2014). However, there is no one characteristic profile that all Big Data fit and they can take multiple forms.
Slide 8: Ethics: Several ethical questions have been raised about the scope of data being generated and retained; such as those concerning privacy, informed consent, and protection from harm.
These questions raise wider issues about what kinds of data should be combined and analysed, and the purposes to which the resulting information should be put.
Slide 9: Inequalities: Challenges of inequality have also been posed:
Whose data traces will be analysed? It is likely that only those who are better off will be represented (as they are more likely to use social media, etc.)
Access and use of open data is unlikely to be equally available to everyone due to existing structural inequalities (Eynon, 2013)
Slide 11: What do Big Data actually tell us? Eynon (2013) argues that Big Data is concerned with capturing and examining patterns, and tells us more about what people actually do than about what they say they do. However, this is not sufficient for all kinds of social science research. We need to understand the meanings of behaviours which cannot be inferred simply from tracking specific patterns.
In order that Big Data are used appropriately, we need to ensure understanding of what kinds of research can or cannot be carried out using them. Big Data should not be seen as [a] “technical fix” for research, but should be used to empower, support and facilitate practice and critical research.
"Big Data" is term heard more and more in industry – but what does it really mean? There is a vagueness to the term reminiscent of that experienced in the early days of cloud computing. This has led to a number of implications for various industries and enterprises. These range from identifying the actual skills needed to recruit talent to articulating the requirements of a "big data" project. Secondary implications include difficulties in finding solutions that are appropriate to the problems at hand – versus solutions looking for problems. This presentation will take a look at Big Data and offer the audience with some considerations they may use immediately to assess the use of analytics in solving their problems.
The talk begins with an idea of how big "Big Data" can be. This leads to an appreciation of how important "Management Questions" are to assessing analytic needs. The fields of data and analysis have become extremely important and impact nearly all facets of life and business. During the talk we will look at the two pillars of Big Data – Data Warehousing and Predictive Analytics. Then we will explore the open source tools and datasets available to NATO action officers to work in this domain. Use cases relevant to NATO will be explored with the purpose of show where analytics lies hidden within many of the day-to-day problems of enterprises. The presentation will close with a look at the future. Advances in the area of semantic technologies continue. The much acclaimed consultants at Gartner listed Big Data and Semantic Technologies as the first- and third-ranked top technology trends to modernize information management in the coming decade. They note there is an incredible value "locked inside all this ungoverned and underused information." HQ SACT can leverage this powerful analytic approach to capture requirement trends when establishing acquisition strategies, monitor Priority Shortfall Areas, prepare solicitations, and retrieve meaningful data from archives.
Introduction to big data for the EA course at Solvay MBAWim Van Leuven
Introduction to what is big data, what can it do and not do, the importance of datascience and how to architect big data solutions (lambda architecture)
Big Data Applications | Big Data Application Examples | Big Data Use Cases | ...Simplilearn
In this Big Data presentation, we will be discussing the Big data growth over the last few years followed by the various big data applications. We will look into the various sectors where big data is used such as weather forecast, healthcare, media and entertainment, logistics, travel & tourism and finally in the government & law enforcement sector.
We will be discussing how below industries are using Big Data presentation:
1. Weather forecast
2. Media and entertainment
3. Healthcare
4. Logistics
5. Travel n tourism
6. Government and law enforcement
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
This course will enable you to:
1. Understand the different components of Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
This presentation offers a basic understanding of Big Data. It does this by defining Big Data, offers a History of Big Data, Big Data by the Numbers and the 8 Laws of Big Data
Predictive Modeling & Data-Driven Product Insights at LinkedIn - Scott Nichol...Scott Nicholson
Talk given at Advanced Analytics & Big Data Forum conference in San Francisco on April 25, 2012.
Abstract: Data on 150+ million professionals' careers and networks provide a fascinating playground for analysts to discover data insights about career trends, the social web and the economy. This talk will focus on how insights extracted from the LinkedIn dataset enable individuals with limited information the ability to make better decisions about their professional lives. In the course of this theme we will discuss data tools, insights and approaches to predictive modeling in the context of the LinkedIn dataset and Analytics Team.
A presentation from Gavin Freeguard (senior researcher at the Institute for Government) opening the Young Policy Professionals event, ‘Public policy in the “big data” age’, held on 9 March 2016 at the National Audit Office, London.
Attend The Data Science Course in Bangalore From ExcelR. Practical Data Science Course in Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Course in Bangalore.
A presentation from Roeland Beerten (Director of Policy and Public Affairs at the Royal Statistical Society) as part of the Young Policy Professionals event, ‘Public policy in the “big data” age’, held on 9 March 2016 at the National Audit Office, London.
7 Big Data Challenges and How to Overcome ThemQubole
Implementing a big data project is difficult. Hadoop is complex, and data governance is crucial. Learn common big data challenges and how to overcome them.
Slide 2: Etymology: The etymology of the term ‘Big Data’ can be traced back to the mid-1990s, when it was first used by John Mashey to refer to handling and analysis of massive datasets. However, by 2013, ‘Big Data’ was already being declared obsolescent as a meaningful term by some, as it was too wide ranging and vague in definition (e.g. de Goes, 2013).
Side 6: Vagaries: Kitchin argues that it is velocity and these additional key characteristics that set Big Data apart and make them a “disruptive innovation – one that radically changes the nature of data and what can be done with them” (Kitchin, 2014). However, there is no one characteristic profile that all Big Data fit and they can take multiple forms.
Slide 8: Ethics: Several ethical questions have been raised about the scope of data being generated and retained; such as those concerning privacy, informed consent, and protection from harm.
These questions raise wider issues about what kinds of data should be combined and analysed, and the purposes to which the resulting information should be put.
Slide 9: Inequalities: Challenges of inequality have also been posed:
Whose data traces will be analysed? It is likely that only those who are better off will be represented (as they are more likely to use social media, etc.)
Access and use of open data is unlikely to be equally available to everyone due to existing structural inequalities (Eynon, 2013)
Slide 11: What do Big Data actually tell us? Eynon (2013) argues that Big Data is concerned with capturing and examining patterns, and tells us more about what people actually do than about what they say they do. However, this is not sufficient for all kinds of social science research. We need to understand the meanings of behaviours which cannot be inferred simply from tracking specific patterns.
In order that Big Data are used appropriately, we need to ensure understanding of what kinds of research can or cannot be carried out using them. Big Data should not be seen as [a] “technical fix” for research, but should be used to empower, support and facilitate practice and critical research.
"Big Data" is term heard more and more in industry – but what does it really mean? There is a vagueness to the term reminiscent of that experienced in the early days of cloud computing. This has led to a number of implications for various industries and enterprises. These range from identifying the actual skills needed to recruit talent to articulating the requirements of a "big data" project. Secondary implications include difficulties in finding solutions that are appropriate to the problems at hand – versus solutions looking for problems. This presentation will take a look at Big Data and offer the audience with some considerations they may use immediately to assess the use of analytics in solving their problems.
The talk begins with an idea of how big "Big Data" can be. This leads to an appreciation of how important "Management Questions" are to assessing analytic needs. The fields of data and analysis have become extremely important and impact nearly all facets of life and business. During the talk we will look at the two pillars of Big Data – Data Warehousing and Predictive Analytics. Then we will explore the open source tools and datasets available to NATO action officers to work in this domain. Use cases relevant to NATO will be explored with the purpose of show where analytics lies hidden within many of the day-to-day problems of enterprises. The presentation will close with a look at the future. Advances in the area of semantic technologies continue. The much acclaimed consultants at Gartner listed Big Data and Semantic Technologies as the first- and third-ranked top technology trends to modernize information management in the coming decade. They note there is an incredible value "locked inside all this ungoverned and underused information." HQ SACT can leverage this powerful analytic approach to capture requirement trends when establishing acquisition strategies, monitor Priority Shortfall Areas, prepare solicitations, and retrieve meaningful data from archives.
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
This is my presentation on the Topic "Data Science - An emerging Stream of Science with its Spreading Reach & Impact". I have compiled and collected different statistics and data from different sources. This may be useful for students and those who might be interested in this field of Study.
Bringing Machine Learning and Knowledge Graphs Together
Six Core Aspects of Semantic AI:
- Hybrid Approach
- Data Quality
- Data as a Service
- Structured Data Meets Text
- No Black-box
- Towards Self-optimizing Machines
Today, we have data – lots of it. We can process information – in many ways. And with these two tools and a little bit of creativity, we are discovering the vast depths of human behavior and by extension, a way to accurately predict the future -- and our future happiness. In fact, we can quantify human movement, behaviors, desires, and even moods on a scale that wasn’t possible before a series of advances in processing power, developments in psychology and social network science, and most importantly, access to data.
In advertising, industry, and humanity, we have experienced the evolution from Web 1.0 (informational) to Web 2.0 (platform) to Web 3.0 (semantic) to elements of Web 4.0 (anticipatory) – In this anticipatory era, what can we dream of next? Beyond addressability and increasing ad relevance, how can businesses utilize these advances in product development and other market initiatives? Can we make the leap from inductive logic to human-paralleled intuition? Can this make up for our human brain mechanics that make predicting our own happiness so difficult?
In this talk we’ll cover the evolutions in data access, models for information processing, and the science of collaboration to see not only how they have been leveraged in businesses but also how they are used to better understand human behavior, and hopefully in the near future, a little bit of happiness.
Defining Data Science
• What Does a Data Science Professional Do?
• Data Science in Business
• Use Cases for Data Science
• Installation of R and R studio
What does a data scientist actually do? Here at Good Rebels we wanted to outline a profile of this new profession, with the help of various industry leaders from academia, business and institutions. In short, we concluded that the main tasks of a data scientist are to identify data, transform it when incomplete, categorize it, prepare it for analysis, perform the analysis, visualize the results and communicate them.
Everybody has heard of Big Data, and its promise as the next great frontier for innovation. However, Big Data is neither new nor easily defined. What are the key drivers that make Big Data so critically important today? What is the single idea behind Big Data that promises such game changing outcomes for capable organizations? Who are the skilled talent that deliver Big Data results?
This presentation briefly reviews the opportunities, motivation and trends that are driving Big Data disruption. Data science is introduced as the enabling engine for Big Data transformation via the creation of new Data Products. The data scientist is defined and his tools, workflow and challenges are reviewed. Finally, practical tips are presented for approaching data product development.
Key takeaways include:
- Big Data disruption is driven by four megatrends
- Data is the essential raw material for creating valuable Data Products
- Data scientists are heterogeneous by role & skill set, but share common tools, workflows and challenges
- Data science talent is more important than raw data for Big Data success
These slides are modified from an invited presentation for the Gwinnett Chamber of Commerce on March 18, 2014. An excerpt was presented at the Georgia Pacific Social Media Working Session on March 19, 2014.
JIMS IT Flash , a monthly newsletter-An Initiative by the students of IT Department, shares the knowledge to its readers about the latest IT Innovations, Technologies and News.Your suggestions, thoughts and comments about latest in IT are always welcome at itflash@jimsindia.org.
Visit Website : http://jimsindia.org/
Data Science Innovations : Democratisation of Data and Data Science suresh sood
Data Science Innovations : Democratisation of Data and Data Science covers the opportunity of citizen data science lying at the convergence of natural language generation and discoveries in data made by the professions, not data scientists.
Why CxOs care about Data Governance; the roadblock to digital masteryCoert Du Plessis (杜康)
This talk covered how data governance are scaled in large organisations, defining self-sustaining ownership models, a mechanism for managing risk and delegating decisions to those with the most knowhow.
Abstract: Knowledge has played a significant role on human activities since his development. Data mining is the process of
knowledge discovery where knowledge is gained by analyzing the data store in very large repositories, which are analyzed
from various perspectives and the result is summarized it into useful information. Due to the importance of extracting
knowledge/information from the large data repositories, data mining has become a very important and guaranteed branch of
engineering affecting human life in various spheres directly or indirectly. The purpose of this paper is to survey many of the
future trends in the field of data mining, with a focus on those which are thought to have the most promise and applicability
to future data mining applications.
Keywords: Current and Future of Data Mining, Data Mining, Data Mining Trends, Data mining Applications.
Big data-analytics-changing-way-organizations-conducting-businessAmit Bhargava
Hi Friends ,
There is an interesting post on how to leveraging Big data analytics in an Integrated GRC Environment in an Organize to have visibility in core enterprises issues on real time basis . This presentation is from Metric stream -an international and Global GRC soloutioning providers in association with Dr. Kirk. D. Borne - Big data consultant and Adviser .Hope you like it and enjoy as well.
TechWise with Eric Kavanagh, Dr. Robin Bloor and Dr. Kirk Borne
Live Webcast on July 23, 2014
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=59d50a520542ee7ed00a0c38e8319b54
Analytical applications are everywhere these days, and for good reason. Organizations large and small are using analytics to better understand any aspect of their business: customers, processes, behaviors, even competitors. There are several critical success factors for using analytics effectively: 1) know which kind of apps make sense for your company; 2) figure out which data sets you can use, both internal and external; 3) determine optimal roles and responsibilities for your team; 4) identify where you need help, either by hiring new employees or using consultants 5) manage your program effectively over time.
Register for this episode of TechWise to learn from two of the most experienced analysts in the business: Dr. Robin Bloor, Chief Analyst of The Bloor Group, and Dr. Kirk Borne, Data Scientist, George Mason University. Each will provide their perspective on how companies can address each of the key success factors in building, refining and using analytics to improve their business. There will then be an extensive Q&A session in which attendees can ask detailed questions of our experts and get answers in real time. Registrants will also receive a consolidated deck of slides, not just from the main presenters, but also from a variety of software vendors who provide targeted solutions.
Visit InsideAnlaysis.com for more information.
Data Scientist has been regarded as the sexiest job of the twenty first century. As data in every industry keeps growing the need to organize, explore, analyze, predict and summarize is insatiable. Data Science is creating new paradigms in data driven business decisions. As the field is emerging out of its infancy a wide range of skill sets are becoming an integral part of being a Data Scientist. In this talk I will discuss the different driven roles and the expertise required to be successful in them. I will highlight some of the unique challenges and rewards of working in a young and dynamic field.
Nicht zuletzt durch die medienwirksame Erfolge des maschinellen Lernens durch DeepMind, OpenAI und Kollegen ist Künstliche Intelligenz im Moment wieder in aller Munde. Einerseits locken zahlreiche neue, vorher undenkbare Anwendungen wie die automatische Diagnose von Krankheiten, autonome Fahrzeuge und Drohnen, oder die automatische Übersetzung gesprochener Wörter. Andererseits warnen mahnenden Stimmen wird vor dem zunehmendem Einflussnahme der „Algorithmen“ auf fast alle Bereiche unseres Lebens sowie vor unerwünschten Folgen von sich verselbstständigenden Computern gewarnt. Einige träumen von – oder fürchten sich vor – der vermeintlich unausweichlichen Singularität, an der sich nichts weniger als das Schicksal der gesamten Menschheit entscheiden wird. Doch was verbirgt sich hinter dem Begriff Künstliche Intelligenz? Je nachdem, wen man fragt, erhält man unterschiedliche, bisweilen gegensätzliche Antworten. Dieser Vortrag stellt einige dieser Antworten vor und versucht sie (nicht nur) anhand von Beispielen aus Forschung und Anwendung einzuordnen.
Event: Business Analytics Day, 07.03.2019
Speaker: Dr. Matthias Richter, Dr. Stefan Igel (inovex)
Mehr Tech-Vorträge: inovex.de/vortraege
Mehr Tech-Artikel: inovex.de/blog
Similar to DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making (20)
Hr pro digital day 2019 - keynote presentation from Philippe Van ImpeDigitYser
I had the pleasure to be invited to talk to 200 HR professionals.
My main message is that lifelong learning is now the norm and that lifelong employment is not any more.
People will have moments in their lives to regroup and rest in order to prepare themselves for the next digital wave.
A place like digityser.org offers this kind of shelter.
This report summarizes all the activities of
the first year and I could not be more proud
and thankful for all the help we received.
Our clubhouse has now become a roaring place
full of digital startups, expert communities and
innovation workgroups focussing on AI, VR, Iot
and Blockchain. It is used by many companies
for innovative ‘offsites’ and workshops. We host
technical trainings, master classes and hands-on
workshops. Next Tech Communities can use the
space for free to organise meetups. Digityser is
the best space in Brussels to boost innovation and
organise hackathons.
Over the past year, we employed, trained and
coached long time job seekers, students, volunteers,
young graduates. We also offered traineeships to
help tomorrow’s working people grow and learn
in a nice start-up environment. We boosted digital
innovation, by organizing hackathons. We helped
train the new data science professionals, Java
developpers and the VR specialists of tomorrow.
We made Brussels shine by hosting international
conferences and shared knowledge with workshops.
We acted as a bridge by organising jobfairs, linking
companies with promising startups and bringing
the universities in touch with the business
world. We promote social responsibility in
our organization and believe we can have
a considerable social impact on the Canal
region. We facilitated company creation with
coaching sessions and were instrumental
in the creation of réseau entreprendre
North. We had great support from the press
with articles in de Tijd, L’Echo and Le Soir
mentioning DigitYser as «the Clubhouse of
the tech communities of Brussels» and were
assisted by numerous volunteers.
It was also a year of transformation of the
venue where we upgraded a cold and empty
building into a meeting space for tech
entrepreneurs bursting with energy and
projects. We build modern meeting rooms,
installed video recording facilities, managed
to stream the content of the meetups online
and provided a space for the cyclists to
shower after their ride to work.
Digityser is now yours to use, I’m looking
forward to meeting you here soon.
Launching digit yser a first 18 months reviewDigitYser
Wrapping Up our first 18 months of activity at DigitYser
Launch:
DigitYser was initiated by the AI & Data Science Community of Belgium with the support of Sofina, the Region of Brussels and Engie.
Together with our partners and volunteers we turned an empty building near the Canal into a bursting digital clubhouse.
Over the past 18 months we organised over 500 events for over 6000 visitors aimed at sharing knowledge about digital innovation and entrepreneurship to a very broad public.
Soon after our launch in December 2017 the press called DigitYser ‘The new hart of the digital entrepreneur in Brussels.’
The fact that we were hosting under one roof all the technologies supported by the Next Tech plan received a warm welcome from all the AI, Data, VR, IoT and Blockchain communities of Brussels.
We achieved our objective to:
Help the digital entrepreneur to make their dreams come true in Brussels. We support them in their effort to transform their initial idea into a invoiceable service. We do that with the support of our technical and innovation partners together with Innoviris and Hub.Brussels.
2. Boost digital skills by offering digital trainings, hands-on workshops, meetups, bootcamps, hackathons, learnathons …
3. And we made Brussels shine by organising different international events such as the W3C annual meeting on VR or the Washington Chambers of commerce event on GDPR privacy.
We are ready for 2019:
Our team is now mature and ready to take on the workload.
Our challenges are mainly focussed on the quality of our offering that we want to be more inclusive and in line with the daily needs of a starting digital entrepreneur.
We will reach more people in 2019.
Special thanks to hub.brussels Sofina ENGIE AI & Data Science Community of Belgium
SAS Teradata DataMinded icity.brussels Accenture SAP Python Predictions @johnson J&J Timi
Thank you to the team for making this DigitYser a reality Tanguy Florence Adélaïde Ludovic Olivier Ricardo & Fouad Nicolas
Special thanks for your permanent support:
Didier Gosuin Harold Boël Nicolas Harmel Olivier Poulaert Juan Bossicard Jean Wallemacq Mathieu Poma Ioana Voicu Tania Van Loon Katrien Mondt Claudio Truzzi Bianca Debaets Jeroen Van Godtsenhoven Ivy Vanderheyden Céline Bouton Alain Heureux Cécile Jodogne Philippe Van Troeye
I hope you have a happy and healthy start to 2019.
SAS forum CDO panel presentation Jonas Vandenbruaene 12/06/2017DigitYser
CDO Panel Discussion (12/06/2017)
Did you ever wonder what the Chief Data Officer (CDO) is all about? Or does CDO rather mean Chief Digital Officer to you? Where do you find these profiles in a company structure? What is he/she responsible for? Do you link a CDO to Innovation? Or rather to Governance? Join us to discuss these and other questions during our CDO Panel at the SAS Forum, June 12th2017 where Jonas Vandenbruaene, student at the University of Antwerp, will also reveal some of the findings of the CDO survey he conducted for his Master thesis in Applied Economics, Business Engineering.
Philippe Van Impe, Founder - European Data Innovation Hub & BRUSSELS DATA SCIENCE COMMUNITY. Social entrepreneur & startup coach.
Jonas Vandenbruaene, Co-Founder at 180° Consulting Antwerp
DISUMMIT - Jos Polfliet - Suicide prevention using text analyticsDigitYser
Jos Polfliet
Making the AI revolution happen - Faction XYZ
In my daily job, I help companies translate their business problems to questions that can be answered by data, using AI, machine learning, data mining, predictive modeling and sensor analytics. This way companies can improve their business processes by making data-driven decisions, with often impressive and insightful results.
Faction XYZ is an applied artificial intelligence engineering service provider, with headquarters in Antwerp (Belgium). We serve clients with a vision and budget for a future where data science and machine learning will continue to drive our client’s core business in a rapidly transforming economic reality. Our in-depth knowledge of artificial intelligence, conversational agents, and an array of emerging technologies make us the first choice for companies with a vision and budget. We advance your strategic goals by imagining, developing, and implementing the technology stack you need to stay ahead.
DISUMMIT - Serge Masyn about denguehack.org - How to convince your CEO to do ...DigitYser
denguehack.org
Welcome to the denguehack.org website
The aim of this site is to help fight the global burden of dengue fever, a mosquito-borne viral infection. About half of the world’s population is at risk!
Tappping into private and public datasets can generate new dengue insights and improve our understanding of this infection. So we are looking for experts who want to share their data, ideas and skills.
We will gather information and datasets and use it in a data4good hackathon on November 25th & 26th 2016 in Brussels, Belgium.
In anticipation of this hackathon we will organize preparation sessions focusing on the use of datascience in this area.
Together we can stop dengue from spreading.
The European Data Innovation Hub is organizing this data4good hackathon with the sponsorship of Janssen, The Pharmaceutical Companies of Johnson & Johnson.
Luc Iterbeke opening DISUMMIT and setting up the objectivesDigitYser
Luc Iterbeke
Head of Customer Intelligence at ING
Within the Marketing Department of ING, Luc is responsible for a team that is in charge of market research, business analysis, product positioning and campaign modelling and analysis.
In keeping with this year’s theme of using data to build a better world, the diHub is thrilled to welcome back three of the winners of the DengueHack.org hackathon to present their results at the di-Summit. From team Stream, Hannah Pinson will be presenting their work on using Twitter data to predict dengue outbreaks. From the Juniors, a team made up of junior data scientists, most of whom were participants in the di-Academy’s Data Science Boot Camp, Olivier Van Tongerloo and Nico Verheyden will be presenting their work on building a model to predict the spread of Dengue in Malaysia. Finally, Maarten Vandenbroeck from Cronos will present XploData’s results on the determining factors that predict the spread of dengue.
Technical workshops during #disummit in Brussels 30 MarchDigitYser
We are looking forward to meeting you at our yearly #datascience event.
We’d like to bring to your attention that in addition to our excellent line-up of speakers, you may be interested in one of our technical workshops that provide hands-on training:
Managing Projects in Predictive Analytics, a workshop with Geert Verstraeten
Hands-on Machine Learning in Python, a workshop with Pieter Buteneeers
Using IBM Watson’s Data Platform with Ann-Elise Delbecq, Niko Tambuyser, Willem Hendriks, and Luc Goossens
These are only some of the technical workshops we have available.
Brussels is the Capital of Data Science in Europe.
Thanks to the European Data Innovation Hub and the Data Science community Brussels has become in a few years the place to be for data driven startups..
The Brussels based European Data Innovation Hub is the premier networking space where business, startup, academic and political decison makers meet and discuss policy and best practices about big data, open data and data innovation.
Our Brussels based hub connects European data professionals, enterprises, startups, government, non-profit organisations and the academic world active in the data innovation ecosystem.
We create and manage a vibrant environment where they share best practices, learn about and drive data innovation in the EU.
The European Data Innovation Hub supports data professionals with networking activities, infrastructure for training, co-workers and startups, resources for community management.
Activities are:
Data Innovation Hub
OpenTraining hub for Data related trainings
Data start-up incubator
Co-working space for data experts
Data Innovation Summit
Data Innovation Survey
Data4Good hub and meeting place
Contact:
vzw European Data Innovation Hub asbl
Vorstlaan 23 bus 411, B-1170 Watermaal-Bosvoorde, Belgium
VAT: BE0630.675.984
email: info@dihub.eu
Main contact: Philippe Van Impe
pvanimpe@dihub.eu
+32 477 23 78 42
@pvanimpe
How to become the best datascientist in EuropeDigitYser
How to become the best datascientist in Europe.
How to boost your datascience skills.
Ho to recruit the most promissing young graduates.
How a company can boost its digital transformation effort.
How to become data driven.
Join the data science bootcamp starting mid September 2016 - prepare during the summer camp for coders.
Grand Opening or the European Data Innvovation HubDigitYser
Presentations from Alexander De Croo, Bianca Debaets, Frank Koster and Jürgen Gren during the Grand Opening or the European Data Innvovation Hub (@dataeu) in Brussels
The Brussels Data Science community is supported by the European Data Innovation Hub.
Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanity’s grand challenges.
What we do
mind the gap
We are the fastest growing community of data scientists in Europe.
We love doing Data4Good.
We promote the value of analytics and organise events, hands-on sessions and trainings to close the gap between academics and business.
Join us if you want to share, learn and have fun with analytical & technological innovation & positive social change.
2015 06 18 datascienc meetup privacy - update - philippe van impeDigitYser
Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanity’s grand challenges.
We are the fastest growing community of data scientists in Europe.
We love doing Data4Good.
We promote the value of analytics and organise events, hands-on sessions and trainings to close the gap between academics and business.
Join us if you want to share, learn and have fun with analytical & technological innovation & positive social change.
How to build a Community of Practice (CoP) + How we build the Brussels Data S...DigitYser
How to build a Community of Practice (CoP)
How we build the Brussels Data Science Community
www.datasciencebe.com
check our video library: https://www.parleys.com/channel/datascience
Fintech - Presentations from the DataScience Meetup about Banking - Brussels ...DigitYser
Link to the Fintech Presentations from the DataScience Meetup about Banking - Brussels May 2015.
Brussels Data Science Community
1. KUL: Bart Baesens & Véronique Van Vlasselaer, Gotch’all! Advanced Network Analysis for Detecting Groups of Fraud
2. Euroclear : Istvan Hajnal, Doing data sciences
3. ING : Meric Potier & Thomas Carette, Data Science and (Advanced) Predictive Analytics @ ING
4. Dexia : Bart Hamers , Data Science Governance, my 6 principles.
5. DataCamp: Presentation of Data Camp, Dieter
Bigdata brussels update @datasciencebe philippe van impe - dataconomyDigitYser
Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanity’s grand challenges.
We are the fastest growing community of data scientists in Europe.
We love doing Data4Good.
We promote the value of analytics and organise events, hands-on sessions and trainings to close the gap between academics and business.
Join us if you want to share, learn and have fun with analytical & technological innovation & positive social change.
http://datasciencebe.com/2015/02/22/digital-wake-up-call-for-belgium-call-for-action/
https://www.youtube.com/watch?v=qSQPgADT6bc
Update on Data Science in Belgium @datasciencebe by Philippe Van Impe DigitYser
Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanity’s grand challenges.
We are the fastest growing community of data scientists in Europe.
We love doing Data4Good.
We promote the value of analytics and organise events, hands-on sessions and trainings to close the gap between academics and business.
Join us if you want to share, learn and have fun with analytical & technological innovation & positive social change.
Our members all want to contribute with their professional skills to make a positive impact on our local community or in the world, they are all aware that data based decision making is key to boost the performances of our organisations. We are an open community and everybody with an interest in data and business is welcome. Our members are University professors, managers in public or private organisations, leaders of NGO’s, Phd students, analytic consultants, technical bigdata experts …
Meetup January 29th 2015 Brussels Data Science CommunityDigitYser
Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanity’s grand challenges.
We are the fastest growing community of data scientists in Europe.
We love doing Data4Good.
We promote the value of analytics and organise events, hands-on sessions and trainings to close the gap between academics and business.
Join us if you want to share, learn and have fun with analytical & technological innovation & positive social change.
2014 Overview of the activities of the Brussels Data Science CommunityDigitYser
Opening presentation by Philippe Van Impe, leader of the Brussels Data Science Community.
Overview of the activities of 2014.
Data Science and Big Data meetup of December 2014.
We are the fastest growing community of data scientists in Europe, Driven by volunteers doing non profit data4good work.
We promote the value of analytics and organise events, hands-on sessions and trainings to close the gap between academics and business.
Join us if you want to share, learn and have fun with analytical & technological innovation & positive social change.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making
1. Kirk Borne
Principal Data Scientist, Booz Allen Hamilton
Data Innovation Summit
March, 30 2017
#DIS2017
A Data-rich World for a Better World:
From Sensors to Sense-Making
2. Principal Data Scientist
Booz Allen Hamilton
http://www.boozallen.com/datascience
Kirk Borne
@KirkDBorne
A Data-rich World for a Better World:
From Sensors to Sense-Making
3. OUTLINE
• First came Data, then Big Data, now Smart Data
• IoT: Dynamic Data-Driven Application Systems
• The Self-Driving Enterprise
• Steps to Discovery: Data Science & Analytics
• Case Study: From Sensors to Sense-Making
• Data Rich for a Better World
@KirkDBorne ---- #disummit, March 2017 ---- http://www.boozallen.com/datascience 3
4. OUTLINE
• First came Data, then Big Data, now Smart Data
• IoT: Dynamic Data-Driven Application Systems
• The Self-Driving Enterprise
• Steps to Discovery: Data Science & Analytics
• Case Study: From Sensors to Sense-Making
• Data Rich for a Better World
@KirkDBorne ---- #disummit, March 2017 ---- http://www.boozallen.com/datascience 4
5. Ever since we first explored our world…
http://www.livescience.com/27663-seven-seas.html 5
6. …We have asked questions about everything around us.
https://jefflynchdev.wordpress.com/tag/adobe-photoshop-lightroom-3/page/5/
6
7. So, we have collected evidence (data) to answer our questions,
which leads to more questions, which leads to more data collection,
which leads to more questions, which leads to BIG DATA!
y ~ 2 * x (linear growth)
y ~ 2 ^ x (exponential)
7
https://www.linkedin.com/pulse/exponential-growth-isnt-cool-combinatorial-tor-bair
Big
Wow!
y ~ x! ≈ x ^ x
→ Combinatorial Growth!
(all possible interconnections,
linkages, and interactions)
8. Semantic, Meaning-filled Data:
• Ontologies (formal)
• Folksonomies (informal)
• Tagging / Annotation
– Automated (Machine Learning)
– Crowdsourced
– “Breadcrumbs” (user trails)
Broad, Enriched Data:
• Linked Data (RDF)
– All of those combinations!
• Graph Databases
• Machine Learning
• Cognitive Analytics
• Context
• The 360o
view
Making Sense of the World with Smart Data
The Human Connectome Project:
mapping and linking the major
pathways in the brain.
http://www.humanconnectomeproject.org/
8
9. OUTLINE
• First came Data, then Big Data, now Smart Data
• IoT: Dynamic Data-Driven Application Systems
• The Self-Driving Enterprise
• Steps to Discovery: Data Science & Analytics
• Case Study: From Sensors to Sense-Making
• Data Rich for a Better World
@KirkDBorne ---- #disummit, March 2017 ---- http://www.boozallen.com/datascience 9
12. Internet of Things (IoT):
Deploying intelligence at the point of data collection!
(Machine Learning at the edge of the network = Edge Analytics!)
Internet of
Everything
https://www.nsf.gov/news/news_images.jsp?cntn_id=122028
The Internet of Things (IoT) will be an interconnected universe of Dynamic
Data-Driven Application Systems (dddas.org) =>
Combinatorial Explosive Growth of Smart Data!
12
https://www.linkedin.com/pulse/can-elephant-bigdata-dance-iot-internet-things-tune-lambba
13. • Smart Health
• Precision Medicine
• Precision Farming
• Personalized Financial Services
• Smart Organizations
• Predictive Maintenance
• Prescriptive Maintenance
• Smart Grid
• Smart Apps
• Predictive Retail
• Precision Marketing
• Smart Highways
• Precision Traffic
• Smart Cities
• Predictive Law Enforcement
• Personalized Learning
Smart => Predictive, Precision, Personalized!
Ubiquitous Smart Data in the IoT:
Applications and Use Cases are everywhere…
… that demands dynamic action = Edge Analytics
13
14. OUTLINE
• First came Data, then Big Data, now Smart Data
• IoT: Dynamic Data-Driven Application Systems
• The Self-Driving Enterprise
• Steps to Discovery: Data Science & Analytics
• Case Study: From Sensors to Sense-Making
• Data Rich for a Better World
@KirkDBorne ---- #disummit, March 2017 ---- http://www.boozallen.com/datascience 14
15. What is Self-Driving? … ~Autonomous!
(e.g., cars, cities, deep space probes, organizations)
Defining Characteristics Data Analytics Maturity Organizational Posture
Sensing, Streaming
Responsive
Learning, Agile
Contextual, Optimizing
Deciding, Acting
Note: additional KEY CHARACTERISTICS = Stateful, Secure, Compliant, “Embedded Reporting”
= explosive growth of “in situ” data that delivers CONTEXT and SENSE-MAKING
15
16. Defining Characteristics Data Analytics Maturity Organizational Posture
Sensing, Streaming Descriptive
Responsive Diagnostic
Learning, Agile Predictive
Contextual, Optimizing Prescriptive
Deciding, Acting Cognitive
16
What is Self-Driving? … ~Autonomous!
(e.g., cars, cities, deep space probes, organizations)
Note: additional KEY CHARACTERISTICS = Stateful, Secure, Compliant, “Embedded Reporting”
= explosive growth of “in situ” data that delivers CONTEXT and SENSE-MAKING
17. Defining Characteristics Data Analytics Maturity Organizational Posture
Sensing, Streaming Descriptive Passive
Responsive Diagnostic Reactive
Learning, Agile Predictive Proactive
Contextual, Optimizing Prescriptive Predictive Reactive
Deciding, Acting Cognitive Sense-making
17
What is Self-Driving? … ~Autonomous!
(e.g., cars, cities, deep space probes, organizations)
Note: additional KEY CHARACTERISTICS = Stateful, Secure, Compliant, “Embedded Reporting”
= explosive growth of “in situ” data that delivers CONTEXT and SENSE-MAKING
18. OUTLINE
• First came Data, then Big Data, now Smart Data
• IoT: Dynamic Data-Driven Application Systems
• The Self-Driving Enterprise
• Steps to Discovery: Data Science & Analytics
• Case Study: From Sensors to Sense-Making
• Data Rich for a Better World
@KirkDBorne ---- #disummit, March 2017 ---- http://www.boozallen.com/datascience 18
19. Levels of Maturity in the
Self-Driving Enterprise
Defining Characteristics Data Analytics Maturity Organizational Posture
Sensing, Streaming Descriptive Passive
Responsive Diagnostic Reactive
Learning, Agile Predictive Proactive
Contextual, Optimizing Prescriptive Predictive Reactive
Deciding, Acting Cognitive Sense-making
19
20. 5 Levels of Analytics Maturity
1) Descriptive Analytics
– Hindsight (What happened?)
2) Diagnostic Analytics
– Oversight (real-time / What is
happening? Why did it happen?)
3) Predictive Analytics
– Foresight (What will happen?)
4) Prescriptive Analytics
– Insight (How can we optimize what
happens?)
5) Cognitive Analytics
– Right Sight (the 360 view , what is the
right question to ask for this set of data in
this context = Game of Jeopardy)
– Finds the right insight, the right action, the
right decision,… right now!
– Moves beyond simply providing answers, to
generating new questions and new
hypotheses, for your “next-best move”
20
@KirkDBorne ---- #disummit, March 2017 ---- http://www.boozallen.com/datascience
21. PREDICTIVE
Analytics
Find a function (i.e., the model) f(d,t) that
predicts the value of some predictive
variable y = f(d,t) at a future time t, given
the set of conditions found in the training
data {d}.
=> Given {d}, find y.
PRESCRIPTIVE
Analytics
Find the conditions {d’} that will produce a
prescribed (desired, optimum) value y at a
future time t, using the previously learned
conditional dependencies among the
variables in the predictive function f(d,t).
=> Given y, find {d’}.
21
@KirkDBorne ---- #disummit, March 2017 ---- http://www.boozallen.com/datascience
22. Context is King!
“It’s not what you look at that matters – it’s what you see.” – Henry David Thoreau
• Context is “other data” about your data = i.e., Metadata!
• The 3 most important things in your data are: Metadata, Metadata,
Metadata!
• Metadata are…
– Other Data that describes Other Data
– Other Data that describes Your Data
– Your Data that describes Other Data
• e.g., Connected “Smart” Cars = that car that is braking 3 vehicles
ahead of you = informs your vehicle to brake now!
• The Smart Business = predictive maintenance algorithm alerts the
corresponding asset, the right skilled technician, & the right tool to
converge at right place at the right time!
• IoT sensor data can provide a lot of context data (metadata!) 22
23. OUTLINE
• First came Data, then Big Data, now Smart Data
• IoT: Dynamic Data-Driven Application Systems
• The Self-Driving Enterprise
• Steps to Discovery: Data Science & Analytics
• Case Study: From Sensors to Sense-Making
• Data Rich for a Better World
@KirkDBorne ---- #disummit, March 2017 ---- http://www.boozallen.com/datascience 23
25. Mars Rover: intelligent data-gatherer, mobile data mining
agent, autonomous decision-support system, and self-driving!
1) Drive around surface of Mars – take samples of items (Mars rocks):
• experimental data type = mass spectroscopy = data histogram = feature vector
2) Perform Intelligent Data Operations (Data Mining in Action):
• Supervised Learning
– Search for items with known characteristics, and assign items to known classes
• Unsupervised Learning
– Discover what types of items are present, without preconceived biases; find the set of
unique classes of items; discover unusual associations; discover trends and new directions
(new leads) to follow
• Semi-supervised Learning
– Find the rare, one-of-kind, most interesting items, behaviors, contexts, or events
3) Enact On-board Intelligent Data Understanding & Decision Support
= Science Goal (or Business Goal) Monitoring:
– “stay here and do more” ; or else “follow trend to a more interesting location”
– “send discoveries to human analysts immediately” ; or else “send results later”
25
26. Smart Sensors & Sentinels for Data-Driven
Sense-Making and Decision Support
• New knowledge and insights are acquired by monitoring and mining
actionable data from all digital inputs (Sensors!)
• Alerts are triggered autonomously, without intervention (if permitted),
applying machine learning and actionable business decision rules for
pattern detection and diagnosis. (Sentinels! = embedded machine
learning / data science algorithms, at the point of data collection =
trained to minimize False Positives and “Alarm Fatigue”)
• “Smart Sensors” (powered by Machine Learning-enabled sentinels) will
therefore deliver actionable intelligence (Sense!)
From Sensors to Sentinels to Sense
(for any application domain with streaming data from sensors)
@KirkDBorne ---- #disummit, March 2017 ---- http://www.boozallen.com/datascience 26
27. The MIPS Analytics Framework
for Dynamic Data-Driven Application Systems (DDDAS)
• MIPS =
– Measurement – Inference – Prediction – Steering
• This applies to any Network of Sensors:
– Web user interactions & actions (web analytics data), Cyber network usage logs, Customer
Service interactions, Social network sentiment, Machine logs (of any kind), Manufacturing
sensors, Health & Epidemic monitoring systems, Financial transactions, National Security, Utilities
and Energy, Remote Sensing, Tsunami warnings, Weather/Climate events, Astronomical sky
events, …
• Machine Learning enables the “IP” part of MIPS:
– Autonomous (or semi-autonomous) Classification
– Intelligent Data Understanding
– Rule-based
– Model-based
– Neural Networks
– Markov Models
– Bayes Inference Engines
Alert & Response systems:
• Actionable insights from
streaming sensor data
• Automation of any data-driven
operational system
http://dddas.org
28. The MIPS Analytics Framework
for Dynamic Data-Driven Application Systems (DDDAS)
• MIPS =
– Measurement – Inference – Prediction – Steering
• This applies to any Network of Sensors:
– Web user interactions & actions (web analytics data), Cyber network usage logs, Customer
Service interactions, Social network sentiment, Machine logs (of any kind), Manufacturing
sensors, Health & Epidemic monitoring systems, Financial transactions, National Security, Utilities
and Energy, Remote Sensing, Tsunami warnings, Weather/Climate events, Astronomical sky
events, …
• Machine Learning enables the “IP” part of MIPS:
– Autonomous (or semi-autonomous) Classification
– Intelligent Data Understanding
– Rule-based
– Model-based
– Neural Networks
– Markov Models
– Bayes Inference Engines
http://dddas.org
Alert & Response systems:
• Actionable insights from
streaming sensor data
• Automation of any data-driven
operational system
29. OUTLINE
• First came Data, then Big Data, now Smart Data
• IoT: Dynamic Data-Driven Application Systems
• The Self-Driving Enterprise
• Steps to Discovery: Data Science & Analytics
• Case Study: From Sensors to Sense-Making
• Data Rich for a Better World
@KirkDBorne ---- #disummit, March 2017 ---- http://www.boozallen.com/datascience 29
30. Big Data + IoT + Citizen Data Scientists = Partners in Sustainability
http://www.unglobalpulse.org
Sustainability Development Goals
Knowing the knowable via deep, wide,
and fast data from ubiquitous sensors!
The Internet of Things (IoT): Big Data:
In the Big Data era, Everything is
Quantified and Tracked :
– Populations & Persons
– Smart Cities, Farms, Highways
– Environmental Sensors
– IoE = Internet of Everything!
Discovery through Data Science:
– Class Discovery
– Correlation Discovery
– Novelty Discovery
– Association Discovery
17 SDGs are KPIs for the World!
(currently, the SDGs have 229
key performance indicators)
30
31. Big Data + IoT + Citizen Data Scientists = Partners in Sustainability
Sustainability Development Goals
Knowing the knowable via deep, wide,
and fast data from ubiquitous sensors!
The Internet of Things (IoT): Big Data:
In the Big Data era, Everything is
Quantified and Tracked :
– Populations & Persons
– Smart Cities, Farms, Highways
– Environmental Sensors
– IoE = Internet of Everything!
Discovery through Data Science:
– Class Discovery
– Correlation Discovery
– Novelty Discovery
– Association Discovery
17 SDGs are KPIs for the World!
(currently, the SDGs have 229
key performance indicators)
31
**********
The Art & Science of being Data-Rich :
From Sensors to a Better World
**********
@KirkDBorne ---- #disummit, March 2017 ---- http://www.boozallen.com/datascience