We help enterprise, fintech and legacy companies reduce costs and unlock innovation though contextual market research, narrow automation and organizational design.
NYC Open Data Meetup-- Thoughtworks chief data scientist talkVivian S. Zhang
This document summarizes a presentation on data science consulting. It discusses:
1) The Agile Analytics group at ThoughtWorks which does data science consulting projects using probabilistic modeling, machine learning, and big data technologies.
2) Two case studies are described, including developing a machine learning model to improve matching of healthcare product data and using logistic regression for retail recommendation systems.
3) The origins and future of the field are discussed, noting that while not entirely new, data science has grown due to improvements in technology, programming languages, and libraries that have increased productivity and driven new career opportunities in the field.
Reproducible Dashboards and other great things to do with JupyterDomino Data Lab
The document discusses reproducible dashboards and other uses of Jupyter notebooks. It outlines the data science life cycle from collecting data to deploying models. It also distinguishes between repeatable, replicable, and reproducible analyses. Finally, it promotes the use of Jupyter notebooks and mentions job openings at Domino Data Lab across several cities.
5 facets of cloud computing - Presentation to AGBCRaymond Gao
My presentation to AGBC (American German Business Club) on Cloud Computing and Social Causes. How doing non-profit work helps finding and validates Use Cases, the heart of any application, business venture, etc.
The document discusses best practices for managing data science teams based on lessons learned. It outlines common pitfalls such as solving the wrong problem, having the wrong tools, or results being used incorrectly. Issues include data science being different from software development and forgetting other stakeholders. Recommendations include establishing processes for the full lifecycle from ideation to monitoring, using modular systems thinking, and defining roles like data scientists, managers, and product owners to address organizational challenges. The goal is to deliver measurable, reliable, and scalable insights.
This document discusses changes in IT over the past 15 years. It notes that digital technology is now present everywhere in both work and personal life, compared to 15 years ago when it was mainly used at work. The role of IT departments has also changed from a focus on technology to supporting the business. There is a shift from producing physical goods to producing data, with IT either being core to production or providing generic support. Managing non-structured data has become a challenge compared to the past focus on structured data.
1. The document discusses the perspectives of John Huffman, CTO of Philips Healthcare Informatics, on big data, analytics, and AI in healthcare.
2. It outlines the multi-stage process for advanced analytics, including data ingestion, model training, evaluation, and production. It also discusses challenges in data collection/processing and model deployment.
3. Key points made are that data is more important than algorithms/methods; analytics requires clean, interoperable data; and the stack provides tools but not full solutions - data is the intellectual property.
Intro to Data Science for Non-Data ScientistsSri Ambati
Erin LeDell and Chen Huang's presentations from the Intro to Data Science for Non-Data Scientists Meetup at H2O HQ on 08.20.15
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Idiots guide to setting up a data science teamAshish Bansal
Some nuggets of how I started the data science practice at Gale Partners on a budget. Presented at the Toronto Hadoop Users Group (THUG) in April, 2015.
NYC Open Data Meetup-- Thoughtworks chief data scientist talkVivian S. Zhang
This document summarizes a presentation on data science consulting. It discusses:
1) The Agile Analytics group at ThoughtWorks which does data science consulting projects using probabilistic modeling, machine learning, and big data technologies.
2) Two case studies are described, including developing a machine learning model to improve matching of healthcare product data and using logistic regression for retail recommendation systems.
3) The origins and future of the field are discussed, noting that while not entirely new, data science has grown due to improvements in technology, programming languages, and libraries that have increased productivity and driven new career opportunities in the field.
Reproducible Dashboards and other great things to do with JupyterDomino Data Lab
The document discusses reproducible dashboards and other uses of Jupyter notebooks. It outlines the data science life cycle from collecting data to deploying models. It also distinguishes between repeatable, replicable, and reproducible analyses. Finally, it promotes the use of Jupyter notebooks and mentions job openings at Domino Data Lab across several cities.
5 facets of cloud computing - Presentation to AGBCRaymond Gao
My presentation to AGBC (American German Business Club) on Cloud Computing and Social Causes. How doing non-profit work helps finding and validates Use Cases, the heart of any application, business venture, etc.
The document discusses best practices for managing data science teams based on lessons learned. It outlines common pitfalls such as solving the wrong problem, having the wrong tools, or results being used incorrectly. Issues include data science being different from software development and forgetting other stakeholders. Recommendations include establishing processes for the full lifecycle from ideation to monitoring, using modular systems thinking, and defining roles like data scientists, managers, and product owners to address organizational challenges. The goal is to deliver measurable, reliable, and scalable insights.
This document discusses changes in IT over the past 15 years. It notes that digital technology is now present everywhere in both work and personal life, compared to 15 years ago when it was mainly used at work. The role of IT departments has also changed from a focus on technology to supporting the business. There is a shift from producing physical goods to producing data, with IT either being core to production or providing generic support. Managing non-structured data has become a challenge compared to the past focus on structured data.
1. The document discusses the perspectives of John Huffman, CTO of Philips Healthcare Informatics, on big data, analytics, and AI in healthcare.
2. It outlines the multi-stage process for advanced analytics, including data ingestion, model training, evaluation, and production. It also discusses challenges in data collection/processing and model deployment.
3. Key points made are that data is more important than algorithms/methods; analytics requires clean, interoperable data; and the stack provides tools but not full solutions - data is the intellectual property.
Intro to Data Science for Non-Data ScientistsSri Ambati
Erin LeDell and Chen Huang's presentations from the Intro to Data Science for Non-Data Scientists Meetup at H2O HQ on 08.20.15
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Idiots guide to setting up a data science teamAshish Bansal
Some nuggets of how I started the data science practice at Gale Partners on a budget. Presented at the Toronto Hadoop Users Group (THUG) in April, 2015.
Data Science Consulting at ThoughtWorks -- NYC Open Data MeetupDavid Johnston
- ThoughtWorks is a global software consulting company with offices worldwide and focuses on agile analytics and data science consulting.
- Data science combines various fields like mathematics, statistics, machine learning, computer science, and business consulting to solve problems by analyzing large amounts of data.
- While not entirely new, data science has grown in popularity and job opportunities due to increases in computing power, data storage, open source tools and libraries, and cloud computing which have improved data scientists' productivity and ability to tackle big data problems.
- However, challenges remain in communicating effectively with business stakeholders, dealing with vague problems, and convincing clients that data science can provide solutions rather than just hype.
This talk will discuss how Stanley Black & Decker is creating a Data Science educational environment. This allows all company members hands on access to Data Science training and online python development environment. Leveraging the power of Wordpress as an LMS with 3Blades as a Data Sciences development platform, a robust Data Science learning environment was created. Come hear about the development of this learning environment and what we have learned from our initial company release.
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-ShapiroData ScienceTech Institute
Data Science Tech Institute - Big Data and Data Science Conference around Dr Gregory Piatetsky-Shapiro.
Keynote - An overview on Big Data & Data Science Dr Gregory Piatetsky-Shapiro - KDnuggets.com Founder & Editor.
Paris May 23rd & Nice May 26th 2016 @ Data ScienceTech Institute (https://www.datasciencetech.institute/)
Data Architecture: OMG It’s Made of Peoplemark madsen
Do you have data? Do you have users? Do they use that data to solve problems? Then you have a data architecture. Maybe your architecture is organic and accidental, or maybe it’s an accumulation of the latest practices and technologies you heard about on Stack Overflow.
Spoiler: data architecture is about people and how they use data, not the latest pipeline framework or AI model. Data architecture is about enabling users to be productive, not adding the next “shiny object” and then blaming the users for using it wrong. What you design needs to focus on a different subject than either technology or data.
Join Kevin Bogusch, Ecosystem Architect, as he talks with Mark Madsen, Fellow at the Technology Innovation Office, on the crucial elements you’re missing in a successful data architecture: people and process. Find out why Mark says, “don’t buy one problem to solve another problem.”
How to build a data science team 20115.03.13v6Zhihao Lin
Teralytics provides real-time insights into human behavior globally using data from 350 million profiles and 180 billion daily events. They have built a data science team in Singapore that develops one of their three products deployed worldwide. The presentation outlines how to build an effective data science team, including finding team members through diverse sources, evaluating them through a multi-stage interview process, convincing them to join by emphasizing the work, data, and team environment, and getting the team working cohesively through collaborative projects with clear goals and deadlines.
The document summarizes the IT transformation efforts at Oak Ridge National Laboratory from 2006-2008. It discusses consolidating IT staff, application transformation, and cyber security revitalization to create a unified user experience with common interfaces for employees, customers, and collaborators. It highlights goals of making enterprise data easier to access, facilitating cross-discipline collaboration, and mining data to create knowledge and predictability in research and development. The summary also briefly outlines lessons learned around executive support, training, and adopting a unified architecture.
The Other 99% of a Data Science ProjectEugene Mandel
Slides from my talk at Open Data Science Conference 2016.
Algorithms and models are an important (and cool) part of data science. This talk is about all the other steps that it takes to deploy a data science project that makes a product slightly smarter. Stuff that you hear from practitioners, but is not covered well enough in books.
The document discusses machine learning techniques for big data, including:
1) Various machine learning models like decision trees, linear models, neural networks and their assumptions.
2) Applications of machine learning like predictive modeling, clustering, personalization and optimization.
3) Key aspects of building machine learning systems like feature selection, model selection, evaluation and continuous adaptation.
This document outlines a data science enablement roadmap created by the Advanced Center of Excellence at Modern Renaissance Corporation. The roadmap consists of 1 introductory course and 3 advanced courses that can earn a student a master's level certificate in data science. The introductory course provides a broad overview of topics like algorithms, statistics, machine learning, and big data platforms. The advanced courses focus on specific skills like machine learning with R, modern data platforms using Hadoop, and advanced big data analytics techniques. The goal is to give students a versatile, practical skill set for a career in data science or big data engineering.
Most of analytics modeling work today focuses on the production of single-purpose "artisanal" models for predictions. This approach to analytics is fragile with respect to model consistency, reorganization, and resource availability. This talk will argue that instead the focus of analytics modeling should be toward the production of analytics interchangeable parts, which can be combined in creative ways to produce a wide variety of analytics results. This "nuts and bolts" approach allows analytics groups to produce results in an agile way where the time between ask and answer is determined by the right combination of analytics, rather than the modeling.
Dataiku - data driven nyc - april 2016 - the solitude of the data team m...Dataiku
This document discusses the challenges faced by a data team manager named Hal in developing a data science software platform for his company. It describes Hal's background in technical fields like functional programming. It then outlines some of the disconnects Hal experienced in determining the appropriate technologies, hiring the right people, accessing needed data, and involving product teams. The document provides suggestions for how Hal can find solutions, such as taking a polyglot approach using open source technologies, creating an API culture, and focusing on solving big business problems to gain support.
Solve User Problems: Data Architecture for Humansmark madsen
We are bombarded with stories of the latest products to hit the market – products that will change everything we do. This causes us to focus on the latest technology, building IT for the sake of building IT. Meanwhile, the world still seems to run on Excel.
The “big innovators” who have and use unimaginably large amounts of data are not the norm. Aspiring to use the same complex technologies and patterns they do leads to poor investments and tradeoffs. This is an age-old problem rooted in the over-emphasis of technology as the agent of change. Technology isn’t the answer – it’s the platform on which people build answers.
To emphasize technology is to ignore the way tools change people and practices. The design focus in our market was on storing and making data accessible. If we want to make progress then we need to step back from the details and look at data from the perspective of the organization. Our design focus shifts to people learning and applying new insights, asking questions about how an organization can be more resilient, more efficient, or faster to sense and respond to changing conditions.
In this talk you will learn how to put your data architecture into a human frame of reference. Drawing inspiration from the history of technology and urban planning, we will see that the services provided by the things we build are what drive success, not the latest shiny distraction.
Data scientists and IT push the limits of what's possible -- whether that's operating more efficiently, taking advantage of new opportunities, or innovating. Here are 5 ways businesses can boost their effectiveness.
For more: http://blog.tyronesystems.com/
Advances in technology for capturing information have led to the promise of “Big Data” to dramatically alter the business environment. However, technology is only an enabler of aggregation and analysis. Many firms struggle to convert information to business knowledge and insights. Learn how organizations are using data to improve skill development at all levels and developing models for organizational structures to link these skills to executive decision-making.
Speakers: Dan McGurrin, Ph.D., NC State and Pamela Webber, Cisco
Being the 1st Chief Data Officer of San Francisco CityTheFamily
Joy Bonaguro explains her work as CDO at San Francisco City : role definition, open data & data science argumentation, being valuable for the people, starting with a problem... Almost as creating a startup inside a public institution that exists for about 200 years :)
Building a Data Platform Strata SF 2019mark madsen
Building a data lake involves more than installing Hadoop or putting data into AWS. The goal in most organizations is to build multi-use data infrastructure that is not subject to past constraints. This tutorial covers design assumptions, design principles, and how to approach the architecture and planning for multi-use data infrastructure in IT.
[This is a new, changed version of the presentations of the same title from last year's Strata]
A Hybrid Approach to Data Science Project ManagementElaine K. Lee
A talk about how Civis Analytics, a data science consultancy and software company, does project management using a blend of approaches from academia, consulting, and software engineering.
This document provides an introduction to the concepts of data science. It defines data science as an interdisciplinary field drawing from computer science, statistics, and application domains. The document outlines the typical workflow of a data scientist, including obtaining data, exploring it, cleaning it, performing analysis, drawing conclusions, and reporting results. It describes the focus areas of the course as mathematics, technology, visualization, and communication skills. The document emphasizes the importance of learning new skills independently and communicating results effectively to non-technical audiences.
Lean AI - Automate Routine, make space to innovateEmerson Taymor
A partnership between Philosophie and Studio VV6, we help companies think about how they can take advantage of machine learning to make more effective teams and organizations.
This document provides an overview of how to prepare for a career in data science. It discusses the author's own career path, which included degrees in bioinformatics and machine learning as well as jobs as a data scientist. It then outlines the typical data science workflow, including identifying problems, accessing and cleaning data, exploratory analysis, modeling, and deploying results. It emphasizes that data science is an iterative process and stresses the importance of communication skills. Finally, it discusses how data science fits within business contexts and the value of working on teams with complementary skills.
Data Science Consulting at ThoughtWorks -- NYC Open Data MeetupDavid Johnston
- ThoughtWorks is a global software consulting company with offices worldwide and focuses on agile analytics and data science consulting.
- Data science combines various fields like mathematics, statistics, machine learning, computer science, and business consulting to solve problems by analyzing large amounts of data.
- While not entirely new, data science has grown in popularity and job opportunities due to increases in computing power, data storage, open source tools and libraries, and cloud computing which have improved data scientists' productivity and ability to tackle big data problems.
- However, challenges remain in communicating effectively with business stakeholders, dealing with vague problems, and convincing clients that data science can provide solutions rather than just hype.
This talk will discuss how Stanley Black & Decker is creating a Data Science educational environment. This allows all company members hands on access to Data Science training and online python development environment. Leveraging the power of Wordpress as an LMS with 3Blades as a Data Sciences development platform, a robust Data Science learning environment was created. Come hear about the development of this learning environment and what we have learned from our initial company release.
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-ShapiroData ScienceTech Institute
Data Science Tech Institute - Big Data and Data Science Conference around Dr Gregory Piatetsky-Shapiro.
Keynote - An overview on Big Data & Data Science Dr Gregory Piatetsky-Shapiro - KDnuggets.com Founder & Editor.
Paris May 23rd & Nice May 26th 2016 @ Data ScienceTech Institute (https://www.datasciencetech.institute/)
Data Architecture: OMG It’s Made of Peoplemark madsen
Do you have data? Do you have users? Do they use that data to solve problems? Then you have a data architecture. Maybe your architecture is organic and accidental, or maybe it’s an accumulation of the latest practices and technologies you heard about on Stack Overflow.
Spoiler: data architecture is about people and how they use data, not the latest pipeline framework or AI model. Data architecture is about enabling users to be productive, not adding the next “shiny object” and then blaming the users for using it wrong. What you design needs to focus on a different subject than either technology or data.
Join Kevin Bogusch, Ecosystem Architect, as he talks with Mark Madsen, Fellow at the Technology Innovation Office, on the crucial elements you’re missing in a successful data architecture: people and process. Find out why Mark says, “don’t buy one problem to solve another problem.”
How to build a data science team 20115.03.13v6Zhihao Lin
Teralytics provides real-time insights into human behavior globally using data from 350 million profiles and 180 billion daily events. They have built a data science team in Singapore that develops one of their three products deployed worldwide. The presentation outlines how to build an effective data science team, including finding team members through diverse sources, evaluating them through a multi-stage interview process, convincing them to join by emphasizing the work, data, and team environment, and getting the team working cohesively through collaborative projects with clear goals and deadlines.
The document summarizes the IT transformation efforts at Oak Ridge National Laboratory from 2006-2008. It discusses consolidating IT staff, application transformation, and cyber security revitalization to create a unified user experience with common interfaces for employees, customers, and collaborators. It highlights goals of making enterprise data easier to access, facilitating cross-discipline collaboration, and mining data to create knowledge and predictability in research and development. The summary also briefly outlines lessons learned around executive support, training, and adopting a unified architecture.
The Other 99% of a Data Science ProjectEugene Mandel
Slides from my talk at Open Data Science Conference 2016.
Algorithms and models are an important (and cool) part of data science. This talk is about all the other steps that it takes to deploy a data science project that makes a product slightly smarter. Stuff that you hear from practitioners, but is not covered well enough in books.
The document discusses machine learning techniques for big data, including:
1) Various machine learning models like decision trees, linear models, neural networks and their assumptions.
2) Applications of machine learning like predictive modeling, clustering, personalization and optimization.
3) Key aspects of building machine learning systems like feature selection, model selection, evaluation and continuous adaptation.
This document outlines a data science enablement roadmap created by the Advanced Center of Excellence at Modern Renaissance Corporation. The roadmap consists of 1 introductory course and 3 advanced courses that can earn a student a master's level certificate in data science. The introductory course provides a broad overview of topics like algorithms, statistics, machine learning, and big data platforms. The advanced courses focus on specific skills like machine learning with R, modern data platforms using Hadoop, and advanced big data analytics techniques. The goal is to give students a versatile, practical skill set for a career in data science or big data engineering.
Most of analytics modeling work today focuses on the production of single-purpose "artisanal" models for predictions. This approach to analytics is fragile with respect to model consistency, reorganization, and resource availability. This talk will argue that instead the focus of analytics modeling should be toward the production of analytics interchangeable parts, which can be combined in creative ways to produce a wide variety of analytics results. This "nuts and bolts" approach allows analytics groups to produce results in an agile way where the time between ask and answer is determined by the right combination of analytics, rather than the modeling.
Dataiku - data driven nyc - april 2016 - the solitude of the data team m...Dataiku
This document discusses the challenges faced by a data team manager named Hal in developing a data science software platform for his company. It describes Hal's background in technical fields like functional programming. It then outlines some of the disconnects Hal experienced in determining the appropriate technologies, hiring the right people, accessing needed data, and involving product teams. The document provides suggestions for how Hal can find solutions, such as taking a polyglot approach using open source technologies, creating an API culture, and focusing on solving big business problems to gain support.
Solve User Problems: Data Architecture for Humansmark madsen
We are bombarded with stories of the latest products to hit the market – products that will change everything we do. This causes us to focus on the latest technology, building IT for the sake of building IT. Meanwhile, the world still seems to run on Excel.
The “big innovators” who have and use unimaginably large amounts of data are not the norm. Aspiring to use the same complex technologies and patterns they do leads to poor investments and tradeoffs. This is an age-old problem rooted in the over-emphasis of technology as the agent of change. Technology isn’t the answer – it’s the platform on which people build answers.
To emphasize technology is to ignore the way tools change people and practices. The design focus in our market was on storing and making data accessible. If we want to make progress then we need to step back from the details and look at data from the perspective of the organization. Our design focus shifts to people learning and applying new insights, asking questions about how an organization can be more resilient, more efficient, or faster to sense and respond to changing conditions.
In this talk you will learn how to put your data architecture into a human frame of reference. Drawing inspiration from the history of technology and urban planning, we will see that the services provided by the things we build are what drive success, not the latest shiny distraction.
Data scientists and IT push the limits of what's possible -- whether that's operating more efficiently, taking advantage of new opportunities, or innovating. Here are 5 ways businesses can boost their effectiveness.
For more: http://blog.tyronesystems.com/
Advances in technology for capturing information have led to the promise of “Big Data” to dramatically alter the business environment. However, technology is only an enabler of aggregation and analysis. Many firms struggle to convert information to business knowledge and insights. Learn how organizations are using data to improve skill development at all levels and developing models for organizational structures to link these skills to executive decision-making.
Speakers: Dan McGurrin, Ph.D., NC State and Pamela Webber, Cisco
Being the 1st Chief Data Officer of San Francisco CityTheFamily
Joy Bonaguro explains her work as CDO at San Francisco City : role definition, open data & data science argumentation, being valuable for the people, starting with a problem... Almost as creating a startup inside a public institution that exists for about 200 years :)
Building a Data Platform Strata SF 2019mark madsen
Building a data lake involves more than installing Hadoop or putting data into AWS. The goal in most organizations is to build multi-use data infrastructure that is not subject to past constraints. This tutorial covers design assumptions, design principles, and how to approach the architecture and planning for multi-use data infrastructure in IT.
[This is a new, changed version of the presentations of the same title from last year's Strata]
A Hybrid Approach to Data Science Project ManagementElaine K. Lee
A talk about how Civis Analytics, a data science consultancy and software company, does project management using a blend of approaches from academia, consulting, and software engineering.
This document provides an introduction to the concepts of data science. It defines data science as an interdisciplinary field drawing from computer science, statistics, and application domains. The document outlines the typical workflow of a data scientist, including obtaining data, exploring it, cleaning it, performing analysis, drawing conclusions, and reporting results. It describes the focus areas of the course as mathematics, technology, visualization, and communication skills. The document emphasizes the importance of learning new skills independently and communicating results effectively to non-technical audiences.
Lean AI - Automate Routine, make space to innovateEmerson Taymor
A partnership between Philosophie and Studio VV6, we help companies think about how they can take advantage of machine learning to make more effective teams and organizations.
This document provides an overview of how to prepare for a career in data science. It discusses the author's own career path, which included degrees in bioinformatics and machine learning as well as jobs as a data scientist. It then outlines the typical data science workflow, including identifying problems, accessing and cleaning data, exploratory analysis, modeling, and deploying results. It emphasizes that data science is an iterative process and stresses the importance of communication skills. Finally, it discusses how data science fits within business contexts and the value of working on teams with complementary skills.
These slides were presented by Pauline Chow, Lead Instructor in Data Science & Analytics, General Assembly for her talk at Data Science Pop Up LA in September 14, 2016.
Agile Data Science is a lean methodology that is adopted from Agile Software Development. At the core it centers around people, interactions, and building minimally viable products to ship fast and often to solicit customer feedback. In this presentation, I describe how this work was done in the past with examples. Get started today with our help by visiting http://www.alpinenow.com
Computer Applications and Systems - Workshop VRaji Gogulapati
This document provides an overview of emerging technologies and their impact on businesses. It discusses how businesses are using new approaches like online collaborative communities and technologies to solve problems. It also covers topics like Enterprise 2.0, cloud computing, big data, analytics, social networking, collaboration tools, search engines, platforms, open source, e-learning and MOOCs. The document suggests that connectivity and data are driving new applications and experiences for consumers, and technologies are becoming the drivers of business success by enabling new ways of working and finding insights.
Innovation med big data – chr. hansens erfaringerMicrosoft
Mange steder er Big Data stadig det nye og ukendte, der ikke har topprioritet hos IT, da ”vi ikke har store datamængder”. Men Big Data er meget mere end store datamængder. I Chr. Hansen A/S har Forskning og Udvikling (Innovation) afdelingen arbejdet med værdien af data og som resultat etableret et tværfagligt BioInformatik-program på Big Data teknologier fra Microsoft.
The world around us is changing. Data is embedded in everything, and users from all lines of business want to leverage this data to influence decisions. The trick is to create a culture for pervasive analytics and empower the business to use data everywhere.
The core enabling technology to make this happen is Apache Hadoop. By leveraging Hadoop, organizations of all sizes and across all industries are making business models more predictable, and creating significant competitive advantages using big data.
Join Cloudera and Forrester to learn:
- What we mean by pervasive analytics, how it impacts your organization, and how to get started
- How leading organizations are using pervasive analytics for competitive advantage
- How Cloudera’s extensive partner ecosystem complements your strategy, helping deliver results faster
Why Everything You Know About bigdata Is A LieSunil Ranka
As a big data technologist, you can bet that you have heard it all: every crazy claim, myth, and outright lie about what big data is and what it isn't that you can imagine, and probably a few that you can't.If your company has a big data initiative or is considering one, you should be aware of these false statements and the reasons why they are wrong.
This document provides an introduction to data science and machine learning concepts. It discusses data analytics, machine learning, artificial intelligence, and deep learning. It introduces popular tools for data analytics like Python, Jupyter Notebook, R, and SAS. It also discusses key platforms in data science like Kaggle and DataScientists.net that host data science competitions and allow users to work on real-world datasets. The document provides examples of data analytics applications in different industries like media, healthcare, finance, and manufacturing. It also discusses concepts related to big data like the four V's of big data - volume, velocity, variety and veracity.
The document discusses how a UX team at AstraZeneca, a large biopharmaceutical company, used Lean UX principles to drive digital transformation. Over the past year, they built internal search and user experience competency centers, developed mobile apps, and ran usability labs. This helped them accelerate the adoption of new technologies, improve existing products, and establish modern design practices across the organization.
This document provides an introduction and overview of data science. It defines data science as the field that uses scientific processes and algorithms to extract knowledge and insights from data. It describes data scientists as applying machine learning to structure and unstructured data to build AI systems. The document outlines typical data science processes and discusses different types of data scientists, including those focused on humans and machines. It explains why data science is important for businesses to increase the value of their data and help with decisions, customers, and processes. Finally, it provides a demo of a data science application.
A brief introduction to DataScience with explaining of the concepts, algorithms, machine learning, supervised and unsupervised learning, clustering, statistics, data preprocessing, real-world applications etc.
It's part of a Data Science Corner Campaign where I will be discussing the fundamentals of DataScience, AIML, Statistics etc.
This talk is an introduction to Data Science. It explains Data Science from two perspectives - as a profession and as a descipline. While covering the benefits of Data Science for business, It explaints how to get started for embracing data science in business.
The document discusses an open approach to increasing customer retention and lifetime value through wearable devices and data. It introduces the speaker, Jeff Katz, and covers topics like the recent history of wearables being kept in a drawer, kindergarten lessons of sharing, and three big ideas - interoperability, data stewardship, and transparency. The presentation concludes by introducing Geeny, a platform for building compelling solutions through an open and transparent approach to wearable data and consumer choice.
Big Data : From HindSight to Insight to ForesightSunil Ranka
When it comes to Analytics and Reporting , There is a fine line between HindSight to Insight to Foresight . With the evolution of BigData technology, there is a need in deriving value out of the larger datasets, not available in the past. Even before we can start using the new shiny technologies, there is a need of understanding what is categorized as reporting or business intelligence or Big Data and Analytics. Based on my experience, people struggle to distinguish between reporting, Analytics, and Business Intelligence.
Introduction to RPA and Document UnderstandingDianaGray10
Welcome Chapter Members to our November Meet-up. We are excited to invite you to learn about RPA and Document Understanding. In this session, we will be covering:
• Introduction to RPA
• Introduction to Document Understanding
• Introduction to AI center
Please bring other topics with you that you would like us to explore in this chapter. We are looking forward to meeting everyone. Invite your colleagues as well.
Bob Selfridge - Identify, Collect, and Act Upon Customer Interactions; Rinse,...Julia Grosman
This document discusses building a customer intelligence practice using an agile process. It recommends starting with reference data and supplementing it with transactional and subjective data. The agile process involves continuously defining needs, designing specifications, developing solutions, and delivering working solutions to measure return on investment. Key aspects of the agile approach include collaboration over negotiation, responding quickly to change, valuing working solutions over documentation, and frequent delivery of software.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Kaxil Naik
Navigating today's data landscape isn't just about managing workflows; it's about strategically propelling your business forward. Apache Airflow has stood out as the benchmark in this arena, driving data orchestration forward since its early days. As we dive into the complexities of our current data-rich environment, where the sheer volume of information and its timely, accurate processing are crucial for AI and ML applications, the role of Airflow has never been more critical.
In my journey as the Senior Engineering Director and a pivotal member of Apache Airflow's Project Management Committee (PMC), I've witnessed Airflow transform data handling, making agility and insight the norm in an ever-evolving digital space. At Astronomer, our collaboration with leading AI & ML teams worldwide has not only tested but also proven Airflow's mettle in delivering data reliably and efficiently—data that now powers not just insights but core business functions.
This session is a deep dive into the essence of Airflow's success. We'll trace its evolution from a budding project to the backbone of data orchestration it is today, constantly adapting to meet the next wave of data challenges, including those brought on by Generative AI. It's this forward-thinking adaptability that keeps Airflow at the forefront of innovation, ready for whatever comes next.
The ever-growing demands of AI and ML applications have ushered in an era where sophisticated data management isn't a luxury—it's a necessity. Airflow's innate flexibility and scalability are what makes it indispensable in managing the intricate workflows of today, especially those involving Large Language Models (LLMs).
This talk isn't just a rundown of Airflow's features; it's about harnessing these capabilities to turn your data workflows into a strategic asset. Together, we'll explore how Airflow remains at the cutting edge of data orchestration, ensuring your organization is not just keeping pace but setting the pace in a data-driven future.
Session in https://budapestdata.hu/2024/04/kaxil-naik-astronomer-io/ | https://dataml24.sessionize.com/session/667627
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
Introduction to Jio Cinema**:
- Brief overview of Jio Cinema as a streaming platform.
- Its significance in the Indian market.
- Introduction to retention and engagement strategies in the streaming industry.
2. **Understanding Retention and Engagement**:
- Define retention and engagement in the context of streaming platforms.
- Importance of retaining users in a competitive market.
- Key metrics used to measure retention and engagement.
3. **Jio Cinema's Content Strategy**:
- Analysis of the content library offered by Jio Cinema.
- Focus on exclusive content, originals, and partnerships.
- Catering to diverse audience preferences (regional, genre-specific, etc.).
- User-generated content and interactive features.
4. **Personalization and Recommendation Algorithms**:
- How Jio Cinema leverages user data for personalized recommendations.
- Algorithmic strategies for suggesting content based on user preferences, viewing history, and behavior.
- Dynamic content curation to keep users engaged.
5. **User Experience and Interface Design**:
- Evaluation of Jio Cinema's user interface (UI) and user experience (UX).
- Accessibility features and device compatibility.
- Seamless navigation and search functionality.
- Integration with other Jio services.
6. **Community Building and Social Features**:
- Strategies for fostering a sense of community among users.
- User reviews, ratings, and comments.
- Social sharing and engagement features.
- Interactive events and campaigns.
7. **Retention through Loyalty Programs and Incentives**:
- Overview of loyalty programs and rewards offered by Jio Cinema.
- Subscription plans and benefits.
- Promotional offers, discounts, and partnerships.
- Gamification elements to encourage continued usage.
8. **Customer Support and Feedback Mechanisms**:
- Analysis of Jio Cinema's customer support infrastructure.
- Channels for user feedback and suggestions.
- Handling of user complaints and queries.
- Continuous improvement based on user feedback.
9. **Multichannel Engagement Strategies**:
- Utilization of multiple channels for user engagement (email, push notifications, SMS, etc.).
- Targeted marketing campaigns and promotions.
- Cross-promotion with other Jio services and partnerships.
- Integration with social media platforms.
10. **Data Analytics and Iterative Improvement**:
- Role of data analytics in understanding user behavior and preferences.
- A/B testing and experimentation to optimize engagement strategies.
- Iterative improvement based on data-driven insights.
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
2. What is Lean AI?
• Current technology stacks are getting more and more
powerful – there is a lot more we can do with legacy
software and data
• We’re thinking about a new perspective that affects
the entire operation - from data to machine to people
and delivers more value from your existing systems
Philosophie & Studio VV6
3. What is the narrowest
task you can automate?
Consider
Philosophie & Studio VV6
5. Data are facts, such as names or numbers. If sensors
are collecting these, there are electronic impulses
when something happens when something moves.
Information is slightly different in that it
combines various data to say something
that the data alone can’t say. For
instance, data on our spending habits
tell us about our financial behavior and
about our patterns of expenditures–that
is information, not just groups of
unrelated numbers.
Source: Information and the Modern Corporation, James W Cortada, MIT Press
step 1
step 2
Philosophie & Studio VV6
6. Knowledge is more complicated than data or
information because it combines data,
information and experiences from logically
connected groups of facts (such as budget
data from a department) with things that
have no direction or obvious connections
(such as previous jobs and experiences).
Then there is wisdom: the
ability to make sense of data,
information, and knowledge
in ways that are relevant to
the organization.
Source: Information and the Modern Corporation, James W Cortada, MIT Press
step 3
step 4
Philosophie & Studio VV6
8. Data Utility
• An organization is a technological system comprised
of hardware, software and people
• What can be done to improve data utility?
• Creating thicker data will improve an organization’s
data utility: and create exponential marginal
utility for better decision making, productivity and
faster innovation cycles
Philosophie & Studio VV6
9. VV6 Philosophie
research, strategy, writing product workshops, prototyping
Kernel
Writing a plan
for what is ahead
Analog p/t
system thinking
hopefully with leadership
step 1
step 2
step 0
Philosophie & Studio VV6
10. VV6 Philosophie
research, strategy, writing product workshop, prototyping
Hypothesis &
Human-Machine
System Design
Digital Prototyping
Contextual &
proprietary I.P.
Org’ Design & More
step 3
step 4
step 5
step 6 (optional)
Philosophie & Studio VV6
12. Malcolm Frank, Paul Roehrig & Ben Pring.
“What to Do When Machines Do Everything.”
“We think your baseline expectation should be cost
reductions of 25%, with an associated productivity
increase of 25%. Based on where the current average
is today (around 15%), and the productivity
improvement seen by some solutions (up to 90%),
this should be your achievable near-term rule of
thumb for initial robotic process automation efforts.”
The 25% - 25% Rule
Philosophie & Studio VV6
13. Philosophie is an agile product
consultancy with offices in LA,
New York, San Francisco and
Seattle
Philosophie
VV6 is a New York based
innovation consultancy and
research studio operating
between academia
and industry.
VV6
We have worked together on a number of projects
and are excited to share more about this unique and
proprietary new offering.
philosophie.is vv6.co
emerson@philosophie.is x@vv6.co
Philosophie & Studio VV6