Hiring data scientists doesn't make a data cultureGeorge Mount
Some organizations try to kick-start their data capacities by hiring loads of highly-trained data scientists or investing in machine learning infrastructures.
This will not be a successful strategy because it surpasses necessary steps in building a data culture. Organizations are better off up-skilling the talent they have.
Future of data science as a professionJose Quesada
How can you thrive in a future where machine learning has been popular for a few years already?
In this talk, I will give you actionable advice from my experience training serious data scientists at our retreat center in Berlin. You are going to face these pointy, hard questions:
- What is the promise of machine learning? Has it happened yet?
- Is it easy to take advance of machine learning, now that most algorithms are nicely packaged in APIs and libraries?
- How much time should I spend getting good at machine learning? Am I good enough now?
- Are data scientists going to be replaced by algorithms? Are we all?
- Is it easy to hire talent in machine learning after the explosion of MOOCs?
Big data & data science challenges and opportunitiesJose Quesada
Even when most companies see the advantages of using more data in their decisions, few actually do. Why is that? A few ideas on challenges and opportunities for (middle-size) companies. Talk audience was an engineering association, where most people represented engineering-centric companies in Germany (often in manufacturing).
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages JaunesDataiku
Presentation made at Rennes in January for the handsome BreizhJUG. This is a mixed presentation for big data technologies, which covers topics such as : Why Hadoop ? What next ? Machine Learning for big data in practice.
Hiring data scientists doesn't make a data cultureGeorge Mount
Some organizations try to kick-start their data capacities by hiring loads of highly-trained data scientists or investing in machine learning infrastructures.
This will not be a successful strategy because it surpasses necessary steps in building a data culture. Organizations are better off up-skilling the talent they have.
Future of data science as a professionJose Quesada
How can you thrive in a future where machine learning has been popular for a few years already?
In this talk, I will give you actionable advice from my experience training serious data scientists at our retreat center in Berlin. You are going to face these pointy, hard questions:
- What is the promise of machine learning? Has it happened yet?
- Is it easy to take advance of machine learning, now that most algorithms are nicely packaged in APIs and libraries?
- How much time should I spend getting good at machine learning? Am I good enough now?
- Are data scientists going to be replaced by algorithms? Are we all?
- Is it easy to hire talent in machine learning after the explosion of MOOCs?
Big data & data science challenges and opportunitiesJose Quesada
Even when most companies see the advantages of using more data in their decisions, few actually do. Why is that? A few ideas on challenges and opportunities for (middle-size) companies. Talk audience was an engineering association, where most people represented engineering-centric companies in Germany (often in manufacturing).
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages JaunesDataiku
Presentation made at Rennes in January for the handsome BreizhJUG. This is a mixed presentation for big data technologies, which covers topics such as : Why Hadoop ? What next ? Machine Learning for big data in practice.
Idiots guide to setting up a data science teamAshish Bansal
Some nuggets of how I started the data science practice at Gale Partners on a budget. Presented at the Toronto Hadoop Users Group (THUG) in April, 2015.
Best practices in building machine learning models in Azure MLZeydy Ortiz, Ph. D.
Microsoft Azure ML Studio provides an easy-to-use interface to build and deploy machine learning models. However, the user must carefully select and configure the modules in order to derive meaningful results. In this presentation, I discuss a case study to highlight best practices in building machine learning models.
Hiring for Data Scientists - Data Science Pop-up SeattleDomino Data Lab
Tales from the other side.
What you might be missing if you don't know what you are looking for. Presented by Amanda Casari, Senior Data Scientist at Concur.
Today, we have data – lots of it. We can process information – in many ways. And with these two tools and a little bit of creativity, we are discovering the vast depths of human behavior and by extension, a way to accurately predict the future -- and our future happiness. In fact, we can quantify human movement, behaviors, desires, and even moods on a scale that wasn’t possible before a series of advances in processing power, developments in psychology and social network science, and most importantly, access to data.
In advertising, industry, and humanity, we have experienced the evolution from Web 1.0 (informational) to Web 2.0 (platform) to Web 3.0 (semantic) to elements of Web 4.0 (anticipatory) – In this anticipatory era, what can we dream of next? Beyond addressability and increasing ad relevance, how can businesses utilize these advances in product development and other market initiatives? Can we make the leap from inductive logic to human-paralleled intuition? Can this make up for our human brain mechanics that make predicting our own happiness so difficult?
In this talk we’ll cover the evolutions in data access, models for information processing, and the science of collaboration to see not only how they have been leveraged in businesses but also how they are used to better understand human behavior, and hopefully in the near future, a little bit of happiness.
Deep Learning Use Cases - Data Science Pop-up SeattleDomino Data Lab
Companies like Google, Microsoft, Amazon and Facebook are in fierce competition for teams that can build deep-learning applications. Because of deep learning's general usefulness in pattern recognition, those applications are surprisingly diverse, ranging from image recognition to machine translation. This talk will explore deep learning use cases for the major data types -- image, sound, text and time series -- as they're emerging in the private sector. Presented by Chris Nicholson, Co-Founder and CEO at Skymind.
Visualisation & Storytelling in Data Science & AnalyticsFelipe Rego
This is a presentation I put together for a talk I gave a while back on data visualisation and storytelling in the context of data science and analytics. The content was a produced from a mix of my own experience in industry and teaching at various schools and also from the work of Edward Tufte, Nathan Yau, Angela Zoss, Peter Beshai, Mike Bostock, among others. I wanted to highlight some basic concepts of what data visualisation is and what are some of the fundamental steps I believe are important in creating analytical dashboards and visualisations. It also contains a bunch of visualisation examples that I find interesting in telling a story with data.
Running Effective Controlled Experiments (aka A/B/n Tests) - Data Science Pop...Domino Data Lab
Microsoft's Analysis and Experimentation team has enabled many different feature teams to run controlled experiments in order to make feature ship/no-ship decisions. In this talk, I will review several case studies of experiments run by teams at Microsoft. I will highlight both the value that running controlled experiments has provided these teams as well as the challenges encountered in order to get trustworthy results from the controlled experiments. Presented by Brian Frasca, Partner Data Scientist Manager at
Microsoft.
Talk from the first O'Reilly Strata, Feb 2011. Learn how to leverage data exhaust, the digital byproduct of our online activities, to solve problems and discover insights about the world around you. We will walk through a real world example which combines several datasets and statistical techniques to discover insights and make predictions about attendees at O'Reilly Strata.
Includes a preview of some of the technology behind LinkedIn Skills, which I launched in a Keynote with DJ Patil the following day.
Video: http://blip.tv/oreilly-promos/distilling-data-exhaust-4780870
Ofer Ron, senior data scientist at LivePerson.
Recently, I've had the pleasure of presenting an introduction to Data Science and data driven products at DevconTLV
I focused this talk around the basic ideas of data science, not the technology used, since I thought that far too many times companies and developers rush to play around with "big data" related technologies, instead of figuring out what questions they want to answer, and whether these answers form a successful product.
Why is it difficult to achieve strategic differentiation using AIGanes Kesari
This deck was presented at ICC 2019, the ISACA Chennai Conference, by Ganes Kesari.
Session Abstract:
Today, every organization is trying to make their business AI-ready. However, industry reports claim that “80% of data science projects will not deliver value”. While the destination is clear, the path to be taken isn’t obvious.
Companies struggle with several questions: How to lay out a robust data science roadmap? What skills do you need to hire for? How do AI models translate to business value? How to sustain a culture of innovation and insight?
This session will answer these questions and show how organizations can apply AI at scale. It achieves the following learning objectives:
* How to align data science with the organization’s strategic objectives?
* How to setup and scale teams?
* What organizational structures help deliver business value with AI?
Fixing data science & Accelerating Artificial Super Intelligence DevelopmentManojKumarR41
This presentation discusses Challenges, Problems, Issues, Measures, Mistakes, Opportunities, Ideas, Technologies, Research and Visions around Data Science
HashGraph, Data Mesh, Data Trajectories, Citrix HDX and Anonos BigPrivacy
Combination of these 5 and few other ideas will ultimately lead us to the VGB Platform. Will soon come up with other document explaining the vision and how exactly work on the vision to gradually develop this Platform, which fixes Data Science Efforts Globally.
Idiots guide to setting up a data science teamAshish Bansal
Some nuggets of how I started the data science practice at Gale Partners on a budget. Presented at the Toronto Hadoop Users Group (THUG) in April, 2015.
Best practices in building machine learning models in Azure MLZeydy Ortiz, Ph. D.
Microsoft Azure ML Studio provides an easy-to-use interface to build and deploy machine learning models. However, the user must carefully select and configure the modules in order to derive meaningful results. In this presentation, I discuss a case study to highlight best practices in building machine learning models.
Hiring for Data Scientists - Data Science Pop-up SeattleDomino Data Lab
Tales from the other side.
What you might be missing if you don't know what you are looking for. Presented by Amanda Casari, Senior Data Scientist at Concur.
Today, we have data – lots of it. We can process information – in many ways. And with these two tools and a little bit of creativity, we are discovering the vast depths of human behavior and by extension, a way to accurately predict the future -- and our future happiness. In fact, we can quantify human movement, behaviors, desires, and even moods on a scale that wasn’t possible before a series of advances in processing power, developments in psychology and social network science, and most importantly, access to data.
In advertising, industry, and humanity, we have experienced the evolution from Web 1.0 (informational) to Web 2.0 (platform) to Web 3.0 (semantic) to elements of Web 4.0 (anticipatory) – In this anticipatory era, what can we dream of next? Beyond addressability and increasing ad relevance, how can businesses utilize these advances in product development and other market initiatives? Can we make the leap from inductive logic to human-paralleled intuition? Can this make up for our human brain mechanics that make predicting our own happiness so difficult?
In this talk we’ll cover the evolutions in data access, models for information processing, and the science of collaboration to see not only how they have been leveraged in businesses but also how they are used to better understand human behavior, and hopefully in the near future, a little bit of happiness.
Deep Learning Use Cases - Data Science Pop-up SeattleDomino Data Lab
Companies like Google, Microsoft, Amazon and Facebook are in fierce competition for teams that can build deep-learning applications. Because of deep learning's general usefulness in pattern recognition, those applications are surprisingly diverse, ranging from image recognition to machine translation. This talk will explore deep learning use cases for the major data types -- image, sound, text and time series -- as they're emerging in the private sector. Presented by Chris Nicholson, Co-Founder and CEO at Skymind.
Visualisation & Storytelling in Data Science & AnalyticsFelipe Rego
This is a presentation I put together for a talk I gave a while back on data visualisation and storytelling in the context of data science and analytics. The content was a produced from a mix of my own experience in industry and teaching at various schools and also from the work of Edward Tufte, Nathan Yau, Angela Zoss, Peter Beshai, Mike Bostock, among others. I wanted to highlight some basic concepts of what data visualisation is and what are some of the fundamental steps I believe are important in creating analytical dashboards and visualisations. It also contains a bunch of visualisation examples that I find interesting in telling a story with data.
Running Effective Controlled Experiments (aka A/B/n Tests) - Data Science Pop...Domino Data Lab
Microsoft's Analysis and Experimentation team has enabled many different feature teams to run controlled experiments in order to make feature ship/no-ship decisions. In this talk, I will review several case studies of experiments run by teams at Microsoft. I will highlight both the value that running controlled experiments has provided these teams as well as the challenges encountered in order to get trustworthy results from the controlled experiments. Presented by Brian Frasca, Partner Data Scientist Manager at
Microsoft.
Talk from the first O'Reilly Strata, Feb 2011. Learn how to leverage data exhaust, the digital byproduct of our online activities, to solve problems and discover insights about the world around you. We will walk through a real world example which combines several datasets and statistical techniques to discover insights and make predictions about attendees at O'Reilly Strata.
Includes a preview of some of the technology behind LinkedIn Skills, which I launched in a Keynote with DJ Patil the following day.
Video: http://blip.tv/oreilly-promos/distilling-data-exhaust-4780870
Ofer Ron, senior data scientist at LivePerson.
Recently, I've had the pleasure of presenting an introduction to Data Science and data driven products at DevconTLV
I focused this talk around the basic ideas of data science, not the technology used, since I thought that far too many times companies and developers rush to play around with "big data" related technologies, instead of figuring out what questions they want to answer, and whether these answers form a successful product.
Why is it difficult to achieve strategic differentiation using AIGanes Kesari
This deck was presented at ICC 2019, the ISACA Chennai Conference, by Ganes Kesari.
Session Abstract:
Today, every organization is trying to make their business AI-ready. However, industry reports claim that “80% of data science projects will not deliver value”. While the destination is clear, the path to be taken isn’t obvious.
Companies struggle with several questions: How to lay out a robust data science roadmap? What skills do you need to hire for? How do AI models translate to business value? How to sustain a culture of innovation and insight?
This session will answer these questions and show how organizations can apply AI at scale. It achieves the following learning objectives:
* How to align data science with the organization’s strategic objectives?
* How to setup and scale teams?
* What organizational structures help deliver business value with AI?
Fixing data science & Accelerating Artificial Super Intelligence DevelopmentManojKumarR41
This presentation discusses Challenges, Problems, Issues, Measures, Mistakes, Opportunities, Ideas, Technologies, Research and Visions around Data Science
HashGraph, Data Mesh, Data Trajectories, Citrix HDX and Anonos BigPrivacy
Combination of these 5 and few other ideas will ultimately lead us to the VGB Platform. Will soon come up with other document explaining the vision and how exactly work on the vision to gradually develop this Platform, which fixes Data Science Efforts Globally.
These are slides from Ellen Wagner\'s featured theme presentation Making Learning Analytics Matter in the Educational Enterprise from Blackboard World 2012, New Orleasn, LA, July 12, 2012
These slides were presented by Pauline Chow, Lead Instructor in Data Science & Analytics, General Assembly for her talk at Data Science Pop Up LA in September 14, 2016.
TechWise with Eric Kavanagh, Dr. Robin Bloor and Dr. Kirk Borne
Live Webcast on July 23, 2014
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=59d50a520542ee7ed00a0c38e8319b54
Analytical applications are everywhere these days, and for good reason. Organizations large and small are using analytics to better understand any aspect of their business: customers, processes, behaviors, even competitors. There are several critical success factors for using analytics effectively: 1) know which kind of apps make sense for your company; 2) figure out which data sets you can use, both internal and external; 3) determine optimal roles and responsibilities for your team; 4) identify where you need help, either by hiring new employees or using consultants 5) manage your program effectively over time.
Register for this episode of TechWise to learn from two of the most experienced analysts in the business: Dr. Robin Bloor, Chief Analyst of The Bloor Group, and Dr. Kirk Borne, Data Scientist, George Mason University. Each will provide their perspective on how companies can address each of the key success factors in building, refining and using analytics to improve their business. There will then be an extensive Q&A session in which attendees can ask detailed questions of our experts and get answers in real time. Registrants will also receive a consolidated deck of slides, not just from the main presenters, but also from a variety of software vendors who provide targeted solutions.
Visit InsideAnlaysis.com for more information.
Slides for a talk given at "The Conference Formerly Known as Conversion Hotel" in November 2019. Covers what data science is, what data scientists do, and how you can start learning data science skills.
Snowplow had our debut at the Data Science Festival in London this April. It was a good chance for us to engage with the data science community and learn more about the important work data scientists are doing and how Snowplow best can support this work. We definitely learned a lot and would like to thank everyone who made it by our booth for a chat.
Alex, Snowplow’s Co-Founder and CEO, held a talk on the topic “What makes an effective data team”. He took the well-known concept of Maslow’s Hierarchy of Needs and applied that to the needs of the data team.
The pioneers in the big data space have battle scars and have learnt many of the lessons in this report the hard way. But if you are a general manger & just embarking on the big data journey, you should now have what they call the 'second mover advantage’. My hope is that this report helps you better leverage your second mover advantage. The goal here is to shed some light on the people & process issues in building a central big data analytics function
In this talk, we introduce the Data Scientist role , differentiate investigative and operational analytics, and demonstrate a complete Data Science process using Python ecosystem tools, like IPython Notebook, Pandas, Matplotlib, NumPy, SciPy and Scikit-learn. We also touch the usage of Python in Big Data context, using Hadoop and Spark.
Learn to Use Databricks for Data ScienceDatabricks
Data scientists face numerous challenges throughout the data science workflow that hinder productivity. As organizations continue to become more data-driven, a collaborative environment is more critical than ever — one that provides easier access and visibility into the data, reports and dashboards built against the data, reproducibility, and insights uncovered within the data.. Join us to hear how Databricks’ open and collaborative platform simplifies data science by enabling you to run all types of analytics workloads, from data preparation to exploratory analysis and predictive analytics, at scale — all on one unified platform.
Similar to How to Build Data Science Teams that Deliver Business Value (20)
Project Management Careers in Data ScienceGanes Kesari
This slide deck was used in the presentation made to the Project Management Institute (PMI) Metrolina Chapter on September 28, 2022.
Title:
Top Data Science Career Opportunities For Project Managers
- Art of the Possible with Data Science
- Industry case studies
Data & Analytics 101 for PMs: Key disciplines and terminologies
- Top roles in data analytics
- Project Management in Data Science
Takeaways:
- How managers influence D&A project outcomes
- Key responsibilities & tips for success
- Industry examples: Challenges and learnings
Penn State Guest Lecture: Business Forecasting in Real LifeGanes Kesari
This deck was used in the guest lecture conducted at the Penn State College of Information Sciences and Technology. The session was delivered by Thanoj Kattamanchi and Ganes Kesari on November 30th, 2021.
Session Abstract:
What are the most common applications of forecasting techniques? This talk will share several industry case studies on forecasting. It will dive into the details of how a large agricultural conglomerate adopted price forecasting to achieve a revenue uplift of 3.2%. The session will cover the step-by-step approach adopted by the client, from business problem definition, data identification, variable selection, model evaluation, to deployment. It will conclude by sharing the top lessons learned and guidelines for aspiring data scientists considering a career in building machine learning applications.
500 startups cognitive bias in decision making - ganes kesari - nov 2021 - finalGanes Kesari
What are the most common cognitive biases that impact leaders? How can early-stage startups lay a foundation for data-driven decisions?
In this fireside chat for 500 Startups, I addressed startup founders on how to champion good data stewardship, embed it in product development, and ensure organizational ROI from data.
The session was conducted on November 10th, 2021.
This deck was used in the guest lecture delivered by Ganes at the Rutgers Business School, New Jersey, on July 7th, 2021. These slides were supplemented with a live business case that was used for the class discussion.
5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...Ganes Kesari
This session was presented on May 27th, 2021, in a Webinar organized by Gramener.
https://info.gramener.com/5-steps-to-transform-into-data-driven-organization
Session Details:
Today, organizations struggle to get value from data despite significant investments. Did you know that there's one factor that influences the outcomes of all your data initiatives?
This webinar will highlight how an organization's data maturity influences its performance. It will show how you can assess your data maturity and plan the five steps for data-driven business transformation.
Pain points we would be discussing:
Most organizations stagnate midway in their data journey.
Gartner says that over 87% of organizations in the industry are at lower levels of data maturity (levels 1 and 2 on a scale of 5).
Just doing more data science projects will not improve your capabilities or outcomes. The fact is that the top challenges reported by CDOs fall into five common areas.
This webinar will show what they are and how you can tackle them.
Who should attend
- Executives, Chief Data/Analytics Officers, Technology leaders, Business heads, Managers
What Will You Learn?
- What is data science maturity, and why does it matter?
- How do you assess data science maturity and limitations of the assessment?
- How can data science maturity help your organization level up (explained with an example)?
How AI can help you make your Audience Sit up and take NoticeGanes Kesari
This session was delivered in the IABC World Conference 2020, on June 15.
https://wc.iabc.com/sessions/
Session Abstract:
How do you cut through the clutter and connect with your audience? Can data make your message more credible? Can analytics help you craft compelling content? How can AI help understand your audience’s response? Today there is an over-abundance of information and a paucity of meaningful communication. Data and analytics can be invaluable aids in communication.
Details:
This session will cover four ways to enable this:
Sourcing data to develop your idea: An idea becomes credible when backed by data. But where do you find the data? We’ll see how to get creative with public data sources.
Taking the help of AI to build your narrative: AI needn’t be terrifying. With the open tools available, it can be a powerful aid in discovering insights and building a narrative
Adopting storytelling to craft your message: A set of simple yet powerful principles can transform ordinary messages into powerful stories.
Using content analytics to establish a feedback loop: Analyze your content and audience response in the form of text, video, and audio to get vital clues for improvement.
This session will have live polls for audience feedback and there will be short exercises to absorb the content from each module.
You’ll learn:
How to find data to support your idea.
Which AI tools and techniques can help develop your idea.
How to establish a content analytics feedback loop to learn and improve.
'Recession-proofing' your Business with DataGanes Kesari
This session was presented on May 7th 2020, in a Webinar organized by Gramener.
https://info.gramener.com/recession-proofing-your-business-with-data
COVID-19 has disrupted every industry and precipitated a recession. With the virus still in the early stages of progression, the only certainty is that the pains to the global economy will be prolonged.
Is your business ready for the long haul? Data is your best ally to navigate the crisis and come out stronger. This webinar will show you how.
What will I learn?
Which areas of your business can benefit most with a data-driven response.
A framework to identify use cases that will deliver the biggest bang for the buck.
How to identify new market opportunities and customers through creative approaches with data.
AGENDA:
- Relevance of data in the current crisis
- How data science can help you stay prepared to navigate the recession
- Industry case studies from Gramener's work to help clients respond to COVID-19
What's the Value of Data Science for Organizations: Tips for Invincibility in...Ganes Kesari
This session was delivered as an Open Colloquium on Apr 30th 2020 for the Master in Information program students. It was organized by the Rutgers School of Communication & Information.
The session covers 3 themes:
- How do enterprises and not-for-profit organizations gain value from data science?
- What are the biggest challenges in data science that professionals are unaware of? How can students translate that into learnings, to make themselves indispensable in the industry
- What's the impact of COVID-19 and the recession on data science industry? How will the data jobs be impacted?
How Brands can use AI for Actionable Customer IntelligenceGanes Kesari
This session was delivered on Apr 20th 2020 at the Rutgers Business School. This was a Guest Lecture for the "AI in Marketing" class.
Session Brief:
Today, customers interact with brands continuously, either intentionally or indirectly. They do so on a number of channels, and leave a variety of digital footprints. Unfortunately, enterprises miss out on this opportunity to understand and connect with the customers. This session will show how brands can leverage data and technology to understand their customers.
Brands can harvest both structured and unstructured data from diverse channels. With the help of data analytics, they can build an integrated view of the customer. By overlaying the content with context, they can map every conversation on the customer journey. By stitching all these insights together, brands can use storytelling to help drive the right business decisions.
Transform your Brand's Customer Experience by using AIGanes Kesari
This deck was used in the presentation at the Concentric 2020 conference, on Jan 24th, 2020.
SESSION BRIEF:
Most organizations struggle to understand their customers. In a digital world, the power of data and technology is not tapped into. So, customer advocacy initiatives are short-sighted, transactional and unscientific. This session will show how organizations can listen to customers better. It will introduce you to AI techniques that are revolutionizing customer intelligence.
With examples, you will understand how these insights can be translated into business decisions.
LEARNING OBJECTIVES:
1. You will learn creative ways to find what customers are looking for right now
2. You will be introduced to advanced analytics techniques and storytelling with data
3. You will understand how data science can be applied for business decisions
These are the slides from the Gramener webinar conducted on 16-Jan-2020.
- What skills & roles will help you deliver your analytics and data visualization projects?
- What skills do most teams miss to hire for?
In a Gartner survey, CIOs reported 'team skills' as their biggest barrier ⚠️ to data science. They have trouble deciding the skill mix ⚗️needed or in finding the right people for the job.
This webinar will show the skills and roles you must plan for. You will learn how to tailor this based on your organization's data maturity. It will help you decide whether to upskill teams or hire externally. The session will show you how and where to find talent.
Throughout the webinar you will learn:
- Critical skills & roles needed in your data science team?
- Tips for data science hiring. What aspirants should know about the jobs?
- Insights presented using real-world examples
How Data Science can help Understand your Customers BetterGanes Kesari
This was presented at the AT&T Retirees Quarterly ELATE Luncheon, on Sep 18th, 2019.
SESSION BRIEF:
Most organizations struggle to understand their customers. In a digital world, the power of data and technology is not tapped into. So, customer advocacy initiatives are short-sighted, transactional and unscientific. This session will show how organizations can listen to customers better. It will introduce you to AI techniques that are revolutionizing customer intelligence.
With examples, you will understand how these insights can be translated into business decisions.
LEARNING OBJECTIVES:
1. You will learn creative ways to find what customers are looking for right now
2. You will be introduced to advanced analytics techniques and storytelling with data
3. You will understand how data science can be applied for business decisions
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
How to Build Data Science Teams that Deliver Business Value
1. 1
Photo by Anna Samoylova on Unsplash
How to Build Data Science Teams that
Deliver Business Value
Ganes Kesari
Gramener
2. 2
INTRODUCTION
Ganes Kesari
Co-founder & Head
of Analytics
“Simplify Data
Science for all”
100+ Clients
Insights as Stories
Help apply & adopt
Analytics
Our data science platform,
Gramex is now open-sourced!
3. 3
“ 80% of analytics insights will not deliver business
outcomes through 2022
- Gartner, Jan 2019
TODAY, WE ARE IN THE AGE OF AI, BUT..
8. 8
THE 3 STAGES TO ORGANIZATIONAL DATA SCIENCE MATURITY
Kazia89636 on Flickr
9. 9
• Start from the top
• Align with strategic priorities
• Start as a central team
• Build an initial data science
roadmap
OrgSizeSpecialization&
“Experiment”
PHASE 1: “EXPERIMENT”
10. 10
IDENTIFYING YOUR FIRST PROJECT
Stakeholder
groups
Objectives Initiatives Questions Data
have a set of that can be met by which answer specific using
for that meet that can address suggests
Business driven approach
Data driven approach
Importance
Ease
Quick wins
Strategic
Deferred
Revenue impact
Breadth of usage
Effort reduction
Data availability
Technology feasibility
Prioritised initiatives
Output
Gap in current reports
Addressed by current reports
1
2
11. 11
WHAT TOOLS DO YOU NEED?
Alteryx
Amazon EC2
Azure ML
BigQuery
Birst
Caffe
Cassandra
Cloud Compute
Cloudera
Cognos
CouchDB
D3
Decision tree
ElasticSearch
Excel
Gephi
ggplot2
Hadoop
HP Vertica
IBM Watson
Impala
Julia
Jupyter Notebook
Kafka
Kibana
Kinesis
Lambda
Leaflet
Logstash
MapR
MapReduce
Matplotlib
Microstrategy
MongoDB
NodeXL
Pandas
Pentaho
Pivotal
PowerPoint
Power BI
Qlikview
R
R Studio
Random Forest
Redis
Redshift
Regression
Revolution R
S3
SAP Hana
SAS
Spark
Spotfire
SPSS
SQL Server
Stanford NLP
Storm
SVM
Tableau
TensorFlow
Teradata
Theano
Thrift
Torch
Weka
Word2Vec
The tool does not matter. A person’s skill with the tool does.
Pick the person. Let them pick the tool.
12. 12
WHAT SKILLS DO YOU NEED?
X
• Start with Generalists
• Breadth, more than depth
• Strong business alignment
• Curiosity
• Learn on the go
https://hackernoon.com/how-not-to-hire-your-first-data-scientist-34f0f56f81ae
13. 13
“ Start where you are. Use what you have. Do what
you can
- Arthur Ashe
15. 15
PHASE 2: “EXPAND”
• Transition from pilots to
projects
• Map deeper with chosen
Business units
• Extend relationship with
sponsors
• Expand your data footprint
Time
OrgSizeSpecialization&
- Generalists with many broad skills
- Teams organized by initiatives
- “Just get it done”
“Experiment” “Expand”
16. 16
WHAT’S HOLDING BACK COMPANIES AT THIS STAGE?
https://www.oreilly.com/data/free/ai-adoption-in-the-enterprise.csp
17. 17
WHAT’S THE SKILL NEEDED?
Deep
Domain Expertise
Information
Design &
Presentation
Deep
Programming
Statistics & Machine
Learning
Data Bootstrap
Skills
Passion for data Data handling
Design basics
Domain awareness
24. 24
PHASE 2: “EXPAND”
• Start planning skill-based
teams
• Evolve your execution model
• “Introduce processes, but
stay nimble”
Time
OrgSizeSpecialization&
- Generalists with many broad skills
- Teams organized by initiatives
- “Just get it done”
“Experiment” “Expand”
• Map deeper with chosen
Business units
• Extend relationship with
sponsors
• Expand your data footprint
• Bring in the specialists
25. 25
“If you only do things where you know the answer
in advance, your company goes away.
- Jeff Bezos, Amazon
27. 27
PHASE 3: “INTEGRATE”
“Expand” “Integrate”
Time
OrgSizeSpecialization&
- Start adding few specialists
- Sprouting of skill-based teams
- “Introduce process, but stay nimble”
(Pic/Icons Credits: José-Manuel Benito/wiki, OpenClipart-Vectors/27440 images, pngtree.com, ClipartPanda.com, macrovector/Freepik)
• Analytics business-as-usual
• Diffused org-wide
• Enable continuous change
• Experimentation labs
• ”Excel at mature execution”
28. 28Source: Peter Skomoroch – Why machine learning is harder than you think, Strata London 2019
AMAZON, GOOGLE, MICROSOFT ARE REORGANIZING THEMSELVES AROUND AI
29. 29
UNCOVER YOUR DARK DATA
Source: http://www.patrickcheesman.com/dark-data-problems-and-solutions/
• INACCESSIBLE data (e.g. technology is outdated)
• FORGOTTEN data (e.g. collected, but not actively used)
• UNCOLLECTED data (e.g. information exists, not digitized)
• SINGLE PURPOSE data (e.g. used for a specific purpose)
30. 30
Data is the product
IT is the key enabler
Tools are the differentiator
Insight is the product
Analysis is the key enabler
Skills are the differentiator
Action is the product
Culture is the key enabler
Process is the differentiator
INCULCATE A DATA CULTURE
Data Science for Business, by Foster Provost & Tom Fawcett, O’Reilly
31. 31
“ Perfection is not attainable, but if we chase
perfection, we can catch excellence
- Vince Lombardi
33. 33
WRAP-UP
“Experiment” “Expand” “Integrate”
Time
OrgSizeSpecialization&
- Generalists with many broad skills
- Teams organized by initiatives
- “Just get it done”
- Start adding few specialists
- Sprouting of skill-based teams
- “Introduce process, but stay nimble”
- Mature business units with specialists
- Data Culture
- Data science diffused org-wide
- “Excel at mature execution”