Gramener collaborated with Nasscom to conduct an online masterclass session on "Storytelling With Data." Gramener's CEO, S Anand, led the masterclass and shared some important slides on how to make data stories and how to drive storytelling.
The slides talk about the structure of data stories and how to find meaning full insights from data. There are real-time examples of data analysis and visualizations we created a Gramener to communicate insights as stories.
This is an ultimate guide on data storytelling that offers tips to create data stories, things to keep in mind while making storylines, and choosing designs to make a design-led data story.
Know more about Gramener's data storytelling workshop for analysts and data scientists at https://gramener.com/data-storytelling-workshop
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessInformatica
Imagine a fast, more efficient business thriving on trusted data-driven decisions. An intelligent data catalog can help your organization discover, organize, and inventory all data assets across the org and democratize data with the right balance of governance and flexibility. Informatica's data catalog tools are powered by AI and can automate tedious data management tasks and offer immediate recommendations based on derived business intelligence. We offer data catalog workshops globally. Visit Informatica.com to attend one near you.
The promise of self-service analytics asserts that business users should be empowered make data-driven decisions quickly without having to involve the analytics team, while critics say that it could lead to faulty choices. In this presentation we’ll cover topics such as acknowledging diverse customer needs, choosing the right tools, understanding the pitfalls, and considering the future of self-service analytics. And cake.
Learn about the basic decisions required for business document scanning. Indexing, file formats, document resolution, color space, and more. Learn about estimating volumes and automated capture technology such as barcode recogonition, OCR, batch document processing and more.
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessInformatica
Imagine a fast, more efficient business thriving on trusted data-driven decisions. An intelligent data catalog can help your organization discover, organize, and inventory all data assets across the org and democratize data with the right balance of governance and flexibility. Informatica's data catalog tools are powered by AI and can automate tedious data management tasks and offer immediate recommendations based on derived business intelligence. We offer data catalog workshops globally. Visit Informatica.com to attend one near you.
The promise of self-service analytics asserts that business users should be empowered make data-driven decisions quickly without having to involve the analytics team, while critics say that it could lead to faulty choices. In this presentation we’ll cover topics such as acknowledging diverse customer needs, choosing the right tools, understanding the pitfalls, and considering the future of self-service analytics. And cake.
Learn about the basic decisions required for business document scanning. Indexing, file formats, document resolution, color space, and more. Learn about estimating volumes and automated capture technology such as barcode recogonition, OCR, batch document processing and more.
Recommended for CDOs and all Data & Analytics Managers
The past 2 years have had a huge impact on organizations journeys to become data driven. Existing data architectures were disrupted; rigid structures and processes were questioned, and many data strategies were re-written.
On the one hand, the global pandemic emphasized the need for organizations to raise the bar, implement strategies, improve data literacy and culture, increase investments in data and analytics, and explore AI opportunities.
On the other, it also presented new challenges such as: the war for data talent and the wide literacy gap. Inadequate structures as well as outdated processes were exposed. Major changes in the data landscape (Data Fabric, Data Mesh, Transition to Data Clouds) will further disrupt existing data architectures and enhance the need for a new adaptive architecture and organization.
Holt-Winters forecasting allows users to smooth a time series and use data to forecast selected areas. Exponential smoothing assigns decreasing weights and values against historical data to decrease the value of the weight for the older data, so more recent historical data is assigned more weight in forecasting than older results. The right augmented analytics provides user-friendly application of this method and allow business users to leverage this powerful tool.
Tackling data quality problems requires more than a series of tactical, one off improvement projects. By their nature, many data quality problems extend across and often beyond an organization. Addressing these issues requires a holistic architectural approach combining people, process and technology. Join Donna Burbank and Nigel Turner as they provide practical ways to control data quality issues in your organization.
This is a presentation in a meetup called "Business of Data Science". Data science is being leveraged extensively in the field of Banking and Financial Services and this presentation will give a brief and fundamental highlight to the evergreen field.
Slides used for a presentation to introduce the field of business analytics. Covers what BA is, how it is a part of business intelligence, and what areas make up BA.
YouTube Link : https://www.youtube.com/watch?v=RPhNwjyLQes
Intellipaat Data Analytics training course: https://intellipaat.com/data-analytics-master-training-course/
Recommended for CDOs and all Data & Analytics Managers
The past 2 years have had a huge impact on organizations journeys to become data driven. Existing data architectures were disrupted; rigid structures and processes were questioned, and many data strategies were re-written.
On the one hand, the global pandemic emphasized the need for organizations to raise the bar, implement strategies, improve data literacy and culture, increase investments in data and analytics, and explore AI opportunities.
On the other, it also presented new challenges such as: the war for data talent and the wide literacy gap. Inadequate structures as well as outdated processes were exposed. Major changes in the data landscape (Data Fabric, Data Mesh, Transition to Data Clouds) will further disrupt existing data architectures and enhance the need for a new adaptive architecture and organization.
Holt-Winters forecasting allows users to smooth a time series and use data to forecast selected areas. Exponential smoothing assigns decreasing weights and values against historical data to decrease the value of the weight for the older data, so more recent historical data is assigned more weight in forecasting than older results. The right augmented analytics provides user-friendly application of this method and allow business users to leverage this powerful tool.
Tackling data quality problems requires more than a series of tactical, one off improvement projects. By their nature, many data quality problems extend across and often beyond an organization. Addressing these issues requires a holistic architectural approach combining people, process and technology. Join Donna Burbank and Nigel Turner as they provide practical ways to control data quality issues in your organization.
This is a presentation in a meetup called "Business of Data Science". Data science is being leveraged extensively in the field of Banking and Financial Services and this presentation will give a brief and fundamental highlight to the evergreen field.
Slides used for a presentation to introduce the field of business analytics. Covers what BA is, how it is a part of business intelligence, and what areas make up BA.
YouTube Link : https://www.youtube.com/watch?v=RPhNwjyLQes
Intellipaat Data Analytics training course: https://intellipaat.com/data-analytics-master-training-course/
Attract High Value Publicity - Be Seen on TV, Radio, Podcasts, Print & BlogsDale Thomas Vaughn
My media system has earned hundreds of millions of impressions for my clients on major media outlets. Imagine yourself in your ideal audience's most trusted news, radio, podcast or television... what are the possibilities for you if you were that well-known?
To schedule a strategy session with me to find out your best next steps, visit dalethomasvaughn.com/media-training
I help social entrepreneurs get seen and become well-known experts in their industry.
If you're tired of seeing someone else with less expertise in your field get all the media attention, I am going to teach you the secrets to build your profile, pitch your story, and create a campaign to become the go-to expert in your field.
I'm a big believer in teaching people how to fish, rather than simply catching fish and selling them to you (which is what publicists do). Most small companies and nonprofits can't afford and frankly don't need a publicist or PR firm - they just need the tools and the confidence to try to book themselves. That's what I love to do for entrepreneurs and nonprofit leaders. Take the skills in-house and you'll have the ability to get media attention forever going forward.
by Dale Thomas Vaughn
Webinar - The Golden Key to Successful Grant Requests - 2018-05-10TechSoup
This presentation will help nonprofits uncover the most effective methods for documenting the need statement, as well as ways to use that information to engage the reader.
MozCon Virtual - Surviving the Covid News Agenda and What It Means for the Fu...Shannon McGuirk
For the past 18 months, my mindset has been firmly set in ‘survival mode’ due to the trials and tribulations that the global pandemic has brought forward. I will be opened up about my learnings sharing insight into how Aira’s digital PR team was able to pivot their link-building activity for clients on a hairpin, whilst navigating an oversaturated news agenda at the same time as being under pressure from clients around return on their investment.
Using ‘survival mode’ experiences, I share clear tactics setting the standard on how to future-proof your digital PR and link building. These tactics will show you how to adapt to the ever-changing news landscape and how to improve your processes from ideation through to outreach.
6 Methods to Improve Your Manufacturing Process with Computer VisionGramener
Computer vision is a technology that enables computers to interpret and comprehend visual information from their surroundings, and it has the potential to transform the manufacturing industry. Manufacturers can improve their processes in a variety of ways by using computer vision, from ensuring quality control and optimizing production to inspecting and measuring products and monitoring machinery.
In this presentation you will find out 6 methods how you can improve your manufacturing process with computer vision.
Download our E-book
bit.ly/ebookcomputervision
Detecting Manufacturing Defects with Computer VisionGramener
Computer vision is the field of artificial intelligence that deals with the ability of computers to interpret and understand visual data from the world around them. In the manufacturing industry, computer vision can be used to detect defects in products as they are being produced. This can help to improve the quality of the final product and reduce the cost of rework or recalls.
In this presentation you will find out the use of computer vision for defect detection in manufacturing which aids in improving the efficiency and effectiveness of the production process, leading to higher quality products and lower costs.
Book a discovery call
https://reachus.gramener.com/damage-detection/
How to Identify the Right Key Opinion Leaders (KOLs) in Pharma & HealthcareGramener
Find out the importance of KOLs (Key Opinion leaders) in the Pharma industry and everything you need to know about them.
In the presentation, we will show you who is a KOL in the Pharmaceutical Industry, what role they play and how to identify the right KOLs.
Book a free demo
https://gramener.com/demorequest/
Automated Barcode Generation System in ManufacturingGramener
Find out how automating barcode generation can improve the efficiency of your company's operations.
In the presentation, we will show you how barcodes play a significant role in enabling accurate inventory control and real-time stock information and how businesses can reduce 67% of their time in handling label standards.
Get a Free BarGen Demo.
https://gramener.com/demorequest/
#barcode #lowcode
The Role of Technology to Save BiodiversityGramener
Find out what are the major challenges biodiversity is facing such as deforestation, species endangerment, and poaching.
In the presentation, we will show you how some of the major technology and nature conservation organizations are building innovative solutions to protect our biodiversity.
Download this E-book to know how geospatial AI is impacting biodiversity conservation and sustainable development.
https://info.gramener.com/geospatial-analytics-ai-solutions-esg-sector-ebook
Enable Storytelling with Power BI & Comicgen PluginGramener
Gramener’s Lead Data Consultant Mrinal Ghosh and Principal Information Designer Richie Lionell conducted an exciting webinar on Power BI Comicgen.
In this webinar, they talk about the Comicgen Power BI plugin and how to use it to generate compelling comic data stories.
Who should watch: If you're a Power BI Developer, Consultant, or anyone who often develops data graphics on the Power BI dashboard.
Full webinar link: https://info.gramener.com/storytelling-with-power-bi-and-comicgen-plugin
Would you like to learn more about our Power BI capabilities? Check out: https://gramener.com/power-bi-consulting/
The Most Effective Method For Selecting Data Science ProjectsGramener
Ganes Kesari, Gramener's Head of Analytics & Co-Founder gives his insights on how to craft a data science roadmap that maximizes ROI.
The biggest reason why 80% of analytics projects fail is that they don’t solve the right problem. Asking analytics or data-related question is the worst way to initiate a data analytics project.
This webinar will walk you through how to get started in the most efficient way possible. You'll discover a straightforward step-by-step strategy to unlocking corporate value through industry examples.
Things you will learn from this webinar:
-The most common reasons for the failure of data science initiatives
-Identifying projects and prioritizing them
-Building a data science strategy in three easy steps
-Real-life examples are used to explain the approach
Watch this full webinar on: https://info.gramener.com/data-science-roadmap
To know more from our industry experts book a free demo at: https://gramener.com/demorequest/
Low Code Platform To Build Data & AI ProductsGramener
Gramener's CEO, Anand S conducted this webinar where he explained how to build Data and AI products using a low-code platform in less than two weeks.
Few takeaways:
-How low-code approaches can be tailored to your data/digital needs?
-Decisions on Building vs. Buying
-Production-ready use cases to stimulate your thinking
Who should watch?
You will find this webinar to be valuable if you're a CPO, VP IT, handling product development, or building analytical solutions for your company.
Watch this full webinar on: https://info.gramener.com/low-code-platform-to-build-process-optimization-solutions?
Want to know more about our low-code platform, Gramex?
Visit: https://gramener.com/gramex/
5 Key Foundations To Build An Effective CX ProgramGramener
Gramener's VP of Analytics Amit Garg hosted this webinar and talked about what are the principles of a good customer experience program, and why is it important.
This webinar will be beneficial to leaders in the CMO, CCO, Customer Service, and any other customer-facing departments within a firm.
Pain points discussed:
-You'll be able to assess the level of CX maturity in your company.
-You'll learn the high-level steps to creating a successful CX program.
-You'll figure out what tools you'll need to improve your talents.
To watch the full webinar visit: https://info.gramener.com/5-key-foundations-effective-cx-program
Learn more about CX Analytics: https://gramener.com/customer-experience-analytics/
Using Power BI To Improve Media Buying & Ad PerformanceGramener
Gramener's Senior Lead Data Consultant, Sidharth Parameswaran, and Navya Sri Channamsetty, Gramener's Associate Lead Data Science Engineer conducted this joint webinar session.
Pain points discussed:
-Actual vs. planned results of a campaign
-Competitor Evaluation & Comparison
-Modeling of Media Mix
-Metrics assessed across the Agency, Client, and Brand levels
- Genre/Channel Performance Evaluation
Things you will learn:
1. Power BI may be used in a variety of ways to investigate findings.
2. Various dashboards would be used to analyze ad/program performance.
3. How can you help your clients obtain higher ROI and acquire a competitive advantage?
Do join us if you are a:
Power BI Developer, Media buyer, Campaign Manager, Brand Manager, Consultant, etc.
To watch this webinar visit: https://info.gramener.com/power-bi-media-buying-ad-performance
Learn more about Gramener: https://gramener.com/
This webinar was hosted by Gramener's CEO/Co-Founder, Anand S, and Ganes Kesari, Head of Analytics/Co-Founder on how data can help firms recover quickly throughout the recession and recovery period.
Who should watch this webinar :
Analytics Leaders, Business Leaders, CDOs, CTOs, etc.
Few takeaways :
-Which aspects of your company could benefit the most from a data-driven response?
-A strategy for identifying use cases that will provide the most value for the money.
How to use data in creative ways to uncover new market opportunities and customers.
Objectives :
-Data's utility in COVID situation
-How data science may assist you in navigating the recession
-Gramener's industry case studies to assist businesses in responding to COVID-19
Full Webinar: https://info.gramener.com/recession-proofing-your-business-with-data
To know more from industry leaders visit our official website: https://gramener.com/
Engage Your Audience With PowerPoint Decks: WebinarGramener
Gramener's CEO and Co-Founder Anand S hosted a webinar on how interactive PowerPoint decks can engage your audiences.
Pain points discussed in this webinar :
-How to utilize interactive slides to answer business questions like "Where is the problem?" and "What created this problem?"
-What forms of interactivity does PowerPoint offer, and when should you utilize each?
-What tools and plug-ins can aid in the creation of interactive presentations?
Watch the full webinar on: https://info.gramener.com/interactive-powerpoint-for-operations
Book a free demo to know more about Gramener's solutions: https://gramener.com/demorequest/
Structure Your Data Science Teams For Best OutcomesGramener
Gramener's Head of Analytics, Ganes Kesari conducted this webinar and discussed the following points :
-Why do data analytics and visualization initiatives require teams to work in silos?
-What are the best organizational structures for data science?
-As your data journey progresses, how should the organizational structure evolve?
-Best methods for encouraging team collaboration in data projects
This is a unique webinar designed for Executives, Chief Analytics Officers, Heads of Analytics, Directors, Technology Leaders, and Managers that work with data science teams on a daily basis.
To check out the full webinar visit: https://info.gramener.com/data-science-teams-structure-for-best-outcomes
To contact us & book a free demo visit: https://gramener.com/demorequest/
Gramener's Lead Data Scientist Soumya Ranjan and Senior Data Science Engineer Sumedh Ghatage conducted a webinar on Geospatial AI.
In this webinar, they discussed the technical know-how to get started, as well as some strategies for navigating this fascinating realm of Geospatial Analytics.
Pain points covered :
-How to begin with Geospatial Analytics in Python
-How can large-scale geospatial datasets be cleaned and analyzed?
-What is the best way to design geospatial workflows?
-How to use Geospatial Datasets for Deep Learning?
No matter whatever industry you're in, Geospatial Analytics will provide you with a wealth of unique solutions.
To watch the full webinar visit: https://info.gramener.com/geospatial-ai-technical-sneak-peek
To know more about Gramener's Geospatial AI solutions book a free demo on: https://gramener.com/demorequest/
5 Steps To Become A Data-Driven Organization : WebinarGramener
Gramener's Chief Data Scientist and Co-founder Ganes Kesari conducted an interesting webinar that will give you an idea of how to analyze your data maturity and plan the five steps to transforming your business using data.
Who should watch this webinar?
Executives, Chief Data/Analytics Officers, Technology leaders, Business heads, Directors, and Managers.
Important points discussed on the webinar:
-The majority of businesses reach a halt in the middle of their data journey.
-According to Gartner, approximately 87% of companies in the business have a poor degree of data maturity (levels 1 and 2 on a scale of 5).
-Adding more data science projects to your portfolio will not boost your talents or results. The truth is that CDOs' primary issues are divided into five categories.
Learnings from this webinar:
-Data Science Maturity. What is it and why is it important?
-How can you determine the maturity of data science and its limitations?
-How does data science maturity (described with an example) assist your business in progressing?
Watch the full webinar on:
https://info.gramener.com/5-steps-to-transform-into-data-driven-organization
To know more about Data Maturity visit:
https://gramener.com/data-maturity/#
5 Steps To Measure ROI On Your Data Science Initiatives - WebinarGramener
Gramener's Chief Decision Scientist & Co-Founder Ganes Kesari conducted an exciting webinar on how to measure ROI on your data science initiatives.
In this webinar people from the C-suite level CEO, COO, Directors, Managers across various industries joined.
Ganes Kesari covered the following points with industry examples:
-Identifying business use cases with a high impact
-Choosing effective success indicators
-Ascertaining that the consequences may be traced back to your data project
The attendees had a good time. Learnings from the webinar:
-Why do businesses struggle to get a return on their data investments?
-A straightforward framework for calculating the return on investment from your data projects
-Benchmarking of typical payback from data initiatives in the industry
To check out the complete recording of the webinar please visit:
https://info.gramener.com/5-steps-to-measure-roi-on-your-data-science-initiatives
To know more about data advisory check out:
https://gramener.com/advisory-consulting/
Saving Lives with Geospatial AI - Pycon Indonesia 2020Gramener
There’s a powerful way to fight dengue. Infect a mosquito with Wolbachia, release it in highly populated regions, and wait for it to infect all mosquitoes in the region.
But this process is expensive, and we need to release it in the most densely populated regions in a city.
And no one really knows what population density is at a 100m x 100m level.
Can we use satellite imagery and use this to identify building density?
Driving Transformation in Industries with Artificial Intelligence (AI)Gramener
Gramener's Director of Delivery, Priyaranjan Mohanty, delivered a virtual session at IIM Nagpur and talked about how organizations are moving towards digital transformation by leveraging advanced technologies such as Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning.
Starting from what the industry leaders think about digital transformation to how it can shape the global economy, this presentation explores the scope of AI in the digital world.
The Art of Storytelling Using Data ScienceGramener
Gramener's VP - Sales, APAC Region, Vijayam Sirikonda interacted with the students of IIM Raipur and talked about the importance of data storytelling for business users.
Storyfying your Data: How to go from Data to Insights to StoriesGramener
Gramener's Director - Client success, Shravan Kumar A, delivered an online session to the students of Praxis Business School.
In his session he talked about how converting data into stories can benefit businesses and enable quick decision making. Furthermore, he shared approaches to create data stories along with some use cases and case studies we solved at Gramener to benefit our clients.
Check out our initiative to teach data storytelling to data scientists and analysts so that they can think out of the box and create wonderful data stories for their stakeholders: https://gramener.com/data-storytelling-workshop
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
2. Data storytelling is a critical skill for data scientists, analysts & managers
2
But analysts present their work, not their message
Data scientists present their analysis – what they did,
and what they found. That’s not what the audience
needs.
Audiences need a message that tells them what to do,
and why. Told in an engaging way. As a story.
Share your data & analysis as data stories
Whenever you share inferences from data – whether
it’s as a presentation, or an email or document with
your analysis, or as a dashboard – craft it as a story.
This workshop will teach you the techniques of how to
convert an analysis into a memorable story – even if
you’ve never told a story before.
Storytelling has a 30X Return on Investment
Rob Walker and Joshua Glenn auctioned common
items like mugs, golf balls, toys, etc. The item
descriptions were stories purpose-written by 200+
contributing writers.
Items that were bought for $250 sold for over $8,000 –
a return of over 3,000% for storytelling!
Stories are memorable and viral
People remember stories. They’ll act on them.
People share stories. That enables collective action.
We analyze data to improve people’s decision making.
For this to be effective, data stories are needed more
than ever before.
3. With the growth of self-service BI, 85% of companies have lost track of how many
dashboards they generated
What QUESTION does
the dashboard answer?
Is the ANSWER evident
from the dashboard?
What ACTION should the
user take now?
BUT 3 THINGS
ARE UNCLEAR ON
MOST DASHBOARDS
3
5. This is a dataset (1975 – 1990) that
has been around for several years and
has been studied extensively. Yet, a
visualization can reveal patterns that
are neither obvious nor well known.
For example,
• Are birthdays uniformly distributed?
• Do doctors or parents exercise the C-section option to move dates?
• Is there any day of the month that has unusually high or low births?
• Are there any months with relatively high or low births?
Very high births in September.
But this is fairly well known. Most
conceptions happen during the
winter holiday season
Relatively few births during the
Christmas and Thanksgiving
holidays, as well as New Year
and Independence Day.
Most people prefer not
to have children on the
13th of any month, given
that it’s an unlucky day
Some special days like April
Fool’s day are avoided, but
Valentine’s Day is quite
popular
More births Fewer births … on average, for each day of the year (from 1975 to 1990)
Let’s look at 15 years of US Birth Data
Education
LINK
Fraud
6. The pattern in India is quite different
This is a birth date dataset that’s
obtained from school admission data
for over 10 million children. When we
compare this with births in the US, we
see none of the same patterns.
For example,
• Is there an aversion to the 13th or is there a local cultural nuance?
• Are holidays avoided for births?
• Which months have a higher propensity for births, and why?
• Are there any patterns not found in the US data?
Very few children are born in the
month of August, and thereafter.
Most births are concentrated in
the first half of the year
We see a large number of
children born on the 5th, 10th,
15th, 20th and 25th of each month
– that is, round numbered dates
Such round numbered patterns a
typical indication of fraud. Here,
birthdates are brought forward to
aid early school admission
More births Fewer births … on average, for each day of the year (from 2007 to 2013)
Education
LINK
Fraud
7. This adversely impacts children’s marks
It’s a well-established fact that older
children tend to do better at school in
most activities. Since many children
have had their birth dates brought
forward, these younger children suffer.
The average marks of children “born” on the 1st, 5th, 10th, 15th etc.. of
the month tend to score lower marks.
• Are holidays avoided for births?
• Which months have a higher propensity for births, and why?
• Are there any patterns not found in the US data?
Higher marks Lower marks … on average, for children born on a given day of the year (from 2007 to 2013)
Children “born” on round numbered days score lower marks on average,
due to a higher proportion of younger children
Education
LINK
Fraud
8. You have data.
You have analysis.
Now what?
Understanding the audience & intent
Finding insights
Storylining
Designing data stories
10. DO IT: Who might be an audience for your
analysis?
• Lookback at your recent analytics project.
• Who do you know that can use this analysis?
(Come up with a real or hypothetical personas)
CHECK IT: Verify these yourself
Is there a name for the individual?
Was the role specific enough? (Head of sales
instead of just executive)
The same data analysis can be relevant to
many people — each group is called persona.
• The trends in sales data for an organization is
relevant for a CEO, head of sales, region leads,
individual sales team member & every
employee.
• The analysis of polio cases in UP is relevant to
the Minister of health, polio campaign manager,
field workers, NGOs, journalists & general
public.
This section will dive deeper into defining a persona
Define your audience, they determine the story
11. DO IT: Start with your own hypothesis
• Pick one of the personas you had listed earlier.
• What problems do you think your persona is facing?
• How do you feel the persona will use the analysis?
• Frame it as a user scenario.
CHECK IT: Verify user scenario with a partner
Is it framed as “As a [persona], I’m in [situation]
where I face [problem], leading to
[consequence]. Solving it by [action] leads to
[impact]”
Would the persona relate to this user scenario if
they heard it?
List scenario(s) for each persona
For each persona, answer the following questions:
1. What situation are they currently in?
2. What problems do they face?
3. What is the consequence?
4. What action can they need to take using your analysis?
5. What is the impact of this action?
Combine these as a user scenario:
“As a [persona], I’m in [situation] where I face [problem], leading
to [consequence]. Solving it by [action] leads to [impact]”
• John: As a Marketing manager, I have to create region-wise
budget for the next quarter. I don’t know which regions give the
highest RoI, so my spend isn’t optimized. Solving it by prioritizing
the region will lead to maximum ROI.
Clear needs & future scenario leads to effective communication.
Know your audience’s needs, that helps align the message
Reference: SPIN Selling by Neil Rackham
13. Insights must be Big, Useful, and Surprising
Filter the analyses using these as a checklist
IS THE INSIGHT
BIG
IS THE INSIGHT
USEFUL
IS THE INSIGHT
SURPRISING
The analysis must, of course, be statistically significant.
But it should also be numerically significant.
We want a result that substantially changes the outcome.
What should the audience do after hearing the insight?
Can they take an action that improves their objective?
Even if it’s informational, what should they do next?
Is this something they didn’t know? Is it non-obvious?
Does it overturn a domain-driven belief or a gut feel?
Or does it bring consensus to a group with divided opinion?
14. Marking each analysis as Big, Useful or Surprising (High, Medium, Low)
14Only those that are high or medium on all aspects are insights
Insights Big Useful Surprising
Twice as many Detractors talk about our Product’s ease of use. Low Medium High
Typing with capitalization in a credit application indicates creditworthiness Low Low High
Almost 20% of all voice search queries are triggered by just 25 words Low High Medium
More engaged employees have fewer accidents Low High Low
About 50% of American small businesses do not have a website High Medium Low
The recommendation system influences about 80% of content streamed on
Netflix
Big Low Low
16. A business storyline
• Our NPS improved 6%
• It was 34% in 4Q18. Now it’s at 40% in 2Q19
• Despite lower satisfaction with our Support,
our NPS grew
• This increase in NPS was mainly due to better
Product Quality & Research
Gladiator’s storyline
• The Emperor asks General Maximus to take
control of Rome and give it back to people
• The ambitious Prince murders the emperor.
• Maximus is sold as a gladiator slave. His family
is murdered
• Maximus grows famous, fights the Prince in the
arena, and wins
• He joins his family in death. Rome is in the
hands of the people
Outlines are the backbone on which you flesh out your story.
This section explains how to create storylines
Storylines are plot outlines. They summarize the entire story
Notice “characters” in red. All stories
have characters, human or otherwise.
16
17. DO IT: Write your takeaway as one sentence
What’s the one thing you want the audience to
remember from your story?
What’s the one message that the audience
should take away?
CHECK IT: Verify these yourself
Is it a single, complete, sentence?
Does it deliver what you want the audience to
remember?
Will your audience care a lot about this?
Close your eyes. Think of a childhood tale.
Summarize the moral of the story in one line
We easily we remember these stories and their
summary as a moral several years later.
Close your eyes. Think of a business
presentation from last week. Can you easily
summarize the message in one line?
Stories are designed around a moral. A single
takeaway. An “elevator pitch”
It’s a one-sentence summary of the most important message for the audience.
1. Start with the takeaway (The elevator pitch. The moral of the story.)
17
18. DO IT: Write your supporting analysis
1. List all possible analysis
2. Re-word them as sentences
3. Strike off what’s not relevant
CHECK IT: Verify these yourself
Is each necessary? Does each analysis
support the takeaway?
Are they sufficient? Do the analyses prove
the takeaway?
What supports your takeaway from
“The Lion and the Mouse”?
http://www.read.gov/aesop/007.html
The lion was an Asiatic lion
The lion had a huge paw
The lion spared the mouse it caught
The lion was caught by a hunter’s net
It was stalking its prey when it got caught
The mouse was nibbling grass nearby
The mouse took few minutes to cut the net
Only include analysis that proves the takeaway.
Ensure that they fully prove the takeaway.
2. Find analysis that supports your takeaway. Ignore irrelevant content
There’s no right or wrong answer. Think
about how it supports your takeaway.
18
19. 3. Convert analysis into messages by adding context
19
DO IT: Add context to your analysis
1. Take each relevant analysis
2. Convert it to a message for the audience by
adding context
CHECK IT: Verify these yourself
Will your audience understand the messages
without explanation?
Will your audience understand why this
message is relevant?
Analysis doesn’t mean anything to people. When
it does, it’s a message. We do this by adding
context. Three ways to add context are:
1. Compare with similar numbers.
Our $15 mn sales is $3 mn more than last
year, $1 mn below budget, and twice our
nearest competitors.
2. Explain with analogies.
If we stopped producing, it’ll take 3 months to
dispose our excess inventory of $2 mn.
3. Add business interpretation.
Usage is correlated with discounts. For every
$1 discount, customer LTV increases by $24.
Frame each analysis as a message that the audience will understand and find relevant
20. 4. Structure the messages into a pyramid or a tree
20
Example of a business tree
Launch sales were 30% less than target due to
high competition
• Launch sales were projected at $20 mn in the
first month, but achieved only $14 mn
o Sales in every region were 20-50% lower.
o Only Philippines & Korea were on target
• Competitors discounted price by 35% - which
is unsustainable for them
o 80 store discounts increased from 15% to 35%
o The maximum sustainable discount is 20%
• Stores offered higher discounts saw less than
20% of our target sales
Construct a pyramid or tree-like outline
• Start with the takeaway at the root of the tree
• Add a message that supports the takeaway
• Add further details or supporting messages
• Messages must prove the first message, and
only the first message
• Strike off any message that isn’t required to
prove or support the takeaway
• Add next message that supports takeaway
• Add details to prove the second message
• Remaining messages for the takeaway
• Add details as required
Arrange messages hierarchically to prove & support the parent message
21. 5. Re-order the messages to increase memorability and motivation
21
Order messages into an emotionally contrasting,
motivating sequence.
Take this aspects-based flow:
• Our profits doubled. But our sales only grew
20%. Our gross margins stayed flat.
The “emotional arc” is falling,
and not motivating.
Here’s the same message re-ordered:
• Our sales grew mildly at 20%. Our margins
didn’t improve at all. But our profits doubled!
This emotional arc falls before
rising. This is more motivating.
Structure your supporting messages into a
memorable flow. Here are 7 flows that help:
1. Time: e.g. Past, Present, Future
Sales was $15 mn. Now it’s $18 mn. We
expect it to grow to $20 mn.
2. Place: e.g. NA, EU, APAC
3. Aspects: e.g. company, competition, context
4. Benefits: e.g. better, faster, cheaper
5. Scale: e.g. local, regional, global
6. Balance: e.g. pros, cons
7. Priority or climactic: least to most important
Remember: Emotional contrast requires bad news – it makes good look better
23. Visually representing data helps us to see patterns in the data quickly
23
• It’s hard to find patterns & derive insights from
raw data
• Statistics can summarize data, but may hide
patterns in how the data is spread
• We use visual encoding techniques to map
data to visual attributes
24. How the data should be interpreted decides the type of chart to be used
24
https://gramener.github.io/visual-vocabulary-vega/
Deviation
Change-
over-Time
Spatial Ranking
Correlation
Part-to-
Whole
Flow
Magnitude
Distribution
25. We use visual design cues to support our annotations & message
25
• Pre-attentive processing drives our attention
towards certain elements more than others.
• We can leverage these to highlight aspects
of the chart that are relevant to our story.
• For ex, when listing a set of countries, if the
relevant insight is for one country, we can
make it stand out as below:
Position is the most powerful encoding.
The eye and brain are naturally wired to detect mis-
alignment of the smallest order
1
Colour, when used in context, is powerful.
We can detect miniscule changes or variations in colour
when comparing an element with neighbouring elements.
This is what makes true colour (32-pixel colour, i.e. 4
billion) a necessity in computer graphics
2
Size is a useful differentiator.
The eye can detect moderate size variations at
moderate distances. Size also has a natural
interpretation: that of priority.
3
Several other
encodings are possible
Aesthetics such as angle,
shadows, shapes, patterns,
density, labelling, enclosures,
etc. can each be used to map
data.
4
26. DO IT: What can you understand from the
chart shown next?
Look at the chart that will be shown next.
List down what all you can understand as points.
CHECK IT: Verify these yourself
How many did aspects did you notice from the
final list of observations?
• Meaning or message behind a chart isn’t
always obvious.
• The same chart can be interpreted in several
ways by your audience.
• You must guide your audience to see the
message you want to show.
Your audience may not understand what you meant to show
27. Class Xth English Marks Distribution
0
5,000
10,000
15,000
20,000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
28. 4 type of annotations help the audience understand your intent
0
5,000
10,000
15,000
20,000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
Marks
# students
Teachers add marks to stop some students from failing
This chart shows Class 10 students’ English marks in Tamil Nadu, India, in
2011. The X-axis has the mark a student has scored. The Y-axis has the # of
students who scored that mark.
This is a bell curve. But the spike at 35 (the mark at which students pass) is
unusual. Teachers must be adding marks to some of the students who are
likely to fail by a small margin.
Large number of students score
exactly 35 marks
Few (but not 0) students score
between 30-35
What’s unusual
Large number of students
score 35 marks.
Few (but not 0) students score
between 30-35
Only some students get this benefit.
Identify a fair policy that will be applied consistently.
Summarize the chart in its title
Don’t describe the chart.
Don’t write the question to answer.
Write the answer itself. Like a headline.
Explain the chart
How should the user read it?
What do you say when you talk through it?
Explain what the visual is. Then the axes. Then
its contents. Then the inference.
Recommend an action
How should I act on this?
You need to change the audience.
(Otherwise, you made no difference.)
Highlight essential elements
What should the user focus their eyes on?
Point it out.
Interpret what they’re seeing – in words.
29. We can apply the same principles to the chart & surrounding elements
A chart is supported by
several elements to get
context of the data & insight.
Each element has a scope to
be well designed and help
improve the comprehension
of the chart.
Legends
Chart Area
Horizontal Axis
Horizontal Axis ValuesVertical Axis Values
Vertical Axis
Vertical Axis Label
Data Labels
Chart Title
Students
Horizontal Axis Label
31. With the growth of self-service BI, 85% of companies have lost track of how many
dashboards they generated
What QUESTION does
the dashboard answer?
Is the ANSWER evident
from the dashboard?
What ACTION should the
user take now?
BUT 3 THINGS
ARE UNCLEAR ON
MOST DASHBOARDS
31
32. Today we use dashboards to expose data. But users must explore & interpret it.
Quarterly Sales vs Target Product-wise growth
Country-wise revenue vs target Country-wise product growth (%)
- 2,000,000 4,000,000
Enterprise A
Enterprise B
Enterprise C
Consumer A
Consumer B
Consumer C
- 2,000,000 4,000,000 6,000,000 8,000,000
NA
MEA
EU
AU
AP Region Cons.. Cons.. Cons.. Enter.. Enter.. Enter..
AP 12 11 15 12 9 14
AU 15 10 22 17 13 18
EU 18 20 12 14 15 22
MEA 22 30 9 16 18 20
NA 7 4 3 9 10 12
33. We automate data stories. So users act, rather than interpret.
SERVICES REVENUE 5% BELOW TARGET, DESPITE 8% QOQ
GROWTH
STORY GUIDE
The visual on the right shows
our services revenue against
target. If we’re below target,
we must understand why.
The visuals below break up
the revenue by product and
region. Focus on the area with
the weakest performance.
Q1, 2017 Q2, 2017 Q3, 2017 Q4, 2017 Q1, 2018 Q2, 2018 Q3, 2018 Q4, 2018 Q1, 2019 Q2, 2019 Q3, 2019 Q4, 2019
Revenue Target
GROWTH DRIVEN BY
CONSUMER PRODUCTS, NOT
ENTERPRISEQ4 2020
Enterprise A
Enterprise B
Enterprise C
Consumer A
Consumer B
Consumer C
TARGET IMPACTED BY NA
SHORTFALL
Q4 2020
AP
MEA
EU
AU
NA
NA CONSUMER PRODUCTS
HAVE GROWN THE LEAST IN Q4
Q4 2020
Action: North America should grow consumer products. Leverage learning from other regions
Revenue is 5%
Below Target
Cons A Cons B Cons C Ent A Ent B Ent C
AP 12 11 15 12 9 14
AU 15 10 22 17 13 18
EU 18 20 12 14 15 22
MEA 22 30 9 16 18 20
NA 7 4 3 9 10 12
QoQ growth
is 8%
34. Here’s are some live data stories that applies these principles
Does access to new Technology facilitate Innovation? Does it
facilitate Entrepreneurship? The Global Information Technology
Report findings tell us that "innovation is increasingly based on
digital technologies and business models, which can drive
economic and social gains from ICTs...".
We were curious about whether the data on TCData360 could tell a
story about influential factors on innovation and entrepreneurship.
With over 1800 indicators, we focused on the Networked Readiness
Index, as it has indicators on entrepreneurship, technology, and
innovation.
LINK
Source: https://tcdata360.worldbank.org/stories/tech-entrepreneurship/
35. European brewery identified €15 m cost savings after consolidating vendors
35WATCH A 4-MINUTE VIDEOSEE LIVE DEMO
A leading European brewery’s plants purchased
commodity raw materials from several vendors
each – and had low volume discounts.
Plants also placed multiple orders placed every
week, leading to higher logistics cost.
When plant managers were shown the data, they
objected, saying “That’s not always the case.” Or,
“That’s the only way– no one else does better.”
Gramener built a custom analytics solution that
sourced their SAP order data, automatically
identified which plants ordered which commodities
the most from multiple vendors – and when.
It showed how each plant performed compared to
peers – shaming those with poor performance.
With this, they identified savings of €15 m — which
the plant managers couldn’t refute.
€15 m 40%
savings potential identified
annually
vendor based reduction
identified
36.
37. You have data.
You have analysis.
Now, create data stories!
Understanding the audience & intent
Finding insights
Storylining
Designing data stories
Source: How to Vanish Management Reporting Mania, Sep 2014: https://www.cfo.com/management-accounting/2014/09/vanquish-management-report-mania/
Add steps on the right side
Who do you know that will use this analysis? (If no one, make someone up).Edit exercise
Activities to be checked yourself
Add steps on the right side
Think of good anecdotes here
Add steps on the right side
Instructors: Give the audience 1 minute to write down a one-sentence takeaway. Ask 2 people to read it out. Apply the checklist. If they don’t meet the checklist, prompt them to revise it. Allow them to struggle through it before taking help.
Instructors: Give the audience 2 minutes to write down supporting material. Ask 1 person to read it out. Apply the checklist. Debate with them whether each point is necessary: “How does it support the takeaway?” Check if the analyses, when put together, logically prove the takeaway. There must be no alternative conclusion possible. If not, we need additional material to prove the takeaway.
Instructors: Give the audience 2 minutes to expand on the context for any one analysis. Ask 1-2 people to read it out. Apply the checklist. Debate with them how they could make the point clearer and meaningful to the audience.. Explore alternatives, using comparisons, analogies or interpretation.
Instructors: Ask 1-2 people from the audience to add supporting points to their takeaway or any message. Ask others to debate whether these points are necessary and sufficient to prove the parent message. Ask the audience if some of them are sub-bullets to a supporting point.
Instructors: The emotional arc is an interesting device and can attract questions. Clarify a few points. It’s not a graph of the data. It’s a graph of the EMOTION the graph triggers. There’s no right or wrong level to the emotion arc. It’s subjective, and audience driven. It’s important to ensure high contrast, and end with the feeling you want to deliver. That’s all. What most people are afraid to do is deliver bad news. Encourage the audience to deliver bad news as a vehicle to make the good news look better.Practice this by asking people for examples of related good and bad news. Then ask how they would word it so that the good news looks better.
For example, “Sales fell. Margins improved” can become “Despite lower sales, our profits have not worsened proportionally, thanks to a significant improvement in our margins.”
Add more visual encoding content here & move animation up
Make a longer ordered list of what cues should I use?
Source: Designing Data Visualizations by Noah Iliinsky and Julie Steele (O’Reilly).
Copyright 2011 Julie Steele and Noah Iliinsky, 978-1-449-31228-2.
Specificity? Structure?
Specificity? Structure?
Source: How to Vanish Management Reporting Mania, Sep 2014: https://www.cfo.com/management-accounting/2014/09/vanquish-management-report-mania/