This document discusses predictive analysis and the challenges, statistics, assumptions, and relevance to managers. It introduces the author and defines predictive analysis as using past data and statistical methods to predict the future. The main challenges are lack of good data and unique identifiers. Regression analysis is commonly used to correlate attributes and predict likelihoods. The key assumption is that the future will be like the past, but this can be invalidated over time if assumptions become irrelevant. Predictive analytics allows managers to proactively address opportunities and losses.
There are 3 questions to ask when evaluating statistics: 1) Can you see the uncertainty in the data? Many visualizations overstate certainty. 2) Can you see yourself represented in the data? Visualizations should make sense to many people. 3) How was the data collected? Government statistics are often more reliable than private statistics because of how the data is collected. Being able to evaluate statistics helps produce better decision making.
This document discusses becoming a measurement ninja and provides a framework for measurement, analysis, insight, strategy, tactics, execution, and review. It emphasizes the importance of collecting good, compatible data and using basic tools like visualization, derivatives, and moving averages for analysis. Insights involve techniques like reverse engineering, journaling, and induction to understand why things are happening. Strategy determines what comes next based on goals and methods within environmental constraints. Tactics and execution implement the strategy, and review feeds back into the measurement process. The presentation encourages attendees to use a discount code for online marketing resources.
A poll of 1038 likely Democratic primary voters in Massachusetts' 8th Congressional District found:
- Stephen Lynch has a 46% favorable rating compared to Robbie Goldstein's 36% favorable rating, with Lynch leading 39-32 if the primary was held.
- Among undecided voters, support for a progressive Democrat was higher than for a moderate, and support for a pro-choice candidate was much higher than a pro-life candidate.
- A majority of undecided voters would also support a candidate who backs Medicare for All over one who opposes it.
- The poll had a 3.04 point margin of error and was conducted August 8-9 via phone and text with respondents' gender and age demographics
Data analytics and personality tests have limitations in predicting job performance, as statistics can never provide certainty. While past performance is often the strongest predictor of future behavior, performance is best evaluated by people familiar with an individual. Relying too heavily on metrics can undermine an organization over the long term by neglecting human factors. Algorithms also have inherent biases that can unfairly discriminate.
Understanding How Emergency Managers Evaluate Crowdsourced Data: A Trust Gam...Mirjam-Mona
Presentation of Kathleen Moore, Andrea H. Tapia and Christopher Griffin on the topic "Understanding How Emergency Managers Evaluate Crowdsourced Data: A Trust Game-Based Approach" at ISCRAM2013
This document discusses the importance of communicating data effectively. It provides examples showing that without proper communication, valuable data can remain hidden or underutilized. Specifically, it notes that (1) scientist Gregor Mendel's discoveries about genetics were not widely adopted due to poor communication, and (2) relationship researchers John and Julie Gottman were able to have much greater impact by effectively disseminating their findings beyond just publishing. The document emphasizes that compiling data is useless without proper communication and that managers especially need to become better "consumers of data" who can understand and apply quantitative analysis.
Big Data & The Role Analytics Can Play In Our OrganizationsAgile Technologies
The document discusses big data and data mining tools. It provides an agenda for a session that will explore big data tools and demonstrate their use in identifying patterns and predicting risks. The session aims to show how these analytics can help organizations identify areas of elevated risk and help plan mitigation strategies. A live demo is presented analyzing patterns in Facebook likes to predict attributes about users. The document concludes that big data tools are now widely available and can provide organizations a competitive advantage through predictive analytics.
This document discusses predictive analysis and the challenges, statistics, assumptions, and relevance to managers. It introduces the author and defines predictive analysis as using past data and statistical methods to predict the future. The main challenges are lack of good data and unique identifiers. Regression analysis is commonly used to correlate attributes and predict likelihoods. The key assumption is that the future will be like the past, but this can be invalidated over time if assumptions become irrelevant. Predictive analytics allows managers to proactively address opportunities and losses.
There are 3 questions to ask when evaluating statistics: 1) Can you see the uncertainty in the data? Many visualizations overstate certainty. 2) Can you see yourself represented in the data? Visualizations should make sense to many people. 3) How was the data collected? Government statistics are often more reliable than private statistics because of how the data is collected. Being able to evaluate statistics helps produce better decision making.
This document discusses becoming a measurement ninja and provides a framework for measurement, analysis, insight, strategy, tactics, execution, and review. It emphasizes the importance of collecting good, compatible data and using basic tools like visualization, derivatives, and moving averages for analysis. Insights involve techniques like reverse engineering, journaling, and induction to understand why things are happening. Strategy determines what comes next based on goals and methods within environmental constraints. Tactics and execution implement the strategy, and review feeds back into the measurement process. The presentation encourages attendees to use a discount code for online marketing resources.
A poll of 1038 likely Democratic primary voters in Massachusetts' 8th Congressional District found:
- Stephen Lynch has a 46% favorable rating compared to Robbie Goldstein's 36% favorable rating, with Lynch leading 39-32 if the primary was held.
- Among undecided voters, support for a progressive Democrat was higher than for a moderate, and support for a pro-choice candidate was much higher than a pro-life candidate.
- A majority of undecided voters would also support a candidate who backs Medicare for All over one who opposes it.
- The poll had a 3.04 point margin of error and was conducted August 8-9 via phone and text with respondents' gender and age demographics
Data analytics and personality tests have limitations in predicting job performance, as statistics can never provide certainty. While past performance is often the strongest predictor of future behavior, performance is best evaluated by people familiar with an individual. Relying too heavily on metrics can undermine an organization over the long term by neglecting human factors. Algorithms also have inherent biases that can unfairly discriminate.
Understanding How Emergency Managers Evaluate Crowdsourced Data: A Trust Gam...Mirjam-Mona
Presentation of Kathleen Moore, Andrea H. Tapia and Christopher Griffin on the topic "Understanding How Emergency Managers Evaluate Crowdsourced Data: A Trust Game-Based Approach" at ISCRAM2013
This document discusses the importance of communicating data effectively. It provides examples showing that without proper communication, valuable data can remain hidden or underutilized. Specifically, it notes that (1) scientist Gregor Mendel's discoveries about genetics were not widely adopted due to poor communication, and (2) relationship researchers John and Julie Gottman were able to have much greater impact by effectively disseminating their findings beyond just publishing. The document emphasizes that compiling data is useless without proper communication and that managers especially need to become better "consumers of data" who can understand and apply quantitative analysis.
Big Data & The Role Analytics Can Play In Our OrganizationsAgile Technologies
The document discusses big data and data mining tools. It provides an agenda for a session that will explore big data tools and demonstrate their use in identifying patterns and predicting risks. The session aims to show how these analytics can help organizations identify areas of elevated risk and help plan mitigation strategies. A live demo is presented analyzing patterns in Facebook likes to predict attributes about users. The document concludes that big data tools are now widely available and can provide organizations a competitive advantage through predictive analytics.
Big data is a collection of extremely large data sets that are too large to be processed with traditional data processing applications. It is characterized by high volume, velocity, and variety of structured, semi-structured, and unstructured data. Big data allows companies to gather customer information, anticipate demands, and identify potential problems by analyzing vast amounts of data from various sources. It provides value by converting large volumes of raw data into useful insights and information for businesses.
Fairness and Transparency: Algorithmic Explainability, some Legal and Ethical...Patrick Van Renterghem
In this presentation, Nazanin Gifani discussed some of the ethical and legal issues of automated decision making, including algorithmic fairness, transparency and explainability. The big question here is: can AI help us to make fairer decisions ?
This document outlines the key steps and analyses involved in developing a business case as a business analyst. It includes sections on feasibility studies, stakeholder analysis, requirements gathering, prioritization, development planning, testing, and deployment. Methodologies covered include PEST analysis, SWOT analysis, Porter's Five Forces, gap analysis, MOSCOW prioritization, and the use of user stories and use cases. The role of the business analyst in justifying the business case and translating requirements between teams is also discussed.
This book discusses how analyzing large amounts of internet search data and digital traces can provide insights into human behavior and psyche. It explains that people will confess private things to Google searches that they wouldn't tell others. The data gives a view into trends of what people want from text, images, and videos. While surveys often receive dishonest answers, digital data may be closer to what people really want or feel as they search for information online.
Big data hype (and reality) by gregory piatetsky shapiroDarpan Deoghare
This document summarizes an analysis of big data hype and reality by Gregory Piatetsky-Shapiro. It discusses how while big data offers unprecedented insights into consumer behavior, human behavior remains unpredictable and inconsistent. Three key findings are discussed: 1) Netflix's algorithm to predict movie ratings improved by less than 0.1 stars after 3 years of work, showing the limits of predicting human tastes. 2) The biggest effects of big data will be creating new areas like search and social media, not radical improvements in prediction. 3) While big data can enhance predictions, managers should not expect it to make human behavior fully predictable and should continue relying on human judgment.
This document summarizes a data set from speed dating events and the analysis conducted on it to build a predictive model for matching users. It describes the data, which comes from 8,378 speed dating observations between 2002-2004. Pre-processing steps are outlined to clean and prepare the data. Exploratory analysis finds an overall 16.5% match rate. Various models are tested and a decision tree after data replacement is found to best predict matches with an 80.2% true positive rate. The predictive model results are described to show how likely users are to say yes based on their ratings and attributes. The conclusions discuss using this model to build a dating application with profiles, suggestions, ratings, and chat.
Using data from twenty-one speed dating events to create a new dating app, we can connect two individuals based on their interest and preferences thus expediting the dating process. The app will direct the user to rate other users’ profiles based on not only the user’s image but also how much he/she likes the other user based on their profile information. The profiles will include demographic information, shared Interests, and other attributes such as fun factor, attractiveness, etc. After evaluating each user’s preferences and rating, the app will suggest partners who have similar interests and matching preferences.
This document provides guidance on designing effective customer satisfaction surveys. It discusses the importance of understanding customer needs, creating surveys of an appropriate length and sample size, ensuring anonymity, and analyzing results collaboratively. The intended audience is a school that is working to improve customer service and seeks to use surveys to identify gaps and track progress. Templates and examples of surveys are provided to help attendees adapt a survey for their own organization.
Anyone that works with data downstream in an organization has seen things go...wrong, while upstream managers and business leaders are being held accountable. Whether it's a failure in process, or something technically goes wrong, working with data is not always easy. What happened? How can we prevent it from happening again? What's next?
This talk, given at the Portland Data Science Group on October 27, 2016, uncovers 4 common foibles of working with organizational data.
This document provides an overview of predictive analytics and highlights some key considerations. It defines predictive analytics as using past data to predict the future. The most common barrier to predictive analytics is a lack of good data. All predictive models are based on assumptions about the future being like the past; these assumptions can become invalid over time if key variables change or the model is based on outdated data. Managers should understand the assumptions behind any predictive analysis and monitor whether conditions could make the assumptions invalid.
The document discusses how big data and analytics can be used to identify early indicators of mental stress. It outlines the 5 V's of big data: volume, variety, velocity, veracity, and value. A case study is presented on using uncommon insights from behavioral data sources like wearable biometrics and online behaviors to better understand users. Challenges to leveraging big data insights at scale are also examined, such as security, data quality, analytic skills and costs. The potential for data-driven personalized marketing and experiences is explored.
1) The document discusses how data analysis can be used to create hit TV shows by studying user activities and experiences while watching shows.
2) It provides examples of how Amazon and Netflix used data collected from free pilot episodes and viewership trends to inform their decisions about which shows to develop into full series.
3) The key lessons are that data analysis alone cannot determine the solution - intellectual decisions are also needed to interpret the data and take wise risks that can lead to extraordinary success. Managers should use both data and decision-making skills to solve problems.
Data Is Power: How to Harness It by Gabrielle Solomon (Director, Editorial Co...Hilary Ip
The document discusses the power of data and some challenges in effectively using data. It notes that data can help better understand audiences, create stronger content, and make smarter business decisions. However, it also notes that data is not always accurate, complete, relevant, or actionable. The document provides tips for ensuring data quality such as checking accuracy, sources, and relevance. It emphasizes telling stories with data and using it to create richer content that users care about.
How to Get More Value from Your Social DataAnna OBrien
Creating a meaningful insight is similar to baking a cake, with out the write ingredients, proper recipe, and something bake the batter- you'll struggle to produce something people want to consume. This deck explores at a high level how to work with data at each stage of the process. It is meant for anyone working with data- from savants to noobs.
A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18Mariia Bocheva
The document discusses A/B testing and optimization. It covers why companies do A/B testing, common mistakes in testing like poor data quality and not following statistics properly, and how to properly conduct tests. Key recommendations include testing everything, prioritizing high impact tests, understanding customer problems by analyzing metrics and flows, removing friction from the user experience, and building a data-driven culture.
The document discusses key topics from IBM's Business Analytics Summit in Toronto in 2013. It outlines the four dimensions of big data: volume, velocity, variety, and veracity. It also discusses challenges organizations face in managing big data and key shifts driving the need for smarter analytics. Additionally, it provides examples of how leading organizations are using analytics to gain insights from data and outperform competitors. Finally, it briefly describes several IBM products for big data analytics.
The challenges of big data, how data capable is your business? DQM Group Internet World
The document discusses the challenges of big data and how to measure an organization's data capabilities. It outlines the opportunities big data provides, such as recalculating risk portfolios and analyzing social media data. However, big data also presents challenges like needing a defined data strategy and ensuring talent and technology are aligned. It introduces the data maturity curve to assess an organization's data management practices and provides examples of metrics to measure people, processes, and technology. The document emphasizes having a data strategy, considering third parties and future legislation, and thinking carefully about current capabilities before pursuing big data initiatives.
This document discusses how perceptions and prejudices can influence how data is interpreted. It notes that average data can be irrelevant and dangerous, and that relational data provides a closer interpretation of the actual scenario by accounting for variation. It emphasizes the need for data visualization and publicly available, contextualized data to overcome biases and properly communicate insights from vast databases.
The document discusses the importance and benefits of data visualization. It notes that data visualization can help make large amounts of data more understandable by providing visual context and representations. It also suggests that data visualization can help reveal patterns and insights that may not be obvious from raw data alone. Finally, it states that visualization can help managers more easily find and understand relevant data to inform important decisions.
Big data is a collection of extremely large data sets that are too large to be processed with traditional data processing applications. It is characterized by high volume, velocity, and variety of structured, semi-structured, and unstructured data. Big data allows companies to gather customer information, anticipate demands, and identify potential problems by analyzing vast amounts of data from various sources. It provides value by converting large volumes of raw data into useful insights and information for businesses.
Fairness and Transparency: Algorithmic Explainability, some Legal and Ethical...Patrick Van Renterghem
In this presentation, Nazanin Gifani discussed some of the ethical and legal issues of automated decision making, including algorithmic fairness, transparency and explainability. The big question here is: can AI help us to make fairer decisions ?
This document outlines the key steps and analyses involved in developing a business case as a business analyst. It includes sections on feasibility studies, stakeholder analysis, requirements gathering, prioritization, development planning, testing, and deployment. Methodologies covered include PEST analysis, SWOT analysis, Porter's Five Forces, gap analysis, MOSCOW prioritization, and the use of user stories and use cases. The role of the business analyst in justifying the business case and translating requirements between teams is also discussed.
This book discusses how analyzing large amounts of internet search data and digital traces can provide insights into human behavior and psyche. It explains that people will confess private things to Google searches that they wouldn't tell others. The data gives a view into trends of what people want from text, images, and videos. While surveys often receive dishonest answers, digital data may be closer to what people really want or feel as they search for information online.
Big data hype (and reality) by gregory piatetsky shapiroDarpan Deoghare
This document summarizes an analysis of big data hype and reality by Gregory Piatetsky-Shapiro. It discusses how while big data offers unprecedented insights into consumer behavior, human behavior remains unpredictable and inconsistent. Three key findings are discussed: 1) Netflix's algorithm to predict movie ratings improved by less than 0.1 stars after 3 years of work, showing the limits of predicting human tastes. 2) The biggest effects of big data will be creating new areas like search and social media, not radical improvements in prediction. 3) While big data can enhance predictions, managers should not expect it to make human behavior fully predictable and should continue relying on human judgment.
This document summarizes a data set from speed dating events and the analysis conducted on it to build a predictive model for matching users. It describes the data, which comes from 8,378 speed dating observations between 2002-2004. Pre-processing steps are outlined to clean and prepare the data. Exploratory analysis finds an overall 16.5% match rate. Various models are tested and a decision tree after data replacement is found to best predict matches with an 80.2% true positive rate. The predictive model results are described to show how likely users are to say yes based on their ratings and attributes. The conclusions discuss using this model to build a dating application with profiles, suggestions, ratings, and chat.
Using data from twenty-one speed dating events to create a new dating app, we can connect two individuals based on their interest and preferences thus expediting the dating process. The app will direct the user to rate other users’ profiles based on not only the user’s image but also how much he/she likes the other user based on their profile information. The profiles will include demographic information, shared Interests, and other attributes such as fun factor, attractiveness, etc. After evaluating each user’s preferences and rating, the app will suggest partners who have similar interests and matching preferences.
This document provides guidance on designing effective customer satisfaction surveys. It discusses the importance of understanding customer needs, creating surveys of an appropriate length and sample size, ensuring anonymity, and analyzing results collaboratively. The intended audience is a school that is working to improve customer service and seeks to use surveys to identify gaps and track progress. Templates and examples of surveys are provided to help attendees adapt a survey for their own organization.
Anyone that works with data downstream in an organization has seen things go...wrong, while upstream managers and business leaders are being held accountable. Whether it's a failure in process, or something technically goes wrong, working with data is not always easy. What happened? How can we prevent it from happening again? What's next?
This talk, given at the Portland Data Science Group on October 27, 2016, uncovers 4 common foibles of working with organizational data.
This document provides an overview of predictive analytics and highlights some key considerations. It defines predictive analytics as using past data to predict the future. The most common barrier to predictive analytics is a lack of good data. All predictive models are based on assumptions about the future being like the past; these assumptions can become invalid over time if key variables change or the model is based on outdated data. Managers should understand the assumptions behind any predictive analysis and monitor whether conditions could make the assumptions invalid.
The document discusses how big data and analytics can be used to identify early indicators of mental stress. It outlines the 5 V's of big data: volume, variety, velocity, veracity, and value. A case study is presented on using uncommon insights from behavioral data sources like wearable biometrics and online behaviors to better understand users. Challenges to leveraging big data insights at scale are also examined, such as security, data quality, analytic skills and costs. The potential for data-driven personalized marketing and experiences is explored.
1) The document discusses how data analysis can be used to create hit TV shows by studying user activities and experiences while watching shows.
2) It provides examples of how Amazon and Netflix used data collected from free pilot episodes and viewership trends to inform their decisions about which shows to develop into full series.
3) The key lessons are that data analysis alone cannot determine the solution - intellectual decisions are also needed to interpret the data and take wise risks that can lead to extraordinary success. Managers should use both data and decision-making skills to solve problems.
Data Is Power: How to Harness It by Gabrielle Solomon (Director, Editorial Co...Hilary Ip
The document discusses the power of data and some challenges in effectively using data. It notes that data can help better understand audiences, create stronger content, and make smarter business decisions. However, it also notes that data is not always accurate, complete, relevant, or actionable. The document provides tips for ensuring data quality such as checking accuracy, sources, and relevance. It emphasizes telling stories with data and using it to create richer content that users care about.
How to Get More Value from Your Social DataAnna OBrien
Creating a meaningful insight is similar to baking a cake, with out the write ingredients, proper recipe, and something bake the batter- you'll struggle to produce something people want to consume. This deck explores at a high level how to work with data at each stage of the process. It is meant for anyone working with data- from savants to noobs.
A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18Mariia Bocheva
The document discusses A/B testing and optimization. It covers why companies do A/B testing, common mistakes in testing like poor data quality and not following statistics properly, and how to properly conduct tests. Key recommendations include testing everything, prioritizing high impact tests, understanding customer problems by analyzing metrics and flows, removing friction from the user experience, and building a data-driven culture.
The document discusses key topics from IBM's Business Analytics Summit in Toronto in 2013. It outlines the four dimensions of big data: volume, velocity, variety, and veracity. It also discusses challenges organizations face in managing big data and key shifts driving the need for smarter analytics. Additionally, it provides examples of how leading organizations are using analytics to gain insights from data and outperform competitors. Finally, it briefly describes several IBM products for big data analytics.
The challenges of big data, how data capable is your business? DQM Group Internet World
The document discusses the challenges of big data and how to measure an organization's data capabilities. It outlines the opportunities big data provides, such as recalculating risk portfolios and analyzing social media data. However, big data also presents challenges like needing a defined data strategy and ensuring talent and technology are aligned. It introduces the data maturity curve to assess an organization's data management practices and provides examples of metrics to measure people, processes, and technology. The document emphasizes having a data strategy, considering third parties and future legislation, and thinking carefully about current capabilities before pursuing big data initiatives.
This document discusses how perceptions and prejudices can influence how data is interpreted. It notes that average data can be irrelevant and dangerous, and that relational data provides a closer interpretation of the actual scenario by accounting for variation. It emphasizes the need for data visualization and publicly available, contextualized data to overcome biases and properly communicate insights from vast databases.
The document discusses the importance and benefits of data visualization. It notes that data visualization can help make large amounts of data more understandable by providing visual context and representations. It also suggests that data visualization can help reveal patterns and insights that may not be obvious from raw data alone. Finally, it states that visualization can help managers more easily find and understand relevant data to inform important decisions.
This document provides tips for aspiring data scientists. It advises them to start by focusing on a topic that interests them and to clearly define their objectives and data collection process. It also recommends that they visualize their data, understand the context, look for additional insights, evaluate results, and find effective uses of the data. The document notes that data is becoming increasingly important in all industries and companies without data-savvy managers will be at a disadvantage.
This document discusses what should be done with large amounts of data. It warns that data could be misinterpreted if not properly understood, and that perspectives and assumptions can influence data. Quality and context of data are important, as correlation does not necessarily imply causation. Overall, the document emphasizes the need to provide context and think critically about data in order to unlock its power and avoid being passive consumers or making misguided decisions based on irrelevant information.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms for those who already suffer from conditions like anxiety and depression.
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
State of Artificial intelligence Report 2023kuntobimo2016
Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its sixth year. Consider this report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future.
We consider the following key dimensions in our report:
Research: Technology breakthroughs and their capabilities.
Industry: Areas of commercial application for AI and its business impact.
Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI.
Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.
Predictions: What we believe will happen in the next 12 months and a 2022 performance review to keep us honest.
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
2. Hello!
I am Vikrant Narayan
Data Analytics enthusiast
You can find me at @vikrant.m.narayan@gmail.com
2
3. “ Decision making can sometimes
seem like an inner civil war
- Jim Rohn
3
4. Is Data enough?
● Events from the past have shown us that, just
looking into past data and analysing attributes of
data points, do not guaranty success.
● We need to ask ourselves if data and data
analysis is enough to guaranty success of the
model.
4
5. Is Data enough?
● The answer to the question is NO.
● There have been instances where similar data
analysis models have been used but the results
of the models have been drastically different.
5
7. Et tu, Google?
● Data analytics was used to study the data about
influenza virus and predict future outbreaks.
● The model worked perfectly for a few years
before it eventually failed.
7
8. Reliability
● If data analysis does not guaranty success, is it
okay to rely on them in more serious domains
like, say, medicine and law enforcement?
● We find that data is involved in real life decision
making, in spite of all these failures.
8
10. Pattern Recognition
● Analysing patterns between successful and
unsuccessful models reveal that the success of
the model depends on the decision making.
● They show that results depend on instinctive
decisions and risk management.
10
11. Role of Data
● Data analysis is primarily for breaking up the
data and understanding the data.
● Data might not be preferred to put back the
broken pieces together.
11
12.
13. Relevance to managers
● Managers should be able to not just collect data
points and analyse them numerically, but also to
make decisions and take risks for the data analysis to
be succesful.
13