This is a presentation from a introductory machine learning workshop that I conducted. This was held as a part of run up to Fifth Elephant 2014, a conference on big data and analytics.
Interpretable machine learning? Let's do this with the Break Down package, part of the DrWhy.AI initiative. Find more information at https://github.com/pbiecek/DALEX/
Ways to Extract Variable Insights when Data is ScarseZia Babar
The document discusses ways to extract insights from scarce data using machine learning and other techniques. It describes a dataset of Freedom of Information requests with few records. Machine learning performed poorly due to the small dataset size. However, descriptive statistics, exploratory data analysis, and natural language processing tools like n-grams and topic modeling can still provide valuable information. These alternative techniques were demonstrated on the request data through word clouds, frequency analysis, and topic modeling.
Data Science as a Career and Intro to RAnshik Bansal
This document discusses data science as a career option and provides an overview of the roles of data analyst, data scientist, and data engineer. It notes that data analysts solve problems using existing tools and manage data quality, while data scientists are responsible for undirected research and strategic planning. Data engineers compile and install database systems. The document also outlines the typical salaries for each role and discusses the growing demand for data science skills. It provides recommendations for learning tools and resources to pursue a career in data science.
Gary Potter reminisces about his time working in marketing for Apple UK in the 1980s. He discusses how Apple UK grew from being a small distributor (Microsense) to directly being owned by Apple Inc. Vertical marketing targeting specific industries like education was very successful. The launch of the Macintosh in the UK required designing a brochure that would appeal more to a British audience. Software development established the Mac's place in publishing and design industries. Through passion and creativity of employees, Apple UK became one of Apple's top performing markets outside of the US.
The document discusses social media usage statistics and provides information about Previsite, a social media marketing company for real estate. It notes that Facebook has 850 million monthly users who upload 250 million photos daily. It also discusses focus groups that revealed real estate brokers' interest in automating social media marketing, increasing SEO, and generating more leads. Previsite was founded in 2000 and provides services to help brokers build their brands on social media, drive website traffic, increase SEO, and integrate their business into social media platforms.
For men of understanding 1. honey bees, salmons, monarch butterflies and drag...HarunyahyaEnglish
The document summarizes the remarkable migratory journeys of three creatures - salmon, bees, and monarch butterflies. It notes that salmon can return with perfect accuracy to the exact streams where they were born to spawn, despite traveling thousands of kilometers. Bees communicate the location of food sources through an intricate dance language. Monarch butterflies migrate annually over long distances, with the fourth generation each year adapted to survive the winter months far from home. In each case, the document argues that these creatures display design and purpose that refute Darwin's theory of evolution, and instead point to Allah as their creator.
Interpretable machine learning? Let's do this with the Break Down package, part of the DrWhy.AI initiative. Find more information at https://github.com/pbiecek/DALEX/
Ways to Extract Variable Insights when Data is ScarseZia Babar
The document discusses ways to extract insights from scarce data using machine learning and other techniques. It describes a dataset of Freedom of Information requests with few records. Machine learning performed poorly due to the small dataset size. However, descriptive statistics, exploratory data analysis, and natural language processing tools like n-grams and topic modeling can still provide valuable information. These alternative techniques were demonstrated on the request data through word clouds, frequency analysis, and topic modeling.
Data Science as a Career and Intro to RAnshik Bansal
This document discusses data science as a career option and provides an overview of the roles of data analyst, data scientist, and data engineer. It notes that data analysts solve problems using existing tools and manage data quality, while data scientists are responsible for undirected research and strategic planning. Data engineers compile and install database systems. The document also outlines the typical salaries for each role and discusses the growing demand for data science skills. It provides recommendations for learning tools and resources to pursue a career in data science.
Gary Potter reminisces about his time working in marketing for Apple UK in the 1980s. He discusses how Apple UK grew from being a small distributor (Microsense) to directly being owned by Apple Inc. Vertical marketing targeting specific industries like education was very successful. The launch of the Macintosh in the UK required designing a brochure that would appeal more to a British audience. Software development established the Mac's place in publishing and design industries. Through passion and creativity of employees, Apple UK became one of Apple's top performing markets outside of the US.
The document discusses social media usage statistics and provides information about Previsite, a social media marketing company for real estate. It notes that Facebook has 850 million monthly users who upload 250 million photos daily. It also discusses focus groups that revealed real estate brokers' interest in automating social media marketing, increasing SEO, and generating more leads. Previsite was founded in 2000 and provides services to help brokers build their brands on social media, drive website traffic, increase SEO, and integrate their business into social media platforms.
For men of understanding 1. honey bees, salmons, monarch butterflies and drag...HarunyahyaEnglish
The document summarizes the remarkable migratory journeys of three creatures - salmon, bees, and monarch butterflies. It notes that salmon can return with perfect accuracy to the exact streams where they were born to spawn, despite traveling thousands of kilometers. Bees communicate the location of food sources through an intricate dance language. Monarch butterflies migrate annually over long distances, with the fourth generation each year adapted to survive the winter months far from home. In each case, the document argues that these creatures display design and purpose that refute Darwin's theory of evolution, and instead point to Allah as their creator.
I have presented these slides at the Energy Harvesting 2013 event funded by the EPSRC in London in March 25th.
This contains Morgan's involvement in developing a piezoelectric based commercial solution for the emerging energy harvesting technology.
In case you wish to be considered a perfect employee in your organization, try avoiding these seven deadly sins by Employees.
Read more interesting content, at www.thecareermuse.co.in - We intend to inform and inspire recruiters, job seekers and anyone with an interest in the workplace and HR technology.
Hope you enjoyed reading the Infographic.
Feel free to share your feedback with us at @CareerBuilderIn
Social Media is no longer optional. CPAs are using social media to stay on top of major trends, increase their recognition in markets, and being recognized as thought leaders. This preso is from a special 4 hour session at MACPA's Beach Retreat on July 3rd, 2010 in Ocean City, MD
Faire d’internet un véritable outil commercial dans le tourismePhilippe Fabry
Intervention à un forum sur le e-marketing dans le secteur touristique, organisé le 14 Octobre 2009, à La Rochelle, au salon de l’entreprise. Quelques éléments simples de cadrage dans l'etourisme et un zoom sur le secteur de l'hôtellerie.
Want to land a sweet tech job? But not sure how to break in? Discover the seven secrets that took me from teaching kindergarten to landing jobs at Apple, LinkedIn, and startups!
Este documento describe las principales herramientas ofimáticas como Word, Excel, PowerPoint, Access y Outlook. Explica cómo se usan comúnmente estas herramientas en actividades laborales como la redacción de documentos, hojas de cálculo, presentaciones y administración de correo electrónico y contactos. También identifica las causas del mal uso de estas herramientas y riesgos para la seguridad de la información como la falta de interés y comprensión de los usuarios.
E book - Hiring tool kit for Smart RecruitersTalview
Talview E-book for recruiters gives a complete working tool kit for recruiters for a better and quality hiring. The E-book is divided into brief Six chapters where it gives complete information about the innovative change in the Talent Acquisition Department.
The Numbers Magic (Amsterdam Node Meetup Presentation)icemobile
An Amsterdam Node Meetup presentation by Luca Maraschi on December 16th 2014:
NaN is a Number and (0.1 + 0.2 === 0.3) is not true!JS, and so node.js, are infamous for their number’s management, but are we all right about this? Do we know where this come from? Is it fixable? Can we?
An historical odyssey (excursus) from the origin of floating point calculation till modern languages, which will take us analysing the real business value of IEEE754 and floating point precision, with real case scenarios and their resolution in production.
This document discusses the challenges, pitfalls, and biases that can occur when working with social data in the real world. It notes that both social data and other data sources are flawed, and combining them is difficult. It outlines several potential issues including selection bias from non-representative social media users, confusion between correlation and causation, confirmation bias from seeking evidence to support hypotheses, overfitting models to random patterns in the data, and underfitting by missing hidden signals. The document provides general tips like being explicit about assumptions, looking for patterns before analysis, attempting to disprove findings, and keeping models as simple as possible.
Social Selling Social Media Conversions Webinar With Milk it Academy by Doyle...Doyle Buehler
Doyle Buehler gives a presentation on social selling. He discusses how social media is not about likes, cute pictures or engagement, but rather about having meaningful conversations to build an online community. The goal is to move this audience into one's own digital ecosystem through delivering value and influencing them, in order to facilitate conversions and sales. Buehler provides tips on defining goals and metrics, setting up marketing funnels, creating landing pages, running social media ads, and monitoring performance.
Leadership howard county 03 28 12 jenn lim delivering happinessDelivering Happiness
This document summarizes a presentation given by Jenn Lim on leadership and happiness. Some key points discussed include defining life goals and priorities, understanding what truly brings happiness, reflecting on personal passions, and examining how companies like Zappos have used a culture-first approach to build sustainable brands committed to happiness. Lim shares lessons learned from Zappos' experiment in using happiness as a business model, including the importance of commitment to core values, transparency, vision, relationships, and hiring the right team.
Este documento es una guía de aplicación sobre el sistema nervioso para estudiantes de primer grado de secundaria. Incluye preguntas sobre las partes del sistema nervioso periférico, los nervios craneales y espinales, los tipos de receptores y estímulos, y las zonas del sistema nervioso afectadas por diferentes enfermedades y condiciones como la epilepsia, Parkinson y Alzheimer. También incluye preguntas sobre los efectos de un derrame cerebral y por qué ciertas drogas son estimulantes, narcóticas o alucinógen
This document provides an overview of data science and the role of data scientists. It discusses how data scientists work with large datasets to solve problems, highlights example use cases at companies like LinkedIn and Uber, and outlines the data science process. Tools commonly used by data scientists like SQL, machine learning algorithms, and Python are also explained. The document concludes by discussing opportunities and challenges in the field and how to get started learning data science through programs like Thinkful.
This document provides an overview of machine learning for Java Virtual Machine (JVM) developers. It begins with introductions to the speaker and topics to be covered. It then discusses the growth of data and opportunities for machine learning applications. Key machine learning concepts are defined, including observations, features, models, supervised vs. unsupervised learning, and common algorithms like classification, regression, and clustering. Popular JVM machine learning tools are listed, with Spark/MLlib highlighted for its community support and implementation of standard algorithms. Example machine learning demos on price prediction and spam classification are described. The document concludes with recommendations for further learning resources.
This document provides an overview of machine learning, including:
- The types of machine learning such as supervised learning, unsupervised learning, and reinforcement learning. Supervised learning involves predicting outputs from labeled inputs using techniques like regression and classification, while unsupervised learning finds patterns in unlabeled data using clustering and dimensionality reduction.
- Common machine learning applications including speech recognition, machine translation, strategic gaming, computer vision, autonomous vehicles, manufacturing, and healthcare.
- Effective machine learning involves reducing programming time through existing tools, customizing and scaling products as needed, and using statistics rather than logic to make decisions from real-world data.
What data scientists really do, according to 50 data scientistsHugo Bowne-Anderson
My talk at PyData NYC, 2018.
This is the abstract:
Hugo Bowne-Anderson, data scientist and host of the DataFramed podcast, will give you a view into the thinking of 50 leading data scientists from around the world about the trends driving the data science revolution. During his interviews with these thought leaders, Hugo discovered themes and lessons about the past, present, and future of data science.
Knowledge Graphs - The Power of Graph-Based SearchNeo4j
1) Knowledge graphs are graphs that are enriched with data over time, resulting in graphs that capture more detail and context about real world entities and their relationships. This allows the information in the graph to be meaningfully searched.
2) In Neo4j, knowledge graphs are built by connecting diverse data across an enterprise using nodes, relationships, and properties. Tools like natural language processing and graph algorithms further enrich the data.
3) Cypher is Neo4j's graph query language that allows users to search for graph patterns and return relevant data and paths. This reveals why certain information was returned based on the context and structure of the knowledge graph.
I have presented these slides at the Energy Harvesting 2013 event funded by the EPSRC in London in March 25th.
This contains Morgan's involvement in developing a piezoelectric based commercial solution for the emerging energy harvesting technology.
In case you wish to be considered a perfect employee in your organization, try avoiding these seven deadly sins by Employees.
Read more interesting content, at www.thecareermuse.co.in - We intend to inform and inspire recruiters, job seekers and anyone with an interest in the workplace and HR technology.
Hope you enjoyed reading the Infographic.
Feel free to share your feedback with us at @CareerBuilderIn
Social Media is no longer optional. CPAs are using social media to stay on top of major trends, increase their recognition in markets, and being recognized as thought leaders. This preso is from a special 4 hour session at MACPA's Beach Retreat on July 3rd, 2010 in Ocean City, MD
Faire d’internet un véritable outil commercial dans le tourismePhilippe Fabry
Intervention à un forum sur le e-marketing dans le secteur touristique, organisé le 14 Octobre 2009, à La Rochelle, au salon de l’entreprise. Quelques éléments simples de cadrage dans l'etourisme et un zoom sur le secteur de l'hôtellerie.
Want to land a sweet tech job? But not sure how to break in? Discover the seven secrets that took me from teaching kindergarten to landing jobs at Apple, LinkedIn, and startups!
Este documento describe las principales herramientas ofimáticas como Word, Excel, PowerPoint, Access y Outlook. Explica cómo se usan comúnmente estas herramientas en actividades laborales como la redacción de documentos, hojas de cálculo, presentaciones y administración de correo electrónico y contactos. También identifica las causas del mal uso de estas herramientas y riesgos para la seguridad de la información como la falta de interés y comprensión de los usuarios.
E book - Hiring tool kit for Smart RecruitersTalview
Talview E-book for recruiters gives a complete working tool kit for recruiters for a better and quality hiring. The E-book is divided into brief Six chapters where it gives complete information about the innovative change in the Talent Acquisition Department.
The Numbers Magic (Amsterdam Node Meetup Presentation)icemobile
An Amsterdam Node Meetup presentation by Luca Maraschi on December 16th 2014:
NaN is a Number and (0.1 + 0.2 === 0.3) is not true!JS, and so node.js, are infamous for their number’s management, but are we all right about this? Do we know where this come from? Is it fixable? Can we?
An historical odyssey (excursus) from the origin of floating point calculation till modern languages, which will take us analysing the real business value of IEEE754 and floating point precision, with real case scenarios and their resolution in production.
This document discusses the challenges, pitfalls, and biases that can occur when working with social data in the real world. It notes that both social data and other data sources are flawed, and combining them is difficult. It outlines several potential issues including selection bias from non-representative social media users, confusion between correlation and causation, confirmation bias from seeking evidence to support hypotheses, overfitting models to random patterns in the data, and underfitting by missing hidden signals. The document provides general tips like being explicit about assumptions, looking for patterns before analysis, attempting to disprove findings, and keeping models as simple as possible.
Social Selling Social Media Conversions Webinar With Milk it Academy by Doyle...Doyle Buehler
Doyle Buehler gives a presentation on social selling. He discusses how social media is not about likes, cute pictures or engagement, but rather about having meaningful conversations to build an online community. The goal is to move this audience into one's own digital ecosystem through delivering value and influencing them, in order to facilitate conversions and sales. Buehler provides tips on defining goals and metrics, setting up marketing funnels, creating landing pages, running social media ads, and monitoring performance.
Leadership howard county 03 28 12 jenn lim delivering happinessDelivering Happiness
This document summarizes a presentation given by Jenn Lim on leadership and happiness. Some key points discussed include defining life goals and priorities, understanding what truly brings happiness, reflecting on personal passions, and examining how companies like Zappos have used a culture-first approach to build sustainable brands committed to happiness. Lim shares lessons learned from Zappos' experiment in using happiness as a business model, including the importance of commitment to core values, transparency, vision, relationships, and hiring the right team.
Este documento es una guía de aplicación sobre el sistema nervioso para estudiantes de primer grado de secundaria. Incluye preguntas sobre las partes del sistema nervioso periférico, los nervios craneales y espinales, los tipos de receptores y estímulos, y las zonas del sistema nervioso afectadas por diferentes enfermedades y condiciones como la epilepsia, Parkinson y Alzheimer. También incluye preguntas sobre los efectos de un derrame cerebral y por qué ciertas drogas son estimulantes, narcóticas o alucinógen
This document provides an overview of data science and the role of data scientists. It discusses how data scientists work with large datasets to solve problems, highlights example use cases at companies like LinkedIn and Uber, and outlines the data science process. Tools commonly used by data scientists like SQL, machine learning algorithms, and Python are also explained. The document concludes by discussing opportunities and challenges in the field and how to get started learning data science through programs like Thinkful.
This document provides an overview of machine learning for Java Virtual Machine (JVM) developers. It begins with introductions to the speaker and topics to be covered. It then discusses the growth of data and opportunities for machine learning applications. Key machine learning concepts are defined, including observations, features, models, supervised vs. unsupervised learning, and common algorithms like classification, regression, and clustering. Popular JVM machine learning tools are listed, with Spark/MLlib highlighted for its community support and implementation of standard algorithms. Example machine learning demos on price prediction and spam classification are described. The document concludes with recommendations for further learning resources.
This document provides an overview of machine learning, including:
- The types of machine learning such as supervised learning, unsupervised learning, and reinforcement learning. Supervised learning involves predicting outputs from labeled inputs using techniques like regression and classification, while unsupervised learning finds patterns in unlabeled data using clustering and dimensionality reduction.
- Common machine learning applications including speech recognition, machine translation, strategic gaming, computer vision, autonomous vehicles, manufacturing, and healthcare.
- Effective machine learning involves reducing programming time through existing tools, customizing and scaling products as needed, and using statistics rather than logic to make decisions from real-world data.
What data scientists really do, according to 50 data scientistsHugo Bowne-Anderson
My talk at PyData NYC, 2018.
This is the abstract:
Hugo Bowne-Anderson, data scientist and host of the DataFramed podcast, will give you a view into the thinking of 50 leading data scientists from around the world about the trends driving the data science revolution. During his interviews with these thought leaders, Hugo discovered themes and lessons about the past, present, and future of data science.
Knowledge Graphs - The Power of Graph-Based SearchNeo4j
1) Knowledge graphs are graphs that are enriched with data over time, resulting in graphs that capture more detail and context about real world entities and their relationships. This allows the information in the graph to be meaningfully searched.
2) In Neo4j, knowledge graphs are built by connecting diverse data across an enterprise using nodes, relationships, and properties. Tools like natural language processing and graph algorithms further enrich the data.
3) Cypher is Neo4j's graph query language that allows users to search for graph patterns and return relevant data and paths. This reveals why certain information was returned based on the context and structure of the knowledge graph.
Data Workflows for Machine Learning - Seattle DAMLPaco Nathan
First public meetup at Twitter Seattle, for Seattle DAML:
http://www.meetup.com/Seattle-DAML/events/159043422/
We compare/contrast several open source frameworks which have emerged for Machine Learning workflows, including KNIME, IPython Notebook and related Py libraries, Cascading, Cascalog, Scalding, Summingbird, Spark/MLbase, MBrace on .NET, etc. The analysis develops several points for "best of breed" and what features would be great to see across the board for many frameworks... leading up to a "scorecard" to help evaluate different alternatives. We also review the PMML standard for migrating predictive models, e.g., from SAS to Hadoop.
A short presentation for beginners on Introduction of Machine Learning, What it is, how it works, what all are the popular Machine Learning techniques and learning models (supervised, unsupervised, semi-supervised, reinforcement learning) and how they works with various Industry use-cases and popular examples.
This document provides information about a course on data mining and data warehousing. It includes the vision and mission statements of the university and computer science department. It outlines the program outcomes, program educational objectives, and course outcomes. Finally, it provides a detailed syllabus covering topics like data preprocessing, frequent pattern mining, classification, clustering, and data warehousing. The goal is to teach students how to extract useful patterns from data through techniques like association rule mining, classification, and clustering.
This session will demystify (generative) AI by exploring its workings as an advanced statistical modelling tool (suitable for any level of technical knowledge). Not only will this session explain the technological underpinnings of AI, it will also address concerns and (long-term) requirements around ethical and practical usage of AI. This includes data preparation and cleaning, data ownership, and the value of data-generated - but not owned - by libraries. It will also discuss the potentials for (hypothetical) use cases of AI in collections environments and making collections data AI-ready; providing examples of AI capabilities and applications beyond chatbots.
This document provides an overview of getting started with data science using Python. It discusses what data science is, why it is in high demand, and the typical skills and backgrounds of data scientists. It then covers popular Python libraries for data science like NumPy, Pandas, Scikit-Learn, TensorFlow, and Keras. Common data science steps are outlined including data gathering, preparation, exploration, model building, validation, and deployment. Example applications and case studies are discussed along with resources for learning including podcasts, websites, communities, books, and TV shows.
This document contains summaries of "folk knowledge" or common sayings related to data science. Some key points included are:
- Machine learning requires both data and some prior knowledge or assumptions in order to generalize beyond the training data.
- Overfitting can take many forms like high bias, high variance, or sampling bias.
- Intuition fails in high dimensions according to Bellman's curse of dimensionality.
- Feature engineering is key, and more data often beats a cleverer algorithm.
The document discusses machine learning and data science concepts. It begins with an introduction to machine learning and the machine learning process. It then provides an overview of select machine learning algorithms and concepts like bias/variance, generalization, underfitting and overfitting. It also discusses ensemble methods. The document then shifts to discussing time series, functions for manipulating time series, and laying the foundation for time series prediction and forecasting. It provides examples of applying techniques like median filtering to smooth time series data. Overall, the document provides a high-level introduction and overview of key machine learning and time series concepts.
In this Lunch & Learn session, Chirag Jain gives us a friendly & gentle introduction to Machine Learning & walks through High-Level Learning frameworks using Linear Classifiers.
The document discusses map reduce and how it can be used for sequential web access-based recommendation systems. It explains that map reduce separates large, unstructured data processing from computation, allowing it to run efficiently on many machines. A map reduce job could process web server logs to build a pattern tree for recommendations, with the tree continuously updated from new data. When making recommendations for a user, their access pattern would be compared to the tree generated from all user data.
OSCON 2014: Data Workflows for Machine LearningPaco Nathan
This document provides examples of different frameworks that can be used for machine learning data workflows, including KNIME, Python, Julia, Summingbird, Scalding, and Cascalog. It describes features of each framework such as KNIME's large number of integrations and visual workflow editing, Python's broad ecosystem, Julia's performance and parallelism support, Summingbird's ability to switch between Storm and Scalding backends, and Scalding's implementation of the Scala collections API over Cascading for compact workflow code. The document aims to familiarize readers with options for building machine learning data workflows.
This document provides an overview of data science. It discusses that data science involves taking large amounts of information and turning it into something valuable. It then discusses the main components of data science: data wrangling, analytics, and predictions. It also profiles four common types of data scientists: researchers, AI gurus, statisticians, and super analysts. It describes their typical work environments, job responsibilities, and education/skills. Finally, it discusses options for learning data science, including a free two-week trial program from Thinkful.
Data science and web development are both growing fields with strong job prospects. Data science involves analyzing data to solve problems and find insights, while web development focuses on building interactive websites and applications. Some key differences are that data science often requires more advanced technical skills and mathematics training, while web development places more emphasis on designing user experiences. Both fields offer opportunities for remote and flexible work. Thinkful provides guided learning programs to help students explore these fields and land jobs through project-based learning and mentorship.
Demystifying Data Science with an introduction to Machine LearningJulian Bright
The document provides an introduction to the field of data science, including definitions of data science and machine learning. It discusses the growing demand for data science skills and jobs. It also summarizes several key concepts in data science including the data science pipeline, common machine learning algorithms and techniques, examples of machine learning applications, and how to get started in data science through online courses and open-source tools.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
2. me ...
Harshad
➔ Senior Data Scientist @ Sokrati
➔ Spent last 4 years trying to understand and
apply machine learning
Sokrati is a digital advertising startup based out of Pune
11. bit of (relevant) history ...
➔ Insurance, Banking industry
◆ Credit scoring
◆ mathematical models in finance
➔ Artificial Intelligence and other fancy ideas
◆ deep blue, Samuel’s checkers machine
◆ IBM watson computer
12. Find ‘f’ => endeavour to understand the world!
g( y ) = f( X )
14. the two cultures in ML
Stats Culture
➔ Focus on ‘why this model’
➔ Goodness of fit, hypothesis
tests, residuals
➔ or MCMC methods, bayesian
modeling
➔ regression models, survival
analysis
AI / ML Culture
➔ Focus on ‘good predictions’
➔ Cross validations, ensemble of
models
➔ Focus on underfit vs overfit
analysis
➔ neural nets, tree based models
(random forests et. al.)
15. but let’s build bridges
Stats
➔ Focus on basics, sound theory
➔ Exploration, summaries
➔ Models
ML / AI
➔ Focus on predictions
➔ Model evaluation
➔ Feature selection
Computations
➔ Focus on application
➔ Achieving scale and usability
➔ Hadoop , Storm and friends..
Business Knowledge
➔ Focus on interpretation
➔ Visualizations
➔ Creating stories out of data
16. typical ML process
➔ Objective
➔ Source data
➔ Explore
➔ Model
➔ Evaluate
➔ Apply
➔ Validate
18. bottom line : not in terms of algorithm but outcome
probability of customer churn
group set of emails by topic
predict rainfall
recommend item to a consumer to
increase likelihood of click
21. introduction to R
➔ Starting R
➔ Data Structures
◆ Atomic Vectors, matrix
◆ Lists
◆ Data frames
➔ Data types
◆ usual suspects in numerics (int, double, character)
◆ Factors
◆ logical
22. data frame , the workhorse
➔ Load sample data frame
➔ Explore data frame (head, tail)
➔ Access elements by index
◆ access rows
◆ access columns
● single
● multiple (by name, by index, by -ve index)
➔ Find metadata
◆ names
◆ dimensions
➔ Explore using plot (pairs)
23. broadcasting / vectorization
➔ Very important concept
➔ Subsetting
◆ vectors
◆ data frames
➔ Applying operations
◆ operations on entire column
24. data exploration
➔ summarizing using mean
➔ quantiles, when mean is not enough
◆ outlier detection
➔ functional roots of R : sapply summary
➔ summary function
◆ on numerical
◆ on factors
➔ plots (basic)
➔ histograms
➔ correlations
27. linear regression
➔ load sample dataset (cars)
➔ build linear regression model
➔ understand the output
◆ summary
◆ plots
➔ understanding train vs predict cycle
◆ most important idea!
28. demo on real world
dataset
data exploration, classification
29. hierarchical clustering
➔ basic idea of clustering
◆ distance as a proxy for similarity, group by distance
◆ group anything as long as distance can be
calculated
➔ load and explore eurodist data
➔ fit hierarchical cluster
➔ plot dendrogram
30. demo on real world data
and the most important idea!
31. concept of vector space model
➔ Words as axis
➔ Bag of words defines
vector space
➔ Document as a point in
space
➔ We can
◆ define distance
◆ measure similarity
(cosine similarity)
◆ group documents
➔ what can be a document ?
33. evaluation metrics
➔ depends on type of model
◆ regression : MAPE, MSSE
◆ classification : accuracy, precision, recall, F score
◆ clustering : within vs between variance
➔ ML world (ref : two cultures) has much
better story
➔ Not enough to perform well on training set
34. brief intro regularization
➔ Bias vs Variance problem
➔ We want to be ‘just right’
➔ Concept of regularization
➔ Intro Cross validation
37. the great ML divide
Lab Culture
➔ Theory
➔ Small Datasets
➔ In memory
➔ Not live
➔ R, Octave,
Python..
Source and Apply
➔ The practice
➔ Huge datasets
➔ Live in production
➔ Hadoop and
friends, Python ? ,
R ?
38. processing data at scale
➔ Data is not available in final form
➔ Non standard data
◆ click streams
◆ event logs
◆ free form text
➔ Process at scale
➔ Transform , clean, group in final form
39. 5 min intro to Hadoop Ecosystem
➔ Write in assembly
◆ java
➔ DSLs
◆ Pig
◆ hive
◆ impala
➔ Functional Languages to rescue!