What are "lexical resources" that can go into defining words and phrases? Visualizations and resources for studying language. (Presentation given at Dictionary.com)
What are "lexical resources" that can go into defining words and phrases? Visualizations and resources for studying language. (Presentation given at Dictionary.com)
What are "lexical resources" that can go into defining words and phrases? Visualizations and resources for studying language. (Presentation given at Dictionary.com)
Verb noun collocations including the following verbs:
have, organize, plan, make, get, take, catch, ask, lose, pay, run, do
There is a fill in the blanks and a word match activity included in this slide show.
There are also 3 slides that have collocations that use swear words.
Explicación acerca del tema "collocations" que son frases de dos o más palabras que se entienden como una sola idea y no deben ser traducidas de manera literal de un idioma a otro.
How to learn IELTS Vocabulary (Collocations and Topic Specific Vocabulary)Ben Worthington
25 Slides explaining the what to learn and how to learn it.
Presentation covers: Collocations, Topic specific vocabulary, and EASY ESSAY SENTENCES TO MEMORIZE.
Also includes how to use the Academic Word List
Verb noun collocations including the following verbs:
have, organize, plan, make, get, take, catch, ask, lose, pay, run, do
There is a fill in the blanks and a word match activity included in this slide show.
There are also 3 slides that have collocations that use swear words.
Explicación acerca del tema "collocations" que son frases de dos o más palabras que se entienden como una sola idea y no deben ser traducidas de manera literal de un idioma a otro.
How to learn IELTS Vocabulary (Collocations and Topic Specific Vocabulary)Ben Worthington
25 Slides explaining the what to learn and how to learn it.
Presentation covers: Collocations, Topic specific vocabulary, and EASY ESSAY SENTENCES TO MEMORIZE.
Also includes how to use the Academic Word List
Webinar Slides-Three Knows to Great Writing Nov 4 2014ERAUWebinars
Webinar presentation by Embry-Riddle Aeronautical University--Worldwide. Dr. Terri Maue shows how to be a better writer by understanding the "Three Knows."
Test & Learn: How to Leverage Design to Learn & Deliver Results Quickly Optimizely
The role of design is often overlooked on growth teams that are moving fast and running experiments at scale. When applied correctly, design can be your growth team's secret weapon. Join Angel Steger, growth design lead at Dropbox, to learn how to leverage design thinking and design craft to super-charge your growth team's velocity while driving high-quality output. We’ll walk through tools and case studies to give you ideas you can put into motion right away.
Attendees will:
Learn how to leverage the Design role within a Growth team
Learn how design quality works in the context of a fast-moving team
Learn how to use design thinking to differentiate between haste and velocity as a cross-functional team
Walk away with tools to learn quickly while making meaningful progress against large unknowns
Template Leading Mathematical Discussions Performance-Based.docxrhetttrevannion
Template: Leading Mathematical Discussions
Performance-Based Assessment #3
Due November 6 at 11:59 PM
Use the template below for Performance-based Assessment # 3
Lesson Plan for the Number Talk
Number Talk Problem
How will you set up the activity?
Describe what you will say/do to introduce the number talk. Since you will be working with a
small group of students, friends, classmates or family you’ll need to give an in-depth
explanation of what to expect and what silent signals to use (e.g., thumb on chest, agree
sign). See this video for inspiration and adjust according to your audience. Delete this and
replace it with your plan.
Anticipated Student Strategies
List as many solution strategies as possible
Plan for Talk Moves
Look at the talk moves handout provided in Module 9 and consider how you will use them.
Which ones do you want to focus on using? Your goal is to use three or more different moves
a total of five times.
Wrap Up Questions
https://www.youtube.com/watch?v=X18cQkKMlhs
https://fiudit-my.sharepoint.com/:w:/g/personal/bking_fiu_edu/ESdTxTbBjRJMqA0KjhZF_https:/fiudit-my.sharepoint.com/:w:/g/personal/bking_fiu_edu/ESdTxTbBjRJMqA0KjhZF_V8Bygvy1ocQGKi3lTURI-PKwg?e=HuuHcdV8Bygvy1ocQGKi3lTURI-PKwg?e=HuuHcd
Record Talk Moves
What questions might you ask after several students have shared their methods?
Talk Moves
Find examples of different talk moves you used during the discussion. Discuss each talk
move up to the fifth one used in the discussion.
1) List the time stamp in your video for the talk move,
2) Explain which talk move you used (use the handout to identify the type of move) and
state the question you asked,
3) Explain the student’s response, and
4) Explain whether the talk move was effective? If you think it was effective, explain why.
If you don’t think it was effective, explain what you hoped would have happened.
Time in
Video
Talk Move (Name and what you said) Student Response
Effectiveness
Time in
Video
Talk Move (Name and what you said) Student Response
Effectiveness
Time in
Video
Talk Move (Name and what you said) Student Response
Effectiveness
Time in
Video
Talk Move (Name and what you said) Student Response
Effectiveness
Time in
Video
Talk Move (Name and what you said) Student Response
Self-Reflection
• Use the following questions to guide your reflection.
o How did the number talk go? Give a brief overview of what happened.
o Evaluate your use of talk moves.
o What were your strengths? What do you need to work on more?
Write your reflection here.
Self-Assessment
● Review your work and assess yourself on indicators A-C below.
● Change the color of the cell or text to show which level (1, 2, or 3) corresponds with the
quality of your work.
● Write an explanation for why you selected the rating you did. Make connections
between the rubric and your work.
Effectiveness
Course Objective # 3: I can facil.
Your data is great, but does it work for your usersvickybuser
How can you be confident that you’re organising and labelling your content in ways that best meet the needs of the people using it? What appears logical in the data may not turn out to reflect the way your users see the world. It’s tempting to make assumptions about your users based on your own experiences, but it’s far better to find out directly from the users themselves. For effective information architecture (IA), user research is crucial for developing knowledge about users’ information seeking behaviours, the trigger words they're looking for, and how they understand the subject domain.
In this session we’ll look at what user research is and the role it plays in figuring out how to structure successful content-rich websites. We’ll take a whistle-stop tour of a toolbox of user research tools and techniques, and how to mix and match the methods to get the best results. For example, during a typical IA project you’d aim to balance the insights gained from search log and usage data analysis with more qualitative techniques such as interviews (to learn about people's information needs), card sorts (to get a sense of how people group and label content) and tree tests (to find out how people look for content). We’ll also briefly cover personas, surveys, contextual inquiry, usability testing, A/B testing, and diary studies. We’ll use examples to show how a better understanding of your users can help you to support them in finding what they need.
You’ll discover why it’s always important to do user research, what methods to use when, and how to avoid some of the potential pitfalls (like recruiting the wrong participants, asking the wrong types of questions, or doing the research in the wrong phase of a project). We’ll also discuss the challenges of finding the time and resources to do the research in the first place, framing it in order to challenge your assumptions, and finally making sure you can deliver value from it in ways that will most benefit your users.
I made this presentation for my course on "Business Communication" where Communication Skills were presented. Here I explained how communication works and how many types of communication are there. There are some activities based on movies, images and games involved peers.
We can all pretend that we’re helping others by making web sites accessible, but we are really making the web better for our future selves. Learn some fundamentals of web accessibility and how it can benefit you (whether future you from aging or you after something else limits your abilities). We’ll review simple testing techniques, basic features and enhancements, coming trends, and where to get help. This isn’t intended to be a deep dive into ARIA, but more of an overall primer for those who aren’t sure where to start nor how it helps them.
How to Develop Discussion Materials for Public DialogueEveryday Democracy
Good discussion materials help people explore a complex, public issue from a wide range of views, and find solutions that they can agree to act on and support. Discussion materials don’t have to provide all the answers; instead, they provide a framework and a starting place for a deep, fair discussion where every voice can be heard.
The step-by-step instructions provided here mirror the order that many discussion guides follow. They are designed to help the writing team move through a series of meetings and tasks to produce the discussion materials.
Computing with Affective Lexicons: Computational Linguistics Tutorial with Da...Idibon1
For the Bay Area NLP Meetup (Natural Language Processing)
This talk is a tutorial summarizing useful methods for using dictionaries and related lexical features to compute affective and social meaning. I’ll define different kinds of social and affective meaning, introduce a number of useful dictionaries, and then give examples from domains like analyzing restaurant reviews and menus, predicting stocks,and detecting interpersonal style in dating.
Dan Jurafsky is a Professor at Stanford University and a MacArthur Fellow. NLP people will know him best as the author of "Speech and Language Processing", with James Martin.
Website: https://web.stanford.edu/~jurafsky/
Suzanne Wertheim: Linguistic Anthropology meets NLPIdibon1
Suzanne Wertheim on the intersection between linguistic anthropology and natural language processing and her work with cross-cultural topics in English, Spanish, Russian, and Farsi.
Gender, language, and Twitter: Social theory and computational methodsIdibon1
The relationship between gender, linguistic style, and social networks, using a novel corpus of over 14,000 Twitter users. Prior quantitative work on gender often treats it as a female/male binary, but that's problematic at a theoretical level and descriptively inadequate. By clustering Twitter users by the words they use, we find a natural decomposition of the dataset into various styles and topical interests. Many of these clusters end up having strong gender orientations, but they offer a more accurate reflection of the multifaceted nature of gendered language styles. Previous corpus-based work has also had little to say about individuals whose linguistic styles defy population-level gender patterns. To identify such individuals, we train a statistical classifier, and measure the classifier confidence for each individual in the dataset. Examining individuals whose language does not match the classifier's model for their gender, we find that they have social networks that include significantly fewer same-gender social connections, and that in general, social network homophily is correlated with the use of same-gender language markers. I'll hope to persuade you that the combination of computational methods and social theory offers new perspectives on how gender emerges as individuals position themselves relative to audiences, topics, and mainstream gender norms.
A presentation to government officials doing crowdsourcing and citizen science. What can machine learning techniques and industry use cases do to help get the most out of data (and big data)
Stanford Linguistics and Computer Science professor Dan Jurafsky on the history of ketchup, why chip names often have 'i's and 'e's in them, what menus are really trying to tell you, and much more.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
5. SOME OTHER PLACES TO CHECK OUT
• The Google Ngram Viewer helps you understand
trends across a bazillion books that Google has
digitized. It’s an amazing resource:
• So are the Corpus of Historical American English:
http://corpus.byu.edu/coha/ (COHA)
• And the Corpus of Contemporary English:
http://corpus.byu.edu/coca/ (COCA)
8. TAKING CARE WITH COUNTS
• The counts in the last two slides are too small to be
anything more than interesting
• The next slide shows us tracking the collocates of
future
• Collocates are the words that appear near a given
word—one of the chief collocates of salt is pepper,
for example
15. MEANING IS IN THE USE
• “For a large class of
cases of the
employment of the
word ‘meaning’—
though not for all—
this way can be
explained in this way:
the meaning of a
word is its use in the
language” —
Wittgenstein,
Philosophical
Investigations
16. MEANING IN THE USE
• Tumblr moms use
over 4 x’s as many
and
as Twitter peeps
• What are the
collocates?
• Blue: his he him
• Purple: she’s she
• No pink heart option!
• See also http://www.washingtonpost.com/sf/opinions/2015/02/12/why-moms-love-emoji/ and
http://idibon.com/emomji-emoji-new-moms-use/
17. CO-OCCURRENCES MATTER (MOVIE
REVIEW RATINGS AND WORDS)
• The idea here is that if you’re writing a review and use the word wow, you’re being very positive
or very negative. You don’t say Wow, I have a balanced and neutral opinion on this very often.
• If you’re using however, however, you’re likely to be in the middle of your movie review rating or
travel summary—not at the very positive/negative extremes.
• See also http://web.stanford.edu/~cgpotts/manuscripts/potts-schwarz-exclamatives08.pdf and
http://web.stanford.edu/~cgpotts/papers/constant-davis-potts-schwarz-expressives.pdf
18. FOUR CASE STUDIES
• Wholesomeness: http://idibon.com/wholesome-
branding-campaign-effectiveness/
• Entrepreneur: http://idibon.com/entrepreneurs-
french-spanish-english/
• Because X: http://idibon.com/innovating-
innovation/
• #BlackLivesMatter:
http://idibon.com/blacklivesmatter-events-change-
conversations/
21. DEEP HISTORY
• The first uses of wholesome tended to be about
‘virtuous teachings’.
• In Wycliffe’s Bible way back in 1382:
The..holsum wordis of oure Lord Jhesu Crist. (1 Timothy 6:3)
(Modern versions treat wordis as ‘words’, ‘teachings’, or
‘instructions’.)
23. HOW ABOUT IN SOCIAL MEDIA?
• You have to deal with spam (11% of data in this
case; another 36% of data is “Wholesome Radio”,
which is probably irrelevant)
• In 2014 tweets:
• Food: 23% (but mostly not about Honey Maid)
• Humans: 23% (and how they can/should live; church-
related mentions are prominent)
• Entertainment: 13% (movies, TV)
• Now let’s compare this to 2011 tweet uses:
• Humans: 32%
• Entertainment: 12%
• Food: 9%
25. MORE ON CONTESTED WORDS
• In the next slide, you’ll see an image from Monroe
et al (2008)
• This is work that takes the basic thing we know:
Republicans and Democrats speak about the same
issue differently.
• In the next slide, they are showing methods that
can pull about how the parties speak about
abortion when they take the floor.
• The words at the top are the Democratic party
words, the ones at the bottom are the Republican
party words.
• http://languagelog.ldc.upenn.edu/myl/Monroe.pdf
28. ENTREPRENEUR IN ENGLISH, FRENCH,
SPANISH
• Tycoon, mogul, industrialist
• A flavor of ‘ill-gotten gains’
• Entrepreuneur doesn’t seem to have this—in English right now
• Collocates have to do with:
• Advice
• Success
• Investors
• Marketing
• Social (media/services/topics/techniques)
• Failure (especially fear-of)
• Lots of named entities (SXSW, Dubai, #KSA, Twitter, Google, LinkedIn,
Etsy)
• The people using entrepreneur identify themselves as
• Authors, speakers, writers, bloggers, strategists, (life) coaches,
consultants, moms, wives, husbands, fathers, food-lovers, music-lovers
30. INTERCONNECTED AXES OF
DIFFERENCE
• Genre (State of the Unions vs. Reddit comments)
• Time (1940s vs. the last ten years)
• Geography (hella vs. wicked)
• Traditional demographics (age, gender, education)
• Personal identity/style (nerd, goth, bro, mom)
32. INNOVATIONS AND THEIR
COMMUNITIES
• Because X’ers
disporportionately like:
• YouTube
• Tumblr
• One Direction (especially
Harry)
• Justin Bieber
• Ariana Grande
• “bands”
• pizza
• sex
• cats
• books
• They are decidedly less likely
to talk about
• software
• basketball
• NASCAR
• business
• words associated with African-
American Vernacular English
34. Part of speech Word counts ≥ 50
Noun (people, spoilers) 32.02%
Compressed clause
(ilysm)
21.78%
Adjective (ugly, tired) 16.04%
Interjection (sweg, omg) 14.71%
Agreement (yeah, no) 12.97%
Pronoun (you, me) 2.45%
PART OF SPEECH TAGGERS ARE GOOD
• There’s even a pretty good one for Twitter POS
37. TOPIC MODELING
• In the previous sections, I’ve been noting what you can
do when you have two or more comparison sets
• How is wholesome used in time x vs. time y vs. time z
• What are the differences between English speakers talking
about entrepreneurship vs. French speakers and Spanish
speakers?
• How are people who use the innovative Because X
construction different than people who don’t use it?
• In this section, we talk about topic modeling, which is a
way to automatically identify clusters within a data set,
even if you don’t have a comparison set.
• We’ll use this to explore conversations around
#blacklivesmatter, but we’ll also see how these
conversations shift before/after a particular moment in
time
40. UNKNOWN UNKNOWNS
• In general, topic modeling is a way of addressing
the limits of our knowledge. If you’re asking a
question about data, you probably know
something about the data going in.
• But what we hear from people is that they are keenly aware
that they don’t know what they don’t know.
• Topic modeling is meant to help that.
• In the next slides, another use of topic modeling:
identifying the themes of Martin Luther King Jr.’s
major speeches and sermons
41. • Topic modeling Dr.
King’s major
speeches and
sermons gets
these topics
• Which change
over time
• See also
http://idibon.com/
topic-detection-
mlk/