Efficient Parallel Learning of Word2Vec

•

1 like•571 views

Since its introduction, Word2Vec and its variants are widely used to learn semantics-preserving representations of words or entities in an embedding space which can be used to produce state-of-art results for various Natural Language Processing tasks. Existing implementations aim to learn efficiently by running multiple threads in parallel while operating on a single model in shared memory, ignoring incidental memory update collisions. We show that these collisions can degrade the efficiency of parallel learning, and propose a straightforward caching strategy that improves the efficiency by a factor of 4. This paper has been accepted for presentation at the ICML Machine Learning Systems Workshop in New York City, USA.

Eﬃcient Parallel Learning of Word2Vec
Jeroen B. P. Vuurens1, Carsten Eickhoﬀ2, and Arjen P. de Vries3
1The Hague University of Applied Science
2ETH Zurich
3Radboud University Nijmegen
June 24, 2016
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 1 / 14

Word2Vec
Figure courtesy of T. Mikolov et al.
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 2 / 14

Word2Vec
Simple method for low-dimensional feature representation of words
Figure courtesy of T. Mikolov et al.
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 2 / 14

Word2Vec
Simple method for low-dimensional feature representation of words
Beneﬁcial properties:
Unsupervised
Semantics-preserving (up to a point. . . )
Figure courtesy of T. Mikolov et al.
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 2 / 14

More is more. . .
Figure courtesy of http://deepdist.com/
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 3 / 14

Parallel Training
Shared model θ
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

Parallel Training
Shared model θ
Parallel SGD threads
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

Parallel Training
Shared model θ
Parallel SGD threads
Draw a random training example xi
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

Parallel Training
Shared model θ
Parallel SGD threads
Draw a random training example xi
Acquire a lock on θ
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

Parallel Training
Shared model θ
Parallel SGD threads
Draw a random training example xi
Acquire a lock on θ
Read θ
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

Parallel Training
Shared model θ
Parallel SGD threads
Draw a random training example xi
Acquire a lock on θ
Read θ
Update θ ← (θ − α L(fθ(xi ), yi ))
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

Parallel Training
Shared model θ
Parallel SGD threads
Draw a random training example xi
Acquire a lock on θ
Read θ
Update θ ← (θ − α L(fθ(xi ), yi ))
Release lock
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

Hogwild!
Simply skip the locking:
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 5 / 14

Hogwild!
Simply skip the locking:
Draw a random training example xi
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 5 / 14

Hogwild!
Simply skip the locking:
Draw a random training example xi
Read current state of θ
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 5 / 14

Hogwild!
Simply skip the locking:
Draw a random training example xi
Read current state of θ
Update θ ← (θ − α L(fθ(xi ), yi ))
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 5 / 14

Parallel Word2Vec
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 6 / 14

Parallel Word2Vec
Intel Xeon CPU E5-2698 v3, 32 cores
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 6 / 14

Parallel Word2Vec
Intel Xeon CPU E5-2698 v3, 32 cores
Original C implementation + Gensim
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 6 / 14

Hierarchical Softmax
Figure courtesy of X. Rong
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 7 / 14

Hierarchical Softmax
Binary Huﬀman tree
Figure courtesy of X. Rong
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 7 / 14

Hierarchical Softmax
Binary Huﬀman tree
V − 1 internal nodes
Figure courtesy of X. Rong
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 7 / 14

Hierarchical Softmax
Binary Huﬀman tree
V − 1 internal nodes
Each word w is represented by a number of binary decisions
Figure courtesy of X. Rong
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 7 / 14

Hierarchical Softmax
Binary Huﬀman tree
V − 1 internal nodes
Each word w is represented by a number of binary decisions
The tree’s top nodes are part of most paths
Figure courtesy of X. Rong
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 7 / 14

Zipf’s Law
Figure courtesy of http://wugology.com/
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 8 / 14

Cached Huﬀman Trees
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 9 / 14

Cached Huﬀman Trees
Cache the top c nodes in the tree
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 9 / 14

Cached Huﬀman Trees
Cache the top c nodes in the tree
Every thread works on their stale copy of these top nodes
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 9 / 14

Cached Huﬀman Trees
Cache the top c nodes in the tree
Every thread works on their stale copy of these top nodes
Update cache every u terms
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 9 / 14

Eﬃciency
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 10 / 14

Eﬃciency
Python/Cython implementation of cached Huﬀman trees
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 10 / 14

Eﬃciency
Python/Cython implementation of cached Huﬀman trees
Same problem at c = 0
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 10 / 14

Eﬃciency
Python/Cython implementation of cached Huﬀman trees
Same problem at c = 0
Signiﬁcantly better performance at c = 31
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 10 / 14

Cache Size
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 11 / 14

Cache Size
Consistent improvements for all c ≤ 31
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 11 / 14

Cache Size
Consistent improvements for all c ≤ 31
Best results for 1 ≤ u ≤ 10
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 11 / 14

Cache Size
Consistent improvements for all c ≤ 31
Best results for 1 ≤ u ≤ 10
Too large choices of u degrade model quality
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 11 / 14

Eﬀectiveness
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 12 / 14

Eﬀectiveness
Stable model quality
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 12 / 14

Eﬀectiveness
Stable model quality
Slight quality edge for Gensim implementation
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 12 / 14

Conclusion
Hierarchical Softmax scales badly beyond 4-8 nodes
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

Conclusion
Hierarchical Softmax scales badly beyond 4-8 nodes
Frequent memory accesses to top nodes
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

Conclusion
Hierarchical Softmax scales badly beyond 4-8 nodes
Frequent memory accesses to top nodes
Zipf’s Law
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

Conclusion
Hierarchical Softmax scales badly beyond 4-8 nodes
Frequent memory accesses to top nodes
Zipf’s Law
Caching few top nodes
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

Conclusion
Hierarchical Softmax scales badly beyond 4-8 nodes
Frequent memory accesses to top nodes
Zipf’s Law
Caching few top nodes
4x speed-up
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

Thank You!
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 14 / 14

Thank You!
j.b.p.vuurens@tudelft.nl
J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 14 / 14

Word2vec works by using documents to train a neural network model to learn word vectors that encode the words' semantic meanings. It trains the model to predict a word's context by learning vector representations of words. It then represents sentences as the average of the word vectors, and constructs a similarity matrix between sentences to score them using PageRank to identify important summary sentences.

What is word2vec?

Traian Rebedea

word2vec, LDA, and introducing a new hybrid algorithm: lda2vec

👋 Christopher Moody

This document summarizes the lda2vec model, which combines aspects of word2vec and LDA. Word2vec learns word embeddings based on local context, while LDA learns document-level topic mixtures. Lda2vec models words based on both their local context and global document topic mixtures to leverage both approaches. It represents documents as mixtures over sparse topic vectors similar to LDA to maintain interpretability. This allows it to predict words based on local context and global document content.

Unsupervised Learning of General-Purpose Embeddings for User and Location Mod...

Carsten Eickhoff

Many social network applications depend on robust representations of spatio-temporal data. In this work, we present an embedding model based on feed-forward neural networks which transforms social media check-ins into dense feature vectors encoding geographic, temporal, and functional aspects for modeling places, neighborhoods, and users. We employ the embedding model in a variety of applications including location recommendation, urban functional zone study, and crime prediction. For location recommendation, we propose a Spatio-Temporal Embedding Similarity algorithm (STES) based on the embedding model. In a range of experiments on real life data collected from Foursquare, we demonstrate our model's effectiveness at characterizing places and people and its applicability in aforementioned problem domains. Finally, we select eight major cities around the globe and verify the robustness and generality of our model by porting pre-trained models from one city to another, thereby alleviating the need for costly local training.

Web2Text: Deep Structured Boilerplate Removal

Carsten Eickhoff

Web pages are a valuable source of information for many natural language processing and information retrieval tasks. Extracting the main content from those documents is essential for the performance of derived applications. To address this issue, we introduce a novel model that performs sequence labeling to collectively classify all text elements in an HTML page as either boilerplate or main content. Our method uses convolutional networks on top of DOM tree features to learn unary classification potentials for each block of text on the page and pairwise potentials for each pair of neighboring text blocks. We find the most likely labeling according to these potentials using the Viterbi algorithm. The proposed method improves page cleaning performance on the CleanEval benchmark compared to the state-of-the-art. As a component of information retrieval pipelines it improves retrieval performance on the ClueWeb12 collection.

Cognitive Biases in Crowdsourcing

Carsten Eickhoff

Cognitive biases can negatively impact result quality in crowdsourcing. The document examines four types of cognitive biases - ambiguity effect, anchoring, bandwagon effect, and decoy effect - and how they manifest in crowdsourcing tasks. Constructed tasks were designed to induce each bias and their effects were measured, finding biases consistently lowered label agreement with experts and system performance when models were trained on biased results. Careful task design is needed to avoid these pitfalls.

Evaluating Music Recommender Systems for Groups

Carsten Eickhoff

Recommendation to groups of users is a challenging and currently only passingly studied task. Especially the evaluation aspect often appears ad-hoc and instead of truly evaluating on groups of users, synthesises groups by merging individual preferences. In this paper, we present a user study, recording the individual and shared preferences of actual groups of participants, resulting in a robust, standardized evaluation benchmark. Using this benchmarking dataset, that we share with the research community, we compare the respective performance of a wide range of music group recommendation techniques proposed in the literature.

For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/ Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit. In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing. van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack

shyamraj55

“I’m still / I’m still / Chaining from the Block”

Claudio Di Ciccio

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf

Malak Abu Hammad

Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers: * What is Vector Search? * Importance and benefits of vector search * Practical use cases across various industries * Step-by-step implementation guide * Live demos with code snippets * Enhancing LLM capabilities with vector search * Best practices and optimization strategies Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications. #MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology

Microsoft - Power Platform_G.Aspiotis.pdf

Uni Systems S.M.S.A.

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Neo4j

Leonard Jayamohan, Partner & Generative AI Lead, Deloitte This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.

Programming Foundation Models with DSPy - Meetup Slides

Zilliz

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024

Neo4j

みなさんこんにちはこれ何文字まで入るの？40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの？えこ...

名前です男

Infrastructure Challenges in Scaling RAG with Custom AI models

Zilliz

Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.

National Security Agency - NSA mobile device best practices

Quotidiano Piemontese

20240607 QFM018 Elixir Reading List May 2024

Matthew Sinclair

Driving Business Innovation: Latest Generative AI Advancements & Success Story

Safe Software

Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency. During the hour, we’ll take you through: Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board. Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes. Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI. We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI. This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!

GraphRAG for Life Science to increase LLM accuracy

Tomaz Bratanic

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Paige Cruz

Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack. While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack. I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:

Climate Impact of Software Testing at Nordic Testing Days

Kari Kakkonen

My slides at Nordic Testing Days 6.6.2024 Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.

Serial Arm Control in Real Time Presentation

tolgahangng

Introduction to CHERI technology - Cybersecurity

mikeeftimakis1

UiPath Test Automation using UiPath Test Suite series, part 5

DianaGray10

Pushing the limits of ePRTC: 100ns holdover for 100 days

Adtran

2024 State of Marketing Report – by Hubspot

Marius Sescu

https://www.hubspot.com/state-of-marketing · Scaling relationships and proving ROI · Social media is the place for search, sales, and service · Authentic influencer partnerships fuel brand growth · The strongest connections happen via call, click, chat, and camera. · Time saved with AI leads to more creative work · Seeking: A single source of truth · TLDR; Get on social, try AI, and align your systems. · More human marketing, powered by robots

Everything You Need To Know About ChatGPT

Expeed Software

ChatGPT is a revolutionary addition to the world since its introduction in 2022. A big shift in the sector of information gathering and processing happened because of this chatbot. What is the story of ChatGPT? How is the bot responding to prompts and generating contents? Swipe through these slides prepared by Expeed Software, a web development company regarding the development and technical intricacies of ChatGPT!

Recently uploaded

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...

Edge AI and Vision Alliance

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack

shyamraj55

“I’m still / I’m still / Chaining from the Block”

Claudio Di Ciccio

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf

Malak Abu Hammad

Microsoft - Power Platform_G.Aspiotis.pdf

Uni Systems S.M.S.A.

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Neo4j

Programming Foundation Models with DSPy - Meetup Slides

Zilliz

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024

Neo4j

名前です男

Infrastructure Challenges in Scaling RAG with Custom AI models

Zilliz

National Security Agency - NSA mobile device best practices

Quotidiano Piemontese

20240607 QFM018 Elixir Reading List May 2024

Matthew Sinclair

Driving Business Innovation: Latest Generative AI Advancements & Success Story

Safe Software

GraphRAG for Life Science to increase LLM accuracy

Tomaz Bratanic

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Paige Cruz

Climate Impact of Software Testing at Nordic Testing Days

Kari Kakkonen

Serial Arm Control in Real Time Presentation

tolgahangng

Introduction to CHERI technology - Cybersecurity

mikeeftimakis1

UiPath Test Automation using UiPath Test Suite series, part 5

DianaGray10

Pushing the limits of ePRTC: 100ns holdover for 100 days

Adtran

Recently uploaded (20)

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack

“I’m still / I’m still / Chaining from the Block”

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf

Microsoft - Power Platform_G.Aspiotis.pdf

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...

Programming Foundation Models with DSPy - Meetup Slides

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024

Infrastructure Challenges in Scaling RAG with Custom AI models

National Security Agency - NSA mobile device best practices

20240607 QFM018 Elixir Reading List May 2024

Driving Business Innovation: Latest Generative AI Advancements & Success Story

GraphRAG for Life Science to increase LLM accuracy

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Climate Impact of Software Testing at Nordic Testing Days

Serial Arm Control in Real Time Presentation

Introduction to CHERI technology - Cybersecurity

UiPath Test Automation using UiPath Test Suite series, part 5

Pushing the limits of ePRTC: 100ns holdover for 100 days

Featured

2024 State of Marketing Report – by Hubspot

Marius Sescu

Everything You Need To Know About ChatGPT

Expeed Software

Product Design Trends in 2024 | Teenage Engineerings

Pixeldarts

How Race, Age and Gender Shape Attitudes Towards Mental Health

ThinkNow

Mental health has been in the news quite a bit lately. Dozens of U.S. states are currently suing Meta for contributing to the youth mental health crisis by inserting addictive features into their products, while the U.S. Surgeon General is touring the nation to bring awareness to the growing epidemic of loneliness and isolation. The country has endured periods of low national morale, such as in the 1970s when high inflation and the energy crisis worsened public sentiment following the Vietnam War. The current mood, however, feels different. Gallup recently reported that national mental health is at an all-time low, with few bright spots to lift spirits. To better understand how Americans are feeling and their attitudes towards mental health in general, ThinkNow conducted a nationally representative quantitative survey of 1,500 respondents and found some interesting differences among ethnic, age and gender groups. Technology For example, 52% agree that technology and social media have a negative impact on mental health, but when broken out by race, 61% of Whites felt technology had a negative effect, and only 48% of Hispanics thought it did. While technology has helped us keep in touch with friends and family in faraway places, it appears to have degraded our ability to connect in person. Staying connected online is a double-edged sword since the same news feed that brings us pictures of the grandkids and fluffy kittens also feeds us news about the wars in Israel and Ukraine, the dysfunction in Washington, the latest mass shooting and the climate crisis. Hispanics may have a built-in defense against the isolation technology breeds, owing to their large, multigenerational households, strong social support systems, and tendency to use social media to stay connected with relatives abroad. Age and Gender When asked how individuals rate their mental health, men rate it higher than women by 11 percentage points, and Baby Boomers rank it highest at 83%, saying it’s good or excellent vs. 57% of Gen Z saying the same. Gen Z spends the most amount of time on social media, so the notion that social media negatively affects mental health appears to be correlated. Unfortunately, Gen Z is also the generation that’s least comfortable discussing mental health concerns with healthcare professionals. Only 40% of them state they’re comfortable discussing their issues with a professional compared to 60% of Millennials and 65% of Boomers. Race Affects Attitudes As seen in previous research conducted by ThinkNow, Asian Americans lag other groups when it comes to awareness of mental health issues. Twenty-four percent of Asian Americans believe that having a mental health issue is a sign of weakness compared to the 16% average for all groups. Asians are also considerably less likely to be aware of mental health services in their communities (42% vs. 55%) and most likely to seek out information on social media (51% vs. 35%).

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

marketingartwork

Creative operations teams expect increased AI use in 2024. Currently, over half of tasks are not AI-enabled, but this is expected to decrease in the coming year. ChatGPT is the most popular AI tool currently. Business leaders are more actively exploring AI benefits than individual contributors. Most respondents do not believe AI will impact workforce size in 2024. However, some inhibitions still exist around AI accuracy and lack of understanding. Creatives primarily want to use AI to save time on mundane tasks and boost productivity.

Skeleton Culture Code

Skeleton Technologies

Organizational culture includes values, norms, systems, symbols, language, assumptions, beliefs, and habits that influence employee behaviors and how people interpret those behaviors. It is important because culture can help or hinder a company's success. Some key aspects of Netflix's culture that help it achieve results include hiring smartly so every position has stars, focusing on attitude over just aptitude, and having a strict policy against peacocks, whiners, and jerks.

PEPSICO Presentation to CAGNY Conference Feb 2024

Neil Kimberley

PepsiCo provided a safe harbor statement noting that any forward-looking statements are based on currently available information and are subject to risks and uncertainties. It also provided information on non-GAAP measures and directing readers to its website for disclosure and reconciliation. The document then discussed PepsiCo's business overview, including that it is a global beverage and convenient food company with iconic brands, $91 billion in net revenue in 2023, and nearly $14 billion in core operating profit. It operates through a divisional structure with a focus on local consumers.

Content Methodology: A Best Practices Report (Webinar)

contently

This document provides an overview of content methodology best practices. It defines content methodology as establishing objectives, KPIs, and a culture of continuous learning and iteration. An effective methodology focuses on connecting with audiences, creating optimal content, and optimizing processes. It also discusses why a methodology is needed due to the competitive landscape, proliferation of channels, and opportunities for improvement. Components of an effective methodology include defining objectives and KPIs, audience analysis, identifying opportunities, and evaluating resources. The document concludes with recommendations around creating a content plan, testing and optimizing content over 90 days.

How to Prepare For a Successful Job Search for 2024

Albert Qian

The document provides guidance on preparing a job search for 2024. It discusses the state of the job market, focusing on growth in AI and healthcare but also continued layoffs. It recommends figuring out what you want to do by researching interests and skills, then conducting informational interviews. The job search should involve building a personal brand on LinkedIn, actively applying to jobs, tailoring resumes and interviews, maintaining job hunting as a habit, and continuing self-improvement. Once hired, the document advises setting new goals and keeping skills and networking active in case of future opportunities.

Social Media Marketing Trends 2024 // The Global Indie Insights

Kurio // The Social Media Age(ncy)

A report by thenetworkone and Kurio. The contributing experts and agencies are (in an alphabetical order): Sylwia Rytel, Social Media Supervisor, 180heartbeats + JUNG v MATT (PL), Sharlene Jenner, Vice President - Director of Engagement Strategy, Abelson Taylor (USA), Alex Casanovas, Digital Director, Atrevia (ES), Dora Beilin, Senior Social Strategist, Barrett Hoffher (USA), Min Seo, Campaign Director, Brand New Agency (KR), Deshé M. Gully, Associate Strategist, Day One Agency (USA), Francesca Trevisan, Strategist, Different (IT), Trevor Crossman, CX and Digital Transformation Director; Olivia Hussey, Strategic Planner; Simi Srinarula, Social Media Manager, The Hallway (AUS), James Hebbert, Managing Director, Hylink (CN / UK), Mundy Álvarez, Planning Director; Pedro Rojas, Social Media Manager; Pancho González, CCO, Inbrax (CH), Oana Oprea, Head of Digital Planning, Jam Session Agency (RO), Amy Bottrill, Social Account Director, Launch (UK), Gaby Arriaga, Founder, Leonardo1452 (MX), Shantesh S Row, Creative Director, Liwa (UAE), Rajesh Mehta, Chief Strategy Officer; Dhruv Gaur, Digital Planning Lead; Leonie Mergulhao, Account Supervisor - Social Media & PR, Medulla (IN), Aurelija Plioplytė, Head of Digital & Social, Not Perfect (LI), Daiana Khaidargaliyeva, Account Manager, Osaka Labs (UK / USA), Stefanie Söhnchen, Vice President Digital, PIABO Communications (DE), Elisabeth Winiartati, Managing Consultant, Head of Global Integrated Communications; Lydia Aprina, Account Manager, Integrated Marketing and Communications; Nita Prabowo, Account Manager, Integrated Marketing and Communications; Okhi, Web Developer, PNTR Group (ID), Kei Obusan, Insights Director; Daffi Ranandi, Insights Manager, Radarr (SG), Gautam Reghunath, Co-founder & CEO, Talented (IN), Donagh Humphreys, Head of Social and Digital Innovation, THINKHOUSE (IRE), Sarah Yim, Strategy Director, Zulu Alpha Kilo (CA).

Trends In Paid Search: Navigating The Digital Landscape In 2024

Search Engine Journal

The search marketing landscape is evolving rapidly with new technologies, and professionals, like you, rely on innovative paid search strategies to meet changing demands. It’s important that you’re ready to implement new strategies in 2024. Check this out and learn the top trends in paid search advertising that are expected to gain traction, so you can drive higher ROI more efficiently in 2024. You’ll learn: - The latest trends in AI and automation, and what this means for an evolving paid search ecosystem. - New developments in privacy and data regulation. - Emerging ad formats that are expected to make an impact next year. Watch Sreekant Lanka from iQuanti and Irina Klein from OneMain Financial as they dive into the future of paid search and explore the trends, strategies, and technologies that will shape the search marketing landscape. If you’re looking to assess your paid search strategy and design an industry-aligned plan for 2024, then this webinar is for you.

5 Public speaking tips from TED - Visualized summary

SpeakerHub

From their humble beginnings in 1984, TED has grown into the world’s most powerful amplifier for speakers and thought-leaders to share their ideas. They have over 2,400 filmed talks (not including the 30,000+ TEDx videos) freely available online, and have hosted over 17,500 events around the world. With over one billion views in a year, it’s no wonder that so many speakers are looking to TED for ideas on how to share their message more effectively. The article “5 Public-Speaking Tips TED Gives Its Speakers”, by Carmine Gallo for Forbes, gives speakers five practical ways to connect with their audience, and effectively share their ideas on stage. Whether you are gearing up to get on a TED stage yourself, or just want to master the skills that so many of their speakers possess, these tips and quotes from Chris Anderson, the TED Talks Curator, will encourage you to make the most impactful impression on your audience. See the full article and more summaries like this on SpeakerHub here: https://speakerhub.com/blog/5-presentation-tips-ted-gives-its-speakers See the original article on Forbes here: http://www.forbes.com/forbes/welcome/?toURL=http://www.forbes.com/sites/carminegallo/2016/05/06/5-public-speaking-tips-ted-gives-its-speakers/&refURL=&referrer=#5c07a8221d9b

ChatGPT and the Future of Work - Clark Boyd

Clark Boyd

Everyone is in agreement that ChatGPT (and other generative AI tools) will shape the future of work. Yet there is little consensus on exactly how, when, and to what extent this technology will change our world. Businesses that extract maximum value from ChatGPT will use it as a collaborative tool for everything from brainstorming to technical maintenance. For individuals, now is the time to pinpoint the skills the future professional will need to thrive in the AI age. Check out this presentation to understand what ChatGPT is, how it will shape the future of work, and how you can prepare to take advantage.

Getting into the tech field. what next

Tessa Mero

The document provides career advice for getting into the tech field, including: - Doing projects and internships in college to build a portfolio. - Learning about different roles and technologies through industry research. - Contributing to open source projects to build experience and network. - Developing a personal brand through a website and social media presence. - Networking through events, communities, and finding a mentor. - Practicing interviews through mock interviews and whiteboarding coding questions.

Google's Just Not That Into You: Understanding Core Updates & Search Intent

Lily Ray

1. Core updates from Google periodically change how its algorithms assess and rank websites and pages. This can impact rankings through shifts in user intent, site quality issues being caught up to, world events influencing queries, and overhauls to search like the E-A-T framework. 2. There are many possible user intents beyond just transactional, navigational and informational. Identifying intent shifts is important during core updates. Sites may need to optimize for new intents through different content types and sections. 3. Responding effectively to core updates requires analyzing "before and after" data to understand changes, identifying new intents or page types, and ensuring content matches appropriate intents across video, images, knowledge graphs and more.

How to have difficult conversations

Rajiv Jayarajah, MAppComm, ACC

Introduction to Data Science

Christy Abraham Joy

Time Management & Productivity - Best Practices

Vit Horky

The six step guide to practical project management

MindGenius

The six step guide to practical project management If you think managing projects is too difficult, think again. We’ve stripped back project management processes to the basics – to make it quicker and easier, without sacrificing the vital ingredients for success. “If you’re looking for some real-world guidance, then The Six Step Guide to Practical Project Management will help.” Dr Andrew Makar, Tactical Project Management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

RachelPearson36

Featured (20)

2024 State of Marketing Report – by Hubspot

Everything You Need To Know About ChatGPT

Product Design Trends in 2024 | Teenage Engineerings

How Race, Age and Gender Shape Attitudes Towards Mental Health

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Efficient Parallel Learning of Word2Vec

1. Efficient Parallel Learning of Word2Vec Jeroen B. P. Vuurens1, Carsten Eickhoff2, and Arjen P. de Vries3 1The Hague University of Applied Science 2ETH Zurich 3Radboud University Nijmegen June 24, 2016 J. Vuurens et al. Efficient Parallel Learning of Word2Vec June 24, 2016 1 / 14

2. Word2Vec Figure courtesy of T. Mikolov et al. J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 2 / 14

3. Word2Vec Simple method for low-dimensional feature representation of words Figure courtesy of T. Mikolov et al. J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 2 / 14

4. Word2Vec Simple method for low-dimensional feature representation of words Beneﬁcial properties: Unsupervised Semantics-preserving (up to a point. . . ) Figure courtesy of T. Mikolov et al. J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 2 / 14

5. Word2Vec Simple method for low-dimensional feature representation of words Beneﬁcial properties: Unsupervised Semantics-preserving (up to a point. . . ) Recently very popular Figure courtesy of T. Mikolov et al. J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 2 / 14

6. More is more. . . Figure courtesy of http://deepdist.com/ J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 3 / 14

7. Parallel Training Shared model θ J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

8. Parallel Training Shared model θ Parallel SGD threads J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

9. Parallel Training Shared model θ Parallel SGD threads Draw a random training example xi J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

10. Parallel Training Shared model θ Parallel SGD threads Draw a random training example xi Acquire a lock on θ J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

11. Parallel Training Shared model θ Parallel SGD threads Draw a random training example xi Acquire a lock on θ Read θ J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

12. Parallel Training Shared model θ Parallel SGD threads Draw a random training example xi Acquire a lock on θ Read θ Update θ ← (θ − α L(fθ(xi ), yi )) J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

13. Parallel Training Shared model θ Parallel SGD threads Draw a random training example xi Acquire a lock on θ Read θ Update θ ← (θ − α L(fθ(xi ), yi )) Release lock J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

14. Parallel Training Shared model θ Parallel SGD threads Draw a random training example xi Acquire a lock on θ Read θ Update θ ← (θ − α L(fθ(xi ), yi )) Release lock Lots of waiting. . . J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 4 / 14

15. Hogwild! Simply skip the locking: J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 5 / 14

16. Hogwild! Simply skip the locking: J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 5 / 14

17. Hogwild! Simply skip the locking: Draw a random training example xi J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 5 / 14

18. Hogwild! Simply skip the locking: Draw a random training example xi Read current state of θ J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 5 / 14

19. Hogwild! Simply skip the locking: Draw a random training example xi Read current state of θ Update θ ← (θ − α L(fθ(xi ), yi )) J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 5 / 14

20. Hogwild! Simply skip the locking: Draw a random training example xi Read current state of θ Update θ ← (θ − α L(fθ(xi ), yi )) J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 5 / 14

21. Parallel Word2Vec J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 6 / 14

22. Parallel Word2Vec J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 6 / 14

23. Parallel Word2Vec Intel Xeon CPU E5-2698 v3, 32 cores J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 6 / 14

24. Parallel Word2Vec Intel Xeon CPU E5-2698 v3, 32 cores Original C implementation + Gensim J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 6 / 14

25. Hierarchical Softmax Figure courtesy of X. Rong J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 7 / 14

26. Hierarchical Softmax Binary Huﬀman tree Figure courtesy of X. Rong J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 7 / 14

27. Hierarchical Softmax Binary Huﬀman tree V − 1 internal nodes Figure courtesy of X. Rong J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 7 / 14

28. Hierarchical Softmax Binary Huﬀman tree V − 1 internal nodes Each word w is represented by a number of binary decisions Figure courtesy of X. Rong J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 7 / 14

29. Hierarchical Softmax Binary Huﬀman tree V − 1 internal nodes Each word w is represented by a number of binary decisions The tree’s top nodes are part of most paths Figure courtesy of X. Rong J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 7 / 14

30. Zipf’s Law Figure courtesy of http://wugology.com/ J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 8 / 14

31. Cached Huﬀman Trees J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 9 / 14

32. Cached Huﬀman Trees Cache the top c nodes in the tree J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 9 / 14

33. Cached Huﬀman Trees Cache the top c nodes in the tree Every thread works on their stale copy of these top nodes J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 9 / 14

34. Cached Huﬀman Trees Cache the top c nodes in the tree Every thread works on their stale copy of these top nodes Update cache every u terms J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 9 / 14

35. Eﬃciency J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 10 / 14

36. Efficiency Python/Cython implementation of cached Huffman trees J. Vuurens et al. Efficient Parallel Learning of Word2Vec June 24, 2016 10 / 14

37. Efficiency Python/Cython implementation of cached Huffman trees Same problem at c = 0 J. Vuurens et al. Efficient Parallel Learning of Word2Vec June 24, 2016 10 / 14

38. Efficiency Python/Cython implementation of cached Huffman trees Same problem at c = 0 Significantly better performance at c = 31 J. Vuurens et al. Efficient Parallel Learning of Word2Vec June 24, 2016 10 / 14

39. Cache Size J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 11 / 14

40. Cache Size Consistent improvements for all c ≤ 31 J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 11 / 14

41. Cache Size Consistent improvements for all c ≤ 31 Best results for 1 ≤ u ≤ 10 J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 11 / 14

42. Cache Size Consistent improvements for all c ≤ 31 Best results for 1 ≤ u ≤ 10 Too large choices of u degrade model quality J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 11 / 14

43. Eﬀectiveness J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 12 / 14

44. Eﬀectiveness Stable model quality J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 12 / 14

45. Eﬀectiveness Stable model quality Slight quality edge for Gensim implementation J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 12 / 14

46. Conclusion Hierarchical Softmax scales badly beyond 4-8 nodes J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

47. Conclusion Hierarchical Softmax scales badly beyond 4-8 nodes Frequent memory accesses to top nodes J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

48. Conclusion Hierarchical Softmax scales badly beyond 4-8 nodes Frequent memory accesses to top nodes Zipf’s Law J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

49. Conclusion Hierarchical Softmax scales badly beyond 4-8 nodes Frequent memory accesses to top nodes Zipf’s Law Caching few top nodes J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

50. Conclusion Hierarchical Softmax scales badly beyond 4-8 nodes Frequent memory accesses to top nodes Zipf’s Law Caching few top nodes 4x speed-up J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

51. Conclusion Hierarchical Softmax scales badly beyond 4-8 nodes Frequent memory accesses to top nodes Zipf’s Law Caching few top nodes 4x speed-up Constant model quality J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

52. Conclusion Hierarchical Softmax scales badly beyond 4-8 nodes Frequent memory accesses to top nodes Zipf’s Law Caching few top nodes 4x speed-up Constant model quality Try it yourself: http://cythnn.github.io J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 13 / 14

53. Thank You! J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 14 / 14

54. Thank You! j.b.p.vuurens@tudelft.nl J. Vuurens et al. Eﬃcient Parallel Learning of Word2Vec June 24, 2016 14 / 14

Efficient Parallel Learning of Word2Vec

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Efficient Parallel Learning of Word2Vec