This document discusses categorizing job titles. It begins by noting that the current status of having 25 job functions is too few, non-specific, and big, while having 25,000 individual job titles is too many, big, small, specific, and non-specific. The document then outlines constraints for a new categorization system, including using around 200 categories, uniquely mapping each title to a category, prioritizing precision over coverage, and aiming for 80% coverage of categorizable titles. It presents an initial machine solution for categorizing titles and categories, compares it to feedback from domain experts, and discusses implementing additional feedback and query functionality. It also explores some technical aspects of analyzing the distribution and geometry of title word vectors.
Financial Management for Mature Organizations4Good.org
This Webinar provides a look at an organization's programs and operations through a financial lens. The content is most appropriate for organizations who have existing financial infrastructure that is running smoothly but want to understand how the financial infrastructure could enhance strategic and programmatic decisions. The session covers financial infrastructure, financial modeling, financial projections, scenario planning, and growth planning
How Iterating Faster Builds Better Product by Capital One PMProduct School
Main takeaways:
- Why it is important to iterate quickly
- Why do software projects get slower as they get larger
- How can Product Managers keep the product iterating quickly
Scaling tricks: practical tips for Scaling in AgileRenee Troughton
With so many approaches out there on how to Scale, this presentation looks less at what is there in the marketplace, but instead takes a look at techniques and tricks that people are using that have not yet been codified. When Agile first started we spent many years refining and getting better at it, this is the start of refining how we scale and begin to integrate design thinking into our approach, whilst always looking for smarter ways to work.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Financial Management for Mature Organizations4Good.org
This Webinar provides a look at an organization's programs and operations through a financial lens. The content is most appropriate for organizations who have existing financial infrastructure that is running smoothly but want to understand how the financial infrastructure could enhance strategic and programmatic decisions. The session covers financial infrastructure, financial modeling, financial projections, scenario planning, and growth planning
How Iterating Faster Builds Better Product by Capital One PMProduct School
Main takeaways:
- Why it is important to iterate quickly
- Why do software projects get slower as they get larger
- How can Product Managers keep the product iterating quickly
Scaling tricks: practical tips for Scaling in AgileRenee Troughton
With so many approaches out there on how to Scale, this presentation looks less at what is there in the marketplace, but instead takes a look at techniques and tricks that people are using that have not yet been codified. When Agile first started we spent many years refining and getting better at it, this is the start of refining how we scale and begin to integrate design thinking into our approach, whilst always looking for smarter ways to work.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
TitleCategoriesLI
1. TITLE CATEGORIZATION 2.1
MENTOR: RAMESH SUBRAMONIAN
TEAM: DATA ANALYTICS
LEADER: DANIEL TUNKELANG
ACKNOWLEDGEMENTS:
SIMLA CEYHAN
DANIEL TUNKELANG
MONICA ROGATTI
LAUREN OLERICH
RON BEKKERMAN
FLOW
CHRISTIAN POSSE
2. Motivation: CURRENT STATUS
25 JOB FUNCTIONS:
• TOO FEW No Field Sales
• TOO NON-SPECIFIC Reporting is difficult
• TOO BIG
25000 CLEAN JOB TITLES:
• TOO MANY
• TOO BIG (“Owner” ~ 5M)
• TOO SMALL (~ 500)
• TOO SPECIFIC (“Human Resources Info. Sys. Mgr.”)
• TOO NON-SPECIFIC (“Specialist”)
3. CONSTRAINTS
• INPUT • OUTPUT
CLEAN TITLE “IMPRESSSIONS” Clean title Category
… … … …
facilities manager 95674 Blonde hair Hair stylist
… … stylist
Chair stylist Furniture maker
Title 1 Title 2 Cosine … …
… … Owner VAGUE
(UNCATEGORIZABLE)
(1,0)
barista Independent: not
vague
Doesn’t fit in any
existing category, too
small to form
Category …
4. CONSTRAINTS (CONTD)
• ~ 200 categories (from Sales: can be dealt with
on human scale)
• Title maps to Unique category
• Precision over coverage
• Coverage ~ 80% of categorizable titles
• 2-3 nearest categories for each category
• 2 alternate categories for each title
5. Machine solution V00
User Domain Expert Feedback (Ester/Lauren in Sales)
Less than 1.5% change in coverage!
Illustrates “goodness” of computational solution!
12. Status
• Handed over to Ester/Lauren in Sales
• Iteratively incorporate human feedback
• Solution is Public, code is documented and
with Ramesh, working on final report
• ~2-3 new technical innovations
• Developed a proposal for “titles” based on
current understanding of LinkedIn needs
13. Feedback Functionality: Implemented
• Title:
1. Delete from Category (Independent)
2. Move to vague
3. Move to another category
4. Define new category
Category:
1. Delete if empty
2. Rename
3. Merge with another
14. Cool Technical stuff
• Distribution of membership over titles
– How used
• Geometry of Title Word vector space
– How used and should be
– Lack of hyperstructure/scale
• How to cluster stars and “Local Dimension”
– How used
– Lack of asymptotic behaviour or transition point
during clustering
16. Membership Distribution in Titles
Slope drops to
within some % of -1:
90% members in 6000 titles 0.6 diminishing marginal
10% members in 19000 titles Returns : should be based
on marginal increase in
impsminustitles
potential earnings –
0.4
marginal increase
in overhead costs
0.2
Slope of curve nearly -1
Cut-off Rank ~ 6000
0.0
0 5000 10000 15000 20000
Rank_decr_imps
Slope = -1
%ile of titles by impressions - %ile of titles by rank VS. Rank of title
7/13/11 Grp Mtg RSTate, LinkedIn 16
17. Projective Word-vector space
Weighted point set
embedded in Euclidean, Based on
XYZ - axis
with induced metric Cosine Sim.
Boundary of nearest
neighbour polyhedra 25008 points
Of Bins. In 50,000 D!
Ti Ti of size ni Recall that n points
define only n-1 D
UVW - axis
ϑij Tj
DIMENSIONALLY SPARSE!, not just in density
ABC - axis
Most angles are nearly 90 deg.s
18. GEOMETRY OF DATA SPACE:
How should be used:
1. Project Title Word
vectors onto N-1
simplex: Σ 1. 2-3.
components = 1
2. Calculate Mean Word
Vector
3. Drop Titles
Ti
(KLPDS) 4-5.
4. Recalculate the Mean
Word Vector and
MOVE there (increases Tj
discrimination) θ
5. Project vectors onto
unit sphere
6. angle is geodesic
measure
Sin (θ/2) = |Ti-Tj|/2
(distances, density etc.).
As opposed to?
19. Radial distribution function of Titles
1e+07
Almost all angles are > 45
8e+06
6e+06
count
4e+06
2e+06
0e+00
10 20 30 40 50 60 70
theta
No SCALE OR higher order structure (for hierarchical taxonomy)
20. Log(count) vs. Theta
7
6
5
4
count
3
2
1
0
10 20 30 40 50 60 70
theta
No scale or higher order structure (for hierarchical taxonomy)
23. LOCAL DIMENSION
Radius mass
1 1
2 8
3 27
4 64
Exponent (coeff of linear term in log-log plot)
= Dimension (above , it is 3)
Each point (title) has a local dimension Di
Which is used to calculate density of the cluster:
Imps/r^Di
These densities are then compared
and highest selected for categories
24. Aggregate Radial Distribu on of Titles
8
7
y = 6.5687x - 5.3293
6
log10(Number of Titles)
5
4
logcount
Linear (logcount)
3
2
1
0
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
log10(Theta)
Average cluster dimension ~ 6.6
25. Log(count) vs Dim.
What does “dimension of cluster” mean?
10
8
6
count
4
2
0
0 20 40 60 80 100 120
Dim
26. Power law evolution of clustering?
No natural break points.
3.6
3.4
3.2
log(AvgDens, 10)
Exponent = -1
3.0
2.8
2.6
2.4
2.2 2.4 2.6 2.8 3.0 3.2 3.4
log(Cats, 10)
27. FLOW
Big Picture: Taxonomy
Use case 2:
Title categorization Search,
CLIENT: Recruiter, Advertiser, Recc.
Semantic network
Sales Team or Search
Manage Manufa Top Level
Marketing Software choices
ment cturing
Marketing Sales Sales
VP Sales Relational
Sales Rep
Dir. Sales
7/13/11 Grp Mtg RSTate, LinkedIn 27
28. FLOW
Taxonomy Big Picture: Relational
Title categorization Use case 1:
Semantic network Sales FIELD SALES
Categories
Sales Sales Rep
Sales Assoc.
Sales Mgr Reg. Sales Mgr
Prob
Defn 1 Titles
Prob
Defn 2
Members
PYMJPCOJ
7/13/11 Grp Mtg RSTate, LinkedIn 28
29. Inadequacy of Cosine Similarity
• Bit vectors differing in 1/3 of their 1-bits
~ 70% Cosine Similarity FLOW
and 70% Sine Dissimilarity
• PROOF of maintaining preference order
does NOT account for Computational
fragility: at θ=6.3o
+/- 0.005 in Cosine => 2.6o – 8.5o in angle
• Vectors at 30 degs have Cosine Sim ~ 85%
• NOT a distance – NO geometry Obtaining Clean titles 2.0
• DOES NOT provide good discrimination
between close neighbours V2.1 LEANER DATA
Even as intermediate means of calculating Deconstruct V2.0 and V2.1
angle, computationally fragile:
• Poor choice, prone to error in region of V2.2 Data Space
interest
• 0 < angle < pi/2 (Maximally dissimilar only
90 degs away!) Title categorization
• Inadequate notion of “maximally Semantic network
dissimilar”
30. What does LinkedIn want from Titles?
1. Navigational ease for Sales, Search, Recommendation
2. Robust and maintainable structure
3. Dynamic response to labor mkt changes
4. Structure based on Domain expertise, NOT on member
information
5. Assignment of members based on profile and inferred info
6. “Universal” acceptability
7. Free and available? Somebody else done the work?
8. Expand use of LinkedIn as point of entry for
recruiters, based on how they define jobs and use titles in
searches