Come può il traduttore vivere del proprio lavoro, a.k.a.: traduzioni a due ce...LUSPIO LanguageCamp
Gianni Davico, autore de “L'industria della traduzione. Realtà e prospettive del mercato italiano.” Illustrerà alcuni aspetti del mercato italiano della traduzione.
La SEOmantica: misteri, potenzialità e sviluppi per il seo del futuro con la ...Michele De Capitani
Tutto ciò che non avresti mai voluto sentirti dire sulla SEO, oggi è sempre più realtà. Algoritmi semantici di Google e gli altri motori di ricerca, basati sui modelli matematici di analisi semantica fra cui: LDA (Latent Dirichlet Allocation), HTMM (Hidden Topic Markov Models), PLSA (Probabilisti Latent Semantic Analysis). V Convegno GT 2010, le slide dell'intervento. Analisi e test sull'algoritmo semantico di Google.
"Traduttese": tendenze e implicazioni
Francesco Urzì è l'autore del Dizionario delle Combinazioni Lessicali (Edizioni Convivium, Lussemburgo 2009). L'opera nasce sulla sua scrivania di traduttore al Parlamento Europeo quando, dal 1982, inizia a schedare combinazioni utili di parole, del tipo nome-verbo e nome-aggettivo, per meglio sfumare il linguaggio politico. Francesco Urzì discuterà le caratteristiche del “traduttese”, ciò che lo Zingarelli 2010 definisce per la prima volta in assoluto come “linguaggio contorto o stile scadente che sono conseguenza di una traduzione eccessivamente letterale”.
Ocplab Analisi Testuale di Andrea NobileMarco Binotto
Presentazione di alcuni aspetti della ricerca sul G8 di Genova del 2001 presentati nel corso di un seminario-laboratorio "Il sistema dei media e il G8" organizzato dall'Osservatorio OCP.
2024 State of Marketing Report – by HubspotMarius Sescu
https://www.hubspot.com/state-of-marketing
· Scaling relationships and proving ROI
· Social media is the place for search, sales, and service
· Authentic influencer partnerships fuel brand growth
· The strongest connections happen via call, click, chat, and camera.
· Time saved with AI leads to more creative work
· Seeking: A single source of truth
· TLDR; Get on social, try AI, and align your systems.
· More human marketing, powered by robots
ChatGPT is a revolutionary addition to the world since its introduction in 2022. A big shift in the sector of information gathering and processing happened because of this chatbot. What is the story of ChatGPT? How is the bot responding to prompts and generating contents? Swipe through these slides prepared by Expeed Software, a web development company regarding the development and technical intricacies of ChatGPT!
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
The realm of product design is a constantly changing environment where technology and style intersect. Every year introduces fresh challenges and exciting trends that mold the future of this captivating art form. In this piece, we delve into the significant trends set to influence the look and functionality of product design in the year 2024.
Come può il traduttore vivere del proprio lavoro, a.k.a.: traduzioni a due ce...LUSPIO LanguageCamp
Gianni Davico, autore de “L'industria della traduzione. Realtà e prospettive del mercato italiano.” Illustrerà alcuni aspetti del mercato italiano della traduzione.
La SEOmantica: misteri, potenzialità e sviluppi per il seo del futuro con la ...Michele De Capitani
Tutto ciò che non avresti mai voluto sentirti dire sulla SEO, oggi è sempre più realtà. Algoritmi semantici di Google e gli altri motori di ricerca, basati sui modelli matematici di analisi semantica fra cui: LDA (Latent Dirichlet Allocation), HTMM (Hidden Topic Markov Models), PLSA (Probabilisti Latent Semantic Analysis). V Convegno GT 2010, le slide dell'intervento. Analisi e test sull'algoritmo semantico di Google.
"Traduttese": tendenze e implicazioni
Francesco Urzì è l'autore del Dizionario delle Combinazioni Lessicali (Edizioni Convivium, Lussemburgo 2009). L'opera nasce sulla sua scrivania di traduttore al Parlamento Europeo quando, dal 1982, inizia a schedare combinazioni utili di parole, del tipo nome-verbo e nome-aggettivo, per meglio sfumare il linguaggio politico. Francesco Urzì discuterà le caratteristiche del “traduttese”, ciò che lo Zingarelli 2010 definisce per la prima volta in assoluto come “linguaggio contorto o stile scadente che sono conseguenza di una traduzione eccessivamente letterale”.
Ocplab Analisi Testuale di Andrea NobileMarco Binotto
Presentazione di alcuni aspetti della ricerca sul G8 di Genova del 2001 presentati nel corso di un seminario-laboratorio "Il sistema dei media e il G8" organizzato dall'Osservatorio OCP.
2024 State of Marketing Report – by HubspotMarius Sescu
https://www.hubspot.com/state-of-marketing
· Scaling relationships and proving ROI
· Social media is the place for search, sales, and service
· Authentic influencer partnerships fuel brand growth
· The strongest connections happen via call, click, chat, and camera.
· Time saved with AI leads to more creative work
· Seeking: A single source of truth
· TLDR; Get on social, try AI, and align your systems.
· More human marketing, powered by robots
ChatGPT is a revolutionary addition to the world since its introduction in 2022. A big shift in the sector of information gathering and processing happened because of this chatbot. What is the story of ChatGPT? How is the bot responding to prompts and generating contents? Swipe through these slides prepared by Expeed Software, a web development company regarding the development and technical intricacies of ChatGPT!
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
The realm of product design is a constantly changing environment where technology and style intersect. Every year introduces fresh challenges and exciting trends that mold the future of this captivating art form. In this piece, we delve into the significant trends set to influence the look and functionality of product design in the year 2024.
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
Mental health has been in the news quite a bit lately. Dozens of U.S. states are currently suing Meta for contributing to the youth mental health crisis by inserting addictive features into their products, while the U.S. Surgeon General is touring the nation to bring awareness to the growing epidemic of loneliness and isolation. The country has endured periods of low national morale, such as in the 1970s when high inflation and the energy crisis worsened public sentiment following the Vietnam War. The current mood, however, feels different. Gallup recently reported that national mental health is at an all-time low, with few bright spots to lift spirits.
To better understand how Americans are feeling and their attitudes towards mental health in general, ThinkNow conducted a nationally representative quantitative survey of 1,500 respondents and found some interesting differences among ethnic, age and gender groups.
Technology
For example, 52% agree that technology and social media have a negative impact on mental health, but when broken out by race, 61% of Whites felt technology had a negative effect, and only 48% of Hispanics thought it did.
While technology has helped us keep in touch with friends and family in faraway places, it appears to have degraded our ability to connect in person. Staying connected online is a double-edged sword since the same news feed that brings us pictures of the grandkids and fluffy kittens also feeds us news about the wars in Israel and Ukraine, the dysfunction in Washington, the latest mass shooting and the climate crisis.
Hispanics may have a built-in defense against the isolation technology breeds, owing to their large, multigenerational households, strong social support systems, and tendency to use social media to stay connected with relatives abroad.
Age and Gender
When asked how individuals rate their mental health, men rate it higher than women by 11 percentage points, and Baby Boomers rank it highest at 83%, saying it’s good or excellent vs. 57% of Gen Z saying the same.
Gen Z spends the most amount of time on social media, so the notion that social media negatively affects mental health appears to be correlated. Unfortunately, Gen Z is also the generation that’s least comfortable discussing mental health concerns with healthcare professionals. Only 40% of them state they’re comfortable discussing their issues with a professional compared to 60% of Millennials and 65% of Boomers.
Race Affects Attitudes
As seen in previous research conducted by ThinkNow, Asian Americans lag other groups when it comes to awareness of mental health issues. Twenty-four percent of Asian Americans believe that having a mental health issue is a sign of weakness compared to the 16% average for all groups. Asians are also considerably less likely to be aware of mental health services in their communities (42% vs. 55%) and most likely to seek out information on social media (51% vs. 35%).
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
This article is all about what AI trends will emerge in the field of creative operations in 2024. All the marketers and brand builders should be aware of these trends for their further use and save themselves some time!
A report by thenetworkone and Kurio.
The contributing experts and agencies are (in an alphabetical order): Sylwia Rytel, Social Media Supervisor, 180heartbeats + JUNG v MATT (PL), Sharlene Jenner, Vice President - Director of Engagement Strategy, Abelson Taylor (USA), Alex Casanovas, Digital Director, Atrevia (ES), Dora Beilin, Senior Social Strategist, Barrett Hoffher (USA), Min Seo, Campaign Director, Brand New Agency (KR), Deshé M. Gully, Associate Strategist, Day One Agency (USA), Francesca Trevisan, Strategist, Different (IT), Trevor Crossman, CX and Digital Transformation Director; Olivia Hussey, Strategic Planner; Simi Srinarula, Social Media Manager, The Hallway (AUS), James Hebbert, Managing Director, Hylink (CN / UK), Mundy Álvarez, Planning Director; Pedro Rojas, Social Media Manager; Pancho González, CCO, Inbrax (CH), Oana Oprea, Head of Digital Planning, Jam Session Agency (RO), Amy Bottrill, Social Account Director, Launch (UK), Gaby Arriaga, Founder, Leonardo1452 (MX), Shantesh S Row, Creative Director, Liwa (UAE), Rajesh Mehta, Chief Strategy Officer; Dhruv Gaur, Digital Planning Lead; Leonie Mergulhao, Account Supervisor - Social Media & PR, Medulla (IN), Aurelija Plioplytė, Head of Digital & Social, Not Perfect (LI), Daiana Khaidargaliyeva, Account Manager, Osaka Labs (UK / USA), Stefanie Söhnchen, Vice President Digital, PIABO Communications (DE), Elisabeth Winiartati, Managing Consultant, Head of Global Integrated Communications; Lydia Aprina, Account Manager, Integrated Marketing and Communications; Nita Prabowo, Account Manager, Integrated Marketing and Communications; Okhi, Web Developer, PNTR Group (ID), Kei Obusan, Insights Director; Daffi Ranandi, Insights Manager, Radarr (SG), Gautam Reghunath, Co-founder & CEO, Talented (IN), Donagh Humphreys, Head of Social and Digital Innovation, THINKHOUSE (IRE), Sarah Yim, Strategy Director, Zulu Alpha Kilo (CA).
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
The search marketing landscape is evolving rapidly with new technologies, and professionals, like you, rely on innovative paid search strategies to meet changing demands.
It’s important that you’re ready to implement new strategies in 2024.
Check this out and learn the top trends in paid search advertising that are expected to gain traction, so you can drive higher ROI more efficiently in 2024.
You’ll learn:
- The latest trends in AI and automation, and what this means for an evolving paid search ecosystem.
- New developments in privacy and data regulation.
- Emerging ad formats that are expected to make an impact next year.
Watch Sreekant Lanka from iQuanti and Irina Klein from OneMain Financial as they dive into the future of paid search and explore the trends, strategies, and technologies that will shape the search marketing landscape.
If you’re looking to assess your paid search strategy and design an industry-aligned plan for 2024, then this webinar is for you.
5 Public speaking tips from TED - Visualized summarySpeakerHub
From their humble beginnings in 1984, TED has grown into the world’s most powerful amplifier for speakers and thought-leaders to share their ideas. They have over 2,400 filmed talks (not including the 30,000+ TEDx videos) freely available online, and have hosted over 17,500 events around the world.
With over one billion views in a year, it’s no wonder that so many speakers are looking to TED for ideas on how to share their message more effectively.
The article “5 Public-Speaking Tips TED Gives Its Speakers”, by Carmine Gallo for Forbes, gives speakers five practical ways to connect with their audience, and effectively share their ideas on stage.
Whether you are gearing up to get on a TED stage yourself, or just want to master the skills that so many of their speakers possess, these tips and quotes from Chris Anderson, the TED Talks Curator, will encourage you to make the most impactful impression on your audience.
See the full article and more summaries like this on SpeakerHub here: https://speakerhub.com/blog/5-presentation-tips-ted-gives-its-speakers
See the original article on Forbes here:
http://www.forbes.com/forbes/welcome/?toURL=http://www.forbes.com/sites/carminegallo/2016/05/06/5-public-speaking-tips-ted-gives-its-speakers/&refURL=&referrer=#5c07a8221d9b
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
Everyone is in agreement that ChatGPT (and other generative AI tools) will shape the future of work. Yet there is little consensus on exactly how, when, and to what extent this technology will change our world.
Businesses that extract maximum value from ChatGPT will use it as a collaborative tool for everything from brainstorming to technical maintenance.
For individuals, now is the time to pinpoint the skills the future professional will need to thrive in the AI age.
Check out this presentation to understand what ChatGPT is, how it will shape the future of work, and how you can prepare to take advantage.
A brief introduction to DataScience with explaining of the concepts, algorithms, machine learning, supervised and unsupervised learning, clustering, statistics, data preprocessing, real-world applications etc.
It's part of a Data Science Corner Campaign where I will be discussing the fundamentals of DataScience, AIML, Statistics etc.
Time Management & Productivity - Best PracticesVit Horky
Here's my presentation on by proven best practices how to manage your work time effectively and how to improve your productivity. It includes practical tips and how to use tools such as Slack, Google Apps, Hubspot, Google Calendar, Gmail and others.
The six step guide to practical project managementMindGenius
The six step guide to practical project management
If you think managing projects is too difficult, think again.
We’ve stripped back project management processes to the
basics – to make it quicker and easier, without sacrificing
the vital ingredients for success.
“If you’re looking for some real-world guidance, then The Six Step Guide to Practical Project Management will help.”
Dr Andrew Makar, Tactical Project Management
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
During this webinar, Anand Bagmar demonstrates how AI tools such as ChatGPT can be applied to various stages of the software development life cycle (SDLC) using an eCommerce application case study. Find the on-demand recording and more info at https://applitools.info/b59
Key takeaways:
• Learn how to use ChatGPT to add AI power to your testing and test automation
• Understand the limitations of the technology and where human expertise is crucial
• Gain insight into different AI-based tools
• Adopt AI-based tools to stay relevant and optimize work for developers and testers
* ChatGPT and OpenAI belong to OpenAI, L.L.C.
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
Mental health has been in the news quite a bit lately. Dozens of U.S. states are currently suing Meta for contributing to the youth mental health crisis by inserting addictive features into their products, while the U.S. Surgeon General is touring the nation to bring awareness to the growing epidemic of loneliness and isolation. The country has endured periods of low national morale, such as in the 1970s when high inflation and the energy crisis worsened public sentiment following the Vietnam War. The current mood, however, feels different. Gallup recently reported that national mental health is at an all-time low, with few bright spots to lift spirits.
To better understand how Americans are feeling and their attitudes towards mental health in general, ThinkNow conducted a nationally representative quantitative survey of 1,500 respondents and found some interesting differences among ethnic, age and gender groups.
Technology
For example, 52% agree that technology and social media have a negative impact on mental health, but when broken out by race, 61% of Whites felt technology had a negative effect, and only 48% of Hispanics thought it did.
While technology has helped us keep in touch with friends and family in faraway places, it appears to have degraded our ability to connect in person. Staying connected online is a double-edged sword since the same news feed that brings us pictures of the grandkids and fluffy kittens also feeds us news about the wars in Israel and Ukraine, the dysfunction in Washington, the latest mass shooting and the climate crisis.
Hispanics may have a built-in defense against the isolation technology breeds, owing to their large, multigenerational households, strong social support systems, and tendency to use social media to stay connected with relatives abroad.
Age and Gender
When asked how individuals rate their mental health, men rate it higher than women by 11 percentage points, and Baby Boomers rank it highest at 83%, saying it’s good or excellent vs. 57% of Gen Z saying the same.
Gen Z spends the most amount of time on social media, so the notion that social media negatively affects mental health appears to be correlated. Unfortunately, Gen Z is also the generation that’s least comfortable discussing mental health concerns with healthcare professionals. Only 40% of them state they’re comfortable discussing their issues with a professional compared to 60% of Millennials and 65% of Boomers.
Race Affects Attitudes
As seen in previous research conducted by ThinkNow, Asian Americans lag other groups when it comes to awareness of mental health issues. Twenty-four percent of Asian Americans believe that having a mental health issue is a sign of weakness compared to the 16% average for all groups. Asians are also considerably less likely to be aware of mental health services in their communities (42% vs. 55%) and most likely to seek out information on social media (51% vs. 35%).
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
This article is all about what AI trends will emerge in the field of creative operations in 2024. All the marketers and brand builders should be aware of these trends for their further use and save themselves some time!
A report by thenetworkone and Kurio.
The contributing experts and agencies are (in an alphabetical order): Sylwia Rytel, Social Media Supervisor, 180heartbeats + JUNG v MATT (PL), Sharlene Jenner, Vice President - Director of Engagement Strategy, Abelson Taylor (USA), Alex Casanovas, Digital Director, Atrevia (ES), Dora Beilin, Senior Social Strategist, Barrett Hoffher (USA), Min Seo, Campaign Director, Brand New Agency (KR), Deshé M. Gully, Associate Strategist, Day One Agency (USA), Francesca Trevisan, Strategist, Different (IT), Trevor Crossman, CX and Digital Transformation Director; Olivia Hussey, Strategic Planner; Simi Srinarula, Social Media Manager, The Hallway (AUS), James Hebbert, Managing Director, Hylink (CN / UK), Mundy Álvarez, Planning Director; Pedro Rojas, Social Media Manager; Pancho González, CCO, Inbrax (CH), Oana Oprea, Head of Digital Planning, Jam Session Agency (RO), Amy Bottrill, Social Account Director, Launch (UK), Gaby Arriaga, Founder, Leonardo1452 (MX), Shantesh S Row, Creative Director, Liwa (UAE), Rajesh Mehta, Chief Strategy Officer; Dhruv Gaur, Digital Planning Lead; Leonie Mergulhao, Account Supervisor - Social Media & PR, Medulla (IN), Aurelija Plioplytė, Head of Digital & Social, Not Perfect (LI), Daiana Khaidargaliyeva, Account Manager, Osaka Labs (UK / USA), Stefanie Söhnchen, Vice President Digital, PIABO Communications (DE), Elisabeth Winiartati, Managing Consultant, Head of Global Integrated Communications; Lydia Aprina, Account Manager, Integrated Marketing and Communications; Nita Prabowo, Account Manager, Integrated Marketing and Communications; Okhi, Web Developer, PNTR Group (ID), Kei Obusan, Insights Director; Daffi Ranandi, Insights Manager, Radarr (SG), Gautam Reghunath, Co-founder & CEO, Talented (IN), Donagh Humphreys, Head of Social and Digital Innovation, THINKHOUSE (IRE), Sarah Yim, Strategy Director, Zulu Alpha Kilo (CA).
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
The search marketing landscape is evolving rapidly with new technologies, and professionals, like you, rely on innovative paid search strategies to meet changing demands.
It’s important that you’re ready to implement new strategies in 2024.
Check this out and learn the top trends in paid search advertising that are expected to gain traction, so you can drive higher ROI more efficiently in 2024.
You’ll learn:
- The latest trends in AI and automation, and what this means for an evolving paid search ecosystem.
- New developments in privacy and data regulation.
- Emerging ad formats that are expected to make an impact next year.
Watch Sreekant Lanka from iQuanti and Irina Klein from OneMain Financial as they dive into the future of paid search and explore the trends, strategies, and technologies that will shape the search marketing landscape.
If you’re looking to assess your paid search strategy and design an industry-aligned plan for 2024, then this webinar is for you.
5 Public speaking tips from TED - Visualized summarySpeakerHub
From their humble beginnings in 1984, TED has grown into the world’s most powerful amplifier for speakers and thought-leaders to share their ideas. They have over 2,400 filmed talks (not including the 30,000+ TEDx videos) freely available online, and have hosted over 17,500 events around the world.
With over one billion views in a year, it’s no wonder that so many speakers are looking to TED for ideas on how to share their message more effectively.
The article “5 Public-Speaking Tips TED Gives Its Speakers”, by Carmine Gallo for Forbes, gives speakers five practical ways to connect with their audience, and effectively share their ideas on stage.
Whether you are gearing up to get on a TED stage yourself, or just want to master the skills that so many of their speakers possess, these tips and quotes from Chris Anderson, the TED Talks Curator, will encourage you to make the most impactful impression on your audience.
See the full article and more summaries like this on SpeakerHub here: https://speakerhub.com/blog/5-presentation-tips-ted-gives-its-speakers
See the original article on Forbes here:
http://www.forbes.com/forbes/welcome/?toURL=http://www.forbes.com/sites/carminegallo/2016/05/06/5-public-speaking-tips-ted-gives-its-speakers/&refURL=&referrer=#5c07a8221d9b
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
Everyone is in agreement that ChatGPT (and other generative AI tools) will shape the future of work. Yet there is little consensus on exactly how, when, and to what extent this technology will change our world.
Businesses that extract maximum value from ChatGPT will use it as a collaborative tool for everything from brainstorming to technical maintenance.
For individuals, now is the time to pinpoint the skills the future professional will need to thrive in the AI age.
Check out this presentation to understand what ChatGPT is, how it will shape the future of work, and how you can prepare to take advantage.
A brief introduction to DataScience with explaining of the concepts, algorithms, machine learning, supervised and unsupervised learning, clustering, statistics, data preprocessing, real-world applications etc.
It's part of a Data Science Corner Campaign where I will be discussing the fundamentals of DataScience, AIML, Statistics etc.
Time Management & Productivity - Best PracticesVit Horky
Here's my presentation on by proven best practices how to manage your work time effectively and how to improve your productivity. It includes practical tips and how to use tools such as Slack, Google Apps, Hubspot, Google Calendar, Gmail and others.
The six step guide to practical project managementMindGenius
The six step guide to practical project management
If you think managing projects is too difficult, think again.
We’ve stripped back project management processes to the
basics – to make it quicker and easier, without sacrificing
the vital ingredients for success.
“If you’re looking for some real-world guidance, then The Six Step Guide to Practical Project Management will help.”
Dr Andrew Makar, Tactical Project Management
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
During this webinar, Anand Bagmar demonstrates how AI tools such as ChatGPT can be applied to various stages of the software development life cycle (SDLC) using an eCommerce application case study. Find the on-demand recording and more info at https://applitools.info/b59
Key takeaways:
• Learn how to use ChatGPT to add AI power to your testing and test automation
• Understand the limitations of the technology and where human expertise is crucial
• Gain insight into different AI-based tools
• Adopt AI-based tools to stay relevant and optimize work for developers and testers
* ChatGPT and OpenAI belong to OpenAI, L.L.C.
2. Limiti della ricerca per parole chiave
• I metodi di ranking tradizionali calcolano l’attinenza di
un documento ad una query sulla base della presenza o
meno di parole contenute nella query: un termine o è
presente o non lo è
• Nel LSI la ricerca avviene per concetti: ma un concetto
non è l’astrazione-generalizazzione di un termine (es:
golf vestiario) bensì un insieme di termini
correlati (golf, maglia, vestito) detti co-occorrenze o
dominio semantico
3. • Data una collezione di documenti, LSI è in grado di rilevare che
alcune n-uple di termini co-occorrono frequentemente (es:
gerarchia, ordinamento e classificazione)
• Se viene fatta una ricerca con gerarchia, ordinamento
vengono “automaticamente” recuperati documenti che
contengono anche (e eventualmente solo!)
classificazione
Dominio
Semantico k
4. Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
Selezione dei documenti basata sul termine ‘Golf’
Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
Car
Petrol
Topgear
GTI
Polo
Base di documenti (20)
Motor
Bike
Oil
Petrol
Tourer
Bed
lace
legal
Petrol
button
soft
Petrol
cat
line
yellow
wind
full
sail
harbour
beach
report
Petrol
Topgear
June
Speed
Fish
Pond
gold
Petrol
Koi
PC
Dell
RAM
Petrol
Floppy
Core
Petrol
Apple
Pip
Tree
Pea
Pod
Fresh
Green
French
Lupin
Petrol
Seed
May
April
Office
Pen
Desk
Petrol
VDU
Friend
Pal
Help
Petrol
Can
Paper
Petrol
Paste
Pencil
Roof
Card
Stamp
Glue
Happy
Send
Toil
Petrol
Work
Time
Cost
con il modello keyword
vengono estratti 4
documenti
5. Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
Selezione basata su ‘Golf’
Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
Car
Petrol
Topgear
GTI
Polo
Tutti i 20 documenti
Motor
Bike
Oil
Petrol
Tourer
Bed
lace
legal
Petrol
button
soft
Petrol
cat
line
yellow
wind
full
sail
harbour
beach
report
Petrol
Topgear
June
Speed
Fish
Pond
gold
Petrol
Koi
PC
Dell
RAM
Petrol
Floppy
Core
Petrol
Apple
Pip
Tree
Pea
Pod
Fresh
Green
French
Lupin
Petrol
Seed
May
April
Office
Pen
Desk
Petrol
VDU
Friend
Pal
Help
Petrol
Can
Paper
Petrol
Paste
Pencil
Roof
Card
Stamp
Glue
Happy
Send
Toil
Petrol
Work
Time
Cost
vediamo quali sono le parole
più rilevanti associate a Golf
di questi 4 documenti. Esse sono:
Car, Topgear and Petrol
rank
dei
doc
selezionati
Car
2 *(20/3) = 13
Topgear
2 *(20/3) = 13
Petrol
3 *(20/16) = 4
wf.idf
6. Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
rank
dei
doc
selezionati
Selezione basata su ‘Golf’
Car
2 *(20/3) = 13
Topgear
2 *(20/3) = 13
Petrol
3 *(20/16) = 4
Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
Car
Petrol
Topgear
GTI
Polo
Tutti i 20 documenti
Motor
Bike
Oil
Petrol
Tourer
Bed
lace
legal
Petrol
button
soft
Petrol
cat
line
yellow
wind
full
sail
harbour
beach
report
Petrol
Topgear
June
Speed
Fish
Pond
gold
Petrol
Koi
PC
Dell
RAM
Petrol
Floppy
Core
Petrol
Apple
Pip
Tree
Pea
Pod
Fresh
Green
French
Lupin
Petrol
Seed
May
April
Office
Pen
Desk
Petrol
VDU
Friend
Pal
Help
Petrol
Can
Paper
Petrol
Paste
Pencil
Roof
Card
Stamp
Glue
Happy
Send
Toil
Petrol
Work
Time
Cost
poiché le parole sono pesate anche
rispetto al loro idf, risulta che :
Car e Topgear sono associate a Golf
più di Petrol
7. Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
Car
Petrol
Topgear
GTI
Polo
Tutti i 20 documenti
Motor
Bike
Oil
Petrol
Tourer
Bed
lace
legal
Petrol
button
soft
Petrol
cat
line
yellow
wind
full
sail
harbour
beach
report
Petrol
Topgear
June
Speed
Fish
Pond
gold
Petrol
Koi
PC
Dell
RAM
Petrol
Floppy
Core
Petrol
Apple
Pip
Tree
Pea
Pod
Fresh
Green
French
Lupin
Petrol
Seed
May
April
Office
Pen
Desk
Petrol
VDU
Friend
Pal
Help
Petrol
Can
Paper
Petrol
Paste
Pencil
Roof
Card
Stamp
Glue
Happy
Send
Toil
Petrol
Work
Time
Cost
Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
rank
dei
doc
selezionati
Selezione basata su ‘Golf’
selezione basata sul dominio semantico
Car
2 *(20/3) = 13
Topgear
2 *(20/3) = 13
Petrol
3 *(20/16) = 4
Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
Car
Wheel
Topgear
GTI
Polo
Ora cerchiamo ancora nella base
di documenti, usando questo insieme
di parole che rappresentano
il “dominio semantico” di Golf .
La lista ora include un nuovo documento,
non catturato sulla base della
semplice ricerca per keywords.
8. Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
rank
dei
doc
selezionati
Selezione basata su ‘Golf’
selezione basata sul dominio semantico
Car
2 *(20/3) = 13
Topgear
2 *(20/3) = 13
Petrol
3 *(20/16) = 4
Rank 2617 17 030
Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
Car
Petrol
Topgear
GTI
Polo
Tutti i 20 documenti
Motor
Bike
Oil
Petrol
Tourer
Bed
lace
legal
Petrol
button
soft
Petrol
cat
line
yellow
wind
full
sail
harbour
beach
report
Petrol
Topgear
June
Speed
Fish
Pond
gold
Petrol
Koi
PC
Dell
RAM
Petrol
Floppy
Core
Petrol
Apple
Pip
Tree
Pea
Pod
Fresh
Green
French
Lupin
Petrol
Seed
May
April
Office
Pen
Desk
Petrol
VDU
Friend
Pal
Help
Petrol
Can
Paper
Petrol
Paste
Pencil
Roof
Card
Stamp
Glue
Happy
Send
Toil
Petrol
Work
Time
Cost
Golf
Car
Topgear
Petrol
GTI
Golf
Car
Clarkson
Petrol
Badge
Golf
Petrol
Topgear
Polo
Red
Golf
Tiger
Woods
Belfry
Tee
Car
Wheel
Topgear
GTI
Polo
Usando un ranking
basato sulla co-occorrenza dei termini
possiamo assegnare
un miglior ranking ai documenti.
Notate che: il documento
più rilevante non contiene la parola Golf, e
che uno dei documenti che la conteneva
scompare (era infatti un senso “spurio”di Golf).
11. Co-occorrenze dei termini nei
documenti
== ΛΛΑ Τ
ijA = T
ikL kjL
T
L L
∑∑ ==
==
n
k
kjki
n
k
kj
T
ikij LLLLA
11
Aij è il numero di co-occorrenze nei documenti
fra il termine i ed il termine j
14. Matrici delle co-occorrenze
• Se L è una matrice nxm (termini x documenti)
• Allora:
– LT
L è la matrice le cui righe ai rappresentano le co-
occorrenze di termini fra di e dj, per ogni dj. Dato un
documento, indica quali sono i documenti più simili.
– LLT
è la matrice le cui righe ai rappresentano le co-
occorrenze nei documenti fra ti e tj per ogni tj. Dato un
termine, indica quali sono i termini più correlati.
– Usando, ad esempio, la matrice LLT
potrei “espandere” ogni
termine con quelli aventi il più alto valore di correlazione
(cioè, aggiungere alla query in cui compare la parola w anche
quelle che co-occorrono con w più frequentemente)
15. Osservazione
• Tutte le possibili co-occorrenze sarebbero assai di più dei
termini singoli (detta L la matrice termini-documenti, dovrei
calcolare A=LLT
).
• Tuttavia, sebbene la matrice A ha dimensionalità elevata la
maggioranza delle celle hanno valore zero
• Con i metodi classici ogni documento o query è un vettore in
uno spazio t-dimensionale
• LSI tenta di proiettare questo spazio in uno spazio di
dimensione ridotta, in cui, anziché termini, le dimensioni
rappresentano co-occorrenze o dominii semantici, ma solo quelli
preponderanti
• Tuttavia LSI utilizza per questa riduzione di rango solo
strumenti matematici (singular value decomposition, SVD).
17. Singual value decomposition
• Come detto, LSI proietta la matrice L termini-
documenti in uno spazio concettuale di dimensioni
ridotte, dove le dimensioni sono gruppi di concetti che
co-occorrono, definendo un “dominio semantico”
• Il metodo utilizzato per effettuare questa proiezione è
la singular value decomposition, un metodo algebrico.
• Ci occorre un piccolo ripasso di algebra per capire
questo metodo.
19. Autovalori & Autovettori
Ha soluzioni non nulle se
Se S mxm, questa è un’equazione di grado m in λ che ha al
più m soluzioni distinte (le radici del polinomio
caratteristico) – possono essere complesse anche se S è
reale.
(right) eigenvector eigenvalue
Esempio
Eigenvectors o autovettori (di una matrice S)
Quanti autovalori al massimo ha S?
20. Esempio: calcolo di Eigenvalues and
Eigenvectors
• Def: Un vettore v ∈ Rn
, v ≠ 0, è un
autovettore di una matrice n×n A con
corrispondente autovalore λ, se:
Av = λv
A =
1 −1
3 5
⎛
⎝
⎜
⎞
⎠
⎟⋅v =
1
−3
⎛
⎝
⎜
⎞
⎠
⎟⋅λ = 4
Av = λv
1 −1
3 5
⎛
⎝
⎜
⎞
⎠
⎟
1
−3
⎛
⎝
⎜
⎞
⎠
⎟= 4
1
−3
⎛
⎝
⎜
⎞
⎠
⎟
1+−3(−1)
3 +5(−3)
⎛
⎝
⎜
⎞
⎠
⎟=
4
−12
⎛
⎝
⎜
⎞
⎠
⎟
4
−12
⎛
⎝
⎜
⎞
⎠
⎟=
4
−12
⎛
⎝
⎜
⎞
⎠
⎟
23. Significato geometrico di autovalori
e autovettori
• La moltiplicazione di una matrice A mxn per un
vettore v è una trasformazione lineare che
trasferisce il vettore v dallo spazio Rn
a Rm
• Gli autovettori sono quei vettorila cui direzione
non cambia per effetto della trasformazione A
24. Moltiplicare per una matrice è una
trasformazione lineare
In questa trasformazione lineare della
Gioconda, l'immagine è
modificata ma l'asse centrale verticale
rimane fisso.
Il vettore blu ha cambiato direzione,
mentre quello rosso no.
Quindi il vettore rosso è un
autovettore della trasformazione e
quello blu no. Inoltre, poiché il vettore
rosso non è stato né allungato, né
compresso, né ribaltato, il suo
autovalore è 1 (quindi l’autovalore
indica una costante di traslazione dei
punti dell’immagine nella direzione
blu) . Tutti i vettori sull'asse verticale
sono multipli scalari del vettore rosso,
e sono tutti autovettori.
Av=λv
25. Trasformazioni lineari
• Se v è un vettore qualsiasi, A una matrice nxm
(trasformazione lineare), vi gli autovettori di A e
λi gli autovalori, la trasformazione del vettore è
completamente definita da autovalori e
autovettori di A:
A
r
v = λ1(v1⋅ v) + λ2(v2 ⋅ v) + ...λk(vk ⋅ v)
26. Riduzione della dimensionalità (o
approssimazione di rango k di una matrice)
Moltiplicare una matrice per un vettore ha due effetti sul vettore:
rotazione (il vettore cambia coordinate) e scalatura (la lunghezza
del vettore cambia). La massima compressione e rotazione
dipendono dagli autovalori della matrice (vedi formula precedente)
27. Riduzione della dimensionalità (o approssimazione
di rango k di una matrice)
Nello schiacciamento e compressione il ruolo principale lo giocano
i valori singolari più grandi della matrice (s1 e s2 in figura)
Gli autovalori descrivono dunque quanto la matrice
distorce (riduce e comprime) il vettore originario
28. Riduzione della dimensionalità (o approssimazione
di rango k di una matrice)
Qui invece supponiamo che, invece di ruotare un vettore,
ruotiamo un insieme di vettori ortonormali. Se, ad es, di tre
autovalori uno lo trascuriamo perché più piccolo, è come se
rimuovessimo una dimensione (se invece eliminiamo due
autovalori, l’ellissoide si schiaccia su una retta)
29. Cosa c’entra tutto ciò?
• Riassumiamo:
– Se q è il vettore di una query e L è la matrice termini-
documenti, il prodotto LT
q fornisce una matrice delle
similarità fra q e i documenti della collezione, secondo il
modello vettoriale standard
– Ma LT
q è una trasformazione lineare, e, se λi e vi sono
autovalori e autovettori di LT
=A, allora
– Se posso trascurare alcuni autovalori, allora è come se
proiettassi q in uno spazio a dimensioni ridotte: ma
come?
– Servono altre definizioni .. ( e ci resta da capire cosa
c’entrano le matrici delle co-occorrenze LT
L e LLT
viste
Aq = λ1(v1 ⋅q) + λ2(v2 ⋅q) +...λk (vk ⋅q)
30. Valori e vettori singolari
• Data una matrice L nxm, la radice quadrata degli n
autovalori di LT
L si dicono valori singolari di L
• Gli n autovettori di LT
L si dicono vettori singolari
destri
• Gli m autovettori di LLT
si dicono vettori singolari
sinistri
• E finalmente…
31. Singular Value Decomposition!!
Sia L una matrice nxm
Data una matrice nxn, esistono 3 matrici U, Σ e VT
, tali che:
L = UΣVT
• U e V sono le matrici dei vettori singolari sinistro e destro di
L (cioè gli autovettori o eigenvectors di LLT
e LT
L,
rispettivamente)
• Le colonne di U e le righe di V definiscono uno spazio
ortonormale, cioè: U-1
=VT
• Σ è la matrice diagonale dei valori singolari σ di L
I valori singolari sono le radici degli autovalori di LLT
o LT
L
(si dimostra che sono uguali). Poiché LLT
è SIMMETRICA,i suoi
autovalori √σ= λ saranno reali decrescenti lungo Σ.
32. Riduzione del rango in LSI
Gli elementi diagonali in Σ sono positivi e decrescenti in
grandezza. Si prendono i primi k e gli altri vengono posti a zero.
Si cancellano le righe e le colonne zero di Σ e le corrispondenti
righe e colonne di U e V. Si ottiene:
L ≈ U’Σ’VT’
Interpretazione
Se il valore k è selezionato opportunamente, l’aspettativa è che la
nuova matrice mantenga l’informazione semantica di L, ma
elimini il rumore derivante dalla sinonimia (perché sensi diversi
avranno co-occorrenze diverse) e riconosca la dipendenza fra
termini co-occorrenti.
^
ˆL
33. Riduzione del rango
L =
t x d t x k k x dk x k
k è il numero di valori singolari scelti per rappresentare i
concetti nell’insieme dei documenti
In genere, k « d.
U’
Σ’ V’T
34. Ma insomma, cosa c’entrano le
co-occorrenze????
• Abbiamo detto che U, Σ e V sono matrici degli
autovalori e autovettori di LT
L e LLT
(nonchè
valori e vettori singolari di L).
• Ma come calcoliamo, ad es. gli autovalori di
LT
L?
35. Ricordate come è fatta la matrice
LT
L
w11 w12 w13
w12 w22 w23
w13 w23 w33
Per trovare gli autovalori, devo calcolare
il determinante di:
w11-λ w12 w13
w12 w22 -λ w23
w13 w23 w33 -λ
L’equazione caratteristica di terzo grado è data, in questo esempio, da:
(w11 −λ) (w22 −λ)(w33 −λ)−2w23[ ] −w12 w12(w23 −λ)−w23w13[ ] +w13(w12w23 −w22w13)
Che come si vede, contiene prodotti di co-occorrenze: gli
autovalori di grandezza maggiore (o vettori singolari di L)
saranno determinati dai prodotti di co-occorrenze tutte non nulle
36. Esempio LLT
(ricordate?)
2 2 0
2 2 0
0 0 1
Si vede chiaramente che esistono
due dimensioni: quella di w1 e w2
(w12), e quella di w3
w11 w12 w13
w12 w22 w23
w13 w23 w33
Calcolando autovalori e autovettori su:
http://www.bluebit.gr/matrix-calculator/calculate.aspx
si ottiene il polinomio caratteristico:
Con autovalori 4, 1 e 0.
λ3 −5λ2 + 4λ
38. La matrice L termini-documenti
Autovettori di LLT
o
vettori singolari sinistri di L
Radici degli autovalori di LLT
o valori singolari di L
Autovettori di LTL o vettori
singolari destri
39. SVD nel LSI: conclusioni
• Nel modello vettoriale, queries e documenti sono
vettori in uno spazio le cui dimensioni sono i termini,
considerati fra loro ortonormali, cioè indipendenti fra
loro
• LSI trasferisce questi vettori in uno spazio le cui
dimensioni sono concetti, cioè co-occorrenze fra
termini
• La riduzione di rango ha l’effetto di eliminare i concetti
poco rilevanti
40. Riassunto del Procedimento
• L=UΣVT
dove L nxm
1. Calcola la trasposta LT
di L
2. Determina gli autovalori di LT
L e ordinali in ordine
decrescente. Calcola le radici quadrate.
3. Costruisci la matrice Σ
4. Calcola gli autovettori di LT
L. Questi sono le colonne
di V. Genera VT
5. Calcola U=AVΣ-1
44. 3a. Calcolo similarità query-
documento
• Per N documenti, V contiene N righe, ognuna
delle quali rappresenta le coordinate del
documento diproiettato nella dimensione LSI
• Una query viene trattata come un documento e
anch’essa proiettata nello spazio LSI
45. 3b. L=USVT
• Se L=UΣVT
si ha anche che
• V = LT
UΣ-1
• d = dT
UΣ-1
• q = qT
UΣ-1
• Dopo la riduzione di rango k:
– d = dT
UkΣk
-1
– q = qT
UkΣk
-1
– sim(q, d) = sim(qT
UkΣk
-1
, dT
UkΣk
-1
)