SlideShare a Scribd company logo
1 of 14
What’s on Wikipedia, and What’s Not…?
Completeness of Information on the Online Collaborative Encyclopedia
Cindy Royal, Ph.D.
Assistant Professor
Texas State University
School of Journalism and Mass Communication
Deepina Kapila
Graduate Student
Texas State University
School of Journalism and Mass Communication
Introduction - Wikipedia
• Wikipedia (www.wikipedia.com), deemed “the free
encyclopedia,” was launched on the web in 2001.
• Since then, it has become the Web’s 3rd most
popular news and information source
• It uses the Wiki software format, which allows a
community of users to develop and monitor content
• Wikipedia operates under the assumption that the
public will act as a policing force, keeping content
reliable and up to date.
Introduction - Research
• Denning et al. (2005) listed the risks inherent in
Wikipedia’s model: accuracy, motives, uncertain
expertise, volatility, coverage, sources.
• Bopp and Smith (2001) state that coverage in an
encyclopedia should be “Even across all subjects”
• Shoemaker and Reese (1995) identified the
individual as a news influencer. Web users and
content creators tend to be young.
• Tankard/Royal (2005) – inherent biases in Web
content, based on systematic searches.
Research Questions
This project measures the content of Wikipedia against
various indexes or standards of completeness to identify
and uncover potential inherent biases.
We are asking:
1. Are there some systematic gaps or biases in the overall presentation of
information made available on Wikipedia?
2. Is recency (or currency) a predictor of amount of information on Wikipedia?
3. Is importance of information a predictor of amount of information on
Wikipedia?
4. Is population a predictor of amount of information about particular countries
on Wikipedia?
5. Is economic power a predictor of amount of information about individual
corporations on Wikipedia?
Method
• Using predictors of recency, importance, country
population, and economic power, several systematic
searches on Wikipedia were conducted
• Each article for each topic was visited, the relevant
content highlighted, and the selection’s words were
counted
• Word counts were captured in a spreadsheet, and
items were plotted on charts
• Ascending order
• Predictor variable
Topics Covered
• Years (1900-2010)
• Academy Award Winning Films
• Time Magazine’s Person of the Year
• #1 Song on Billboard Top 100 (1940-2006)
• Encyclopedia Terms
• Countries in the United Nations
• Fortune 1000 companies
Results - Years
0
2,000
4,000
6,000
8,000
10,000
12,000
1 9 17 25 33 41 49 57 65 73 81 89 97 105
0
2,000
4,000
6,000
8,000
10,000
12,000
1900
1906
1912
1918
1924
1930
1936
1942
1948
1954
1960
1966
1972
1978
1984
1990
1996
2002
2008
Ascending Order Chronological Order
-Backward L-shaped curve
-Clear progression of length of article with year; dramatic increase in
years after 2001
-Years in the future displayed understandably shorter word counts
-Spearman Correlation between variables: .79
Results - Films
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
1928
1932
1936
1940
1944
1948
1952
1956
1960
1964
1968
1972
1976
1980
1984
1988
1992
1996
2000
2004
Ascending Order Chronological Order
-Backward L-shaped curve is apparent.
-With few exceptions (ie. Gone with the Wind, 1939 and Casablanca, 1943) the results
show progression favoring more current films. Recency is important, but certain films
transcend time and are deemed important for other reasons.
-Average word count for films since 2001 was 80% higher than word count before
2001.
-Spearman correlation between variables: .49; increased to .62 simply by removing 2
Results - Person of the Year
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
16,000
18,000
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
16,000
18,000
1927
1931
1935
1939
1943
1947
1952
1957
1962
1967
1974
1979
1985
1991
1996
2001
Ascending Order Chronological Order
-Softer backward-shaped L curve
-Even distribution shows bias is unrelated to recency, measured by another variable
of importance
-Spearman Correlation between variables: O-there was no relationship with time.
Results - Billboard Top 100
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
1940
1943
1946
1949
1952
1955
1958
1961
1964
1967
1970
1973
1976
1979
1982
1985
1988
1991
1994
1997
2000
2003
2006
Ascending Order Chronological Order
-Backward L-shaped curve
-Although Average word count was 32% higher for artists since 1990, distribution
shows trend similar to movies in that some artists transcend time.
-Spearman correlation between variables: .40 (by eliminating 2 outliers)
Encyclopedia Terms
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99
Ascending Order
-Comparison between Encyclopedia Britannica and Wikipedia articles
-Backward L-shaped distribution apparent
-Spearman correlation used to compare inches of content in Encyclopedia Britannica
with word count in Wikipedia: .26
-Of 100 terms, 14 were not represented in Wikipedia
Results - UN Countries
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
1 13 25 37 49 61 73 85 97 109 121 133 145 157 169 181 193
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
1 13 25 37 49 61 73 85 97 109 121 133 145 157 169
Ordered by populationAscending Order
-Backward L-shaped curve - although fairly evenly distributed, a SHARP increase appears
for the top 22 countries.
-Gradual upward curve in 2nd
chart shows that as population increases, so does word count
-Average word count for top 10% of countries was 63% higher than the rest on the list
-Spearman correlation between variables: .55
Results - Fortune 1000
0
1,000
2,000
3,000
4,000
5,000
6,000
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85
0
1,000
2,000
3,000
4,000
5,000
6,000
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
Ascending Order Ordered by Revenue
-Backward L-shaped curve
-SHARP increase for top 10% of companies by revenue
-Top 10% of companies by revenue counted for 30% of total word count on companies
-Spearman correlation between variables: .49
Conclusion
-Information on Wikipedia is volatile, dynamic and constantly changing over time
-Wikipedia’s purpose is to serve as a general reference source, but the content is
weighted due to its contributors’ demographics
-In each search performed for the dimensions, strong biases were evident and strong
correlations experienced:
-Currency/Recency: the more current topics were covered the most
-Random Selection: Encyclopedia terms showed clear bias towards more
common or popular terms
-Relevancy: Wikipedia’s word count correlates to inches in a traditional
encyclopedia, showing a strong agenda by each publication
-Population: the larger the country and the larger its population, the higher the
word count
-Revenue: The larger the revenue, the higher the word count

More Related Content

Viewers also liked (20)

Fairfield hannah
Fairfield hannahFairfield hannah
Fairfield hannah
 
Pilhofer
PilhoferPilhofer
Pilhofer
 
Debth
DebthDebth
Debth
 
Batsell2011
Batsell2011Batsell2011
Batsell2011
 
Latinamerica
LatinamericaLatinamerica
Latinamerica
 
Thurman
ThurmanThurman
Thurman
 
Jake
JakeJake
Jake
 
Avery
AveryAvery
Avery
 
O shea
O sheaO shea
O shea
 
Nord
NordNord
Nord
 
Santos
SantosSantos
Santos
 
Carlos Salamanca
Carlos SalamancaCarlos Salamanca
Carlos Salamanca
 
Robinson
RobinsonRobinson
Robinson
 
Barker
BarkerBarker
Barker
 
Benz
BenzBenz
Benz
 
Demaeyer
DemaeyerDemaeyer
Demaeyer
 
Claudia silva
Claudia silvaClaudia silva
Claudia silva
 
Kian
KianKian
Kian
 
Wilkinson
WilkinsonWilkinson
Wilkinson
 
Zhao
ZhaoZhao
Zhao
 

Similar to Royalkapila

Richard Edelman Jul 5th 2007
Richard Edelman   Jul 5th 2007Richard Edelman   Jul 5th 2007
Richard Edelman Jul 5th 2007
Luca Colombo
 
Tourism and Distribution - New Paradigm: ATEC, Sydney, 2007
Tourism and Distribution - New Paradigm: ATEC, Sydney,  2007Tourism and Distribution - New Paradigm: ATEC, Sydney,  2007
Tourism and Distribution - New Paradigm: ATEC, Sydney, 2007
Anna Pollock
 
Essay For Climate Change. Climate Change Essay Telegraph
Essay For Climate Change. Climate Change Essay  TelegraphEssay For Climate Change. Climate Change Essay  Telegraph
Essay For Climate Change. Climate Change Essay Telegraph
Ashley Mason
 
Tell Your Library's Story with Infographics: Tips From an Accidental Graphic ...
Tell Your Library's Story with Infographics: Tips From an Accidental Graphic ...Tell Your Library's Story with Infographics: Tips From an Accidental Graphic ...
Tell Your Library's Story with Infographics: Tips From an Accidental Graphic ...
library_research_service
 

Similar to Royalkapila (20)

PR in a changing world
PR in a changing worldPR in a changing world
PR in a changing world
 
Richard Edelman Jul 5th 2007
Richard Edelman   Jul 5th 2007Richard Edelman   Jul 5th 2007
Richard Edelman Jul 5th 2007
 
Meo
MeoMeo
Meo
 
Analytics with Purpose Data Visualization Gallery
Analytics with Purpose Data Visualization GalleryAnalytics with Purpose Data Visualization Gallery
Analytics with Purpose Data Visualization Gallery
 
Tourism and Distribution - New Paradigm: ATEC, Sydney, 2007
Tourism and Distribution - New Paradigm: ATEC, Sydney,  2007Tourism and Distribution - New Paradigm: ATEC, Sydney,  2007
Tourism and Distribution - New Paradigm: ATEC, Sydney, 2007
 
Essay For Climate Change. Climate Change Essay Telegraph
Essay For Climate Change. Climate Change Essay  TelegraphEssay For Climate Change. Climate Change Essay  Telegraph
Essay For Climate Change. Climate Change Essay Telegraph
 
Pitch deck
Pitch deckPitch deck
Pitch deck
 
Web 2.0 For Western Springs Jun25, 2009
Web 2.0 For Western Springs Jun25, 2009Web 2.0 For Western Springs Jun25, 2009
Web 2.0 For Western Springs Jun25, 2009
 
Cross Platform Publishing - mail online
Cross Platform Publishing - mail onlineCross Platform Publishing - mail online
Cross Platform Publishing - mail online
 
High School Dropout Essay.pdf
High School Dropout Essay.pdfHigh School Dropout Essay.pdf
High School Dropout Essay.pdf
 
Mc combs
Mc combsMc combs
Mc combs
 
Tell Your Library's Story with Infographics: Tips From an Accidental Graphic ...
Tell Your Library's Story with Infographics: Tips From an Accidental Graphic ...Tell Your Library's Story with Infographics: Tips From an Accidental Graphic ...
Tell Your Library's Story with Infographics: Tips From an Accidental Graphic ...
 
Opening Presentation for 2018 Public Radio Super Regional Oct 25 2018
Opening Presentation for 2018 Public Radio Super Regional Oct 25 2018Opening Presentation for 2018 Public Radio Super Regional Oct 25 2018
Opening Presentation for 2018 Public Radio Super Regional Oct 25 2018
 
Illustration Essay Child Obesity. Online assignment writing service.
Illustration Essay Child Obesity. Online assignment writing service.Illustration Essay Child Obesity. Online assignment writing service.
Illustration Essay Child Obesity. Online assignment writing service.
 
Randy Ifra
Randy IfraRandy Ifra
Randy Ifra
 
Marketing Trends: Social Media, Where did it come from? Where is it going?
Marketing Trends: Social Media, Where did it come from? Where is it going?Marketing Trends: Social Media, Where did it come from? Where is it going?
Marketing Trends: Social Media, Where did it come from? Where is it going?
 
Rhetorical Essay Thesis. Take A Look At The Vital R
Rhetorical Essay Thesis. Take A Look At The Vital RRhetorical Essay Thesis. Take A Look At The Vital R
Rhetorical Essay Thesis. Take A Look At The Vital R
 
1-Face-PPT
1-Face-PPT1-Face-PPT
1-Face-PPT
 
Future of Philippine Media
Future of Philippine MediaFuture of Philippine Media
Future of Philippine Media
 
Sustainable Coastal Development: Finding Certainty in Uncertain Times
Sustainable Coastal Development: Finding Certainty in Uncertain TimesSustainable Coastal Development: Finding Certainty in Uncertain Times
Sustainable Coastal Development: Finding Certainty in Uncertain Times
 

More from Knight Center (20)

Martin
MartinMartin
Martin
 
Britt
BrittBritt
Britt
 
Joseph yoo
Joseph yooJoseph yoo
Joseph yoo
 
Singer
SingerSinger
Singer
 
Ramirez
RamirezRamirez
Ramirez
 
Griggs
GriggsGriggs
Griggs
 
Ting tingchia
Ting tingchiaTing tingchia
Ting tingchia
 
Symson
SymsonSymson
Symson
 
Garcia ruiz
Garcia ruizGarcia ruiz
Garcia ruiz
 
Brundrett. 2015
Brundrett. 2015Brundrett. 2015
Brundrett. 2015
 
J moroney
J moroneyJ moroney
J moroney
 
Collins
CollinsCollins
Collins
 
Ray
RayRay
Ray
 
Owen
OwenOwen
Owen
 
Royal blasingame
Royal blasingameRoyal blasingame
Royal blasingame
 
Diakopoulos
DiakopoulosDiakopoulos
Diakopoulos
 
Scacco
ScaccoScacco
Scacco
 
Havlak
HavlakHavlak
Havlak
 
Lee
LeeLee
Lee
 
Hernandez
HernandezHernandez
Hernandez
 

Recently uploaded

call girls inMahavir Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls inMahavir Nagar  (delhi) call me [🔝9953056974🔝] escort service 24X7call girls inMahavir Nagar  (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls inMahavir Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
The political system of the united kingdom
The political system of the united kingdomThe political system of the united kingdom
The political system of the united kingdom
lunadelior
 
{Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...
{Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...{Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...
{Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...
hyt3577
 
THE OBSTACLES THAT IMPEDE THE DEVELOPMENT OF BRAZIL IN THE CONTEMPORARY ERA A...
THE OBSTACLES THAT IMPEDE THE DEVELOPMENT OF BRAZIL IN THE CONTEMPORARY ERA A...THE OBSTACLES THAT IMPEDE THE DEVELOPMENT OF BRAZIL IN THE CONTEMPORARY ERA A...
THE OBSTACLES THAT IMPEDE THE DEVELOPMENT OF BRAZIL IN THE CONTEMPORARY ERA A...
Faga1939
 
9953056974 Call Girls In Pratap Nagar, Escorts (Delhi) NCR
9953056974 Call Girls In Pratap Nagar, Escorts (Delhi) NCR9953056974 Call Girls In Pratap Nagar, Escorts (Delhi) NCR
9953056974 Call Girls In Pratap Nagar, Escorts (Delhi) NCR
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

call girls inMahavir Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls inMahavir Nagar  (delhi) call me [🔝9953056974🔝] escort service 24X7call girls inMahavir Nagar  (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls inMahavir Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Transformative Leadership: N Chandrababu Naidu and TDP's Vision for Innovatio...
Transformative Leadership: N Chandrababu Naidu and TDP's Vision for Innovatio...Transformative Leadership: N Chandrababu Naidu and TDP's Vision for Innovatio...
Transformative Leadership: N Chandrababu Naidu and TDP's Vision for Innovatio...
 
KING VISHNU BHAGWANON KA BHAGWAN PARAMATMONKA PARATOMIC PARAMANU KASARVAMANVA...
KING VISHNU BHAGWANON KA BHAGWAN PARAMATMONKA PARATOMIC PARAMANU KASARVAMANVA...KING VISHNU BHAGWANON KA BHAGWAN PARAMATMONKA PARATOMIC PARAMANU KASARVAMANVA...
KING VISHNU BHAGWANON KA BHAGWAN PARAMATMONKA PARATOMIC PARAMANU KASARVAMANVA...
 
Unveiling the Characteristics of Political Institutions_ A Comprehensive Anal...
Unveiling the Characteristics of Political Institutions_ A Comprehensive Anal...Unveiling the Characteristics of Political Institutions_ A Comprehensive Anal...
Unveiling the Characteristics of Political Institutions_ A Comprehensive Anal...
 
declarationleaders_sd_re_greens_theleft_5.pdf
declarationleaders_sd_re_greens_theleft_5.pdfdeclarationleaders_sd_re_greens_theleft_5.pdf
declarationleaders_sd_re_greens_theleft_5.pdf
 
The political system of the united kingdom
The political system of the united kingdomThe political system of the united kingdom
The political system of the united kingdom
 
Job-Oriеntеd Courses That Will Boost Your Career in 2024
Job-Oriеntеd Courses That Will Boost Your Career in 2024Job-Oriеntеd Courses That Will Boost Your Career in 2024
Job-Oriеntеd Courses That Will Boost Your Career in 2024
 
06052024_First India Newspaper Jaipur.pdf
06052024_First India Newspaper Jaipur.pdf06052024_First India Newspaper Jaipur.pdf
06052024_First India Newspaper Jaipur.pdf
 
Dubai Call Girls Pinky O525547819 Call Girl's In Dubai
Dubai Call Girls Pinky O525547819 Call Girl's In DubaiDubai Call Girls Pinky O525547819 Call Girl's In Dubai
Dubai Call Girls Pinky O525547819 Call Girl's In Dubai
 
America Is the Target; Israel Is the Front Line _ Andy Blumenthal _ The Blogs...
America Is the Target; Israel Is the Front Line _ Andy Blumenthal _ The Blogs...America Is the Target; Israel Is the Front Line _ Andy Blumenthal _ The Blogs...
America Is the Target; Israel Is the Front Line _ Andy Blumenthal _ The Blogs...
 
10052024_First India Newspaper Jaipur.pdf
10052024_First India Newspaper Jaipur.pdf10052024_First India Newspaper Jaipur.pdf
10052024_First India Newspaper Jaipur.pdf
 
422524114-Patriarchy-Kamla-Bhasin gg.pdf
422524114-Patriarchy-Kamla-Bhasin gg.pdf422524114-Patriarchy-Kamla-Bhasin gg.pdf
422524114-Patriarchy-Kamla-Bhasin gg.pdf
 
China's soft power in 21st century .pptx
China's soft power in 21st century   .pptxChina's soft power in 21st century   .pptx
China's soft power in 21st century .pptx
 
05052024_First India Newspaper Jaipur.pdf
05052024_First India Newspaper Jaipur.pdf05052024_First India Newspaper Jaipur.pdf
05052024_First India Newspaper Jaipur.pdf
 
*Navigating Electoral Terrain: TDP's Performance under N Chandrababu Naidu's ...
*Navigating Electoral Terrain: TDP's Performance under N Chandrababu Naidu's ...*Navigating Electoral Terrain: TDP's Performance under N Chandrababu Naidu's ...
*Navigating Electoral Terrain: TDP's Performance under N Chandrababu Naidu's ...
 
Politician uddhav thackeray biography- Full Details
Politician uddhav thackeray biography- Full DetailsPolitician uddhav thackeray biography- Full Details
Politician uddhav thackeray biography- Full Details
 
04052024_First India Newspaper Jaipur.pdf
04052024_First India Newspaper Jaipur.pdf04052024_First India Newspaper Jaipur.pdf
04052024_First India Newspaper Jaipur.pdf
 
{Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...
{Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...{Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...
{Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...
 
THE OBSTACLES THAT IMPEDE THE DEVELOPMENT OF BRAZIL IN THE CONTEMPORARY ERA A...
THE OBSTACLES THAT IMPEDE THE DEVELOPMENT OF BRAZIL IN THE CONTEMPORARY ERA A...THE OBSTACLES THAT IMPEDE THE DEVELOPMENT OF BRAZIL IN THE CONTEMPORARY ERA A...
THE OBSTACLES THAT IMPEDE THE DEVELOPMENT OF BRAZIL IN THE CONTEMPORARY ERA A...
 
9953056974 Call Girls In Pratap Nagar, Escorts (Delhi) NCR
9953056974 Call Girls In Pratap Nagar, Escorts (Delhi) NCR9953056974 Call Girls In Pratap Nagar, Escorts (Delhi) NCR
9953056974 Call Girls In Pratap Nagar, Escorts (Delhi) NCR
 

Royalkapila

  • 1. What’s on Wikipedia, and What’s Not…? Completeness of Information on the Online Collaborative Encyclopedia Cindy Royal, Ph.D. Assistant Professor Texas State University School of Journalism and Mass Communication Deepina Kapila Graduate Student Texas State University School of Journalism and Mass Communication
  • 2. Introduction - Wikipedia • Wikipedia (www.wikipedia.com), deemed “the free encyclopedia,” was launched on the web in 2001. • Since then, it has become the Web’s 3rd most popular news and information source • It uses the Wiki software format, which allows a community of users to develop and monitor content • Wikipedia operates under the assumption that the public will act as a policing force, keeping content reliable and up to date.
  • 3. Introduction - Research • Denning et al. (2005) listed the risks inherent in Wikipedia’s model: accuracy, motives, uncertain expertise, volatility, coverage, sources. • Bopp and Smith (2001) state that coverage in an encyclopedia should be “Even across all subjects” • Shoemaker and Reese (1995) identified the individual as a news influencer. Web users and content creators tend to be young. • Tankard/Royal (2005) – inherent biases in Web content, based on systematic searches.
  • 4. Research Questions This project measures the content of Wikipedia against various indexes or standards of completeness to identify and uncover potential inherent biases. We are asking: 1. Are there some systematic gaps or biases in the overall presentation of information made available on Wikipedia? 2. Is recency (or currency) a predictor of amount of information on Wikipedia? 3. Is importance of information a predictor of amount of information on Wikipedia? 4. Is population a predictor of amount of information about particular countries on Wikipedia? 5. Is economic power a predictor of amount of information about individual corporations on Wikipedia?
  • 5. Method • Using predictors of recency, importance, country population, and economic power, several systematic searches on Wikipedia were conducted • Each article for each topic was visited, the relevant content highlighted, and the selection’s words were counted • Word counts were captured in a spreadsheet, and items were plotted on charts • Ascending order • Predictor variable
  • 6. Topics Covered • Years (1900-2010) • Academy Award Winning Films • Time Magazine’s Person of the Year • #1 Song on Billboard Top 100 (1940-2006) • Encyclopedia Terms • Countries in the United Nations • Fortune 1000 companies
  • 7. Results - Years 0 2,000 4,000 6,000 8,000 10,000 12,000 1 9 17 25 33 41 49 57 65 73 81 89 97 105 0 2,000 4,000 6,000 8,000 10,000 12,000 1900 1906 1912 1918 1924 1930 1936 1942 1948 1954 1960 1966 1972 1978 1984 1990 1996 2002 2008 Ascending Order Chronological Order -Backward L-shaped curve -Clear progression of length of article with year; dramatic increase in years after 2001 -Years in the future displayed understandably shorter word counts -Spearman Correlation between variables: .79
  • 8. Results - Films 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 1928 1932 1936 1940 1944 1948 1952 1956 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 Ascending Order Chronological Order -Backward L-shaped curve is apparent. -With few exceptions (ie. Gone with the Wind, 1939 and Casablanca, 1943) the results show progression favoring more current films. Recency is important, but certain films transcend time and are deemed important for other reasons. -Average word count for films since 2001 was 80% higher than word count before 2001. -Spearman correlation between variables: .49; increased to .62 simply by removing 2
  • 9. Results - Person of the Year 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 16,000 18,000 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 16,000 18,000 1927 1931 1935 1939 1943 1947 1952 1957 1962 1967 1974 1979 1985 1991 1996 2001 Ascending Order Chronological Order -Softer backward-shaped L curve -Even distribution shows bias is unrelated to recency, measured by another variable of importance -Spearman Correlation between variables: O-there was no relationship with time.
  • 10. Results - Billboard Top 100 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 1940 1943 1946 1949 1952 1955 1958 1961 1964 1967 1970 1973 1976 1979 1982 1985 1988 1991 1994 1997 2000 2003 2006 Ascending Order Chronological Order -Backward L-shaped curve -Although Average word count was 32% higher for artists since 1990, distribution shows trend similar to movies in that some artists transcend time. -Spearman correlation between variables: .40 (by eliminating 2 outliers)
  • 11. Encyclopedia Terms 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 Ascending Order -Comparison between Encyclopedia Britannica and Wikipedia articles -Backward L-shaped distribution apparent -Spearman correlation used to compare inches of content in Encyclopedia Britannica with word count in Wikipedia: .26 -Of 100 terms, 14 were not represented in Wikipedia
  • 12. Results - UN Countries 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 1 13 25 37 49 61 73 85 97 109 121 133 145 157 169 181 193 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 1 13 25 37 49 61 73 85 97 109 121 133 145 157 169 Ordered by populationAscending Order -Backward L-shaped curve - although fairly evenly distributed, a SHARP increase appears for the top 22 countries. -Gradual upward curve in 2nd chart shows that as population increases, so does word count -Average word count for top 10% of countries was 63% higher than the rest on the list -Spearman correlation between variables: .55
  • 13. Results - Fortune 1000 0 1,000 2,000 3,000 4,000 5,000 6,000 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 0 1,000 2,000 3,000 4,000 5,000 6,000 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 Ascending Order Ordered by Revenue -Backward L-shaped curve -SHARP increase for top 10% of companies by revenue -Top 10% of companies by revenue counted for 30% of total word count on companies -Spearman correlation between variables: .49
  • 14. Conclusion -Information on Wikipedia is volatile, dynamic and constantly changing over time -Wikipedia’s purpose is to serve as a general reference source, but the content is weighted due to its contributors’ demographics -In each search performed for the dimensions, strong biases were evident and strong correlations experienced: -Currency/Recency: the more current topics were covered the most -Random Selection: Encyclopedia terms showed clear bias towards more common or popular terms -Relevancy: Wikipedia’s word count correlates to inches in a traditional encyclopedia, showing a strong agenda by each publication -Population: the larger the country and the larger its population, the higher the word count -Revenue: The larger the revenue, the higher the word count