The Art Pastor's Guide to Sabbath | Steve Thomason
The web bang project michele zadra
1. The ‘Web-Bang’: an interdisciplinary study of
the evolution of the Web through network
analysis
Michele Zadra
November 2018
Web Science Institute, University of Southampton
MSc in Web Science 2018/2019, mod. WEBS6203, Interdisciplinary Thinking
2. Few numbers on the Web-Bang…
The total number of Websites
has grown from 1 in 1990 to 1.8
billion in 2018.
55% of the world population,
over 4 billion people, has
access to the Web.
The number of pages of the
‘invisible Web’ is estimated to be
over 155 billions!
Source: http://www.netcraft.com/
3. Why study the Web Bang?
The Web is the biggest system of knowledge sharing ever existed.
It has brought great benefits in almost every area of human activity, but it has
created great threats too.
If we understand the evolutionary patterns that have driven the Web's growth so
far we can shape it to make it a better and safer place (Shadbolt and Berners Lee,
2008)
Therefore…
We need to reveal the mechanisms of the Web evolution!
4. Two disciplines for an interdisciplinary study
The evolution of the Web can be studied from many complementary angles :
sociology, economics, psychology, history, biology, physics, computer science,
and so on…
However, if we are looking at evolutionary patterns, we need predictive models
that can be tested.
‘Hard sciences’ produce predictive theories that can be verified through
experiments.
The Web is a vast and complex network. Through network analysis we
understand how it develops through time.
Physics and Computer Science (CS) are disciplines capable of producing
predictive models and can be both applied to network analysis.
5. The Web Evolution through Network
Analysis
Physics
Computer
Science
Web 1.0 Web 2.0 Web 3.0
Map of the network of
Web DBs in 2009
BA model
(1999)
PageRank
based
model
(2010)
Web-Bang
model ???
7. Research Method: Computer Science
An iterative process based on trials and errors (Hakken, 2003).
Main
problem
Workable
sub-problem
Analysis Design Test Deployment
OK
Not OK
Iterative phases
Research
questions
8. The Common Ground!
The Web is a complex network. Both physicists and computer scientists study
complex networks.
The common ground is found in the field of Network Analysis
Physics uses analytical tools to discover the universal laws of networks.
CS brings algorithms, data management and data mining.
Integration: a model of the Web network evolution based on Physics theories
integrated with CS tools.
Example: the PageRank algorithm integrated into the BA model (Giammateo et
al., 2010)
9. Conclusion? A work in progress…
This project proposal is a work in progress. There isn’t a research design yet.
More details in a coming report
We know the common ground for the selected disciplines: network analysis.
Expected result: an improved predictive model of the Web evolution that will
incorporate Web-specific technical parameters
10. References
Barabasi, A. (2016). Network science. Cambridge: Cambridge University Press.
Giammatteo, P., Donato, D., Zlatić, V. and Caldarelli, G. (2010). A PageRank-based
preferential attachment model for the evolution of the World Wide Web. EPL
(Europhysics Letters), 91(1), p.18004.
Hakken, D. (2003). Knowledge Landscapes. London: Routledge.
Shadbolt, N. and Berners-Lee, T. (2008). Web Science Emerges. Scientific American,
299(4), pp.76-81. doi: 10.1038/scientificamerican1008-76
Weidner Tilghman, R. and Brown, L. (2018). Physics - The methodology of physics.
Available at: https://www.britannica.com/science/physics-science/The-methodology-
of-physics
Editor's Notes
We all know the importance of the Web for our life. Almost every single everyday activity can be done on the Web or through the Web. There is nothing as pervasive as the Web. All businesses and organisations are on the Web. If we don’t consider under 8 years old children and elderly aged 75 years old or more, almost everyone on this planet is aware of the Web and probably use it in a way or another. However, despite the Web playing such important part in our life, we often forget that it is a relative young technology, created only in 1990 by Tim Berners Lee at CERN. In just 28 years we passed from 1 to 1.8 billion websites and over 4 billion people got access to the Web. Beyond the known Web, there are an estimated 155 billion pages not indexed by search engines, the so-called ‘invisible web’. These numbers make the growth rate of the Web unprecedented in the history of human artefacts: it’s the Web-Bang.
The Web has benefited governments, businesses, other organisations and individuals in many ways. It has given billions of people access to virtually all existing knowledge and information. We have now access to services that make our life much easier, which we could not even imagine before the Web. We can educate and inform ourselves on the Web. In most countries we can voice our opinions and reach a global audience thanks to the Web. Through social media the Web has completely changed the way we interact with each other and has allowed us to connect with people otherwise completely out of reach. Unfortunately, the Web has also amplified existing risks for individuals and organisations, and created new highly harmful threats: copyrights infringements, personal data misuse, identity theft, mass surveillance, scamming and frauds, cyber-bullying, cyber-terrorism, extremisms, racism, discriminations…and the list goes on and on. Two founders of the Web Science, Nigel Shadbolt and Tim Berners-Lee, in a 2008 article titled ‘Web Science emerges’ wrote that the ultimate pursuit of web scientists is to answer the question ‘What evolutionary patterns have driven the Web growth?’ By discovering drives and patterns of the Web evolution we can create a model to predict it s future directions that can help to prevent, limit or mitigate its harmful side That is why an ambitious research project aiming to answer the above question is timely and of real importance.
Web Science is an inherently interdisciplinary field, because the Web is a highly complex socio-technological system. Consequently, any research endeavour about the Web that wants to achieve a proper understanding of how the Web works and how it influences society (and vice versa) must contemplate different approaches, methods and philosophies. Academics from all disciplines can give a contribution to the development of Web Science. However, not every angle is needed or well suited for any given research about the Web. In this case, the aim of the project is to develop an explanatory and predictive model of the evolution of the Web. Only hard sciences can produce predictive theories that can be objectively tested through lab experiments or computer simulations.; ‘soft sciences’ like sociology, psychology and economics could just make speculations about the future of the Web.
Thanks to previous network analysis studies we know that the evolution of the Web can be explained by the laws that govern the development of its complex network. Physicists and computer scientists are capable of producing verifiable predictive models and both take great interest in network analysis.
Physics and Computer Science describe how the Web network has evolved through time in quite different ways. Starting from the Albert-Barabasi model in 1999, physicists have proposed a series of mathematical models that explain the topology of the network and its characteristics (like robustness, continuous growth and preferential attachment of links). They look into the relationships between vertices (a single website or a single html document) established through links. These relationships constitute the ever changing structure of the Web and can explain its growth. Every model is more accurate than the previous one: the next could be the outcome of this project, the ‘Web-Bang’ model.
Computer scientists, on the other hand, have analysed the evolution of the Web by using software development language to describe important shifts in its functionalities and technological characteristics: Web 1.0, read only; Web 2.0, read-write, it allows social interactions and content co-creation; Web 3.0, or semantic Web, the next phase. They look at how the technological architecture (e.g. protocols, formats and languages) of the network has changed through time. Based on past and current technological analyses of the Web, they can make informed hypothesis about how the Web is going to look in the future.
The research process in Physics is different from Computer Science. Physics can be taken as the paradigm discipline for the application of the ‘scientific method’: the observation of natural phenomena trigger researchers to formulate research questions and propose theories that answer the questions. These theories are then used to produce mathematical models that describe the observed phenomena and make predictions about their behaviours. The predictions (hypothesis) are tested in experiments run in laboratories or through computer simulations. Finally, the results are analysed and fed into new observations and improved theories. Every following research cycle makes the mathematical model a more accurate description of the observed phenomenon.
Computer scientists try to tackle a technical problem, or a challenge, by developing machine-readable algorithms. Often they can tackle only a specific aspect of the overarching issue, so they define a ‘workable sub-problem‘ by applying their expertise and previous literature. This preliminary stage allows them to formulate research questions, which in turn prompts an iterative process: a requirements analysis followed by the design, the development and the testing of the algorithm. This iterative process goes on until the results are considered satisfactory and the algorithm can be published and/or deployed in the real world. Differently from physicists, computer scientists don’t operate in controlled environment like laboratories. Instead, they design computer-mediated systems that they further develop through trial and errors.
After having analysed methods and approaches used by the two disciplines involved in this study, the next step is to integrate the discipline insights and create a common ground. As we have seen before, both physicists and computer scientists study, although in different ways, complex networks like the Web. Therefore, Network Analysis is their common ground. In Network Analysis, physicists discover laws and principles common to all complex networks using mathematics and analytical tools, while computer scientists provide algorithms, data management and data mining to design or map computer networks.
How would physicists and computer scientists use the common ground to integrate their knowledge and achieve a more comprehensive understanding of the Web? By developing a predictive model of the Web growth capable to account for technical parameters like robustness and scalability of the network. A good example of this kind of integration between Physics and CS in network analysis is a study published in 2010 by Giammateo and other physicists, which improved the Barabasi-Albert model of the Web by integrating the Google PageRank algorithm.
This research proposal is still a work in progress. The literature review and the information collected so far and presented in these slides are not sufficient to define a clear research design and an effective work plan. Still, the key ingredients of a successful interdisciplinary study are already known: the 2 disciplines that will lend researchers to compose the team, Physics and Computer Sciences; and the common ground to integrate their work, Network Analysis.
As in any research endeavour, results are never guaranteed a priori. However, if successful this project has the potential to deliver a model describing and predicting the evolution of the Web better than previous monodisciplinary attempts, thanks to the integration of Web-specific technical parameters into a mathematical model.