This document outlines the organization of a seminar on algorithms of the internet. It provides motivation for studying internet algorithms and how crises have spurred algorithmic solutions. The goals of the seminar are for students to research literature, write a survey, and give presentations. Topics covered include mobile networks, peer-to-peer networks, web caching, search engines, internet structure, security, denial of service attacks, viruses/spam, epidemic algorithms, DNS, TCP bandwidth allocation, routing, broadcasting, and the self-organization of the internet. Students will complete two presentations and a 5-10 page written survey assignment on their selected topic.
Presentation given at the EMTACL12 conference in Trondheim, Norway, on October 1 2012. Discusses the evolution towards a highly dynamic scholarly record (assets don't have the sense of fixity they used to have; assets are highly interdependent) and how the archiving infrastructure used for scholarly communication can not adequately deal with this dynamism.
Extendible Hashing Example
Extendible hashing solves bucket overflow by splitting the bucket into two and if necessary increasing the directory size. When the directory size increases it doubles its size a certain number of times.
Presentation given at the EMTACL12 conference in Trondheim, Norway, on October 1 2012. Discusses the evolution towards a highly dynamic scholarly record (assets don't have the sense of fixity they used to have; assets are highly interdependent) and how the archiving infrastructure used for scholarly communication can not adequately deal with this dynamism.
Extendible Hashing Example
Extendible hashing solves bucket overflow by splitting the bucket into two and if necessary increasing the directory size. When the directory size increases it doubles its size a certain number of times.
Working with Dimensional data in Distributed Hash TablesMike Malone
Recently a new class of database technologies has developed offering massively scalable distributed hash table functionality. Relative to more traditional relational database systems, these systems are simple to operate and capable of managing massive data sets. These characteristics come at a cost though: an impoverished query language that, in practice, can handle little more than exact-match lookups at scale.
This talk will explore the real world technical challenges we faced at SimpleGeo while building a web-scale spatial database on top of Apache Cassandra. Cassandra is a distributed database that falls into the broad category of second-generation systems described above. We chose Cassandra after carefully considering desirable database characteristics based on our prior experiences building large scale web applications. Cassandra offers operational simplicity, decentralized operations, no single points of failure, online load balancing and re-balancing, and linear horizontal scalability.
Unfortunately, Cassandra fell far short of providing the sort of sophisticated spatial queries we needed. We developed a short term solution that was good enough for most use cases, but far from optimal. Long term, our challenge was to bridge the gap without compromising any of the desirable qualities that led us to choose Cassandra in the first place.
The result is a robust general purpose mechanism for overlaying sophisticated data structures on top of distributed hash tables. By overlaying a spatial tree, for example, we’re able to durably persist massive amounts of spatial data and service complex nearest-neighbor and multidimensional range queries across billions of rows fast enough for an online consumer facing application. We continue to improve and evolve the system, but we’re eager to share what we’ve learned so far.
1) Tree
2) General Tree
3) Binary Tree
4) Full Binay Tree, Complete Binay Tree
5) Binary Tree Traversal (DFS & BFS)
6) Binary Search Tree
7) Reconstruction of Binay Tree
8) Expression Tree
9) Evaluation of postfix expression
10) Infix to Prefix using stack
11) Infix to Postfix using stack
12) Threaded Binary Tree
13) AVL-Tree
14) AVL-Tree Rotation
Working with Dimensional data in Distributed Hash TablesMike Malone
Recently a new class of database technologies has developed offering massively scalable distributed hash table functionality. Relative to more traditional relational database systems, these systems are simple to operate and capable of managing massive data sets. These characteristics come at a cost though: an impoverished query language that, in practice, can handle little more than exact-match lookups at scale.
This talk will explore the real world technical challenges we faced at SimpleGeo while building a web-scale spatial database on top of Apache Cassandra. Cassandra is a distributed database that falls into the broad category of second-generation systems described above. We chose Cassandra after carefully considering desirable database characteristics based on our prior experiences building large scale web applications. Cassandra offers operational simplicity, decentralized operations, no single points of failure, online load balancing and re-balancing, and linear horizontal scalability.
Unfortunately, Cassandra fell far short of providing the sort of sophisticated spatial queries we needed. We developed a short term solution that was good enough for most use cases, but far from optimal. Long term, our challenge was to bridge the gap without compromising any of the desirable qualities that led us to choose Cassandra in the first place.
The result is a robust general purpose mechanism for overlaying sophisticated data structures on top of distributed hash tables. By overlaying a spatial tree, for example, we’re able to durably persist massive amounts of spatial data and service complex nearest-neighbor and multidimensional range queries across billions of rows fast enough for an online consumer facing application. We continue to improve and evolve the system, but we’re eager to share what we’ve learned so far.
1) Tree
2) General Tree
3) Binary Tree
4) Full Binay Tree, Complete Binay Tree
5) Binary Tree Traversal (DFS & BFS)
6) Binary Search Tree
7) Reconstruction of Binay Tree
8) Expression Tree
9) Evaluation of postfix expression
10) Infix to Prefix using stack
11) Infix to Postfix using stack
12) Threaded Binary Tree
13) AVL-Tree
14) AVL-Tree Rotation
Slides from the first meeting of the project group PUSHPIN at the University of Paderborn. I focus on the general focus of the project group and the topics for the seminar phase.
Deadline: 1st June 2012
Notification: 15th June 2012
Revision: 25th June 2012
Publication: 5th July 2012
It is our immense pleasure to invite you to submit manuscripts of your original paper for publication in International Journal of Research in Computer Sciences. IJORCS is a blind peer-reviewed periodical dedicated to the propagation and elucidation of scholarly research results. IJORCS promotes research work among young students and teachers and motivate them to carry out actual research work and publish their manuscripts.
IJORCS is now accepting manuscripts for its next issue, Volume 2, Issue 4. Authors are encouraged to contribute to the journal by submitting articles that clarify new research results, projects, surveying works and industrial experiences that describe significant advances in field of computer science.
For list of topics visit, Call for Papers (http://www.ijorcs.org/topics)
Accepted papers will be published, and authors will be provided with printed copies of the issue.
All submitted papers will be judged on the basis of their quality by the Editorial Board, Anti-Plagiarism Board.(http://www.ijorcs.org/editorial-board)
All paper submissions (http://www.ijorcs.org/submit-paper) will be handled electronically and detailed instructions on submission procedure are available on IJORCS website (http://www.ijorcs.org/author-guidelines)
Papers that describe research and experimentation are encouraged.
Kindly circulate this announcement among your respective communities or colleagues so as to utilize this opportunity to publish their manuscripts / research work.
A Keynote at the Web Science Conference, 2018, held at the VU Amsterdam [1]. This describes in the main the output of the Semantic Technology Institute International (STI2) Summit (for senior researchers in the Semantic Web field) held in Crete in September, 2017 [2].
1. https://websci18.webscience.org/
2. https://www.sti2.org/events/2017-sti2-semantic-summit
With approximately 1.x years of delay to the US, the term "Data Science" is also gaining speed in Europe. We see more and more job openings for- and business cards of data scientists, new events dedicated to the topic and an increased demand in related education literally every month. In response to this trend, Zurich University of Applied Sciences founded the ZHAW Data Science Laboratory (Datalab) last year.
This talk is to give an updated overview of Data Science in Europe by the example of the Datalab's activities in Switzerland. After a definition and classification of the field, a presentation of real technical projects sets the stage for what Data Science looks like here, offside of internet behemoths and big data clichés. Then, conclusions on the state of the art at least in Switzerland are drawn from evaluating the recent "1st Swiss Workshop on Data Science" event and ZHAW's professional education programme "DAS in Data Science".
With the help of the audience during the subsequent discussion, these results can eventually be extrapolated to the wider European community.
Bibliotheca Digitalis. Reconstitution of Early Modern Cultural Networks. From Primary Source to Data.
DARIAH / Biblissima Summer School, 4-8 July 2017, Le Mans, France.
5th and last day, July 8th – Digital representation and data accuracy for Humanities.
Humanities at Scale and Dariah-EU.
Nicolas Larrousse – Research officer, TGIR Huma-Num.
Abstract: https://bvh.hypotheses.org/3330#resume-NLarousse
Hazel Hall's presentation at Knowledge Management in an Age of Networks, Redwood House, Edinburgh, 27 October 2000. The content of this presentation was developed into a journal article accessible from http://www.knowledgeboard.com/lib/3259, and a conference paper available from http://www.slideshare.net/HazelHall/devising-intranet-incentives-rewards-and-conditions-for-knowledge-exchange. The material presented here draws on early work for Hazel Hall's PhD, the full details of which are available from http://hazelhall.org/publications/phd-the-knowledge-trap-an-intranet-implementation-in-a-corporate-environment/
2010 EGITF Amsterdam - Gap between GRID and HumanitiesDirk Roorda
How useful/relevant is GRID and High Performance Computing in its current form for the Humanities, especially within the European Infrastructure projects CLARIN, DARIAH and CESSDA? We need virtual use cases!
Challenges for researchers in the Digital HumanitiesLIBIS
The digital evolution of our society is increasingly affecting and enabling research in the humanities where digital resources and cultural datasets are now being considered as valuable research material. This evolution has increased the need for infrastructures and web environments where researchers from the humanities can collaboratively work on their data and even actively involve citizens. But while this digital evolution also brings new opportunities for service providers, there are many challenges to overcome when collaborating with humanities research groups in the development of their research infrastructures. At LIBIS, a service provider for information solutions at the University of Leuven (KU Leuven - Belgium), we have experienced some of the main issues being the sometimes limited technological knowhow of the researchers, but especially the lack of resources for the continued maintenance and support of the digital humanities infrastructures and datasets after the project funding period has ended.
This presentation focusses on a number of Humanities infrastructure projects in which library, archival and museum tools have been used in combination with other open source and proprietary systems to provide a sustainable and innovative environment for different humanities research groups. We like to share our experiences on the active collaboration with the researchers in the writing of project proposals and the design and development of their infrastructures as well as provide a set of recommendations concerning the selection of tools and standards to guarantee a long lasting collaboration.
Slides of the paper Curation Technologies for a Cultural Heritage Archive: Analysing and transforming a heterogeneous data set into an interactive curation workbench by Georg Rehm, Martin Lee, Julián Moreno Schneider and Peter Bourgonje at the 3rd Edition of the DATeCH2019 International Conference
1. HEINZ NIXDORF INSTITUT
University of Paderborn
Algorithms und Complexity
Seminar
Algorithms of the
Internet
Christian
Schindelhauer
2004-04-19
Uploaded by: CarAutoDriver
1
2. HEINZ NIXDORF INSTITUTE
Motivation University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• The Internet
– is the public global wide-area interconnection network for computers
– grows exponentially
– evolves
The evolution of the Internet
• Crises and catastrophes
– Computer hackers since the 70s
– The traffic breakdown in the 80s
– Denial of service attacks in the 90s
– SPAM forever
• Clever algorithmic solutions
– Secure protocols
– TCP bandwidth control
– DoS-detection
– SPAM-filters
Seminar Algorithms of the Internet 2
3. HEINZ NIXDORF INSTITUTE
Goals of the Seminar University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• Algorithms of the Internet
• Literature recherche in a hot topic
• Write an survey on the state of the art
• Give a presentation on this field
• Interact with others on scientific research
• Provide material for
– the community
– future lectures and seminars
– a book
Seminar Algorithms of the Internet 3
4. HEINZ NIXDORF INSTITUTE
Organization University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• See the Web-page
http://wwwcs.upb.de/cs/ag-madh/WWW/Teaching/2004SS/AlgInternet/
• Today: Registration
Assignment of topics
• Next two meetings: Kickoff and feedback
• May 10th/17th: 2pm-6pm
1st presentation
• From May to July weekly (voluntary) meetings for consultation
• July 19th/26th: 1pm-7pm
2nd presentation
• August 1st (11:59 pm): Deadline for written assignment
• August 30th (2pm): Evaluation, grades and comments
(voluntary participation)
Seminar Algorithms of the Internet 4
5. HEINZ NIXDORF INSTITUTE
The Deliverables University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• 1st presentation
– Duration 15 min.
– Presents
• main issue
• strategy to get the work done
• 2nd presentation
– Duration 45 min.
– Survey of the research area
• Written assignment
– 5-10 pages (pure text without title, references, and figures)
– Survey of the most relevant and interesting work in the assigned
area
Seminar Algorithms of the Internet 5
6. HEINZ NIXDORF INSTITUTE
How it Counts for the Grade University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• 1st presentation
– 0%
• 2nd presentation
– 25 %
• Written assignment
– 75 %
Seminar Algorithms of the Internet 6
7. HEINZ NIXDORF INSTITUTE
The Topics University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
1. The mobile Internet
2. P2P-networks
3. Web caching
4. Algorithms for Web search engines
5. The structure of the Web
6. Security mechanisms of the Internet
7. Denial of service attacks
8. Worms, viruses and spam
9. Epidemic algorithms
10. The Domain Name System (DNS)
11. Bandwidth allocation of TCP
12. Routing algorithms of IP
13. Broadcasting and Multicasting in IP
14. The self-organization of the Internet
15. “Wild card”
Seminar Algorithms of the Internet 7
8. HEINZ NIXDORF INSTITUTE
Contents of Written Assignment University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• Scientific survey on the assigned topic
• For broad audience interested in algorithms and Internet
Table of contents for “Tachyonic Networking”
1. What is Tachyonic Network Transmissions (TNT)
2. Applications of TNT
3. How TNT began
4. The main streams of TNT
5. Recent developments in Tachyonic Networking and Computing
6. Focus: The TachyNet - A clever solution of TNT
7. Open problems and upcoming developments
• References
Seminar Algorithms of the Internet 8
9. Format and Layout of Written HEINZ NIXDORF INSTITUTE
University of Paderborn
Assignment Algorithms und Complexity
Christian Schindelhauer
• American English
• Neutral style
• 5-10 pages (pure text without title, references, and figures)
• Accurate and correct citations and references
• LATeX, BibTeX
• Deliverables
– LATeX source file
– BibTeX file
– Compiled PDF-output
– As many referred text documents as possible
• Electronically if possible
• On paper if necessary
Seminar Algorithms of the Internet 9
10. HEINZ NIXDORF INSTITUTE
1. The Mobile Internet University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• IP Tunneling
• Mobile Ad-hoc Networks
• Handhelds, PDA
• UMTS, WAP
Seminar Algorithms of the Internet 10
11. HEINZ NIXDORF INSTITUTE
2. P2P-Networks University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• 1st generation
– Napster, Kazaa, Gnutella
• Modern P2P-Networks
– CAN, CHORD, Tapestry, …
• Visit the lecture “Algorithmen für Peer-to-Peer-Netzwerke”
• But do not copy (all)
Seminar Algorithms of the Internet 11
12. HEINZ NIXDORF INSTITUTE
3. Web Caching University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• Relieving hot spots in the Internet
• Akamai
• Distributed Hash Tables
• Take a look into the script “Algorithmische Grundlagen des Internets”,
Summer 2003
Seminar Algorithms of the Internet 12
13. HEINZ NIXDORF INSTITUTE
4. Web Search Engines University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• Commercial systems
– Google, Alltheweb, Altavista, etc
– look at websearchengineshowdown.com
• Algorithmic solutions
– PageRank by Brin and Page (Google)
– Kleinberg’s HITS-algorithm
• Take a look into the script “Algorithmische Grundlagen des Internets”,
Summer 2003 and 2004
• Contact Peter Mahlmann!
Seminar Algorithms of the Internet 13
14. HEINZ NIXDORF INSTITUTE
5. The Structure of the Web University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• The WWW is made by individuals
• Yet the structure can be described by Pareto-distribution
– number of links, size of connected components
• The graph structure of the WWW
• Take a look into the script “Algorithmische Grundlagen des Internets”,
Summer 2002
• Link to Web search machines
Seminar Algorithms of the Internet 14
15. 6. Security mechanisms of the HEINZ NIXDORF INSTITUTE
University of Paderborn
Internet Algorithms und Complexity
Christian Schindelhauer
• Are there any?
• Is it all trust-based?
• Start your investigations for the search of the missing crypto-layer of
TCP/IP at the secure shell protocol
Seminar Algorithms of the Internet 15
16. HEINZ NIXDORF INSTITUTE
7. Denial of Service Attacks University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• A new problem from the 90s
• Solutions
– Ingress filtering
– Link testing
– Logging
– ICMP Trace back
– Marking (!!!)
• Take a look into the script “Algorithmische Grundlagen des Internets”,
Summer 2003
Seminar Algorithms of the Internet 16
17. HEINZ NIXDORF INSTITUTE
8. Worms, Viruses, and SPAM University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• Definition
• How they spread
• How they work
• The perfect Antivirus-Software
• An Immune-System for Computers
Seminar Algorithms of the Internet 17
18. HEINZ NIXDORF INSTITUTE
9. Epidemic Algorithms University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• Demers et al.
– Epidemic algorithms for mirroring data bases
– Idea: Spread information like a virus
• Some analyses and new ideas by
– Karp et al. 2001, Randomized Rumor Spreading
• Take a look into the script “Algorithmische Grundlagen des Internets”,
Summer 2002
Seminar Algorithms of the Internet 18
19. HEINZ NIXDORF INSTITUTE
10. The Domain Name System University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• What is it?
• How does it work?
• Why is it so stable?
• Alternatives?
Seminar Algorithms of the Internet 19
20. HEINZ NIXDORF INSTITUTE
11. Bandwidth Allocation of TCP University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• Very classical problem of TCP/IP
• Strangely enough network congestion in the Internet is healed in the
transport layer and not in the network layer
• Start with the scripts
– Algorithmische Grundlagen des Internets”, Summer 2002 and 2003
• Random early deletion (RED)
• New TCP-allocation for Tera-Baud-connections
Seminar Algorithms of the Internet 20
21. HEINZ NIXDORF INSTITUTE
12. Routing Algorithms of IP University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• Start with the standard algorithms
• There are lots of theoretical work of routing algorithms
• Concentrate on such algorithms related to IP
Seminar Algorithms of the Internet 21
22. 13. Broadcasting and Multicasting HEINZ NIXDORF INSTITUTE
University of Paderborn
in IP Algorithms und Complexity
Christian Schindelhauer
• IPv4 and IPv6 foresee multicasting
• How does it work?
• How can it be improved?
• Is it possible to have TV on IP?
Seminar Algorithms of the Internet 22
23. 14. The self-organization of the HEINZ NIXDORF INSTITUTE
University of Paderborn
Internet Algorithms und Complexity
Christian Schindelhauer
• Official organizations of the Internet (IETF, …)
• The Internet and its self-regulation
– socially and technology based
Seminar Algorithms of the Internet 23
24. HEINZ NIXDORF INSTITUTE
15. “Wild Card” University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
• Did we miss something?
• Take a hot topic of your choice within this area.
• If everything fails, I’ll help.
Seminar Algorithms of the Internet 24
25. HEINZ NIXDORF INSTITUT
University of Paderborn
Algorithms und Complexity
Christian Schindelhauer
Thanks and let’s go!
Verteilte Algorithmen in Netzwerken 25