PageRank is an algorithm used by Google to rank web pages. It models the web as a Markov chain and views a random web surfer clicking on links. Pages that are frequently visited or linked to by important pages receive higher PageRank scores. The algorithm represents the web as a link matrix which is column stochastic. This matrix's principal eigenvector corresponds to the PageRank scores, representing the theoretical steady state probabilities a random surfer would be on each page.
The Battle Against Penguins and Pandas: Getting Back Ranking and Running Futu...Benj Arriola
The Battle Against Penguins and Pandas: Getting Back Ranking and Running Future Proof SEO Strategies
Online Marketing Summit (OMS) 2012 Presentation by Benj Arriola
This document outlines Starbucks' 2016 social media strategy. Key objectives are to increase followers by 500,000 and interactions by 30% across platforms. The strategy focuses on paid promotion, owned hashtags, and earned media through coupons. Performance will be measured by social referral traffic and engagement. Critical response plans address inappropriate content and product issues to manage impact.
This document provides Starbucks' social media strategy. Key points:
- The strategy focuses on showcasing Starbucks' products and personality through social media.
- Major strategies are featuring customer stories and advocating causes aligned with Starbucks' mission.
- Twitter and Facebook are the most followed channels. Instagram and Twitter drive the most traffic to Starbucks' website.
- The target audience is mostly ages 18-30 who engage on Instagram and Twitter.
- Competitors have strong presences but could improve response times and two-way communication.
- Objectives are to acquire new followers, increase engagement, and grow visual content by 40% in 6 months.
Goolge Panda PPT | Goolge Panda PresentationKeshav Kole
The Google Panda update was launched in August 2011 to reduce rankings for low-quality websites and increase rankings for high-quality websites. The update targeted issues like duplicate content, low-quality inbound links, irrelevant advertising, and keyword stuffing. To protect their sites from being affected, website owners should focus on optimizing for quality by ensuring original and error-free content, removing duplicate content, avoiding keyword stuffing, and obtaining links from high-quality sources.
The document outlines Starbucks' six-month social media strategy beginning in October. The objectives are to grow their online following and increase customer satisfaction/loyalty by sharing more personalized content. Two key strategies are increasing published content and promoting customer interaction. Goals include a 40% increase in Twitter/Instagram followers and more unique website visits from Facebook/LinkedIn. Progress will be measured through analytics on social media engagement, website traffic sources, and the #FallintoStarbucks campaign performance.
Social Media Marketing Strategy @ Kentucky Fried Chicken- Customer Service & ...Rajesh Prabhakar
Kentucky Fried Chicken has been actively and aggressively using the various social media platforms to promote their menu offerings, promotions, contests, new product launches and most importantly focusing on the freshness and juiciness of their chicken compared to its competitors.
The document provides background information on Pearl Continental Hotel in Rawalpindi, Pakistan. It discusses the hotel's ownership and management under the Hashoo Group of Companies. It also outlines the hotel's vision, organizational structure, marketing strategies, and external factors that influence its business such as demographics, economic conditions, and competition.
The Battle Against Penguins and Pandas: Getting Back Ranking and Running Futu...Benj Arriola
The Battle Against Penguins and Pandas: Getting Back Ranking and Running Future Proof SEO Strategies
Online Marketing Summit (OMS) 2012 Presentation by Benj Arriola
This document outlines Starbucks' 2016 social media strategy. Key objectives are to increase followers by 500,000 and interactions by 30% across platforms. The strategy focuses on paid promotion, owned hashtags, and earned media through coupons. Performance will be measured by social referral traffic and engagement. Critical response plans address inappropriate content and product issues to manage impact.
This document provides Starbucks' social media strategy. Key points:
- The strategy focuses on showcasing Starbucks' products and personality through social media.
- Major strategies are featuring customer stories and advocating causes aligned with Starbucks' mission.
- Twitter and Facebook are the most followed channels. Instagram and Twitter drive the most traffic to Starbucks' website.
- The target audience is mostly ages 18-30 who engage on Instagram and Twitter.
- Competitors have strong presences but could improve response times and two-way communication.
- Objectives are to acquire new followers, increase engagement, and grow visual content by 40% in 6 months.
Goolge Panda PPT | Goolge Panda PresentationKeshav Kole
The Google Panda update was launched in August 2011 to reduce rankings for low-quality websites and increase rankings for high-quality websites. The update targeted issues like duplicate content, low-quality inbound links, irrelevant advertising, and keyword stuffing. To protect their sites from being affected, website owners should focus on optimizing for quality by ensuring original and error-free content, removing duplicate content, avoiding keyword stuffing, and obtaining links from high-quality sources.
The document outlines Starbucks' six-month social media strategy beginning in October. The objectives are to grow their online following and increase customer satisfaction/loyalty by sharing more personalized content. Two key strategies are increasing published content and promoting customer interaction. Goals include a 40% increase in Twitter/Instagram followers and more unique website visits from Facebook/LinkedIn. Progress will be measured through analytics on social media engagement, website traffic sources, and the #FallintoStarbucks campaign performance.
Social Media Marketing Strategy @ Kentucky Fried Chicken- Customer Service & ...Rajesh Prabhakar
Kentucky Fried Chicken has been actively and aggressively using the various social media platforms to promote their menu offerings, promotions, contests, new product launches and most importantly focusing on the freshness and juiciness of their chicken compared to its competitors.
The document provides background information on Pearl Continental Hotel in Rawalpindi, Pakistan. It discusses the hotel's ownership and management under the Hashoo Group of Companies. It also outlines the hotel's vision, organizational structure, marketing strategies, and external factors that influence its business such as demographics, economic conditions, and competition.
The Starbucks social media strategy document outlines objectives to grow millennial followers across social media platforms by 25% and increase engagement with posts by 15% by January 2017. Key strategies include increasing video and visual content, engaging with followers daily on Twitter, and encouraging followers to share photos with the #mystarbucks hashtag. Metrics such as website traffic sources, social media follower counts and engagement rates will be reported on in February, May, August and November.
Coca-Cola is the largest beverage company in the world with over 3,500 products sold in over 200 countries. It employs over 146,000 people globally and has maintained over 50 consecutive years of profit growth. Coca-Cola utilizes social media platforms like Facebook, Twitter, Google+, and Pinterest to engage with customers, though it has had more success on Facebook and Twitter than other platforms due to not having a clear target audience. While Coca-Cola highlights environmental programs, it uses a large amount of water in producing its beverages.
Social media marketing strategy & plan 2017Fraser Hay
Social media marketing strategy plan coaching 22017 is an overview of the social media marketing coaching program available at http://www.growyourbusiness.club
Social Media Marketing Strategy
Social Media Marketing Plan 2017
Social Media marketing strategy 2017
Social media marketing plan
social media marketing course 2017
social media marketing
marketing plan 2017
marketing plan
marketing strategy 2017
marketing strategy
Social media marketing 2017
social media strategy 2017
grow your business coaching
grow your business club
grow your business
online marketing strategy
online marketing plan
Google uses complex algorithms to rank websites in search results. The main algorithms include PageRank, Penguin, Panda, and Hummingbird. PageRank is the original algorithm that analyzes backlinks to determine importance. Penguin penalizes spam sites and paid links. Panda targets low-quality "content farms." Hummingbird incorporates previous algorithms and aims to better understand search queries through techniques like semantic search, location data, and knowledge graphs to provide more relevant, personalized results. It seeks to answer questions more conversationally rather than just returning keywords.
The Starbucks 2017 social media strategy aims to grow its following among parents and customers aged 56+ through targeted content on Facebook and Twitter. Key objectives include increasing tweets by 40% and Twitter followers by 100,000 in 6 months. The strategy involves introducing relevant #ageless content across channels, and improving engagement on Facebook and Twitter. Measurement will track growth in social metrics and website traffic from social sources.
Google AdWords is an advertising service by Google for businesses wanting to display ads on Google and its advertising network.
The AdWords program enables businesses to set a budget for advertising and only pay when people click the ads. The ad service is largely focused on keywords.
Practical SEO provides a blueprint for an effective SEO process that involves researching keywords, optimizing pages for those keywords, promoting optimized pages to generate links and traffic, and analyzing results to refine strategies. The process is presented as a cyclical model where SEO activities are ongoing and adjusted based on data to continually improve search visibility and results.
Google introduced the Hummingbird algorithm in 2013 to improve search results. Hummingbird analyzes longer, more complex questions to provide more direct answers compared to previous algorithms like Panda and Penguin. It uses information like user location and interests to personalize results. The algorithm also leverages Google's Knowledge Graph to understand context and relationships to better answer follow-up questions. Hummingbird aims to have a more human-like conversation with users through its semantic search capabilities.
Consumer search behaviour is complex.
You perform multiple searches on multiple devices over multiple days; where everything you see and experience, in the SERPs and beyond, influences your brand preference and purchase decisions.
Traditional funnel analysis and marketing models do a poor job of measuring and managing this ecosystem.
We need to re-think the way we talk about, measure and manage SEO if we want bigger budgets, integrated strategies, and to win big - and we need to move fast; some of the world's biggest companies are already beating us to it.
7thingsmedia's Founder & CEO , Chris Bishop, presented an intimate introduction into search engine optimization (SEO). Bishop took the group through the basics of how the various search engines work, how to efficiently rank in them and the key on page and off page ranking factors. Plus an Google Panda and Google Penguin update.
1) Inbound marketing is a holistic, data-driven approach that attracts potential customers through helpful content and converts them into lasting customers, rather than interrupting people with ads, emails, calls.
2) Traditional marketing is broken as people tune out interruptive tactics, but inbound marketing provides value to customers through content on blogs, videos, tools.
3) HubSpot is a marketing platform that makes inbound marketing simple through integrated software and tools that help businesses attract visitors and convert them into customers more effectively than outbound marketing.
This slideshare highlights 40 mini case studies of businesses in Singapore that have stood out by implementing creative social media marketing campaigns.
Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine's "natural" or un-paid ("organic") search
Read all SEO Tips Shared by Matt Cutts in this PPT.
This lengthy (150+ slides), but very comprehensive presentation is designed to help experienced SEOs train those new to the practice over a 2-3 hour, interactive session. It covers the search engine landscape, the SEO process, keyword research, link building and the emergence of social media as a ranking signal.
A snapshot of internet, social media, and mobile use in every country in the world. This report is part of a suite of reports brought to you by We Are Social and Hootsuite - read the other reports for free at http://www.slideshare.net/wearesocialsg/presentations
- More than half of the world's population now uses the internet, with global internet users growing 8% year-over-year. Mobile internet and social media usage are also growing significantly.
- Social media users grew over 20% in the past year to over 2.5 billion active users monthly. Mobile social media use in particular saw 30% growth.
- The report provides statistics on internet, social media, and mobile usage globally and by region, finding continued growth in connectivity and usage around the world.
In this paper, we consider a criminal investigation on the collective guilt of part members in a working group. Assuming that the statistics we used are reliable, we present the Page Rank Model based on mutual information. First, we use the average mutual information between non-suspicious topics and the suspicious topics to score the topics by degree of suspicion. Second, we build the correlation matrix based on the degree of suspicion and acquire the corresponding Markov state transition matrix. Then, we set the original value for all members of the working group based on the degree of suspicion. At the last, we calculate the suspected degree of each member in the working group. In the small 10-people case, we build the improved Page Rank model. By calculating the statistics of this case, we acquire a table which indicates the ranking of the suspected degree. In contrast with the results given in this issue, we find these two results basically match each other, indicating the model we have built is feasible. In the current case, firstly, we obtain a ranking list on 15 topics in order of suspicion via Page Rank Model based on mutual information. Secondly, we acquire the stable point of Markov state transition matrix using the Markov chain. Then, we build the connection matrix based on the degree of suspicion and acquire the corresponding Markov state transition matrix. Last, we calculate the degree of 83 candidates. From the result, we can see that those suspicious are on the top of the ranking list while those innocent people are at the bottom of the list, representing that the model we have built is feasible. When suspicious topics and conspirators changed, a relatively good result can also be obtained by this model. In the current case, we have the evidence to believe that Dolores and Jerome, who are the senior managers, have significant suspicion. It is recommended that future attention should be paid to them. The Page Rank Model, based on mutual information, takes full account of the information flow in message distribution network. This model can not only deal with the statistics used in conspiracy, but also be applied to detect the infected cells in a biological network. Finally, we present the advantages and disadvantages of this model and the direction of improvements.
Application of finite markov chain to a model of schoolingAlexander Decker
This document discusses the application of finite Markov chains to model school progression. It begins by introducing key concepts of Markov chains including states, transition probabilities, and transition matrices. It then presents a model of school progression with states representing each year of school plus outcomes of graduating or dropping out. Transition probabilities between states are used to formulate the model. The document applies this model to secondary school data from Nigeria to predict academic progression. It analyzes the data using knowledge of finite Markov chains and finds the model provides an efficient way to predict student academic progress.
This document provides an overview of knowledge representation techniques and object recognition. It discusses syntax and semantics in representation, as well as descriptions, features, grammars, languages, predicate logic, production rules, fuzzy logic, semantic nets, and frames. It then covers statistical and cluster-based pattern recognition methods, feedforward and backpropagation neural networks, unsupervised learning including Kohonen feature maps, and Hopfield neural networks. The goal is to represent knowledge in a way that enables object classification and decision-making.
This document provides an overview of knowledge representation techniques and object recognition. It discusses syntax and semantics in representation, as well as descriptions, features, grammars, languages, predicate logic, production rules, fuzzy logic, semantic nets, and frames. It then covers statistical and cluster-based pattern recognition methods, feedforward and backpropagation neural networks, unsupervised learning including Kohonen feature maps, and Hopfield neural networks. The goal is to represent knowledge in a way that enables object classification and decision-making.
The Starbucks social media strategy document outlines objectives to grow millennial followers across social media platforms by 25% and increase engagement with posts by 15% by January 2017. Key strategies include increasing video and visual content, engaging with followers daily on Twitter, and encouraging followers to share photos with the #mystarbucks hashtag. Metrics such as website traffic sources, social media follower counts and engagement rates will be reported on in February, May, August and November.
Coca-Cola is the largest beverage company in the world with over 3,500 products sold in over 200 countries. It employs over 146,000 people globally and has maintained over 50 consecutive years of profit growth. Coca-Cola utilizes social media platforms like Facebook, Twitter, Google+, and Pinterest to engage with customers, though it has had more success on Facebook and Twitter than other platforms due to not having a clear target audience. While Coca-Cola highlights environmental programs, it uses a large amount of water in producing its beverages.
Social media marketing strategy & plan 2017Fraser Hay
Social media marketing strategy plan coaching 22017 is an overview of the social media marketing coaching program available at http://www.growyourbusiness.club
Social Media Marketing Strategy
Social Media Marketing Plan 2017
Social Media marketing strategy 2017
Social media marketing plan
social media marketing course 2017
social media marketing
marketing plan 2017
marketing plan
marketing strategy 2017
marketing strategy
Social media marketing 2017
social media strategy 2017
grow your business coaching
grow your business club
grow your business
online marketing strategy
online marketing plan
Google uses complex algorithms to rank websites in search results. The main algorithms include PageRank, Penguin, Panda, and Hummingbird. PageRank is the original algorithm that analyzes backlinks to determine importance. Penguin penalizes spam sites and paid links. Panda targets low-quality "content farms." Hummingbird incorporates previous algorithms and aims to better understand search queries through techniques like semantic search, location data, and knowledge graphs to provide more relevant, personalized results. It seeks to answer questions more conversationally rather than just returning keywords.
The Starbucks 2017 social media strategy aims to grow its following among parents and customers aged 56+ through targeted content on Facebook and Twitter. Key objectives include increasing tweets by 40% and Twitter followers by 100,000 in 6 months. The strategy involves introducing relevant #ageless content across channels, and improving engagement on Facebook and Twitter. Measurement will track growth in social metrics and website traffic from social sources.
Google AdWords is an advertising service by Google for businesses wanting to display ads on Google and its advertising network.
The AdWords program enables businesses to set a budget for advertising and only pay when people click the ads. The ad service is largely focused on keywords.
Practical SEO provides a blueprint for an effective SEO process that involves researching keywords, optimizing pages for those keywords, promoting optimized pages to generate links and traffic, and analyzing results to refine strategies. The process is presented as a cyclical model where SEO activities are ongoing and adjusted based on data to continually improve search visibility and results.
Google introduced the Hummingbird algorithm in 2013 to improve search results. Hummingbird analyzes longer, more complex questions to provide more direct answers compared to previous algorithms like Panda and Penguin. It uses information like user location and interests to personalize results. The algorithm also leverages Google's Knowledge Graph to understand context and relationships to better answer follow-up questions. Hummingbird aims to have a more human-like conversation with users through its semantic search capabilities.
Consumer search behaviour is complex.
You perform multiple searches on multiple devices over multiple days; where everything you see and experience, in the SERPs and beyond, influences your brand preference and purchase decisions.
Traditional funnel analysis and marketing models do a poor job of measuring and managing this ecosystem.
We need to re-think the way we talk about, measure and manage SEO if we want bigger budgets, integrated strategies, and to win big - and we need to move fast; some of the world's biggest companies are already beating us to it.
7thingsmedia's Founder & CEO , Chris Bishop, presented an intimate introduction into search engine optimization (SEO). Bishop took the group through the basics of how the various search engines work, how to efficiently rank in them and the key on page and off page ranking factors. Plus an Google Panda and Google Penguin update.
1) Inbound marketing is a holistic, data-driven approach that attracts potential customers through helpful content and converts them into lasting customers, rather than interrupting people with ads, emails, calls.
2) Traditional marketing is broken as people tune out interruptive tactics, but inbound marketing provides value to customers through content on blogs, videos, tools.
3) HubSpot is a marketing platform that makes inbound marketing simple through integrated software and tools that help businesses attract visitors and convert them into customers more effectively than outbound marketing.
This slideshare highlights 40 mini case studies of businesses in Singapore that have stood out by implementing creative social media marketing campaigns.
Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine's "natural" or un-paid ("organic") search
Read all SEO Tips Shared by Matt Cutts in this PPT.
This lengthy (150+ slides), but very comprehensive presentation is designed to help experienced SEOs train those new to the practice over a 2-3 hour, interactive session. It covers the search engine landscape, the SEO process, keyword research, link building and the emergence of social media as a ranking signal.
A snapshot of internet, social media, and mobile use in every country in the world. This report is part of a suite of reports brought to you by We Are Social and Hootsuite - read the other reports for free at http://www.slideshare.net/wearesocialsg/presentations
- More than half of the world's population now uses the internet, with global internet users growing 8% year-over-year. Mobile internet and social media usage are also growing significantly.
- Social media users grew over 20% in the past year to over 2.5 billion active users monthly. Mobile social media use in particular saw 30% growth.
- The report provides statistics on internet, social media, and mobile usage globally and by region, finding continued growth in connectivity and usage around the world.
In this paper, we consider a criminal investigation on the collective guilt of part members in a working group. Assuming that the statistics we used are reliable, we present the Page Rank Model based on mutual information. First, we use the average mutual information between non-suspicious topics and the suspicious topics to score the topics by degree of suspicion. Second, we build the correlation matrix based on the degree of suspicion and acquire the corresponding Markov state transition matrix. Then, we set the original value for all members of the working group based on the degree of suspicion. At the last, we calculate the suspected degree of each member in the working group. In the small 10-people case, we build the improved Page Rank model. By calculating the statistics of this case, we acquire a table which indicates the ranking of the suspected degree. In contrast with the results given in this issue, we find these two results basically match each other, indicating the model we have built is feasible. In the current case, firstly, we obtain a ranking list on 15 topics in order of suspicion via Page Rank Model based on mutual information. Secondly, we acquire the stable point of Markov state transition matrix using the Markov chain. Then, we build the connection matrix based on the degree of suspicion and acquire the corresponding Markov state transition matrix. Last, we calculate the degree of 83 candidates. From the result, we can see that those suspicious are on the top of the ranking list while those innocent people are at the bottom of the list, representing that the model we have built is feasible. When suspicious topics and conspirators changed, a relatively good result can also be obtained by this model. In the current case, we have the evidence to believe that Dolores and Jerome, who are the senior managers, have significant suspicion. It is recommended that future attention should be paid to them. The Page Rank Model, based on mutual information, takes full account of the information flow in message distribution network. This model can not only deal with the statistics used in conspiracy, but also be applied to detect the infected cells in a biological network. Finally, we present the advantages and disadvantages of this model and the direction of improvements.
Application of finite markov chain to a model of schoolingAlexander Decker
This document discusses the application of finite Markov chains to model school progression. It begins by introducing key concepts of Markov chains including states, transition probabilities, and transition matrices. It then presents a model of school progression with states representing each year of school plus outcomes of graduating or dropping out. Transition probabilities between states are used to formulate the model. The document applies this model to secondary school data from Nigeria to predict academic progression. It analyzes the data using knowledge of finite Markov chains and finds the model provides an efficient way to predict student academic progress.
This document provides an overview of knowledge representation techniques and object recognition. It discusses syntax and semantics in representation, as well as descriptions, features, grammars, languages, predicate logic, production rules, fuzzy logic, semantic nets, and frames. It then covers statistical and cluster-based pattern recognition methods, feedforward and backpropagation neural networks, unsupervised learning including Kohonen feature maps, and Hopfield neural networks. The goal is to represent knowledge in a way that enables object classification and decision-making.
This document provides an overview of knowledge representation techniques and object recognition. It discusses syntax and semantics in representation, as well as descriptions, features, grammars, languages, predicate logic, production rules, fuzzy logic, semantic nets, and frames. It then covers statistical and cluster-based pattern recognition methods, feedforward and backpropagation neural networks, unsupervised learning including Kohonen feature maps, and Hopfield neural networks. The goal is to represent knowledge in a way that enables object classification and decision-making.
This document summarizes a presentation given at the 2022 AERA Annual Conference exploring academic literacy infusion in STEM contexts. Over a four-year project at UTSA, researchers designed professional development for STEM faculty to increase diversity in STEM fields by engaging curriculum change and addressing achievement gaps for Latinx students. The presentation focuses on one engineering instructor's lesson on mass spring systems over multiple iterations and modalities (in-person and online synchronous). Researchers analyzed transcripts using codes related to academic discourse, instructional framing, modeling, visual literacy, and student interactions. Findings showed the instructor intentionally infusing literacy practices to support student comprehension and more inclusive learning across modalities. Lesson study enhanced the instructor's ability to promote
This document provides an introduction to Bayesian phylogenetic analysis for empirical biologists. It explains that Bayesian phylogenetic models were introduced in the 1990s and have revolutionized data analysis by using powerful models and user-friendly computer programs. The purpose of the paper is to explain basic Bayesian statistical concepts like priors, likelihoods, Markov chain Monte Carlo algorithms, and assessing convergence and mixing to help biologists use Bayesian phylogenetic programs. It covers topics like choosing appropriate priors, types of data that can be used, how MCMC sampling works, and diagnosing convergence, burn-in and mixing. Bayesian phylogenetics has grown due to available models and programs and is commonly used for tasks like divergence time estimation, species tree estimation, and species delimitation.
The Google Pagerank algorithm - How does it work?Kundan Bhaduri
This document discusses how Google uses Markov chains and the PageRank algorithm to rank web pages. It begins by explaining Markov chains and how they can model random user behavior on the web. It then describes how Google implemented PageRank as a non-absorbing Markov chain to calculate the probability of a random user reaching any given page. The document outlines issues with applying this to the large-scale web, and proposes techniques like the power method to efficiently approximate PageRank values for the trillion-page internet graph. Finally, it provides an example of how links between related high-authority sites can increase the PageRank of a given page.
This document provides instructions for an assignment on data and file structures for the MCS-021 course. It outlines four questions to answer for the assignment, which is worth 100 marks total and 25% of the course grade. Students must answer all four questions, with each question worth 20 marks. The assignment should be submitted by October 15th, 2013 for the July 2013 session or April 15th, 2014 for the January 2014 session. Question 1 asks students to write an algorithm for implementing doubly linked lists.
An application of artificial intelligent neural network and discriminant anal...Alexander Decker
This document presents a study that compares the predictive abilities of artificial neural networks and linear discriminant analysis for credit scoring. A credit dataset from a Nigerian bank with 200 applicants and 15 variables is used to build both neural network and linear discriminant models. The models are evaluated based on measures like accuracy, Wilks' lambda, and canonical correlation. Key findings are that the neural network model performs slightly better with less misclassification cost. However, variable selection is important for both models' success. Age, length of service, and other borrowing are found to be the most important predictor variables.
This document provides a preface for a textbook on systems and control. It describes the scope and objectives of the textbook, which is to familiarize readers with dynamical system theory and provide tools for control system design. The textbook can be used for senior/graduate level courses in systems and control, introductory courses in nonlinear systems, or introductory courses in modern automatic control. It includes modeling of dynamical systems from various engineering and scientific disciplines. Analysis methods like Lyapunov stability theory and numerical techniques are discussed. Classical and modern control approaches like fuzzy and neural networks are presented in a unified manner.
This Presentation is on recommended system on question paper predication using machine learning techniques. We did literature survey and implement using same technique.
The document discusses algorithms for tree data structures. It describes AVL trees as self-balancing binary search trees where the heights of subtrees can differ by at most one. It also describes red-black trees, noting they are similar to AVL trees but use color attributes (red or black) to balance the tree. The key differences are that AVL trees are generally faster for lookup-intensive applications as they are more rigidly balanced, while red-black tree insertions and deletions may require fewer rotations than AVL trees.
[Emnlp] what is glo ve part i - towards data scienceNikhil Jaiswal
This document introduces GloVe (Global Vectors), a method for creating word embeddings that combines global matrix factorization and local context window models. It discusses how global matrix factorization uses singular value decomposition to reduce a term-frequency matrix to learn word vectors from global corpus statistics. It also explains how local context window models like skip-gram and CBOW learn word embeddings by predicting words from a fixed-size window of surrounding context words during training. GloVe aims to learn from both global co-occurrence patterns and local context to generate word vectors.
In this Machine Learning tutorial, we will cover the top Neural Network Algorithms. These algorithms are used to train the Artificial Neural Network. This blog provides you a deep learning of the Gradient Descent, Evolutionary Algorithms, and Genetic Algorithm in Neural Network.
Today’s market evolution and high volatility of business requirements put an increasing emphasis on the
ability for systems to accommodate the changes required by new organizational needs while maintaining
security objectives satisfiability. This is all the more true in case of collaboration and interoperability
between different organizations and thus between their information systems. Ontology mapping has been
used for interoperability and several mapping systems have evolved to support the same. Usual solutions
do not take care of security. That is almost all systems do a mapping of ontologies which are unsecured.
We have developed a system for mapping secured ontologies using graph similarity concept.
The document provides a mid-year report from an Autodesk Olin SCOPE team tasked with rethinking self-paced learning for Autodesk Inventor. It outlines 8 concept areas for improving the Autodesk Inventor learning experience, along with examples. It also presents research in the form of an "Autodesk Learning Deck of Cards" to help explain ideas. Key points include: 1) The current tutorials are text-based and don't adapt. 2) The concepts aim to increase flexibility, relevance, and connection between learners. 3) The concept areas and deck of cards were developed to organize ideas and map them to research. Moving forward, the team will focus on specific concepts to
Autodesk olin scope mid year presentationMaia Bittner
The document outlines the mid-year presentation for an Autodesk learning project. It includes sections on the project overview, team members, concept areas explored, and design concepts developed. The concepts focus on unlockable tutorials, personalization features, tutorial generation tools, competitions, and strategies to overcome learning obstacles when using the software.
1) In 2008, Maia Bittner took 13 trips that totaled 60,556 km, or about 16% of the distance to the moon.
2) She spent time in Bangkok, Stockholm, Needham, San Francisco, and Bellingham and coincided most with Tim, Michael, Greg, and Stefan.
3) Her carbon footprint for 2008 was estimated at 7,199 kg CO2, which is close to the annual emissions of one Hummer H3 truck.
Maia Bittner in Olin College promotional brochureMaia Bittner
I was a featured graduate for my alma mater, the Franklin W. Olin College of Engineering. This page was shown in promotional brochures, as well as circulated as an advertisement in magazines such as Nautilus.
Nails not wood: focus on building, not writing codeMaia Bittner
My lightning talk for Women Who Code on how I built RocksBox: leveraged my time by focusing on connecting SASS & opensource software together rather than rewriting that functionality myself.
A system is designed to allow users to send barcodes via text or email to receive price comparisons from multiple vendors. The system extracts the barcode from incoming messages, queries Amazon and Google to retrieve price pages, parses the HTML to find the lowest four prices from each site, then composes and sends an email to the user with the price information. Users can ask questions by replying to the email address.
The document outlines a 24-hour project marathon conducted by the Olin College Autodesk SCOPE Team to simulate a year's worth of work on a project, describing the SCOPE program which engages students in significant engineering projects for clients over two semesters. It also provides details about the accelerated project experience and introduces ideas for possible areas of exploration for the team's year-long research project that were developed prior to meeting with their Autodesk liaisons.
Final Project Presentation for Computer NetworkingMaia Bittner
This document discusses an instant messenger project that aims to integrate students' informal questions with official course material in a time-sensitive and low bandwidth supplement to the campus network. It provides video demonstrations and outlines possible improvements such as expanding the network beyond a LAN, allowing students to join courses already in progress, selecting a multicast IP address via a proxy, and future steps like sharing videos, pictures, and applications on mobile devices and over the internet.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
20240605 QFM017 Machine Intelligence Reading List May 2024
Google PageRank
1. 1
Linear Algebra in Use: Ranking Web Pages with an
Eigenvector
Maia Bittner, Yifei Feng
Abstract—Google PageRank is an algorithm that uses the In this paper, we will explain how the interlinking structure
underlying, hyperlinked structure of the web to determine the of the web and the properties of Markov Chains can be used to
theoretical number of times a random web surfer would visit each quantify the relative importance of each indexed page. We will
page. Google converts these numbers into probabilities, and uses
these probabilities as each web page’s relative importance. Then, examine Markov Chains, eigenvectors, and the power iteration
the most important pages for a given query can be returned first method, as well as some of the problems that arise when using
in search results. In this paper, we focus on the math behind this system to rank web pages.
PageRank, which includes eigenvectors and Markov Chains, and
on explaining how it is used to rank webpages.
A. Markov Chains
Google PageRank, Eigenvector, Eigenvalue, Markov Chain
Markov chains are mathematical models that describe par-
I. I NTRODUCTION ticular types of systems. For example, we can construe the
The internet is humanity’s largest collection of information, number of students at Olin College who are sick as a Markov
and its democratic nature means that anyone can contribute Chain if we know the how likely it is that a student will
more information to it. Search engines help users sort through become sick. Let us say that if a student is feeling well, she has
the billions of available web pages to find the information a 5% chance of becoming sick the next day, and that if she is
that they are looking for. Most search engines use a two step already sick, she has a 35% chance of feeling better tomorrow.
process to return web pages based on the user’s query. The first In our example, a student can only be healthy or sick; these
step involves finding which of the pages the search engine has two states are called the state space of the system. In addition,
indexed are related to the query, either by containing the words we’ve decided to only ask how the students are feeling in the
in the query, or by more advanced means that use semantic morning, and their health on any day only depends on how
models. The second step is to order this list of relevant pages they were feeling the previous morning. This constant, discrete
according to some criterion. For example, the first web search increase in time makes the system time-homogenous. We can
engines, launched in the early 90’s, used text-based matching generate a set of linear equations that will describe how many
systems as their criterion for ordering returned results by students at Olin College are healthy and sick on any given day.
relevancy. This ranking method often resulted in returning If we let mk indicates the number of healthy students and nk
exact matches in unauthoritative, poorly written pages before indicates the number of sick students at morning k, then we
results that the user could trust. Even worse, this system was get the following two equations:
easy to exploit by page owners, who could fill their pages mk+1 = .95mk + .35nk
with irrelevant words and phrases, with the hope of ranking
nk+1 = .05mk + .65nk
highly. These problems prompted researchers to investigate
more advanced methods of ranking.
Larry Page and Sergey Brin were researching a new kind Putting this system of linear equations into matrix notation,
of search engine at Stanford University when they had the we get:
idea that pages could be ranked by link popularity. The
underlying social basis of their idea was that more reputable mk+1 .95 .35 mk
= (1)
pages are linked to by other pages more often. Page and Brin nk+1 .05 .65 nk
developed an algorithm that could quantify this idea, and in We can take this matrix full of probabilities and call it P, the
1998, christened the algorithm PageRank [2] and published transition matrix.
their first paper on it. Shortly afterwards, they founded Google,
a web search engine that uses PageRank to help rank the
.95 .35
returned results. Google’s famously useful search results [3] P= (2)
.05 .65
helped it reach almost $29 billion dollars in revenue in 2010
[1]. The original algorithm organized the indexed web pages The columns can be viewed as representing the present state
such that the links between them are used to construct the of the system, and the rows can be viewed as representing
probability of a random web surfer navigating from one page the future state of the system. The first column accounts for
to another. This system can be characterized as a Markov the students who are healthy today, and the second column
Chain, a mathematical model described below, in order to take accounts for the students who are sick today, while the first row
advantage of their convenient properties. indicates the students who will be healthy tomorrow and the
2. 2
second row indicates the students who will be sick tomorrow. classified as a Markov Chain has these steady-state values that
The intersecting elements of the transition matrix represent the system will remain constant at, regardless of the initial
the probability that a student will transition from that column state. This steady-state vector is a specific example of an
state to the row state. So we can see that p1,1 indicates that eigenvector, explained below.
95% of the students who are healthy today will be healthy
tomorrow. The total number of students who will be healthy B. Eigenvalues and Eigenvectors
tomorrow is represented by the first row, so that it is 95% of
An eigenvector is a nonzero vector x that, when multipled
the students who are healthy today plus 35% of the students
by a matrix A, only scales in length and does not change
who are sick today. Similarly, the second row shows us the
direction, except for potentially a reverse in direction.
number of students who will be sick tomorrow: 5% of the
students who are healthy today plus 65% of the students who Ax = λx (9)
are sick today.
The corresponding amount that an eigenvector is scaled by, λ,
You can see that each column sums to one, to account for
is called its eigenvalue. There are several techniques to find the
100% of students who are sick and 100% of students who
eigenvalues and eigenvectors of a matrix. We will demonstrate
are healthy in the current state. Square matrices like this, that
one technique below with matrix A.
have nonnegative real entries and where every column sums
to one, are called column-stochastic. 1 2 5
We can find the total number of students who will be in A= 0 3 0
a state on day k + 1 by multiplying the transition matrix by 0 0 4
a vector containing how many students were in each state on How to find the eigenvalues for matrix A: We know that λ is
day k. a eigenvalue of A if and only if the equation
Pxk = xk+1 (3)
Ax = λx (10)
For example, if 270 students at Olin College are healthy today,
and 30 are sick, we can find the state vector for tomorrow: has a nontrivial solution. This is equivalent to finding λ such
that:
.95 .35 270 267
= (4) (A − λI)x = 0 (11)
.05 .65 30 33
The above equation has nontrivial solution when the determi-
which shows that tomorrow, 267 students will be healthy and
nant of A − λI is zero.
33 students will be sick. To find the next day’s values, you
can multiply again by the transition matrix: 1−λ 2 5
det(A − λI3 ) = 0 3−λ 0
.95 .35 267 265.2 0 0 4−λ
= (5)
.05 .65 33 34.8
= (4 − λ)(1 − λ)(3 − λ) = 0
which is the same as
2
Solving for λ, we get that the eigenvalues are λ1 = 1, λ2 = 3,
.95 .35 270 265.2 and λ3 = 4. If we solve for Avi = λi vi , we will get the
= (6)
.05 .65 30 34.8 corresponding eigenvector for each eigenvalue:
So we can see that in order to find mk and nk , we can 1 4 5
multiply P k by the state vector containing m0 and n0 , as in v1 = 0 , v2 = −1 , v3 = 0
the equation below: 0 2 3
k
mk .95 .35 m0 This means that if v2 is transformed by A, the result will
= (7) scale v2 by its eigenvalue, 3.
nk .05 .65 n0
You can see in Eqn. (8) that the steady state of a Markov
If you continue to multiply the state vector by the transition Chain has an eigenvalue of 1. This is why those steady state
matrix for very high values of k, we will see that it will vectors are a special case of eigenvectors. Because they are
eventually converge upon a steady state, regardless of initial column-stochastic, all transition matrices of Markov Chains
conditions. This is represented by vector q in Eqn. (8). will have an eigenvalue of 1 (we invite the reader to prove
this in Exercise 4). A system having an eigenvalue of 1 is the
Pq = q (8) same as it having a steady state.
Being able to find this steady-state vector is the main In some matrices, we may get repeated roots when solving
advantage of using a column-stochastic matrix to model a det(A − λI) = 0. We will demonstrate this for the column-
system. Column-stochastic transition matrices are always used stochastic matrix P:
to represent the known probabilities of transitioning between 0 1 0 0 0
states in Markov Chains. To model a system as a Markov 1 0 0 0 0
1
Chain, it must be a discrete-time process that has a finite P= 0 0 0 1 2
state space, and the probability distribution for any state must 0 0 0 0 1
2
depend only on the previous state. Every situation that can be 0 0 1 0 0
3. 3
four webpages. Page 1 is linked to by pages 2 and 3, so its
importance score is x1 = 2. In the same way, we can get
x2 = 3, x3 = 1, x4 = 2. Page 2 has the highest importance
score, indicating that page 2 is the most important page in this
web.
However, this approach has several drawbacks. First, the
pages that have more links to other pages would have more
votes, which means that a website can easily gain more
influence by creating many links to other pages. Second, we
would expect that a vote from an important page should weigh
more than a vote from an unimportant one, but every page’s
vote is worth the same amount with this method. A way to
fix both of these problems is to give each page the amount
Figure 1. A small web of 4 pages, connected by directional links
of voting power that is equivalent to its importance score. So
for webpage k with an importance score of xk , it has a total
To find the eigenvalues, solve: voting power of xk . Then we can equally distribute xk to all
the pages it links to. We can define the importance score of
−λ 1 0 0 0 a page as the sum of all the weighted votes it gets from the
1 −λ 0 0 0 pages that link to it. So if webpage k has a set of pages Sk
det(P − λI5 ) = 0 0 −λ 1 1 2 linked to it, we have
0 0 −λ 0 1 2 xj
0 0 1 0 −λ xk = (12)
nj
1 j∈Sk
= − (λ − 1)2 (λ + 1)(2λ2 + 2λ + 1) = 0
2 where nj is the number of links from page j. If we apply this
When we solve the characteristic equation, we find that the method to the network of Figure 1, we can get a system of
five eigenvalues are: λ1 = 1, λ2 = 1, λ3 = −1, λ4 = − 1 − 2 , i linear equations:
2
1 i x2 x4
λ5 = − 2 + 2 . Since 1 appears twice as an eigenvalue, we
x1 = +
say that is has algebraic multiplicity of 2. The number of 1 2
individual eigenvectors associated with eigenvalue 1 is called x1 x3 x4
x2 = + +
the geometric multiplicity of λ = 1. The reader can confirm 3 2 2
x1
that in this case, λ = 1 has geometric multiplicity of 2 with x3 =
3
associated eigenvectors x and y. x1 x3
x4 = +
√2 √2 3 2
√2 − 2
√
2 2
2 2
x= 0 ,y = 0
which can be written in the matrix form x = Lx, where x =
0 0 [x1 , x2 , x3 , x4 ]T and
0 0
0 1 0 2 1
We can see that when transition matrices for Markov chains 1 0 1 1
L= 1 3 2 2
have geometric multiplicity for eigenvalue of 1, it’s unclear
3 0 0 0
1 1
which independent eigenvector should be used to represent 3 0 2 0
the steady-state of the system. L is called the link matrix of this network system since it
encapsulates the links between all the pages in the system.
II. PAGE R ANK Because we’ve evenly distributed xk to each of the pages
When the founders of Google created PageRank, they were k links to, the link matrix is always column-stochastic. As
trying to discern the relative authority of web pages from the we defined earlier, vector x contains the importance scores
underlying structure of links that connects the web. They did of all web pages. To find these scores, we can solve for
this by calculating an importance score for each web page. Lx = x. You’ll notice that this looks similar to Eqn. (8),
Given a webpage, call it page k, we can use xk to denote the and indeed, we can transform this problem into finding the
importance of this page among the total number of n pages. eigenvector with eigenvalue λ = 1 for the matrix L! For
There are many different ways that one could calculate an matrix L, the eigenvector is [0.387, 0.290, 0.129, 0.194]T for
importance score. One simple and intuitive way to do page λ = 1. So we know the importance score of each page is
ranking is to count the number of links from other pages x1 ≈ 0.387, x2 ≈ 0.290, x3 ≈ 0.129, x4 ≈ 0.194. Note that
to page k, and assign that number as xk . We can think of with this more sophisticated method, page 1 has the highest
each link as being one vote for the page it links to, and of importance score instead of page 2. This is because page
the number of votes a page gets as showing the importance 2 only links to page 1, so it casts its entire vote to page
of the page. In the example network of Figure 1, there are 1, boosting up its score. Knowing that the link matrix is a
4. 4
column-stochastic matrix, let us now look at the problem of
ranking internet pages in terms of a Markov Chain system.
For a network with n pages, the ith entry of the n × 1
vector xk denotes the probability of visiting page i after k
clicks. The link matrix L is the transition matrix such that
the entry lij is the probability of clicking on a link to page
i when on page j. Finding the importance score of a page is
the same as finding its entry in the steady state vector of the
Markov chain that describes the system. For example, say we
start from page 1, so that vector that represents our begining
state is x0 = [1, 0, 0, 0]T To find the probability of ending up
on each web page after n clicks, we do:
xn = Ln x0 (13) Figure 2. Here are two small subwebs, which do not exchange links
where n represents the state after n clicks (the possibility of
being on each page), and L is the link matrix, or transition
matrix. So by calculating the powers of L, we can determine 0 1 0 0 0
the steady state vector. This process is called the Power
1 0 0 0 0
1
Iteration method, and it converges on an estimate of the A=
0 0 0 1 2
1
eigenvector for the greatest eigenvalue, which is always 1 in 0 0 0 0 2
the case of a column-stochastic matrix. For example, by raising 0 0 1 0 0
the link matrix L to the 25th power, we have Mathematically, this problem poses itself as being a multi-
0.387 0.387 0.387 0.387
dimensional Markov Chain. Link matrix A has an geometric
0.290 0.290 0.290 0.290 multiplicity of 2 for the eigenvalue of 1, as we showed
B≈ 0.129 0.129 0.129 0.129
in section I. B. It’s unclear which of the two associated
0.194 0.194 0.194 0.194 eigenvectors should be chosen to form the rankings. The
two eigenvectors are essentially eigenvectors for each subweb,
If we multiply matrix B by our initial state x0 , we get our and each shows rankings which are accurate locally, but not
steady state vector globally.
T Google has chosen to solve this problem of subwebs by
s= 0.387 0.290 0.129 0.194 introducing an element of randomness into the link matrix.
1
which shows the probability of each link being visited. This Defining a matrix S as an n × n matrix with all entries n
power iteration process gives us approximately the same and a value m between 0 and 1 as a relative weight, we can
result as finding the eigenvector of the link matrix, but is replace link matrix A with:
often more computationally feasible, especially for matrices
with a dimension of around 1 billion, like Google’s. These M = (1 − m)A + mS (14)
computation savings are why this is the method by which If m > 0, there will be no parts of matrix M which
Google actually calculates the PageRank of web pages [5]. By represent entirely disconnected subwebs, as every web surfer
taking powers of the matrix to estimate eigenvectors, Google has some probability of reaching another page regardless of
is doing the reverse of many applications, which diagonalize the page they’re on. In the original PageRank algorithm, an
matrices into their eigenvector components in order to take m value of .15 was used. Today, it is spectulated by those
them to a high power. Few applications actually use the power outside of Google that the value currently in use lies between
iteration method, since it is only appropriate given a narrow .1 and .2. The larger the value of m, the higher the random
range of conditions. The sparseness of the web’s link matrix, matrix is weighted, and the more egalitarian the corresponding
and the need to only know the eigenvector corresponding to PageRank values are. If m is 1, a web surfer has equal
the dominant eigenvalue, make it an application well-suited to probability of getting to any page on the web from any other
take advantages of the power iteration method. page, and the all links would be ignored. If m is 0, any
subwebs contained in the system will cause the eigenvalues to
A. Subwebs have a multiplicity greater than 1, and there will be ambiguity
in the system.
We now address a problem that this model has when
faced with real-world constraints. We refer to this problem
as disconnected subwebs, as shown in Figure 2. If there are III. D ISCUSSION
two or more groups of linked pages that do not link to each We’ve shown how systems that can be characterized as
other, it is impossible to rank all pages relative to each other. a Markov Chain will converge to a steady state, and that
The matrix shown below is the link matrix for the web shown these steady state values can be found by using either the
in Figure 2: characteristic equation or the power iteration method. We then
5. 5
investigated how the web can be viewed as a Markov Chain
when the state is which page a web surfer is on, and the hy-
perlinks between pages dictate the probability of transitioning
from one state to another. With this characterization, we can
view the steady-state vector as the proportional amount of time
a web surfer would spend on every page, hence, a valuable
metric for which pages are more important.
R EFERENCES
[1] O’Neill, Nick. Why Facebook Could Be Worth $200 Billion All Facebook.
Available at http://www.allfacebook.com/is-facebooks-valuation-hype-a-
macro-perspective-2011-01
[2] Page, L. and Brin, S. and Motwani, R. and Winograd, T. The pagerank
citation ranking: Bringing order to the web Technical report, Stanford Figure 3. A small web of 5 pages
Digital Library Technologies Project, 1998.
[3] Granka, Laura A. and Joachims, Thorsten and Gay, Geri. Eye-tracking
analysis of user behavior in WWW search SIGIR, 2004. Available at
http://www.cs.cornell.edu/People/tj/publications/granketal04a.pdf IV. E XERCISES
[4] Grinstead, Charles M. and Snell, J. Laurie Introduction Exercise 1: Find the eigenvalues and correspoding eigenvectors of the
to Probability: Chapter 11 Markov Chains Available at following matrix.
http://www.dartmouth.edu/ chance/teachingaids/booksarticles/probabilitybook/Chapter11.pdf
3 0 −1 0
[5] Ipsen, Ilse, and Rebecca M. Wills Analysis and Computation of Google’s
PageRank 7th IMACS International Symposium on Iterative Methods in 4 −3 2 0
A=
Scientific Computing, Fields Institute, Toronto, Canada, 5 May 2005. 0 0 1 0
Available at http://www4.ncsu.edu/ ipsen/ps/slidesimacs.pdf −2 0 3 2
Exercise 2: Given the column-stochastic matrix P:
0.6 0.3
P=
0.4 0.7
find the steady-state vector for P.
Exercise 3: Create a link matrix for the network with 5 internet pages
in Figure 3, then rank the pages.
Exercise 4: In section I B. we claim that all transition matrices for
Markov chains have 1 as an eigenvalue. Why is this true for every
column-stochastic matrix?
6. 6
V. S OLUTION TO E XERCISES
Solution 1: The eigenvalues are λ1 = 2, λ2 = −3, λ3 = 3 and λ4 = 1,
and corresponding eigenvectors are
0 0 3 1
0 1 2 2
x1 = x = x = x =
0 2 0 3 0 4 2
1 0 −6 −4
Solution 2: The steady-state vector for P is,
0.4286
s=
5714
Solution 3: The link matrix is,
1
0 1 0 0
3
1 0 1
0 0
3 3
1
A= 0 0 0 2
1
1 0 1
0 0
3 3
1 1
3
0 0 2
0
The ranking vector is,
1 1 1 1 1 T
x= 4 6 4 6 6
Solution 4: By definition, every column in a column stochastic matrix
contains no negative numbers and sums to 1. As shown in Section I. B.,
eigenvalues are the numbers that, when subtracted from the main diagonal
of a matrix, cause its determinant to equal 0. Since every column adds to
1, and we are subtracting 1 from every column, clearly the determinant
will equal 0. Therefore, 1 is always an eigenvalue of column-stochastic
matrices.