The document is a presentation by Edward Chang from Google Research in Beijing. It discusses search and social synergies, including how search can lead to social interactions and vice versa. It also covers the scalability and elasticity challenges of cloud computing platforms at Google's scale. Specific techniques mentioned include distributed latent Dirichlet allocation for modeling large-scale text data and UserRank for evaluating user contributions.
Keyword-based Search and Exploration on Databases (SIGMOD 2011)weiw_oz
Keyword-based search aims to support searching databases using keywords rather than structured queries. This allows for a large user population but comes with challenges including structural and keyword ambiguity. The tutorial discusses approaches to infer structure from keywords and rank candidate structures and results to provide high-quality answers. Future work includes better handling of keyword ambiguity and more effective result analysis and exploration.
Vks Presentation, Jankowski,15 Jan2009, Websites & Books, Near FinalNick Jankowski
Network Venues & Scholarly Monographs:
Pioneering Initiatives in Publishing e-Scholarship
Abstract
Scholarly publishers are increasingly incorporating Web sites into facets of the enterprise. Often, such sites primarily serve basic promotional and purchasing functions, but occasionally sites of both publishers and authors reflect other functionalities: search facilities, availability of published text, referral to instructional and research materials, hyperlinks to external sources, opportunity for reader-author exchange. This presentation provides a panoramic overview of Web sites recently prepared by publishers and/or authors that complement traditionally published scholarly monograph. This overview is intended to stimulate discussion of suitable Web functionalities that might be incorporated into monograph publications being prepared by scholars affiliated to the Virtual Knowledge Studio.
Sinks Method Paper Presentation @ Duke Political Networks Conference 2010Daniel Katz
This document proposes and evaluates distance measures for dynamic citation networks. It discusses how standard community detection methods do not capture the expected stability of relationships in citation networks. The authors introduce a distance measure based on the idea that two documents are more similar if they share more common "sinks", or ideas introduced into the network. They test this sink-based distance measure through simulations and on a United States Supreme Court citation network, finding it performs better than standard approaches at revealing stable communities over time.
"You can just tell whether a website looks reliable or not." People's modes o...Lynn Connaway
Connaway, L. S. (2018). "You can just tell whether a website looks reliable or not." People's modes of online engagement. Keynote presented at Universidad Javeriana, October 2, 2018, Bogota, Colombia.
Convenient isn't always simple: Digital Visitors and Residents.Lynn Connaway
Connaway, L. S. (2019). Convenient isn't always simple: Digital Visitors and Residents. Presented at the University of Adelaide, February 18, 2019, Adelaide, Australia.
Deep neural networks for matching online social networking profilesTraian Rebedea
The document presents a study on using deep neural networks to match online social networking profiles that belong to the same individual. It describes extracting features from profiles, including domain-specific and text-based features. A deep neural network model with multiple fully-connected layers is proposed and shown to achieve high precision and recall on a large dataset, outperforming other supervised and unsupervised baseline methods. The study demonstrates applying deep learning techniques to the task of linking profiles from different social networks that refer to the same person.
The document summarizes a data mining program held at Renmin University in Beijing from May 21-27, 2012. It discusses the various lecturers and topics covered during the program. Professors Yang, Han, and Pei each gave lectures on their areas of expertise, including classification and transfer learning, information network models, and mining uncertain data. The curriculum focused mainly on data mining and included both basic and advanced concepts. Participants were encouraged to actively engage and ask questions throughout the program.
Modelling and Analysis of User Behaviour in Online CommunitiesMatthew Rowe
The document discusses modeling and analyzing user behavior in online communities. It presents an ontology for modeling behavioral roles and features over time. Eight common community roles are defined, such as elitist, grunt, and popular initiator. Semantic rules are used to infer user roles based on their behavioral features, like initiation of threads and response rates. The composition of roles in a community over time can provide insights into its health and suggest when intervention may be needed to influence the community.
Keyword-based Search and Exploration on Databases (SIGMOD 2011)weiw_oz
Keyword-based search aims to support searching databases using keywords rather than structured queries. This allows for a large user population but comes with challenges including structural and keyword ambiguity. The tutorial discusses approaches to infer structure from keywords and rank candidate structures and results to provide high-quality answers. Future work includes better handling of keyword ambiguity and more effective result analysis and exploration.
Vks Presentation, Jankowski,15 Jan2009, Websites & Books, Near FinalNick Jankowski
Network Venues & Scholarly Monographs:
Pioneering Initiatives in Publishing e-Scholarship
Abstract
Scholarly publishers are increasingly incorporating Web sites into facets of the enterprise. Often, such sites primarily serve basic promotional and purchasing functions, but occasionally sites of both publishers and authors reflect other functionalities: search facilities, availability of published text, referral to instructional and research materials, hyperlinks to external sources, opportunity for reader-author exchange. This presentation provides a panoramic overview of Web sites recently prepared by publishers and/or authors that complement traditionally published scholarly monograph. This overview is intended to stimulate discussion of suitable Web functionalities that might be incorporated into monograph publications being prepared by scholars affiliated to the Virtual Knowledge Studio.
Sinks Method Paper Presentation @ Duke Political Networks Conference 2010Daniel Katz
This document proposes and evaluates distance measures for dynamic citation networks. It discusses how standard community detection methods do not capture the expected stability of relationships in citation networks. The authors introduce a distance measure based on the idea that two documents are more similar if they share more common "sinks", or ideas introduced into the network. They test this sink-based distance measure through simulations and on a United States Supreme Court citation network, finding it performs better than standard approaches at revealing stable communities over time.
"You can just tell whether a website looks reliable or not." People's modes o...Lynn Connaway
Connaway, L. S. (2018). "You can just tell whether a website looks reliable or not." People's modes of online engagement. Keynote presented at Universidad Javeriana, October 2, 2018, Bogota, Colombia.
Convenient isn't always simple: Digital Visitors and Residents.Lynn Connaway
Connaway, L. S. (2019). Convenient isn't always simple: Digital Visitors and Residents. Presented at the University of Adelaide, February 18, 2019, Adelaide, Australia.
Deep neural networks for matching online social networking profilesTraian Rebedea
The document presents a study on using deep neural networks to match online social networking profiles that belong to the same individual. It describes extracting features from profiles, including domain-specific and text-based features. A deep neural network model with multiple fully-connected layers is proposed and shown to achieve high precision and recall on a large dataset, outperforming other supervised and unsupervised baseline methods. The study demonstrates applying deep learning techniques to the task of linking profiles from different social networks that refer to the same person.
The document summarizes a data mining program held at Renmin University in Beijing from May 21-27, 2012. It discusses the various lecturers and topics covered during the program. Professors Yang, Han, and Pei each gave lectures on their areas of expertise, including classification and transfer learning, information network models, and mining uncertain data. The curriculum focused mainly on data mining and included both basic and advanced concepts. Participants were encouraged to actively engage and ask questions throughout the program.
Modelling and Analysis of User Behaviour in Online CommunitiesMatthew Rowe
The document discusses modeling and analyzing user behavior in online communities. It presents an ontology for modeling behavioral roles and features over time. Eight common community roles are defined, such as elitist, grunt, and popular initiator. Semantic rules are used to infer user roles based on their behavioral features, like initiation of threads and response rates. The composition of roles in a community over time can provide insights into its health and suggest when intervention may be needed to influence the community.
2010 09-24-闕志克老師-cloud computing where do we gonccuscience
1) The document discusses cloud computing and Taiwan's positioning in the cloud computing industry. It outlines the basic concepts of cloud computing including infrastructure as a service, platform as a service, and software as a service models.
2) It examines where Taiwan can fit within the cloud computing "food chain" and identifies opportunities in becoming a major hardware/software solution provider for cloud service providers through developing technologies like container computers and cloud operating systems.
3) The conclusion emphasizes that cloud computing involves consolidation of IT infrastructures and usage-based resource allocation. It also states that ITRI's integrated data center solution of Container Computer 1.0 and Cloud OS 1.0 could provide 80% of the functionality of current market leaders
AIT partner university, the Capital University of Economics and Business (CUEB), Beijing consists of 14 colleges and departments, including business administration, accountancy, finance, banking and economics. Some 20,000 students attend the university, undertaking programmes at bachelor, master and doctoral degree level. Approximately half of this number are full-time undergraduate students.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help boost feelings of calmness and well-being.
This presentation will guide you selecting custom window fashions. Learn what the actual cost is to have custom drapery and window fashions in your home.
Katherine Mansfield was an influential New Zealand writer known for her unconventional lifestyle and new style of short story writing. She published her first story at age 9 and went on to study in England. Her works were greatly influenced by Anton Chekhov and she published many short stories during her most productive period while struggling with tuberculosis. Sadly, her life was cut short at age 34 when she suffered a fatal lung hemorrhage while seeking unconventional cures for her illness.
2010 07-26-interdisciplinary research and learningnccuscience
This document summarizes Ming-Yang Kao's talk on interdisciplinary research and learning. The talk covered Kao's research areas including algorithms, finance, DNA self-assembly, and more. It provided examples of projects in predicting stock markets using computational complexity and data compression techniques. The talk also discussed designing index-based stock portfolios as a hard computational problem. Finally, it offered general thoughts on opportunities, values, and strategies for interdisciplinary research and learning.
The document discusses the neurophysiological mechanisms and neurobehavioral model of insomnia. It begins by outlining the two main systems that regulate sleep and wakefulness - the homeostatic system and circadian system, as well as the arousal system. It then presents a neurobehavioral model of insomnia which involves interactions between these physiological systems and psychological/behavioral factors. The model posits that insomnia may stem from imbalances or dysfunctions in the homeostatic system, circadian system and/or arousal system, as well as maladaptive sleep-related cognitions and behaviors. Clinical implications and studies on insomnia patients are discussed to support this neurobehavioral conceptualization of insomnia.
Slides from presentation at CHI2015:
Paper Title: Designing for Citizen Data Analysis: A Cross-Sectional Case Study of a Multi-Domain Citizen Science Platform
Abstract:
Designing an effective and sustainable citizen science (CS) project requires consideration of a great number of factors. This makes the overall process unpredictable, even when a sound, user-centred design approach is followed by an experienced team of UX designers. Moreover, when such systems are deployed, the complexity of the resulting interactions challenges any attempt to generalisation from retrospective analysis. In this paper, we present a case study of the largest single platform of citizen driven data analysis projects to date, the Zooniverse. By eliciting, through structured reflection, experiences of core members of its design team, our grounded analysis yielded four sets of themes, focusing on Task Specificity, Community Development, Task Design and Public Relations and Engagement. For each, we propose a set of design claims (DCs), drawing comparisons to the literature on crowdsourcing and online communities to contextualise our findings.
EdChang - Parallel Algorithms For Mining Large Scale Datagu wendong
Parallel algorithms are used for mining large-scale data. Edward Chang from Google Research in Beijing discusses parallel algorithms for key machine learning subroutines like frequent itemset mining, latent Dirichlet allocation, clustering, and support vector machines. These algorithms allow distributed computing across large datasets.
Towards a Social Learning Analytics for Online Communities of Practice for Ed...dcambrid
Presentation on social learning analytics for online professional learning by Kathleen Perez-Lopez and I at Learning Analytics and Knowledge, May 2, 2012 in Vancouver.
Netnography is a type of online ethnographic research that adapts traditional participant observation methods to study online communities. It involves planning research topics and questions, preparing for entry into online communities by considering ethics and data collection, observing community interactions, collecting archive, elicited, and fieldnote data, analyzing data through coding and developing understanding, and representing findings while considering criteria like coherence and reflexivity. Key steps include gaining entrée into communities by contributing meaningfully, collecting different types of qualitative data, analyzing data through coding and developing insights, and evaluating research quality.
Joining the docs: Back to the Future with Taxonomy and the New SemanticsRussell Blakeborough
An introduction to a web system that automatically works out 'you may also be interested in.." style links, and performs semantic clustering on folksonomies.
- NUTN Library
- NUTN CSIE Department
- NUTN Digital Art and Interactive Design Lab
- NUTN Student Affairs Office
- NUTN Career Planning and Counseling Center
- NUTN College of Liberal Arts
- NUTN College of Science and Engineering
- NUTN College of Management
- NUTN College of Electrical Engineering and Computer Science
- NUTN College of Human Ecology
- NUTN College of Arts
- NUTN College of Medicine
- NUTN College of Nursing
- NUTN College of Education
- NUTN College of Social Sciences
- NUTN College of Communication
- NUTN College of
This document discusses using linked open data to find experts who can solve problems from different domains. It presents challenges in open innovation on the semantic web, such as finding experts from various sources and related domains. Methods for expert finding before linked data are described. The document then proposes using linked data metrics based on topic distribution and proximity to select expert finding hypotheses. Finally, it describes a prototype expert finding system and the next challenge of exploring relevant knowledge domains to expand expert search.
This document describes a collective intelligence tool called the Evidence Hub for evidence-based policy deliberations. It discusses how the Evidence Hub aims to harness the collective intelligence of online communities to crowdsource policy deliberation around complex issues. It provides an overview of the conceptual model, prototype tools, and case studies using the Evidence Hub including for educational policy issues. The Evidence Hub allows users to collaboratively annotate resources, make semantic connections between ideas, and engage in structured online discussions to facilitate the emergence of collective intelligence around contested policy topics.
Global lodlam_communities and open cultural dataMinerva Lin
This document provides an overview of linked open data in libraries, archives, and museums. It defines linked open data and open cultural data, and discusses their importance in enabling connections and collaboration. The history and role of communities in advancing open cultural data initiatives are described. Key events like the LODLAM summits that brought the community together are summarized. The document promotes open data standards and licensing to realize the full potential of linked open cultural data.
Transdisciplinary Research: A short introductiontyndallcentreuea
This document provides an introduction to transdisciplinary research from the Network for Transdisciplinary Research (td-net). It defines transdisciplinary research as aiming to solve societal problems through close interaction with stakeholders. The research process links scientific knowledge production with societal problem solving through co-production of knowledge. Principles of transdisciplinary research include grasping complexity, considering diverse perspectives, linking different types of knowledge, and promoting the common good. Stakeholder participation and collaboration across disciplines are key to applying these principles.
Keynote reusability measurement and social community analysis from mooc con...HannibalHsieh
This document summarizes a project on analyzing reusability and social communities from MOOC content users. The project aims to:
1) Specify an exchangeable learning object format to encourage content sharing across platforms.
2) Develop an authoring/search tool and repository to facilitate content creation and discovery.
3) Build an online social community tool to promote interaction among MOOC users.
The project will analyze learning objects, develop metadata standards, and define functions for searching, matching, and ranking content based on similarity, diversity, and relatedness. Insights from user behaviors and content usage will also be studied to support recommendations.
Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...Ed Chi
Ed H. Chi, Palo Alto Research Center
Large-Scale Social Analytics in Wikipedia, Delicious, and Twitter
Abstract
We will illustrate an analytical research approach in social computing. Our research in Augmented Social Cognition is aimed at enhancing the ability of a group of people to remember, think, and reason. The drive to build models and theories for social computing research should further our understanding of how network science, behavioral economics, and evolutionary theories could explain how social systems work. Here we will summarize the published research we conducted on large-scale social analytics in Wikipedia, Delicious, and Twitter, and point out how social analytics can help us understand the intricacies of large social systems.
About the Speaker
Ed H. Chi is area manager and principal scientist at Palo Alto Research Center's Augmented Social Cognition Group. He leads the group in understanding how Web2.0 and Social Computing systems help groups of people to remember, think and reason. Ed completed his three degrees (B.S., M.S., and Ph.D.) in 6.5 years from University of Minnesota, and has been doing research on user interface software systems since 1993. He has been featured and quoted in the press, such as the Economist, Time Magazine, LA Times, and the Associated Press. With 20 patents and over 70 research articles, he has won awards for both teaching and research. In his spare time, Ed is an avid Taekwondo martial artist, photographer, and snowboarder.
2010 09-24-闕志克老師-cloud computing where do we gonccuscience
1) The document discusses cloud computing and Taiwan's positioning in the cloud computing industry. It outlines the basic concepts of cloud computing including infrastructure as a service, platform as a service, and software as a service models.
2) It examines where Taiwan can fit within the cloud computing "food chain" and identifies opportunities in becoming a major hardware/software solution provider for cloud service providers through developing technologies like container computers and cloud operating systems.
3) The conclusion emphasizes that cloud computing involves consolidation of IT infrastructures and usage-based resource allocation. It also states that ITRI's integrated data center solution of Container Computer 1.0 and Cloud OS 1.0 could provide 80% of the functionality of current market leaders
AIT partner university, the Capital University of Economics and Business (CUEB), Beijing consists of 14 colleges and departments, including business administration, accountancy, finance, banking and economics. Some 20,000 students attend the university, undertaking programmes at bachelor, master and doctoral degree level. Approximately half of this number are full-time undergraduate students.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help boost feelings of calmness and well-being.
This presentation will guide you selecting custom window fashions. Learn what the actual cost is to have custom drapery and window fashions in your home.
Katherine Mansfield was an influential New Zealand writer known for her unconventional lifestyle and new style of short story writing. She published her first story at age 9 and went on to study in England. Her works were greatly influenced by Anton Chekhov and she published many short stories during her most productive period while struggling with tuberculosis. Sadly, her life was cut short at age 34 when she suffered a fatal lung hemorrhage while seeking unconventional cures for her illness.
2010 07-26-interdisciplinary research and learningnccuscience
This document summarizes Ming-Yang Kao's talk on interdisciplinary research and learning. The talk covered Kao's research areas including algorithms, finance, DNA self-assembly, and more. It provided examples of projects in predicting stock markets using computational complexity and data compression techniques. The talk also discussed designing index-based stock portfolios as a hard computational problem. Finally, it offered general thoughts on opportunities, values, and strategies for interdisciplinary research and learning.
The document discusses the neurophysiological mechanisms and neurobehavioral model of insomnia. It begins by outlining the two main systems that regulate sleep and wakefulness - the homeostatic system and circadian system, as well as the arousal system. It then presents a neurobehavioral model of insomnia which involves interactions between these physiological systems and psychological/behavioral factors. The model posits that insomnia may stem from imbalances or dysfunctions in the homeostatic system, circadian system and/or arousal system, as well as maladaptive sleep-related cognitions and behaviors. Clinical implications and studies on insomnia patients are discussed to support this neurobehavioral conceptualization of insomnia.
Slides from presentation at CHI2015:
Paper Title: Designing for Citizen Data Analysis: A Cross-Sectional Case Study of a Multi-Domain Citizen Science Platform
Abstract:
Designing an effective and sustainable citizen science (CS) project requires consideration of a great number of factors. This makes the overall process unpredictable, even when a sound, user-centred design approach is followed by an experienced team of UX designers. Moreover, when such systems are deployed, the complexity of the resulting interactions challenges any attempt to generalisation from retrospective analysis. In this paper, we present a case study of the largest single platform of citizen driven data analysis projects to date, the Zooniverse. By eliciting, through structured reflection, experiences of core members of its design team, our grounded analysis yielded four sets of themes, focusing on Task Specificity, Community Development, Task Design and Public Relations and Engagement. For each, we propose a set of design claims (DCs), drawing comparisons to the literature on crowdsourcing and online communities to contextualise our findings.
EdChang - Parallel Algorithms For Mining Large Scale Datagu wendong
Parallel algorithms are used for mining large-scale data. Edward Chang from Google Research in Beijing discusses parallel algorithms for key machine learning subroutines like frequent itemset mining, latent Dirichlet allocation, clustering, and support vector machines. These algorithms allow distributed computing across large datasets.
Towards a Social Learning Analytics for Online Communities of Practice for Ed...dcambrid
Presentation on social learning analytics for online professional learning by Kathleen Perez-Lopez and I at Learning Analytics and Knowledge, May 2, 2012 in Vancouver.
Netnography is a type of online ethnographic research that adapts traditional participant observation methods to study online communities. It involves planning research topics and questions, preparing for entry into online communities by considering ethics and data collection, observing community interactions, collecting archive, elicited, and fieldnote data, analyzing data through coding and developing understanding, and representing findings while considering criteria like coherence and reflexivity. Key steps include gaining entrée into communities by contributing meaningfully, collecting different types of qualitative data, analyzing data through coding and developing insights, and evaluating research quality.
Joining the docs: Back to the Future with Taxonomy and the New SemanticsRussell Blakeborough
An introduction to a web system that automatically works out 'you may also be interested in.." style links, and performs semantic clustering on folksonomies.
- NUTN Library
- NUTN CSIE Department
- NUTN Digital Art and Interactive Design Lab
- NUTN Student Affairs Office
- NUTN Career Planning and Counseling Center
- NUTN College of Liberal Arts
- NUTN College of Science and Engineering
- NUTN College of Management
- NUTN College of Electrical Engineering and Computer Science
- NUTN College of Human Ecology
- NUTN College of Arts
- NUTN College of Medicine
- NUTN College of Nursing
- NUTN College of Education
- NUTN College of Social Sciences
- NUTN College of Communication
- NUTN College of
This document discusses using linked open data to find experts who can solve problems from different domains. It presents challenges in open innovation on the semantic web, such as finding experts from various sources and related domains. Methods for expert finding before linked data are described. The document then proposes using linked data metrics based on topic distribution and proximity to select expert finding hypotheses. Finally, it describes a prototype expert finding system and the next challenge of exploring relevant knowledge domains to expand expert search.
This document describes a collective intelligence tool called the Evidence Hub for evidence-based policy deliberations. It discusses how the Evidence Hub aims to harness the collective intelligence of online communities to crowdsource policy deliberation around complex issues. It provides an overview of the conceptual model, prototype tools, and case studies using the Evidence Hub including for educational policy issues. The Evidence Hub allows users to collaboratively annotate resources, make semantic connections between ideas, and engage in structured online discussions to facilitate the emergence of collective intelligence around contested policy topics.
Global lodlam_communities and open cultural dataMinerva Lin
This document provides an overview of linked open data in libraries, archives, and museums. It defines linked open data and open cultural data, and discusses their importance in enabling connections and collaboration. The history and role of communities in advancing open cultural data initiatives are described. Key events like the LODLAM summits that brought the community together are summarized. The document promotes open data standards and licensing to realize the full potential of linked open cultural data.
Transdisciplinary Research: A short introductiontyndallcentreuea
This document provides an introduction to transdisciplinary research from the Network for Transdisciplinary Research (td-net). It defines transdisciplinary research as aiming to solve societal problems through close interaction with stakeholders. The research process links scientific knowledge production with societal problem solving through co-production of knowledge. Principles of transdisciplinary research include grasping complexity, considering diverse perspectives, linking different types of knowledge, and promoting the common good. Stakeholder participation and collaboration across disciplines are key to applying these principles.
Keynote reusability measurement and social community analysis from mooc con...HannibalHsieh
This document summarizes a project on analyzing reusability and social communities from MOOC content users. The project aims to:
1) Specify an exchangeable learning object format to encourage content sharing across platforms.
2) Develop an authoring/search tool and repository to facilitate content creation and discovery.
3) Build an online social community tool to promote interaction among MOOC users.
The project will analyze learning objects, develop metadata standards, and define functions for searching, matching, and ranking content based on similarity, diversity, and relatedness. Insights from user behaviors and content usage will also be studied to support recommendations.
Large Scale Social Analytics on Wikipedia, Delicious, and Twitter (presented ...Ed Chi
Ed H. Chi, Palo Alto Research Center
Large-Scale Social Analytics in Wikipedia, Delicious, and Twitter
Abstract
We will illustrate an analytical research approach in social computing. Our research in Augmented Social Cognition is aimed at enhancing the ability of a group of people to remember, think, and reason. The drive to build models and theories for social computing research should further our understanding of how network science, behavioral economics, and evolutionary theories could explain how social systems work. Here we will summarize the published research we conducted on large-scale social analytics in Wikipedia, Delicious, and Twitter, and point out how social analytics can help us understand the intricacies of large social systems.
About the Speaker
Ed H. Chi is area manager and principal scientist at Palo Alto Research Center's Augmented Social Cognition Group. He leads the group in understanding how Web2.0 and Social Computing systems help groups of people to remember, think and reason. Ed completed his three degrees (B.S., M.S., and Ph.D.) in 6.5 years from University of Minnesota, and has been doing research on user interface software systems since 1993. He has been featured and quoted in the press, such as the Economist, Time Magazine, LA Times, and the Associated Press. With 20 patents and over 70 research articles, he has won awards for both teaching and research. In his spare time, Ed is an avid Taekwondo martial artist, photographer, and snowboarder.
Keynote presentation for the International Semantic Web Conference in Athens Greece, on November 9, 2023. The talk addresses the generative AI explosion and its potential impacts on the Semantic Web and Knowledge Graph communities and, in fact, may spark a research Renaissance.
Abstract:
We are living in an age of rapidly advancing technology. History may view this period as one in which generative artificial intelligence is seen as reshaping the landscape and narrative of many technology-based fields of research and application. Times of disruptions often present both opportunities and challenges. We will discuss some areas that may be ripe for consideration in the field of Semantic Web research and semantically-enabled applications. Semantic Web research has historically focused on representation and reasoning and enabling interoperability of data and vocabularies. At the core are ontologies along with ontology-enabled (or ontology-compatible) knowledge stores such as knowledge graphs. Ontologies are often manually constructed using a process that (1) identifies existing best practice ontologies (and vocabularies) and (2) generates a plan for how to leverage these ontologies by aligning and augmenting them as needed to address requirements. While semi-automated techniques may help, there is typically a significant portion of the work that is often best done by humans with domain and ontology expertise. This is an opportune time to rethink how the field generates, evolves, maintains, and evaluates ontologies. We consider how hybrid approaches, i.e., those that leverage generative AI components along with more traditional knowledge representation and reasoning approaches to create improved processes. The effort to build a robust ontology that meets a use case can be large. Ontologies are not static however and they need to evolve along with knowledge evolution and expanded usage. There is potential for hybrid approaches to help identify gaps in ontologies and/or refine content. Further, ontologies need to be documented with term definitions and their provenance. Opportunities exist to consider semi-automated techniques for some types of documentation, provenance, and decision rationale capture for annotating ontologies. The area of human-AI collaboration for population and verification presents a wide range of areas of research collaboration and impact. Ontologies need to be populated with class and relationship content. Knowledge graphs and other knowledge stores need to be populated with instance data in order to be used for question answering and reasoning. Population of large knowledge graphs can be time consuming. Generative AI holds the promise to create candidate knowledge graphs that are compatible with the ontology schema. The knowledge graph should contain provenance information identifying how the content was populated and its source and correctness and currency should be checked. A human-AI assistant approach is presented.
Prior empirical and theoretical work has discussed the role of dominant search engine plays in the function of information gatekeeping on the Web, and there are reports on the high ranking of Wikipedia website among the search engine result pages (SERP). However, little research has been conducted on non-Google search engines and non-English versions of user-generated encyclopedias. This paper proposes a method to quantify the “display” gatekeeping differences of the SERP ranking and presents findings based on the Chinese SERP data. Based on 2,500 mainly-Chinese-language search queries, the data set includes the SERP outcome of four Chinese-speaking regions (mainland China, Singapore, Hong Kong and Taiwan) provided by three major search engines (Baidu, and Google and Yahoo), covering over 97% of the search engine market in each region. The findings, analysed and visualized using network analysis techniques, demonstrate the followings: major user-generated encyclopedias are among the most visible; localization factors matter (certain search engine variants produce the most divergent outcomes, especially mainland Chinese ones). The indicated strong effects of “network gatekeeping” by search engines also suggest similar dynamics inside user-generated encyclopedias.
A Tale of Two Platforms: Emerging communicative patterns in two scientific bl...Cornelius Puschmann
Invited talk given as part of the Nuffield/Oxford Internet Institute Social Netowkrs Seminar Series at Nuffield College. I thank Bernie Hogan for inviting me and Ralph Schroeder and Eric Meyer for being my hosts at OII.
A description of the idea of transitive credit, a potential way of quantifying the contribution of various people to scholarly products, where the contribution is direct or indirect (through additional levels of products)
The document discusses designing an online collaborative workshop. It provides background on principles for successful online communities and collaborative tools. It analyzes various technology options before determining that a customized Ning platform best meets the client's needs of encouraging online sharing, feedback and scaling capabilities. Key features of the customized Ning prototype are described.
JISC Institutional Innovation Support and SynthesisGeorge Roberts
This document outlines the support and synthesis activities of an institutional innovation support team. It discusses using an asset-based community development approach to improve educational technology projects. It identifies initial project clusters and stakeholders. It also describes the support structure including analysis and discovery teams, support and synthesis activities like conferences and cluster events, and benefits like fostering institutional innovation centers.
Similar to 2010 09-17-張志威老師-confucius & “its” intelligent disciples (20)
How Barcodes Can Be Leveraged Within Odoo 17Celine George
In this presentation, we will explore how barcodes can be leveraged within Odoo 17 to streamline our manufacturing processes. We will cover the configuration steps, how to utilize barcodes in different manufacturing scenarios, and the overall benefits of implementing this technology.
How to Manage Reception Report in Odoo 17Celine George
A business may deal with both sales and purchases occasionally. They buy things from vendors and then sell them to their customers. Such dealings can be confusing at times. Because multiple clients may inquire about the same product at the same time, after purchasing those products, customers must be assigned to them. Odoo has a tool called Reception Report that can be used to complete this assignment. By enabling this, a reception report comes automatically after confirming a receipt, from which we can assign products to orders.
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapitolTechU
Slides from a Capitol Technology University webinar held June 20, 2024. The webinar featured Dr. Donovan Wright, presenting on the Department of Defense Digital Transformation.
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.pptHenry Hollis
The History of NZ 1870-1900.
Making of a Nation.
From the NZ Wars to Liberals,
Richard Seddon, George Grey,
Social Laboratory, New Zealand,
Confiscations, Kotahitanga, Kingitanga, Parliament, Suffrage, Repudiation, Economic Change, Agriculture, Gold Mining, Timber, Flax, Sheep, Dairying,
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...TechSoup
Whether you're new to SEO or looking to refine your existing strategies, this webinar will provide you with actionable insights and practical tips to elevate your nonprofit's online presence.
7. Outline
• Search + Social Synergy
• Search Social
• Social Search
• Scalability and Elasticity
2010-9-17 Ed Chang@NCCU Invited Talk 7
8. Related Papers
• AdHeat (Social Ads):
– AdHeat: An Influence-based Diffusion Model for Propagating Hints to Match Ads,
H.J. Bao and E. Y. Chang, WWW 2010 (best paper candidate), April 2010.
– Parallel Spectral Clustering in Distributed Systems,
Wen-Yen Chen, Yangqiu Song, Hongjie Bai, Chih-Jen Lin, and E. Y. Chang,
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2010.
• UserRank:
– Confucius and its Intelligent Disciples, X. Si, E. Y. Chang, Z. Gyongyi, VLDB,
September 2010 .
– Topic-dependent User Rank, Xiance Si, Z. Gyongyi, E. Y. Chang, and M.S. Sun,
Google Technical Report.
• Large-scale Collaborative Filtering:
– PLDA*: Parallel Latent Dirichlet Allocation for Large-Scale Applications, ACM
Transactions on Internet Technology, 2010.
– Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior,
W.-Y. Chen, J. Chu, E. Y. Chang, WWW 2009: 681-690.
– Combinational Collaborative Filtering for Personalized Community Recommendation,
W.-Y. Chen, E. Y. Chang, KDD 2008: 115-123.
– Parallel SVMs, E. Y. Chang, et al., NIPS 2007.
2010-9-17 Ed Chang@NCCU Invited Talk 8
9. Query: What are must-see attractions at Yellowstone
2010-9-17 Ed Chang@NCCU Invited Talk 9
10. Query: What are must-see attractions at Yellowstone
2010-9-17 Ed Chang@NCCU Invited Talk 10
11. Query: What are must-see attractions at Yosemite
2010-9-17 Ed Chang@NCCU Invited Talk 11
12. Query: What are must-see attractions at Beijing
Hotel ads
2010-9-17 Ed Chang@NCCU Invited Talk 12
17. Confucius: Google Q&A
Providing High-Quality Answers in a Timely Fashion
Differentiated Content
DC
U
S C
Search U
Community
CCS/Q&A
Discussion/Question
Trigger a discussion/question session during search
Provide labels to a post (semi-automatically)
Given a post, find similar posts (automatically)
Evaluate quality of a post, relevance and originality
Evaluate user credentials in a topic sensitive way
Route questions to experts
Provide most relevant, high-quality content for Search to index
Generate answers using NLP
2010-9-17 Ed Chang@NCCU Invited Talk 17
18. Confucius: Google Q&A
Providing High-Quality Answers in a Timely Fashion
Differentiated Content
DC
U
S C
Search U
Community
CCS/Q&A
Discussion/Question
Trigger a discussion/question session during search
Provide labels to a post (semi-automatically)
Given a post, find similar posts (automatically)
Evaluate quality of a post, relevance and originality
Evaluate user credentials in a topic sensitive way
Route questions to experts
Provide most relevant, high-quality content for Search to index
Generate answers using NLP
2010-9-17 Ed Chang@NCCU Invited Talk 18
19. Q&A Uses Machine Learning
Label suggestion
using LDA algorithm.
• Real Time topic-to-
topic (T2T)
recommendation
using LDA algorithm.
• Gives out related high
quality links to
previous questions
before human answer
appear.
2010-9-17 Ed Chang@NCCU Invited Talk 19
20. Collaborative Filtering
Labels/Qs
1 1 1
1 1 1 1 1 1
Based on membership so far, 1 1 1
and memberships of others 1 1 1 1
Questions
1
1 1
Predict further membership 1 1
1 1
1 1
1 1
1 1 1 1 1
2010-9-17 Ed Chang@NCCU Invited Talk 20
22. FIM Preliminaries
• Observation 1: If an item A is not frequent, any pattern contains
A won’t be frequent [R. Agrawal]
use a threshold to eliminate infrequent items
{A} {A,B}
• Observation 2: Patterns containing A are subsets of (or found
from) transactions containing A [J. Han]
divide-and-conquer: select transactions containing A to form a
conditional database (CDB), and find patterns containing A from
that conditional database
{A, B}, {A, C}, {A} CDB A
{A, B}, {B, C} CDB B
2010-9-17 Ed Chang@NCCU Invited Talk 22
23. Preprocessing
• According to
Observation 1, we
count the support of
each item by
scanning the
database, and
eliminate those
infrequent items
from the
transactions.
• According to
Observation 3, we
sort items in each
transaction by the
order of descending
support value.
2010-9-17 Ed Chang@NCCU Invited Talk 23
24. Example of Projection
Example of Projection of a database into CDBs.
Left: sorted transactions in order of f, c, a, b, m, p
Right: conditional databases of frequent items
2010-9-17 Ed Chang@NCCU Invited Talk 25
25. Example of Projection
Example of Projection of a database into CDBs.
Left: sorted transactions;
Right: conditional databases of frequent items
2010-9-17 Ed Chang@NCCU Invited Talk 26
26. Example of Projection
Example of Projection of a database into CDBs.
Left: sorted transactions;
Right: conditional databases of frequent items
2010-9-17 Ed Chang@NCCU Invited Talk 27
27. Recursive Projections [H. Li, et al. ACM RS 08]
• Recursive projection form
a search tree
• Each node is a CDB
• Using the order of items to
prevent duplicated CDBs.
• Each level of breath-first
search of the tree can be
done by a MapReduce
iteration.
• Once a CDB is small
enough to fit in memory,
we can invoke FP-growth
to mine this CDB, and no
more growth of the sub-
tree.
2010-9-17 Ed Chang@NCCU Invited Talk 28
29. Collaborative Filtering
Labels/Related Qs
1 1 1
1 1 1 1 1 1
Based on membership so far, 1 1 1
Questions
and memberships of others 1 1 1 1
1
1 1
Predict further membership 1 1
1 1
1 1
1 1
1 1 1 1 1
2010-9-17 Ed Chang@NCCU Invited Talk 30
30. Distributed Latent Dirichlet Allocation (LDA)
• Search Users/Music/Ads/Question
– Construct a latent layer for better
for semantic matching
Users/Music/Ads/Answers
• Example:
– iPhone crack
– Apple pie
1 recipe pastry for a 9
Documents inch double crust How to install apps on
9 apples, 2/1 cup, Apple mobile phones?
brown sugar
Topic
Distribution
• Other Collaborative Filtering Apps
– Recommend Users Users
Topic
Distribution
– Recommend Music Users
– Recommend Ads Users
– Recommend Answers Q
iPhone crack Apple pie
User quries • Predict the ? In the light-blue cells
2010-9-17 Ed Chang@NCCU Invited Talk 31
31. Latent Dirichlet Allocation [D. Blei, M. Jordan 04]
• α: uniform Dirichlet φ prior
α
for per document d topic
distribution (corpus level
parameter)
θ
• β: uniform Dirichlet φ prior
for per topic z word
distribution (corpus level
parameter) z
• θd is the topic distribution of
document d (document level)
• zdj the topic if the jth word in w φ β
d, wdj the specific word (word Nm K
level)
M
2010-9-17 Ed Chang@NCCU Invited Talk 32
32. Combinational Collaborative Filtering Model (CCF)
【KDD2008】
Communities Communities Communities
users descriptions users descriptions
2010-9-17
33 Ed Chang@NCCU Invited Talk
33. LDA Gibbs Sampling: Inputs & Outputs
Inputs:
1. training data: documents as
bags of words
words topics
2. parameter: the number of
topics
docs docs
Outputs:
1. model parameters: a co-
occurrence matrix of topics and
words. topics
2. by-product: a co-occurrence words
matrix of topics and documents.
2010-9-17 Ed Chang@NCCU Invited Talk 34
34. Parallel Gibbs Sampling
Inputs:
1. training data: documents as
bags of words
words topics
2. parameter: the number of
topics
docs docs
Outputs:
1. model parameters: a co-
occurrence matrix of topics and
words. topics
2. by-product: a co-occurrence words
matrix of topics and documents.
2010-9-17 Ed Chang@NCCU Invited Talk 35
35. PLDA -- enhanced parallel LDA
[ACM Transactions on IT]
• PLDA is restricted by memory: Topic-word matrix has
to fit into memory
• Restricted by Amdahl’s Law: communication costs
too high
2010-9-17 Ed Chang@NCCU Invited Talk
36
36. PLDA -- enhanced parallel LDA
• Take advantage of bag of words modeling: each Pw
machine processes vocabulary in a word order
• Pipelining: fetching the updated topic distribution
matrix while doing Gibbs sampling
2010-9-17 Ed Chang@NCCU Invited Talk
37
37. Speedup
1,500x using 2,000 machines
2010-9-17 Ed Chang@NCCU Invited Talk 38
39. Confucius: Google Q&A
Differentiated Content
DC
U
S C
Search U
Community
CCS/Q&A
Discussion/Question
Trigger a discussion/question session during search
Provide labels to a post (semi-automatically)
Given a post, find similar posts (automatically)
Evaluate quality of a post, relevance and originality
Evaluate user credentials in a topic sensitive way
Route questions to experts
Provide most relevant, high-quality content for Search to index
NLQA
2010-9-17 Ed Chang@NCCU Invited Talk 40
40. UserRank
• Rank users by quantity (number of links)
and quality (weights on links) of
contributions
• Quality include:
– Relevance. Is an answer relevant to the
Q? Measured by KL divergence
between latent-topic vectors of A and Q
– Coverage. Compared among different
answers
– Originality. Detect potential plagiarism
and spam
– Promptness. Time between Q and A
posting time
2010-9-17 Ed Chang@NCCU Invited Talk 41
41. Outline
• Search + Social Synergy
• Search Social
• Social Search
• Scalability and Elasticity
2010-9-17 Ed Chang@NCCU Invited Talk 42
42. Outline
• Search + Social Synergy
• Search Social
• Social Search
• Scalability and Elasticity
2010-9-17 Ed Chang@NCCU Invited Talk 43
43. Social + Data Networks
2010-9-17 Ed Chang@NCCU Invited Talk 44
50. Google文件系统 (GFS)
GFS Master
Replicas
Master Client
GFS Master
Client
C0 C1 C1 C0 C3
C3 C4 C3 C5 C4
• Master manages metadata
• Data transfers happen directly between
clients/chunkservers
• Files broken into chunks (typically 64 MB)
• Chunks triplicated across three machines for safety
• See SOSP^03 paper at Ed Chang@NCCU Invited Talk
2010-9-17 51
http://labs.google.com/papers/gfs.html
51. 大表格(Big Table)
• 内存管理
• <Row, Column, Timestamp> triple for key -
lookup, insert, and delete API
• Example
– Row: a Web Page
– Columns: various aspects of the page
– Time stamps: time when a column was fetched
2010-9-17 Ed Chang@NCCU Invited Talk 52
52. MapReduce
Data Block 1
map
Data Data Block 2 Results
datadatada Reduce datadata
tadatadata datadata
datadatada map datadata
tadatadata
Data Block 3 Data Block 5
map
map
map
Data Block 4
2010-9-17 Ed Chang@NCCU Invited Talk 53
61. 新parellel算法
MapReduce Pregel MPI
GFS/IO and task rescheduling Yes No No
overhead between iterations +1 +1
Flexibility of computation model AllReduce only ? Flexible
+0.5 +1 +1
Efficient AllReduce Yes Yes Yes
+1 +1 +1
Recover from faults between Yes Yes Apps
iterations +1 +1
Recover from faults within each Yes Yes Apps
iteration +1 +1
Final Score for scalable machine 3.5 5 4
learning
2010-9-17 Ed Chang@NCCU Invited Talk 62
62. 設計挑戰
• 新程式設計模型
– Parallel
– Flash (SSD); GPUs?
• 能源節約
– Hardware, software, warehouse
– Encode/compress/transmit data well
• 故障恢復
– Hardware/software faults
– Deal with stragglers
2010-9-17 Ed Chang@NCCU Invited Talk 63
80. Concluding Remarks
• Search + Social
• Increasing quantity and complexity of data demands scalable
solutions
• Have parallelized key subroutines for mining massive data sets
– Spectral Clustering [ECML 08]
– Frequent Itemset Mining [ACM RS 08]
– PLSA [KDD 08]
– LDA [WWW 09, AAIM 09]
– UserRank [Google TR 09]
– Support Vector Machines [NIPS 07]
• Relevant papers
– http://infolab.stanford.edu/~echang/
• Open Source PSVM, PLDA
– http://code.google.com/p/psvm/
– http://code.google.com/p/plda/
2010-9-17 Ed Chang@NCCU Invited Talk 81
81. Collaborators
• Prof. Chih-Jen Lin (NTU)
• Hongjie Bai (Google)
• Hongji Bao (Google)
• Wen-Yen Chen (UCSB)
• Jon Chu (MIT)
• Haoyuan Li (PKU)
• Zhiyuan Liu (Tsinghua)
• Xiance Si (Tsinghua)
• Yangqiu Song (Tsinghua)
• Matt Stanton (CMU)
• Yi Wang (Google)
• Dong Zhang (Google)
• Kaihua Zhu (Google)
2010-9-17 Ed Chang@NCCU Invited Talk 82
82. References
[1] Alexa internet. http://www.alexa.com/.
[2] D. M. Blei and M. I. Jordan. Variational methods for the dirichlet process. In Proc. of the 21st international
conference on Machine learning, pages 373-380, 2004.
[3] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993-
1022, 2003.
[4] D. Cohn and H. Chang. Learning to probabilistically identify authoritative documents. In Proc. of the Seventeenth
International Conference on Machine Learning, pages 167-174, 2000.
[5] D. Cohn and T. Hofmann. The missing link - a probabilistic model of document content and hypertext connectivity. In
Advances in Neural Information Processing Systems 13, pages 430-436, 2001.
[6] S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic
analysis. Journal of the American Society of Information Science, 41(6):391-407, 1990.
[7] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm.
Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1-38, 1977.
[8] S. Geman and D. Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE
Transactions on Pattern recognition and Machine Intelligence, 6:721-741, 1984.
[9] T. Hofmann. Probabilistic latent semantic indexing. In Proc. of Uncertainty in Articial Intelligence, pages 289-296,
1999.
[10] T. Hofmann. Latent semantic models for collaborative filtering. ACM Transactions on Information System, 22(1):89-
115, 2004.
[11] A. McCallum, A. Corrada-Emmanuel, and X. Wang. The author-recipient-topic model for topic and role discovery in
social networks: Experiments with enron and academic email. Technical report, Computer Science, University of
Massachusetts Amherst, 2004.
[12] D. Newman, A. Asuncion, P. Smyth, and M. Welling. Distributed inference for latent dirichlet allocation. In
Advances in Neural Information Processing Systems 20, 2007.
[13] M. Ramoni, P. Sebastiani, and P. Cohen. Bayesian clustering by dynamics. Machine Learning, 47(1):91-121, 2002.
2010-9-17 Ed Chang@NCCU Invited Talk 83
83. References (cont.)
[14] R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted boltzmann machines for collaborative ltering. In Proc. Of the
24th international conference on Machine learning, pages 791-798, 2007.
[15] E. Spertus, M. Sahami, and O. Buyukkokten. Evaluating similarity measures: a large-scale study in the orkut social
network. In Proc. of the 11th ACM SIGKDD international conference on Knowledge discovery in data mining,
pages 678-684, 2005.
[16] M. Steyvers, P. Smyth, M. Rosen-Zvi, and T. Griths. Probabilistic author-topic models for information discovery. In
Proc. of the 10th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 306-
315, 2004.
[17] A. Strehl and J. Ghosh. Cluster ensembles - a knowledge reuse framework for combining multiple partitions.
Journal on Machine Learning Research (JMLR), 3:583-617, 2002.
[18] T. Zhang and V. S. Iyengar. Recommender systems using linear classiers. Journal of Machine Learning Research,
2:313-334, 2002.
[19] S. Zhong and J. Ghosh. Generative model-based clustering of documents: a comparative study. Knowledge and
Information Systems (KAIS), 8:374-384, 2005.
[20] L. Admic and E. Adar. How to search a social network. 2004
[21] T.L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, pages
5228-5235, 2004.
[22] H. Kautz, B. Selman, and M. Shah. Referral Web: Combining social networks and collaborative filtering.
Communitcations of the ACM, 3:63-65, 1997.
[23] R. Agrawal, T. Imielnski, A. Swami. Mining association rules between sets of items in large databses. SIGMOD
Rec., 22:207-116, 1993.
[24] J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In
Proceedings of the Fourteenth Conference on Uncertainty in Artifical Intelligence, 1998.
[25] M.Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Trans. Inf. Syst., 22(1):143-
177, 2004.
2010-9-17 Ed Chang@NCCU Invited Talk 84
84. References (cont.)
[26] B.M. Sarwar, G. Karypis, J.A. Konstan, and J. Reidl. Item-based collaborative filtering recommendation algorithms.
In Proceedings of the 10th International World Wide Web Conference, pages 285-295, 2001.
[27] M.Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Trans. Inf. Syst., 22(1):143-
177, 2004.
[28] B.M. Sarwar, G. Karypis, J.A. Konstan, and J. Reidl. Item-based collaborative filtering recommendation algorithms.
In Proceedings of the 10th International World Wide Web Conference, pages 285-295, 2001.
[29] M. Brand. Fast online svd revisions for lightweight recommender systems. In Proceedings of the 3rd SIAM
International Conference on Data Mining, 2003.
[30] D. Goldbberg, D. Nichols, B. Oki and D. Terry. Using collaborative filtering to weave an information tapestry.
Communication of ACM 35, 12:61-70, 1992.
[31] P. Resnik, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. Grouplens: An open architecture for aollaborative
filtering of netnews. In Proceedings of the ACM, Conference on Computer Supported Cooperative Work. Pages
175-186, 1994.
[32] J. Konstan, et al. Grouplens: Applying collaborative filtering to usenet news. Communication ofACM 40, 3:77-87,
1997.
[33] U. Shardanand and P. Maes. Social information filtering: Algorithms for automating “word of mouth”. In
Proceedings of ACM CHI, 1:210-217, 1995.
[34] G. Kinden, B. Smith and J. York. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet
Computing, 7:76-80, 2003.
[35] T. Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning Journal 42, 1:177-
196, 2001.
[36] T. Hofmann and J. Puzicha. Latent class models for collaborative filtering. In Proceedings of International Joint
Conference in Artificial Intelligence, 1999.
[37] http://www.cs.carleton.edu/cs_comps/0607/recommend/recommender/collaborativefiltering.html
[38] E. Y. Chang, et. al., Parallelizing Support Vector Machines on Distributed Machines, NIPS, 2007.
[39] Wen-Yen Chen, Dong Zhang, and E. Y. Chang, Combinational Collaborative Filtering for personalized community
recommendation, ACM KDD 2008.
[40] Y. Sun, W.-Y. Chen, H. Bai, C.-j. Lin, and E. Y. Chang, Parallel Spectral Clustering, ECML 2008.
[41] Wen-Yen Chen, Jon Chu, Junyi Luan, Hongjie Bai, Yi Wang, and Edward Y. Chang , Collaborative Filtering for
Orkut Communities: Discovery of User Latent Behavior, International World Wide Web Conference (WWW)
Madrid, Spain, April 2009 .
[42] Yi Wang, Hongjie Bai, Matt Stanton, Wen-Yen Chen, and Edward Y. Chang, PLDA: Parallel Latent Dirichlet
Allocation, International Conference on Algorithmic Aspects in Information and Management, June 2009.
2010-9-17 Ed Chang@NCCU Invited Talk 85