A proposal for combining two different technologies, Solr and a triple store, in order to improve the (user) search experience by decoupling the “search” from the “view” perspective.
1. The document describes matching companies and individuals from the Crunchbase database to patent data from the PATSTAT database.
2. It outlines challenges with missing address data and disambiguating entities, and solutions used like standardizing names, adding country codes, and comparing inventors to company staff.
3. The final results found around 50,000 companies from Crunchbase that own over 12 million patents, with improved precision and recall after filtering based on applicant and inventor matches between the two databases.
This document discusses using data structures for robotic learning. It provides background on data structures and their efficiency measured by Big O notation. Common data structures like linked lists, binary trees, and quad trees are described. Game AI often uses tree structures to evaluate game states. Learning AI can also use tree structures to store learning data from experiences. Robotic learning may be able to apply similar techniques as game AI and learning AI by using data structures like trees to store learned data from experiences and interactions.
This document discusses Last.fm's use of HFiles outside of HBase. It summarizes tests performed comparing Last.fm's original plain text file format to a new binary format based on HFiles. The HFile format reduced file size by 80% and query times by over 90%. Last.fm is moving its chartserver data storage to HBase to address indexing slowness and allow different teams to use different NoSQL systems. The document also advertises two open data scientist positions at Last.fm.
HFile: A Block-Indexed File Format to Store Sorted Key-Value PairsSchubert Zhang
HFile is a mimic of Google’s SSTable. Now, it is available in Hadoop HBase-0.20.0. And the previous releases of HBase temporarily use an alternate file format – MapFile, which is a common file format in Hadoop IO package. I think HFile should also become a common file format when it becomes mature, and should be moved into the common IO package of Hadoop in the future.
This document is the introduction to the history book of the Kizhakkekara family. It acknowledges the contributions of the ancestors who established and maintained the family and its traditions over generations. It hopes that recording the history will help preserve the memories and legacy for future generations. The introduction honors the ancestors and seeks blessings for the well-being and unity of current and future family members. It concludes with a prayer asking God to protect the family and fulfill their needs and aspirations.
The document discusses competency mapping for the role of Program Director in a Management Institute. It outlines the core competencies required for the role, including personal attributes, communication skills, leadership, analytical thinking, and strategic planning. Administrative competencies like placement, faculty training, appraisals and organizational development are also outlined. Behavioral competencies such as time management, language skills, human relations and stress tolerance are discussed as important for day-to-day functioning and managing human relations effectively. The competencies required are mapped over the course of a 10 year period, with different competency levels expected in the 1st 2-3 years compared to 4-6 years and 7-9 years.
Todos tenemos sueños, pero pocos se atreven a trabajar para alcanzarlos.
En esta revista encontrarás las historias de aquellas personas que trabajando inteligentemente y formando equipos, han logrado conquistar la tan nombrada libertad financiera. Todos podemos lograrlo, si encontramos en vehículo correcto.
1. The document describes matching companies and individuals from the Crunchbase database to patent data from the PATSTAT database.
2. It outlines challenges with missing address data and disambiguating entities, and solutions used like standardizing names, adding country codes, and comparing inventors to company staff.
3. The final results found around 50,000 companies from Crunchbase that own over 12 million patents, with improved precision and recall after filtering based on applicant and inventor matches between the two databases.
This document discusses using data structures for robotic learning. It provides background on data structures and their efficiency measured by Big O notation. Common data structures like linked lists, binary trees, and quad trees are described. Game AI often uses tree structures to evaluate game states. Learning AI can also use tree structures to store learning data from experiences. Robotic learning may be able to apply similar techniques as game AI and learning AI by using data structures like trees to store learned data from experiences and interactions.
This document discusses Last.fm's use of HFiles outside of HBase. It summarizes tests performed comparing Last.fm's original plain text file format to a new binary format based on HFiles. The HFile format reduced file size by 80% and query times by over 90%. Last.fm is moving its chartserver data storage to HBase to address indexing slowness and allow different teams to use different NoSQL systems. The document also advertises two open data scientist positions at Last.fm.
HFile: A Block-Indexed File Format to Store Sorted Key-Value PairsSchubert Zhang
HFile is a mimic of Google’s SSTable. Now, it is available in Hadoop HBase-0.20.0. And the previous releases of HBase temporarily use an alternate file format – MapFile, which is a common file format in Hadoop IO package. I think HFile should also become a common file format when it becomes mature, and should be moved into the common IO package of Hadoop in the future.
This document is the introduction to the history book of the Kizhakkekara family. It acknowledges the contributions of the ancestors who established and maintained the family and its traditions over generations. It hopes that recording the history will help preserve the memories and legacy for future generations. The introduction honors the ancestors and seeks blessings for the well-being and unity of current and future family members. It concludes with a prayer asking God to protect the family and fulfill their needs and aspirations.
The document discusses competency mapping for the role of Program Director in a Management Institute. It outlines the core competencies required for the role, including personal attributes, communication skills, leadership, analytical thinking, and strategic planning. Administrative competencies like placement, faculty training, appraisals and organizational development are also outlined. Behavioral competencies such as time management, language skills, human relations and stress tolerance are discussed as important for day-to-day functioning and managing human relations effectively. The competencies required are mapped over the course of a 10 year period, with different competency levels expected in the 1st 2-3 years compared to 4-6 years and 7-9 years.
Todos tenemos sueños, pero pocos se atreven a trabajar para alcanzarlos.
En esta revista encontrarás las historias de aquellas personas que trabajando inteligentemente y formando equipos, han logrado conquistar la tan nombrada libertad financiera. Todos podemos lograrlo, si encontramos en vehículo correcto.
Christian Lacroix is a renowned French fashion designer born in southern France. He studied at several prominent French universities and has shown talent in many areas of design including fashion, perfume, jewelry, and costumes. Today Lacroix has over 1000 retail locations worldwide selling his designs. The document then provides product details and pricing for various Lacroix accessories, clothing, and other items.
The document discusses developing faith integration by teaching the gospel message based on the Book of John. It provides an introduction to the Book of John and outlines the gospel message. It then discusses objectives, content, strategies, and goals for teaching faith through the gospel message, focusing on developing student's faith, belief in Jesus Christ, and hope for eternal life. The conclusion emphasizes that the Gospel of John indicates Jesus Christ as creator, redeemer, and hope for everlasting life and stresses believing in Jesus and hoping for eternal life.
The failed "top kill" effort to stop the BP oil spill in the Gulf of Mexico by pumping drilling mud into the well. BP will now try a new approach to stop the flow of crude oil fouling coastal areas. Over 1.2 million gallons of mud were used in the "top kill" attempt over three days but it did not work to plug the leak. The spill is still flowing at an estimated rate of 18-40 million gallons and is now the largest in U.S. history, surpassing the Exxon Valdez disaster. BP said it is readying a new plan but did not provide details on its likelihood of success.
- Apprendere il significato di problema
– Conoscere il processo risolutivo di un problema
– Apprendere la differenza tra risolutore ed esecutore
– Conoscere il concetto di algoritmo
– Conoscere le proprietà di un algoritmo
The Military Spouse Employment Partnership (MSEP) has grown to include 96 partner companies, up from 72 previously. The program aims to connect military spouses with flexible employment opportunities and has already provided jobs to over 5,600 spouses. Updates to the MSEP website will allow spouses to directly apply for jobs and be notified of resume matches. The Department of Defense is committed to continuing improvements to help reduce high military spouse unemployment and wage gaps.
Desmania is a multidisciplinary design house with over 21 years of experience in product, packaging, and retail design. It has a strong workforce of designers, engineers, and modelers working across various industry segments. Desmania believes in design thinking and a user-centric approach to excavate hidden user needs and deliver total brand experiences. It has state-of-the-art facilities for prototyping, 3D printing, modeling, and more. Some of Desmania's clients include major corporations and MSMEs. The document then highlights three of the author's projects at Desmania - a fan design inspired by Indian culture, new packaging for an oil company, and a serving tray with modern yet classic designs.
The document compares the calorie, fat, and sugar content of meals from Maria, Jataya, a kid, and an adult to two healthy meals and one other meal. It finds that unhealthy meals like Maria's sausage croissant and kid's chicken strips and fruit punch have much higher calorie, fat, and sugar contents than healthier options like a side salad, water, and apple or a chicken pita and sprite. The document thanks the Jack in the Box website for providing the nutritional information and pictures used.
This document summarizes the assessment of self-regulated learning. It discusses widely used methodologies like self-report surveys such as the Motivated Strategies for Learning Questionnaire and the Self-Regulated Learning Interview Scale. It also discusses innovative methodologies like trace measures that track student actions, model tracing of human actions, and computerized self-evaluation measures with mandatory reports and performance comparisons. The study aims to systematically review the development of technology-supported assessments of self-regulated learning.
The document describes the author's observations of several retail stores, including an Apple store, gold jewelry store, furniture expo, flower market, Domino's pizza outlet, and city supermarket. For each store, the author notes details about the environment, customers, products, noise levels, lighting, flooring, and how these elements make them feel. The stores are contrasted in terms of whether the door is open or closed, the size and font of signage, and whether the design draws customers in before they enter.
This document provides information on travel packages and membership opportunities with DreamTrips. It includes descriptions of various international vacation packages with prices and details. It also outlines the benefits and compensation structure for becoming a DreamTrips representative.
1. O documento apresenta várias dinâmicas de grupo com objetivos de integração, autoconhecimento e desenvolvimento de valores. As dinâmicas envolvem exercícios como apresentação com adjetivos, escolha de fotografias, identificação de sentimentos com cores e mais.
We have lots data. Let’s use it create more credible estimates to help tame the growth beast. In spite the estimating community’s efforts to provide credible estimates, government programs still seem to deliver less than promised, cost more than planned, and take longer than needed. When estimates are consistently biased low:
- Decisions of choice are distorted
- Cost growth causes more growth as programs are stretched out to fund portfolios with fixed budgets
- Taxpayers become more cynical and negative about government
- The estimating community’s credibility is seriously questioned
A empresa de tecnologia anunciou um novo smartphone com câmera aprimorada, maior tela e melhor desempenho. O dispositivo também possui recursos adicionais de inteligência artificial e segurança de dados aprimorados. O lançamento do novo smartphone está programado para o final deste ano.
An effective message for a campaign or organizing effort should be credible, concise, relevant, and compelling. It answers the question of "why" by explaining why the issue matters and why people should care. The message is a conversation, not just a slogan or list of demands. It must be credible based on trustworthy content and messenger. It should also be compelling to the audience by connecting on a personal level. Finally, an effective message is concise and clear without jargon, and contrasts the choices to offer a decision between the campaign and its opposition.
The document discusses 9 collision events that could occur from 53 emerging realities, including the personal consumer experiencing accelerating serendipity and having simultaneous identities, everything becoming connected, and disruptive inventions in areas like robotics, self-driving cars, and alternative energies. The events also explore trends like the power of the crowd, a new age of sharing through renting and accessing over owning, and computers becoming super-smart through advances like passive computing and speech recognition.
This document provides an assignment guide for an introductory psychology course covering the following:
1) It outlines the reading assignments, homework assignments, quizzes, and important reminders for each class day from June 13th to July 4th.
2) Students are expected to complete reading assignments from the textbook, supplementary materials, and work shows before each class and complete related homework assignments.
3) Quizzes will cover the reading assignments and important concepts for each class, and students are warned to thoroughly study all sections, including enrichment materials, to earn an A.
4) The guide provides important reminders about course procedures, materials, and expectations to help students succeed in the course.
The document discusses several announcements and events from the Department of Defense and organizations that support military families:
1) The launch of the Military Spouse Employment Partnership program to connect military spouses to career opportunities with over 70 employer partners.
2) An upcoming hiring fair in Los Angeles on July 10th that is open to both veterans and military spouses, and will be attended by Prince William and Catherine.
3) Updates from the Family Advocacy Program on a meeting between program staff and service representatives, as well as a DOD summit on preventing child and domestic abuse fatalities.
4) Upcoming commissary on-site sales for Guard/Reserve members in several locations throughout July.
The Archives Forum - The National Archives - 02 March 2011David F. Flanders
The document summarizes a presentation given by David F. Flanders about digital infrastructure innovation and the future of archives. It discusses how archives can innovate with limited budgets in the short term by improving search engine optimization, using application programming interfaces, and engaging communities. In the medium term, archives can prepare for increased budgets by crowdsourcing content and metadata from communities. Long term innovations may include addressing why digitization is endless, understanding how context is missing from the web, embracing open licensing, and preparing for technologies like augmented reality.
Christian Lacroix is a renowned French fashion designer born in southern France. He studied at several prominent French universities and has shown talent in many areas of design including fashion, perfume, jewelry, and costumes. Today Lacroix has over 1000 retail locations worldwide selling his designs. The document then provides product details and pricing for various Lacroix accessories, clothing, and other items.
The document discusses developing faith integration by teaching the gospel message based on the Book of John. It provides an introduction to the Book of John and outlines the gospel message. It then discusses objectives, content, strategies, and goals for teaching faith through the gospel message, focusing on developing student's faith, belief in Jesus Christ, and hope for eternal life. The conclusion emphasizes that the Gospel of John indicates Jesus Christ as creator, redeemer, and hope for everlasting life and stresses believing in Jesus and hoping for eternal life.
The failed "top kill" effort to stop the BP oil spill in the Gulf of Mexico by pumping drilling mud into the well. BP will now try a new approach to stop the flow of crude oil fouling coastal areas. Over 1.2 million gallons of mud were used in the "top kill" attempt over three days but it did not work to plug the leak. The spill is still flowing at an estimated rate of 18-40 million gallons and is now the largest in U.S. history, surpassing the Exxon Valdez disaster. BP said it is readying a new plan but did not provide details on its likelihood of success.
- Apprendere il significato di problema
– Conoscere il processo risolutivo di un problema
– Apprendere la differenza tra risolutore ed esecutore
– Conoscere il concetto di algoritmo
– Conoscere le proprietà di un algoritmo
The Military Spouse Employment Partnership (MSEP) has grown to include 96 partner companies, up from 72 previously. The program aims to connect military spouses with flexible employment opportunities and has already provided jobs to over 5,600 spouses. Updates to the MSEP website will allow spouses to directly apply for jobs and be notified of resume matches. The Department of Defense is committed to continuing improvements to help reduce high military spouse unemployment and wage gaps.
Desmania is a multidisciplinary design house with over 21 years of experience in product, packaging, and retail design. It has a strong workforce of designers, engineers, and modelers working across various industry segments. Desmania believes in design thinking and a user-centric approach to excavate hidden user needs and deliver total brand experiences. It has state-of-the-art facilities for prototyping, 3D printing, modeling, and more. Some of Desmania's clients include major corporations and MSMEs. The document then highlights three of the author's projects at Desmania - a fan design inspired by Indian culture, new packaging for an oil company, and a serving tray with modern yet classic designs.
The document compares the calorie, fat, and sugar content of meals from Maria, Jataya, a kid, and an adult to two healthy meals and one other meal. It finds that unhealthy meals like Maria's sausage croissant and kid's chicken strips and fruit punch have much higher calorie, fat, and sugar contents than healthier options like a side salad, water, and apple or a chicken pita and sprite. The document thanks the Jack in the Box website for providing the nutritional information and pictures used.
This document summarizes the assessment of self-regulated learning. It discusses widely used methodologies like self-report surveys such as the Motivated Strategies for Learning Questionnaire and the Self-Regulated Learning Interview Scale. It also discusses innovative methodologies like trace measures that track student actions, model tracing of human actions, and computerized self-evaluation measures with mandatory reports and performance comparisons. The study aims to systematically review the development of technology-supported assessments of self-regulated learning.
The document describes the author's observations of several retail stores, including an Apple store, gold jewelry store, furniture expo, flower market, Domino's pizza outlet, and city supermarket. For each store, the author notes details about the environment, customers, products, noise levels, lighting, flooring, and how these elements make them feel. The stores are contrasted in terms of whether the door is open or closed, the size and font of signage, and whether the design draws customers in before they enter.
This document provides information on travel packages and membership opportunities with DreamTrips. It includes descriptions of various international vacation packages with prices and details. It also outlines the benefits and compensation structure for becoming a DreamTrips representative.
1. O documento apresenta várias dinâmicas de grupo com objetivos de integração, autoconhecimento e desenvolvimento de valores. As dinâmicas envolvem exercícios como apresentação com adjetivos, escolha de fotografias, identificação de sentimentos com cores e mais.
We have lots data. Let’s use it create more credible estimates to help tame the growth beast. In spite the estimating community’s efforts to provide credible estimates, government programs still seem to deliver less than promised, cost more than planned, and take longer than needed. When estimates are consistently biased low:
- Decisions of choice are distorted
- Cost growth causes more growth as programs are stretched out to fund portfolios with fixed budgets
- Taxpayers become more cynical and negative about government
- The estimating community’s credibility is seriously questioned
A empresa de tecnologia anunciou um novo smartphone com câmera aprimorada, maior tela e melhor desempenho. O dispositivo também possui recursos adicionais de inteligência artificial e segurança de dados aprimorados. O lançamento do novo smartphone está programado para o final deste ano.
An effective message for a campaign or organizing effort should be credible, concise, relevant, and compelling. It answers the question of "why" by explaining why the issue matters and why people should care. The message is a conversation, not just a slogan or list of demands. It must be credible based on trustworthy content and messenger. It should also be compelling to the audience by connecting on a personal level. Finally, an effective message is concise and clear without jargon, and contrasts the choices to offer a decision between the campaign and its opposition.
The document discusses 9 collision events that could occur from 53 emerging realities, including the personal consumer experiencing accelerating serendipity and having simultaneous identities, everything becoming connected, and disruptive inventions in areas like robotics, self-driving cars, and alternative energies. The events also explore trends like the power of the crowd, a new age of sharing through renting and accessing over owning, and computers becoming super-smart through advances like passive computing and speech recognition.
This document provides an assignment guide for an introductory psychology course covering the following:
1) It outlines the reading assignments, homework assignments, quizzes, and important reminders for each class day from June 13th to July 4th.
2) Students are expected to complete reading assignments from the textbook, supplementary materials, and work shows before each class and complete related homework assignments.
3) Quizzes will cover the reading assignments and important concepts for each class, and students are warned to thoroughly study all sections, including enrichment materials, to earn an A.
4) The guide provides important reminders about course procedures, materials, and expectations to help students succeed in the course.
The document discusses several announcements and events from the Department of Defense and organizations that support military families:
1) The launch of the Military Spouse Employment Partnership program to connect military spouses to career opportunities with over 70 employer partners.
2) An upcoming hiring fair in Los Angeles on July 10th that is open to both veterans and military spouses, and will be attended by Prince William and Catherine.
3) Updates from the Family Advocacy Program on a meeting between program staff and service representatives, as well as a DOD summit on preventing child and domestic abuse fatalities.
4) Upcoming commissary on-site sales for Guard/Reserve members in several locations throughout July.
The Archives Forum - The National Archives - 02 March 2011David F. Flanders
The document summarizes a presentation given by David F. Flanders about digital infrastructure innovation and the future of archives. It discusses how archives can innovate with limited budgets in the short term by improving search engine optimization, using application programming interfaces, and engaging communities. In the medium term, archives can prepare for increased budgets by crowdsourcing content and metadata from communities. Long term innovations may include addressing why digitization is endless, understanding how context is missing from the web, embracing open licensing, and preparing for technologies like augmented reality.
The document discusses challenges facing the semantic web as it tries to keep up with the growth of the regular web, including not having enough agreed upon vocabularies, data, and links between data. It also notes problems with reasoning over large amounts of noisy and inconsistent web data from different sources. Solutions proposed include cleverly injecting semantic web technologies into content management systems to extract and link more data, as well as developing lightweight vocabularies and simplified reasoning techniques.
II-SDV 2017: Custom Open Source Search Engine with Drupal 8 and Solr at Frenc...Dr. Haxel Consult
A journey in the Dark Web, for companies looking to take control of their search strategy. Objective if this presentation is to prove that any reasonable cost, any organisation can setup its own search strategy, outside or in parallel of its document management strategy.
Challenge at French Ministry is to aggregate internal content, external content on social network (pinterest, youtube, facebook) and external legacy WebSite content (other Website from agency in relation with Ministry) and provide a brand new Web Site with "best of the bread" interface : search engine, auto completion and word correction, easy custom and secured navigation
Result is awesome, for a budget kept under control, we provided a new Drupal Module to monitor and configure Solr6 indexation and search engine, together with custom API to index external WebSite.
This session will come with a presentation of the Project Architecture (multi tiers servers) and a live demo of the Search interface
Towards a rebirth of data science (by Data Fellas)Andy Petrella
Nowadays, Data Science is buzzing all over the place.
But what is a, so-called, Data Scientist?
Some will argue that a Data Scientist is a person able to report and present insights in a data set. Others will say that a Data Scientist can handle a high throughput of values and expose them in services. Yet another definition includes the capacity to create meaningful visualizations on the data.
However, we enter an age where velocity is a key. Not only the velocity of your data is high, but the time to market is shortened. Hence, the time separating the moment you receive a set of data and the time you’ll be able to deliver added value is crucial.
In this talk, we’ll review the legacy Data Science methodologies, what it meant in terms of delivered work and results.
Afterwards, we’ll slightly move towards different concepts, techniques and tools that Data Scientists will have to learn and appropriate in order to accomplish their tasks in the age of Big Data.
The dissertation is closed by exposing the Data Fellas view on a solution to the challenges, specially thanks to the Spark Notebook and the Shar3 product we develop.
Utopoll is a decentralized survey aggregator platform built on blockchain and IPFS technologies. It uses Tether tokens (USDT) for transactions and provides a faster, safer, and more open environment for conducting surveys compared to traditional centralized methods. Users can distribute survey data across a global network for storage using IPFS's distributed file storage approach, improving transmission speeds, privacy, security, and permanent storage of survey results.
COinS allow metadata about articles to be hidden on web pages in a standardized format called OpenURL. Browser plugins can use COinS to find documents or store metadata. Adding COinS to pages allows users more simple and personalized work with resources by linking to library services or saving references.
Architecture Patterns for Semantic Web Applicationsbpanulla
This document provides an overview of non-relational database (NoSQL) architectures and patterns for semantic web applications. It discusses NoSQL key-value and graph databases as alternatives to relational databases for domains where schemas change rapidly or data is sparse. It also covers semantic web technologies like RDF, OWL, SPARQL and linked data for representing information and relationships in a machine-readable way. The document uses examples to illustrate concepts like modeling bookmark data from a social bookmarking site in RDF and querying it with SPARQL.
Emanuele Bartolesi will present on Real Solutions Day about Azure Media Services and Azure Search. He will demonstrate how to use Azure Media Services to encode, package and stream video and audio files. He will also demonstrate how to use Azure Search to build a search index and query data. The agenda includes overviews of Azure Media Services and Azure Search, followed by demonstrations of each service.
This document summarizes research into discovering lost web pages using techniques from digital preservation and information retrieval. Key points include:
- Web pages are frequently lost due to broken links or content being moved/removed, but copies may still exist in search engine caches or archives.
- Techniques like lexical signatures (representing a page's content in a few keywords) and analyzing page titles, tags and link neighborhoods can help characterize lost pages and find similar replacement content.
- Experiments showed that lexical signatures degrade over time but page titles are more stable, and combining techniques improves performance in locating replacement content. The goal is to develop a browser extension to help users find lost web pages.
Migration from a Commercial Search Platform (specifically FAST ESP) to Lucene/Solr
Presented by Michael McIntosh, VP, Enterprise Search Technologies, TNR Global
There are many reasons that an IT department with a large scale search installation would want to move from a proprietary platform to Lucene Solr. In the case of FAST Search, the company’s purchase by Microsoft and discontinuation of the Linux platform has created an urgency for FAST users.
This presentation will compare Lucene/Solr to FAST ESP on a feature basis, and as applied to an enterprise search installation. We will further explore how various advanced features of commercial enterprise search platforms can be implemented as added functions for Lucene/Solr. Actual cases will be presented describing how to map the various functions between systems.
What is a distributed data science pipeline. how with apache spark and friends.Andy Petrella
What was a data product before the world changed and got so complex.
Why distributed computing/data science is the solution.
What problems does that add?
How to solve most of them using the right technologies like spark notebook, spark, scala, mesos and so on in a accompanied framework
Utopoll is a decentralized survey aggregator platform built on blockchain and IPFS technologies. It allows users to conduct surveys, store survey data across a distributed network for improved security and speed, and uses the Tether cryptocurrency (a stablecoin pegged to the US dollar) for payments. The platform aims to provide a faster, safer, and more open environment for online surveys compared to traditional centralized methods. Users can earn rewards for participating in surveys, inviting others, and promoting the platform.
This document provides an introduction to the Semantic Web and RDF (Resource Description Framework). It discusses how the Semantic Web aims to extend the current web by giving data well-defined meaning to enable computers and people to better work together. It introduces RDF as a standard for representing information in the Semantic Web and provides examples of how RDF can be used to represent different types of data, such as relational data and evolving data scenarios.
- The document discusses the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) standard which allows interoperability between digital archives and repositories.
- It describes key aspects of the OAI-PMH standard including verbs, identifiers, sets, data and service providers, and harvesting metadata from multiple sources.
- The document also provides an example of implementing OAI-PMH through the CulturaItalia project in Italy which aggregates metadata about artworks in Tuscany from different source repositories.
The document discusses the Semantic Web and Linked Data. It provides an overview of RDF syntaxes, storage and querying technologies for the Semantic Web. It also discusses issues around scalability and reasoning over large amounts of semantic data. Examples are provided to illustrate SPARQL querying of RDF data, including graph patterns, conjunctions, optional patterns and value testing.
Making the Most of In-Memory: More than SpeedInside Analysis
The document discusses how in-memory platforms are more than just speed - they are designed to efficiently exploit RAM and are optimized for analytics workloads that involve complex "crunching" of data. It explains that analytics workloads are CPU-intensive and benefit from techniques like parallelization across CPU cores. Additionally, the document notes that declining RAM prices and interest in advanced analytics are driving more adoption of in-memory platforms for both large and small data use cases.
Open for Business Open Archives, OpenURL, RSS and the Dublin CoreAndy Powell
UKOLN is supported by various open standards and protocols to facilitate digital information management, including OpenURL, RSS, Dublin Core, and the OAI Protocol for Metadata Harvesting. Andy Powell from UKOLN gave a presentation on using these standards to integrate resources from multiple content providers and enable user-focused discovery and access across heterogeneous collections. The presentation provided an overview of each standard and how they address issues like joining up discovery services with delivery of appropriate copies.
Project Tungsten: Bringing Spark Closer to Bare MetalDatabricks
As part of the Tungsten project, Spark has started an ongoing effort to dramatically improve performance to bring the execution closer to bare metal. In this talk, we’ll go over the progress that has been made so far and the areas we’re looking to invest in next. This talk will discuss the architectural changes that are being made as well as some discussion into how Spark users can expect their application to benefit from this effort. The focus of the talk will be on Spark SQL but the improvements are general and applicable to multiple Spark technologies.
Retrieving musical records from a corpus of Information, using an audio input as a query is not an easy task. Various approaches try to solve the problem modelling the query and the corpus of Information as an array of hashes calculated from the chroma features of the audio input.
Scope of this talk is to introduce a novel approach in calculating such hashes, considering the intervals of the most intense pitches of sequential chroma vectors.
Building on the theoretical introduction, a prototype will show you this approach in action with Apache Solr with a sample dataset and the benefits of positional queries.
Challenges and future works will follow up as conclusive considerations.
Music Information Retrieval is about retrieving information from music entities.
The slides will introduce the basic concepts of the music language, passing through different kind of music representations and it will end up describing some low level features that are used when dealing with music entities.
Every team working on information retrieval software struggles with the task of evaluating how well their system performs in terms of search quality(currently and historically). Evaluating search quality is important both to understand and size the improvement or regression of your search application across the development cycles, and to communicate such progress to relevant stakeholders. In the industry, and especially in the open source community, the landscape is quite fragmented: such requirements are often achieved using ad-hoc partial solutions that each time require a considerable amount of development and customization effort. To provide a standard, unified and approachable technology, we developed the Rated Ranking Evaluator (RRE), an open source tool for evaluating and measuring the search quality of a given search infrastructure. RRE is modular, compatible with multiple search technologies and easy to extend.
Haystack London - Search Quality Evaluation, Tools and Techniques Andrea Gazzarini
Every search engineer ordinarily struggles with the task of evaluating how well a search engine is performing. Improving the correctness and effectiveness of a search system requires a set of tools which help measuring the direction where the system is going. The talk will describe the Rated Ranking Evaluator from a developer perspective. RRE is an open source search quality evaluation tool, that could be used for producing a set of deliverable reports and that could be integrated within a continuous integration infrastructure.
Search Quality Evaluation: a Developer PerspectiveAndrea Gazzarini
Search quality evaluation is an ever-green topic every search engineer ordinarily struggles with. Improving the correctness and effectiveness of a search system requires a set of tools which help measuring the direction where the system is going.
The slides will focus on how a search quality evaluation tool can be seen under a practical developer perspective, how it could be used for producing a deliverable artifact and how it could be integrated within a continuous integration infrastructure.
ADLUG 2013 - A proposal for an RDF assembly lineAndrea Gazzarini
This document summarizes an agenda for a meeting discussing an RDF assembly line. It describes intermediate and final storage layers used in the conversion process. An intermediate MARC storage layer is used for querying during conversion. Records are converted to RDF triples and stored in a triple store as the final layer, allowing for standardized querying and exchange of data. A Camel integration framework is used to define processors that split, aggregate, manipulate, and perform tasks to chain the whole conversion process together.
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Keywords: AI, Containeres, Kubernetes, Cloud Native
Event Link: https://meine.doag.org/events/cloudland/2024/agenda/#agendaId.4211
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsScyllaDB
ScyllaDB monitoring provides a lot of useful information. But sometimes it’s not easy to find the root of the problem if something is wrong or even estimate the remaining capacity by the load on the cluster. This talk shares our team's practical tips on: 1) How to find the root of the problem by metrics if ScyllaDB is slow 2) How to interpret the load and plan capacity for the future 3) Compaction strategies and how to choose the right one 4) Important metrics which aren’t available in the default monitoring setup.
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillLizaNolte
HERE IS YOUR WEBINAR CONTENT! 'Mastering Customer Journey Management with Dr. Graham Hill'. We hope you find the webinar recording both insightful and enjoyable.
In this webinar, we explored essential aspects of Customer Journey Management and personalization. Here’s a summary of the key insights and topics discussed:
Key Takeaways:
Understanding the Customer Journey: Dr. Hill emphasized the importance of mapping and understanding the complete customer journey to identify touchpoints and opportunities for improvement.
Personalization Strategies: We discussed how to leverage data and insights to create personalized experiences that resonate with customers.
Technology Integration: Insights were shared on how inQuba’s advanced technology can streamline customer interactions and drive operational efficiency.
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving
What began over 115 years ago as a supplier of precision gauges to the automotive industry has evolved into being an industry leader in the manufacture of product branding, automotive cockpit trim and decorative appliance trim. Value-added services include in-house Design, Engineering, Program Management, Test Lab and Tool Shops.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
ScyllaDB is making a major architecture shift. We’re moving from vNode replication to tablets – fragments of tables that are distributed independently, enabling dynamic data distribution and extreme elasticity. In this keynote, ScyllaDB co-founder and CTO Avi Kivity explains the reason for this shift, provides a look at the implementation and roadmap, and shares how this shift benefits ScyllaDB users.
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...DanBrown980551
This LF Energy webinar took place June 20, 2024. It featured:
-Alex Thornton, LF Energy
-Hallie Cramer, Google
-Daniel Roesler, UtilityAPI
-Henry Richardson, WattTime
In response to the urgency and scale required to effectively address climate change, open source solutions offer significant potential for driving innovation and progress. Currently, there is a growing demand for standardization and interoperability in energy data and modeling. Open source standards and specifications within the energy sector can also alleviate challenges associated with data fragmentation, transparency, and accessibility. At the same time, it is crucial to consider privacy and security concerns throughout the development of open source platforms.
This webinar will delve into the motivations behind establishing LF Energy’s Carbon Data Specification Consortium. It will provide an overview of the draft specifications and the ongoing progress made by the respective working groups.
Three primary specifications will be discussed:
-Discovery and client registration, emphasizing transparent processes and secure and private access
-Customer data, centering around customer tariffs, bills, energy usage, and full consumption disclosure
-Power systems data, focusing on grid data, inclusive of transmission and distribution networks, generation, intergrid power flows, and market settlement data
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
From Natural Language to Structured Solr Queries using LLMsSease
This talk draws on experimentation to enable AI applications with Solr. One important use case is to use AI for better accessibility and discoverability of the data: while User eXperience techniques, lexical search improvements, and data harmonization can take organizations to a good level of accessibility, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints.
That is where AI – and most importantly, Natural Language Processing and Large Language Model techniques – could make a difference. This natural language, conversational engine could facilitate access and usage of the data leveraging the semantics of any data source.
The objective of the presentation is to propose a technical approach and a way forward to achieve this goal.
The key concept is to enable users to express their search queries in natural language, which the LLM then enriches, interprets, and translates into structured queries based on the Solr index’s metadata.
This approach leverages the LLM’s ability to understand the nuances of natural language and the structure of documents within Apache Solr.
The LLM acts as an intermediary agent, offering a transparent experience to users automatically and potentially uncovering relevant documents that conventional search methods might overlook. The presentation will include the results of this experimental work, lessons learned, best practices, and the scope of future work that should improve the approach and make it production-ready.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
1. 31st ADLUG ANNUAL MEETING 2012
Sala Brunelleschi of the OPA – CESVOT - Firenze
19 – 21 September 2012
Linking Linked Data
Andrea Gazzarini
Software Architect
Copyright 2009-2010 @CULT. All rights reserved
2. Agenda
Goals
Information Retrieval
Triple store
Proof of concept
Q&A
Copyright 2009-2010 @CULT. All rights reserved 2
3. Agenda
Goals
Information Retrieval
Triple store
Proof of concept
Q&A
Copyright 2009-2010 @CULT. All rights reserved 3
4. Goals
1) Combine two different technologies in order to improve the (user) search
experience by decoupling the “search” from the “view” perspective.
2) Provide a fast full-featured fulltext search that is able to scale over billion
of records, providing tipical search features like faceting, stemming,
autocompletion and so on...
3) Provide a system that is able to benefit of the Linked Data
extensibility feature
Copyright 2009-2010 @CULT. All rights reserved 4
5. Le avventure di Pinocchio
This is a record extracted from the recordset we will use during
this presentation.
000 00694nam a2200241 i 4500
008 971205s1997 it j 000 0 ita c
020 a 880921191X
082 1 a 853.8
100 1 a Collodi, Carlo.
245 13 a Le avventure di Pinocchio /
c C. Collodi ; illustrazioni di Attilio Mussino.
260 a Firenze :
b Giunti,
c 1997.
440 0 a Collana favolosa / [Giunti]
521 a Letteratura per ragazzi
700 1 a Mussino, Attilio.
Copyright 2009-2010 @CULT. All rights reserved 5
6. Agenda
Goals
Information Retrieval
Triple store
Proof of concept
Q&A
Copyright 2009-2010 @CULT. All rights reserved 6
7. Information Retrieval (1/2)
For our purposes we will (simplistically) define an Information Retrieval (IR) as
a full-text search framework able to index textual data and perform some
manipulation in order to enable some end user interesting search features like:
» Relevance computation and boosting
» Autocompletion
» Faceting
» Stemming
» Did you mean?
» Search by phoneme (i.e. Sounds Like)
» More like this
» ...and many many others...
But there's a price to pay for that...
Copyright 2009-2010 @CULT. All rights reserved 7
8. Inverted index
In computer science, an inverted index (also referred to as postings file or
inverted file) is an index data structure storing a mapping from content, such
as words or numbers, to its locations in a database file, or in a document or a
set of documents. The purpose of an inverted index is to allow fast full text
searches, at a cost of increased processing when a document is added to the
database. The inverted file may be the database file itself, rather than its
index. It is the most popular data structure used in document retrieval systems
http://en.wikipedia.org/wiki/Inverted_index
An inverted index is an optimized structure that allows fast searches but is
supposed to be immutable so that means if you need to change something in
your data you need to rebuild your index.
Copyright 2009-2010 @CULT. All rights reserved 8
9. Semantic destruction (1/3)
A search engine doesn't care about how much accuracy you put and how
many time you spent for cataloguing a bibliographic resource...once
indexed, it will loose any semantic meaning!
...ipsum
dolor sit
amet,
consectetur
adipiscing...
A
S
C
C
I
Y
L
O E
Z
P I
O
U
A
U
Y R D
W
Copyright 2009-2010 @CULT. All rights reserved 9
10. Semantic destruction (2/3)
The adventures of Pinocchio
The adventures of Pinocchio
adventures Pinocchio
adventures pinocchio
adventure pinocchio
ATFN PNX
Tokenization
Stopwords
Lowercase
Stemming (light)
Phoneme (!)
These are the only tokens that will be indexed!
Copyright 2009-2010 @CULT. All rights reserved 10
12. Agenda
Goals
Information Retrieval
Triple store
Proof of concept
Q&A
Copyright 2009-2010 @CULT. All rights reserved 12
13. Triple store (1/2)
A triplestore is a purpose-built database for the storage and retrieval of triples,
a triple being a data entity composed of subject-predicate-object, like "Bob
is 35" or "Bob knows Fred".
http://en.wikipedia.org/wiki/Triplestore
Subject Predicate Object
book hasTitle The adventures of Pinocchio
book hasAuthor Collodi, Carlo
book hasPublisher Giunti
Of course it is more similar to a database and basically has nothing to do
with an inverted index.
Copyright 2009-2010 @CULT. All rights reserved 13
14. Triple store (2/2)
Using a triple store you can have
1) a standard Query language (SPARQL) to query the store;
2) a standard format for exchanging data (RDF);
3) a storage where you are free to change your data in realtime
without doing any kind of reindex operation;
But, most important, you cannot have
any of the seach features we described in the previous slides; for
some of them it is practically impossible (e.g. faceting), for others
(e.g. autocompletion) the problem is mainly the response time;
Copyright 2009-2010 @CULT. All rights reserved 14
15. Agenda
Goals
Information Retrieval
Triple store
Proof of concept
Q&A
Copyright 2009-2010 @CULT. All rights reserved 15
16. Proof of Concept
Our system is able to combine together the previous described technologies
trying to get all the advantages and minimize the disadvantages.
MARC (Binary) MARC XML RDF / XML N3 Turtle NTriples
Search View
Information
Retrieval
Triple store
Copyright 2009-2010 @CULT. All rights reserved 16
18. Le avventure di Pinocchio (MARC)
000 00694nam a2200241 i 4500
008 971205s1997 it j 000 0 ita c
020 a 880921191X
082 1 a 853.8
100 1 a Collodi, Carlo.
245 13 a Le avventure di Pinocchio /
c C. Collodi ; illustrazioni di Attilio Mussino.
260 a Firenze :
b Giunti,
c 1997.
440 0 a Collana favolosa / [Giunti]
521 a Letteratura per ragazzi
700 1 a Mussino, Attilio.
Copyright 2009-2010 @CULT. All rights reserved 18
19. Le avventure di Pinocchio (RDF / XML)
<bibo:Book rdf:about="http://www.cbt.trentinocultura.net/biblio/000002577949">
<dcterms:identifier>000002577949</dcterms:identifier>
<bibo:isbn10>880921191X</bibo:isbn10>
<dcterms:shortTitle>Le avventure di Pinocchio</dcterms:shortTitle>
<dcterms:title>
Le avventure di Pinocchio / C. Collodi ; illustrazioni di Attilio Mussino
The book...
</dcterms:title>
<dc:creator rdf:resource="http://www.cbt.trentinocultura.net/person/collodi_carlo"/>
<dcterms:language>ita</dcterms:language>
<dcterms:audience rdf:resource="http://www.cbt.trentinocultura.net/subject/opera_per_bambini"/>
<dcterms:isPartOf rdf:resource="http://www.cbt.trentinocultura.net/biblio/2378129373323" />
<dcterms:extent>186 p.</dcterms:extent>
<isbd:hasPlaceOfPublicationProductionDistribution>
Firenze
</isbd:hasPlaceOfPublicationProductionDistribution>
<dcterms:issued>1997</dcterms:issued>
<dcterms:publisher rdf:resource="http://www.cbt.trentinocultura.net/organisations/giunti"/>
</bibo:Book>
...the author...
<foaf:Person rdf:about="http://www.cbt.trentinocultura.net/person/collodi_carlo">
<foaf:name>Collodi, Carlo</foaf:name>
</foaf:Person>
<foaf:Organization rdf:about="http://www.cbt.trentinocultura.net/organisations/giunti">
<foaf:name>Giunti</foaf:name>
</foaf:Organization>
...and the publisher
Copyright 2009-2010 @CULT. All rights reserved 19
20. Step 1: transform MARC in RDF
As first step we need to transform MARC records in their corresponding RDF
representation.
This presentation is not focused on this advanced topic, we will just index ten
MARC records only for demonstrating the capabilities of the system.
We choosen the RDF / XML format for expressing the resulting triples. This
will be the input data of the system.
MARC 21 RDF / XML
Copyright 2009-2010 @CULT. All rights reserved 20
21. Step 2: submit RDF data
The RDF data created in the previous step needs to be submitted to the
system.
RDF / XML
Copyright 2009-2010 @CULT. All rights reserved 21
22. Step 3: make a search...
Autocompletion
Faceting
Copyright 2009-2010 @CULT. All rights reserved 22
23. Step 4: more publisher data...
It would be great if my users could see
additional data on search results.
For example, I could ask data to publishers
(logo, homepage and so on)...maybe for them
could be a kind of advertisment, while for my users an
additional information displayed on my catalog
But
1) I don't want those data be part of my search index;
2) I don't want to include those data in my bibliographic database;
3) I don't want to reindex my data when some publisher information changes
4) I would like to manage, improve those data without affecting searches
Copyright 2009-2010 @CULT. All rights reserved 23
24. Step 6: Our sample publisher
Before...
<foaf:Organization rdf:about="http://www.cbt.trentinocultura.net/organisations/giunti">
<foaf:name>Giunti</foaf:name>
</foaf:Organization>
...and after
<foaf:Organization rdf:about="http://www.cbt.trentinocultura.net/organisations/giunti">
<foaf:name>Giunti</foaf:name>
<foaf:logo rdf:resource=”http://www.giunti.it/custom/src/@css/images/logo_Giunti.jpg”/>
<rdfs:comment>Fondata nel pieno delle battaglie risorgimentali...</rdfs:comment>
<foaf:mbox rdf:resource=”mailto:contactsus@domain.it”/>
<foaf:homepage rdf:resource=”http://www.giunti.it”/>
</foaf:Organization>
As you can see, we added a logo, a brief description of the publisher, a mailbox and a
homepage. We got data directly from the publisher website.
This data will be submitted again to the search system but without rebuild the search index.
As consequence of that, changes made to the publishers are immediately available.
Copyright 2009-2010 @CULT. All rights reserved 24
25. Step 7: see additional data...
Copyright 2009-2010 @CULT. All rights reserved 25
26. Step 7 bis: another publisher...
Copyright 2009-2010 @CULT. All rights reserved 26
27. Step 8: still more (linked) data... (1/3)
Great! My users were enthusiast!!
So I'd like more...and not only publisher...
but what else?
Sir, I think it would be very useful if we would
show, beside each record, author information
Yes definitely it would, but you have no idea of what kind of
job I did to insert all publisher data and I don't
want to do the same for authors...too much work!
If I remember well your system is
Yes using Linked Data isn't it?
So in this case the right question is not “How can I do, I have no data”,
but “What kind of data I would like to show?”
???
Copyright 2009-2010 @CULT. All rights reserved 27
28. Step 8: still more (linked) data...(2/3)
There a lot of RDF authoritative endpoints that are exposing their data free of charge;
the main advantage is that you can link this information to your system and you
don't have to worry about their maintenance: it's not your data! See
http://viaf.org or http://dbpedia.org
By linking those resources, you can get data in a standardized way because sources
are sharing one or more (accepted) ontologies for describing authors, subjects,
things and so on...
So for the example above we need the gather additional information about people
(authors) and fortunately there's an ontology called Friend of a Friend (FOAF) that
fits exactly our needs. This ontology is used in all RDF sources describing persons
(like VIAF, Dbpedia)
In our example instead of copying and storing in our triple store (as we did for
publishers) all information about Carlo Collodi, the author of “The adventures of
Pinocchio”, we will simply link our internal representation with the same resource
as defined in DBPedia.
Copyright 2009-2010 @CULT. All rights reserved 28
29. Step 8: still more (linked) data...(3/3)
Copyright 2009-2010 @CULT. All rights reserved 29
30. Step 9: Our sample author
Before...
<foaf:Organization rdf:about="http://www.cbt.trentinocultura.net/person/collodi_carlo">
<foaf:name>Collodi, Carlo</foaf:name>
</foaf:Organization>
...and after
<foaf:Organization rdf:about="http://www.cbt.trentinocultura.net/person/collodi_carlo">
<foaf:name>Collodi, Carlo</foaf:name>
<owl:sameAs rdf:resource=”http://dbpedia.org/resource/Carlo_Collodi”/>
</foaf:Organization>
As you can see, we didn't add any information but just a “link” with the sameAs predicate.
The URI (http://dbpedia.org/resource/Carlo_Collodi) points to a web resource describing
Carlo Collodi, so we can gather this data and display to the end user (for example).
Copyright 2009-2010 @CULT. All rights reserved 30
31. Step 10: again the same search...
Copyright 2009-2010 @CULT. All rights reserved 31
32. Step 10 bis: another author...
Copyright 2009-2010 @CULT. All rights reserved 32
33. Step 11: still more data??? yes!
Wow!! And now?
Is there some other content I could “link”?
Yes sir, subjects for example...are you using subjects
coming from the “Nuovo Soggettario”?
Yes
So in this case you can link those subjects directly
with concepts of the thesaurus, therefore providing
to end users information like scope notes,
history notes, term relationships and so on..
And, as another example, for places you can link “Geonames”
resources, which provides RDF description of cities, countries.
Copyright 2009-2010 @CULT. All rights reserved 33
34. Step 12: Linking the “Nuovo Soggettario“
Copyright 2009-2010 @CULT. All rights reserved 34
35. Step 13: Linking Firenze with Geonames
Copyright 2009-2010 @CULT. All rights reserved 35
36. Agenda
Goals
Information Retrieval
Triple store
Proof of concept
Q&A
Copyright 2009-2010 @CULT. All rights reserved 36
37. 31st ADLUG ANNUAL MEETING 2012
Sala Brunelleschi of the OPA – Firenze
19 – 21 September 2012
Linking Linked Data
Thank You!