This document provides an overview of Linked Data efforts at Globo.com, including motivation, current status, architecture, and plans for a new open semantic data management platform called Brainiak. Brainiak aims to provide a scalable and mobile-friendly API for managing and interconnecting Globo's semantic data with external sources to facilitate flexible organization of content and seamless navigation across products.
A Day in the Life of a Functional Data ScientistC4Media
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1tnEpjY.
Richard Minerich explains how ideas and tools from functional programming can save time, prevent subtle mistakes in data science, and how he incorporates them into his everyday workflow. Filmed at qconnewyork.com.
Richard Minerich works tirelessly at Bayard Rock to apply cutting edge research to anti-money laundering and fraud while using typed functional programming whenever possible. As an F# MVP he's been running events, speaking, and writing for over five years.
Data deduplication, or entity resolution, is a common problem for anyone working with data, especially public data sets. Many real world datasets do not contain unique IDs, instead, we often use a combination of fields to identify unique entities across records by linking and grouping. This talk will show how we can use active learning techniques to train learnable similarity functions that outperform standard similarity metrics (such as edit or cosine distance) for deduplicating data in a graph database. Further, we show how these techniques can be enhanced by inspecting the structure of the graph to inform the linking and grouping processes. We will demonstrate how to use open source tools to perform entity resolution on a dataset of campaign finance contributions loaded into the Neo4j graph database.
Entity Linking in Queries: Tasks and EvaluationFaegheh Hasibi
Slides for the ICTIR 2015 paper "Entity Linking in Queries: Tasks and Evaluation"
Annotating queries with entities is one of the core problem areas in query understanding. While seeming similar, the task of entity linking in queries is different from entity linking in documents and requires a methodological departure due to the inherent ambiguity of queries. We differentiate between two specific tasks, semantic mapping and interpretation finding, discuss current evaluation methodology, and propose refinements. We examine publicly available datasets for these tasks and introduce a new manually curated dataset for interpretation finding. To further deepen the understanding of task differences, we present a set of approaches for effectively addressing these tasks and report on experimental results.
A Day in the Life of a Functional Data ScientistC4Media
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1tnEpjY.
Richard Minerich explains how ideas and tools from functional programming can save time, prevent subtle mistakes in data science, and how he incorporates them into his everyday workflow. Filmed at qconnewyork.com.
Richard Minerich works tirelessly at Bayard Rock to apply cutting edge research to anti-money laundering and fraud while using typed functional programming whenever possible. As an F# MVP he's been running events, speaking, and writing for over five years.
Data deduplication, or entity resolution, is a common problem for anyone working with data, especially public data sets. Many real world datasets do not contain unique IDs, instead, we often use a combination of fields to identify unique entities across records by linking and grouping. This talk will show how we can use active learning techniques to train learnable similarity functions that outperform standard similarity metrics (such as edit or cosine distance) for deduplicating data in a graph database. Further, we show how these techniques can be enhanced by inspecting the structure of the graph to inform the linking and grouping processes. We will demonstrate how to use open source tools to perform entity resolution on a dataset of campaign finance contributions loaded into the Neo4j graph database.
Entity Linking in Queries: Tasks and EvaluationFaegheh Hasibi
Slides for the ICTIR 2015 paper "Entity Linking in Queries: Tasks and Evaluation"
Annotating queries with entities is one of the core problem areas in query understanding. While seeming similar, the task of entity linking in queries is different from entity linking in documents and requires a methodological departure due to the inherent ambiguity of queries. We differentiate between two specific tasks, semantic mapping and interpretation finding, discuss current evaluation methodology, and propose refinements. We examine publicly available datasets for these tasks and introduce a new manually curated dataset for interpretation finding. To further deepen the understanding of task differences, we present a set of approaches for effectively addressing these tasks and report on experimental results.
Dove sono i tuoi vertici e di cosa stanno parlando?Codemotion
"Dove sono i tuoi vertici e di cosa stanno parlando?" by Roberto Franchini
OrientDB unisce la potenza di un database a grafo con la flessibilità di un document database creando il primo database multi-modello. Il talk esplora le funzionalità avanzate di ricerca geo-spaziale e full-text di OrientDB basate su Lucene. I due modelli di interrogazione, completamente integrati nel linguaggio SQL di OrientDB, aprono possibilità di analisi avanzate per dati geolocalizzati e testi non strutturati.
Boost your data analytics with open data and public news contentOntotext
Get guidance through the gigantic sea of freely available Open Data and learn how it can empower you analysis of any kind of sources.
This webinar is a live demo of news and data analytics, based on rich links within big knowledge graphs. It will show you how to:
Build ranking reports (e.g for people and organisations)
View topics linked implicitly (e.g. daughter companies, key personnel, products …)
Draw trend lines
Extend your analytics with additional data sources
Building search and discovery services for Schibsted (LSRS '17)Sandra Garcia
Presentation given at the Large Scale Recommender Systems workshop (LSRS) in Recsys 2017.
This presentation describes the search and discovery products we are working on in Schibsted for the domains of news and marketplaces as well as the challenges within each of these domains. It also covers how we bring these services into production including the system architecture and deployment process.
InfoSec World 2013 – W4 – Using Google to Find Vulnerabilities in Your IT Env...Bishop Fox
https://resources.bishopfox.com/resources/tools/google-hacking-diggity/
As of late, security professionals have been waging a losing battle against hackers. Google, Bing, and other major search engines have been kind enough to index and make searchable all the vulnerabilities on the web, including everything from exposed password files to SQL injection points. This fact has not gone unnoticed by hackers.
Last year, LulzSec employed Google hacking to go on an epic 50 day hacking spree that left in its wake a wide variety of major victims including Sony, PBS, Arizona's Department of Public Safety, Infraguard, the FBI, and the CIA. Botnets have also been confirmed to be utilizing search engines for identifying targets as part of mass injection campaigns and other malware distribution techniques. This falls in line with the results of the 2012 Verizon Data Breach Investigations Report which found that 79% of all victims were targets of opportunity. Google Hacking is the perfect vehicle to enable opportunistic attackers who are seeking quick and easy targets to exploit on a massive scale.
It is imperative that security professionals learn to take equal advantage of these techniques to help safeguard their organizations. In this workshop, the audience will gain an understanding of the magnitude of this threat, as well as the importance of being proactive in addressing it. We’ll be introducing you to slew of new tools and techniques that will allow you to leverage Google, Bing, SHODAN and many more open search interfaces to track down and eliminate information disclosures and vulnerabilities in your public facing systems and applications before hackers have the chance to exploit them.
Some of the topics to be covered are:
• Search engine hacking – primary attack methods
o Google Hacking
o Bing Hacking
o Toolkit overview:
Diggity toolset, Maltego, theHarvester, FOCA, and more…
• Footprinting target organization networks and applications
o Identifying applications, URLs, hostnames, domains, IP addresses, emails and more
o Port scanning networks passively via Google
o DNS data mining via DeepMagic search engine
• Data loss prevention tools and techniques
o Locating sensitive data leaks via public web applications
• Cloud hacking via Google
o Targeting cloud implementations via search engines
o Using the cloud and custom search to identify vulnerabilities
• Adobe Flash hacking via Google and Bing
• Open source code vulnerabilities
• Finding sensitive information disclosures on 3rd party sites
o Facebook, Twitter, YouTube, PasteBin
o Cloud document storage (Dropbox, Google Drive, etc.)
• Malware and Search Engines – Bound by Destiny, Unholy Union
o Understanding how search engines are used to distribute malware to users
o Leveraging search engines to identify and avoid malware
• Advanced defense tools and techniques
o Search engine hacking alerts and intrusion detection systems (IDS)
GeoLinked Data (.es) is an open initiative whose aim is to enrich the Web of Data with Spanish geospatial data. This initiative started off by publishing diverse information sources belonging to the Spanish National Geographic Institute. Such sources are made available as RDF (Resource Description Framework) knowledge bases according to the Linked Data principles. With this work, Spain has joined the Linked Data initiative, in which the United Kingdom and Germany are already participating. In this presentation, we provide an overview of the process that has been followed for the development of this initiative.
Simple fuzzy name matching in elasticsearch paris meetupBasis Technology
Those are the slides that were presented during the Elasticsearch meetup in Paris on July 29th.
Normalization is crucial to high quality search results -- who wants irrelevant variations between queries and documents leading to missed hits (e.g., “celebrity” v. “celebrities”)? Normalizing dictionary words works, but what if your application focuses on names? Whether you’re tackling log analysis, e-commerce, watch list screening or other applications, names are often the key. Can you find “Abdul Jabbar, Karim” if you search for “Kareem AbdalJabar” or “كريم عبد الجبار”?
Applications using Elasticsearch provide some fuzziness by mixing its built-in edit-distance matching and phonetic analysis with more generic analyzers and filters. We’ve tried to go beyond that to provide both better matching and a simpler integration. We use a custom Mapper and Score Function so that linguistic nuances can be handled behind-the-scenes. We’ll talk about how we built this sort of plug-in for Rosette, its customization, and its connection to broader trend of entity-centric search.
Open Data and News Analytics Demo from the 4th Sofia Open Data & Linked Data meetup
http://www.meetup.com/Sofia-Open-Data-Linked-Data-Meetup/events/228747999/
Mar'2016, Sofia | BG
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...Ontotext
This webinar continues series are demonstrating how linked open data and semantic tagging of news can be used for comprehensive media monitoring, market and business intelligence. The platform for the demonstrations is FactForge: a hub for news and data about people, organizations, and locations (POL). FactForge embodies a big knowledge graph (BKG) of more than 1 billion facts that allows various analytical queries, including tracing suspicious patterns of company control; media monitoring of people, including companies owned by them, their subsidiaries, etc.
Mathematics & Computer Science Seminar
Emory University
October 2, 2009
Martin Klein & Michael L. Nelson
Department of Computer Science
Old Dominion University
Norfolk VA
Data Modelling is an important tool in the toolbox of a developer. By building and communicating a shared understanding of the domain they're working with, their applications and APIs are more useable and maintainable. However, as you scale up your technical teams, how do you keep these benefits whilst avoiding time-consuming meetings every time something new comes along? This talk reminds ourselves of key data modelling technique and how our use of Kafka changes and informs them. It then examines how these patterns change as more teams join your organisation and how Kafka comes into its own in this world.
We've known for years that data-driven content was a 'thing' when we'd produce simple infographics that shared a few statistics and they'd get easy traction for us online. The game has lifted and consumers are becoming more and more obsessed with data and are now demanding higher quality and more complex data-driven content. The challenge for us now as "T-Shaped" marketers is that there are increasing demands for us to learn new skills to produce this content but we don't have the time to do this amongst the other things we need to be expert at.
This presentation is going to give you specific help on how to produce data-driven content without any programming skill. After watching this presentation you'll have the confidence to build your own data-driven content with the knowledge of:
- blueprints for data-driven content ideas
- scraping tools, frameworks and methodologies
- how to brief in a data scraping project to your in-house team or a freelancer
- how to turn your data into visually appealing content
- channels for promoting data-driven content to ensure it gets traction
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Dove sono i tuoi vertici e di cosa stanno parlando?Codemotion
"Dove sono i tuoi vertici e di cosa stanno parlando?" by Roberto Franchini
OrientDB unisce la potenza di un database a grafo con la flessibilità di un document database creando il primo database multi-modello. Il talk esplora le funzionalità avanzate di ricerca geo-spaziale e full-text di OrientDB basate su Lucene. I due modelli di interrogazione, completamente integrati nel linguaggio SQL di OrientDB, aprono possibilità di analisi avanzate per dati geolocalizzati e testi non strutturati.
Boost your data analytics with open data and public news contentOntotext
Get guidance through the gigantic sea of freely available Open Data and learn how it can empower you analysis of any kind of sources.
This webinar is a live demo of news and data analytics, based on rich links within big knowledge graphs. It will show you how to:
Build ranking reports (e.g for people and organisations)
View topics linked implicitly (e.g. daughter companies, key personnel, products …)
Draw trend lines
Extend your analytics with additional data sources
Building search and discovery services for Schibsted (LSRS '17)Sandra Garcia
Presentation given at the Large Scale Recommender Systems workshop (LSRS) in Recsys 2017.
This presentation describes the search and discovery products we are working on in Schibsted for the domains of news and marketplaces as well as the challenges within each of these domains. It also covers how we bring these services into production including the system architecture and deployment process.
InfoSec World 2013 – W4 – Using Google to Find Vulnerabilities in Your IT Env...Bishop Fox
https://resources.bishopfox.com/resources/tools/google-hacking-diggity/
As of late, security professionals have been waging a losing battle against hackers. Google, Bing, and other major search engines have been kind enough to index and make searchable all the vulnerabilities on the web, including everything from exposed password files to SQL injection points. This fact has not gone unnoticed by hackers.
Last year, LulzSec employed Google hacking to go on an epic 50 day hacking spree that left in its wake a wide variety of major victims including Sony, PBS, Arizona's Department of Public Safety, Infraguard, the FBI, and the CIA. Botnets have also been confirmed to be utilizing search engines for identifying targets as part of mass injection campaigns and other malware distribution techniques. This falls in line with the results of the 2012 Verizon Data Breach Investigations Report which found that 79% of all victims were targets of opportunity. Google Hacking is the perfect vehicle to enable opportunistic attackers who are seeking quick and easy targets to exploit on a massive scale.
It is imperative that security professionals learn to take equal advantage of these techniques to help safeguard their organizations. In this workshop, the audience will gain an understanding of the magnitude of this threat, as well as the importance of being proactive in addressing it. We’ll be introducing you to slew of new tools and techniques that will allow you to leverage Google, Bing, SHODAN and many more open search interfaces to track down and eliminate information disclosures and vulnerabilities in your public facing systems and applications before hackers have the chance to exploit them.
Some of the topics to be covered are:
• Search engine hacking – primary attack methods
o Google Hacking
o Bing Hacking
o Toolkit overview:
Diggity toolset, Maltego, theHarvester, FOCA, and more…
• Footprinting target organization networks and applications
o Identifying applications, URLs, hostnames, domains, IP addresses, emails and more
o Port scanning networks passively via Google
o DNS data mining via DeepMagic search engine
• Data loss prevention tools and techniques
o Locating sensitive data leaks via public web applications
• Cloud hacking via Google
o Targeting cloud implementations via search engines
o Using the cloud and custom search to identify vulnerabilities
• Adobe Flash hacking via Google and Bing
• Open source code vulnerabilities
• Finding sensitive information disclosures on 3rd party sites
o Facebook, Twitter, YouTube, PasteBin
o Cloud document storage (Dropbox, Google Drive, etc.)
• Malware and Search Engines – Bound by Destiny, Unholy Union
o Understanding how search engines are used to distribute malware to users
o Leveraging search engines to identify and avoid malware
• Advanced defense tools and techniques
o Search engine hacking alerts and intrusion detection systems (IDS)
GeoLinked Data (.es) is an open initiative whose aim is to enrich the Web of Data with Spanish geospatial data. This initiative started off by publishing diverse information sources belonging to the Spanish National Geographic Institute. Such sources are made available as RDF (Resource Description Framework) knowledge bases according to the Linked Data principles. With this work, Spain has joined the Linked Data initiative, in which the United Kingdom and Germany are already participating. In this presentation, we provide an overview of the process that has been followed for the development of this initiative.
Simple fuzzy name matching in elasticsearch paris meetupBasis Technology
Those are the slides that were presented during the Elasticsearch meetup in Paris on July 29th.
Normalization is crucial to high quality search results -- who wants irrelevant variations between queries and documents leading to missed hits (e.g., “celebrity” v. “celebrities”)? Normalizing dictionary words works, but what if your application focuses on names? Whether you’re tackling log analysis, e-commerce, watch list screening or other applications, names are often the key. Can you find “Abdul Jabbar, Karim” if you search for “Kareem AbdalJabar” or “كريم عبد الجبار”?
Applications using Elasticsearch provide some fuzziness by mixing its built-in edit-distance matching and phonetic analysis with more generic analyzers and filters. We’ve tried to go beyond that to provide both better matching and a simpler integration. We use a custom Mapper and Score Function so that linguistic nuances can be handled behind-the-scenes. We’ll talk about how we built this sort of plug-in for Rosette, its customization, and its connection to broader trend of entity-centric search.
Open Data and News Analytics Demo from the 4th Sofia Open Data & Linked Data meetup
http://www.meetup.com/Sofia-Open-Data-Linked-Data-Meetup/events/228747999/
Mar'2016, Sofia | BG
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...Ontotext
This webinar continues series are demonstrating how linked open data and semantic tagging of news can be used for comprehensive media monitoring, market and business intelligence. The platform for the demonstrations is FactForge: a hub for news and data about people, organizations, and locations (POL). FactForge embodies a big knowledge graph (BKG) of more than 1 billion facts that allows various analytical queries, including tracing suspicious patterns of company control; media monitoring of people, including companies owned by them, their subsidiaries, etc.
Mathematics & Computer Science Seminar
Emory University
October 2, 2009
Martin Klein & Michael L. Nelson
Department of Computer Science
Old Dominion University
Norfolk VA
Data Modelling is an important tool in the toolbox of a developer. By building and communicating a shared understanding of the domain they're working with, their applications and APIs are more useable and maintainable. However, as you scale up your technical teams, how do you keep these benefits whilst avoiding time-consuming meetings every time something new comes along? This talk reminds ourselves of key data modelling technique and how our use of Kafka changes and informs them. It then examines how these patterns change as more teams join your organisation and how Kafka comes into its own in this world.
We've known for years that data-driven content was a 'thing' when we'd produce simple infographics that shared a few statistics and they'd get easy traction for us online. The game has lifted and consumers are becoming more and more obsessed with data and are now demanding higher quality and more complex data-driven content. The challenge for us now as "T-Shaped" marketers is that there are increasing demands for us to learn new skills to produce this content but we don't have the time to do this amongst the other things we need to be expert at.
This presentation is going to give you specific help on how to produce data-driven content without any programming skill. After watching this presentation you'll have the confidence to build your own data-driven content with the knowledge of:
- blueprints for data-driven content ideas
- scraping tools, frameworks and methodologies
- how to brief in a data scraping project to your in-house team or a freelancer
- how to turn your data into visually appealing content
- channels for promoting data-driven content to ensure it gets traction
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
1. Linked Data at
Semantic Team
semantica@corp.globo.com
Tatiana Al-Chueyr and Rodrigo D. A. Senra
{tatiana.martins, rodrigo.senra}@corp.globo.com
globo.com
8. Isabella Nardoni foi morta em 29 de março de 2008
na Zona Norte de São Paulo (Foto:Reprodução)
Isabella de Oliveira Nardoni, de 5
anos, foi morta na noite de 29 de
março de 2008. A perícia concluiu
que a menina foi atirada do sexto
andar do prédio onde moravam seu
pai, Alexandre Nardoni, sua
madrasta, Anna Carolina Jatobá, e
dois filhos pequenos do casal, na
Vila Isolina Mazzei, na zona norte de
São Paulo.
Túmulo de Isabella vira local de visitação em SP; casal Nardoni está preso.
Caso Isabella Nardoni
Juliana Cardilli
G1 SP
RDF
FOAF
GEO
Dublin
Core
SKOS
Semantic markup in web pages
Motivation
13. Outcomes
● Flexible ways to organize content
● Ease to find related issues
● Explicit relations derived from annotated content
● Up-to-date topic pages with little editorial effort
● Linking content across different web products
● Seamless navigation leading to flow state
14. Status Quo
Used by the main web products of Globo.com
linking, among others:
○ 18,485 organizations
○ 82,386 people
○ 9,129 places
○ 1,000,000+ annotated news
from August 2010 to May 2013
17. Poor data management
○ direct access to triple store (unmanaged)
○ difficulty to share data (distributed DBs)
○ re-sync triple-store and search engine index
○ scalability of triple store
○ high entropy in distributed ontology engineering
Problems
21. Semantic as a library
○ many different versions in production
○ programming language dependent
○ steep learning curve for RDF/OWL/SPARQL
Problems
22. Create an open semantic data management platform
● Scalable
● Mobile and Web friendly
● Interconnect Globo's data with external data sources
● Automate content extraction (including NER)
Next Step
26. Requirements
● Indirect usage of SPARQL
● Programming language independent
● Data management with quality
● Finer-grained authorization and authentication
● Isolate applications from triplestore
● Improve triplestore performance
27. SPARQL query
DEFINE input:inference <http://data.globo.com/ruleset>
SELECT ?uri ?label
FROM <http://data.globo.com/sports/>
WHERE
{
?uri a <http://data.globo.com/sports/Team>;
rdfs:label ?label .
}
LIMIT 10
OFFSET 0
task: list all sports teams
40. Hypermedia
● Flexibility and programmatic adaptation
● Semantic affordances
● Client has to understand what is consumed
● "Hypermedia APIs are not fully baked yet"
42. Services
● List Contexts
● List Collections
● Get a Schema
● List Prefixes
● Status of Services
● Create
● Retrieve
● Delete
● Edit
● List
Instances
49. SPARQL query
SELECT DISTINCT ?class
WHERE {
<http://data.globo.com/place/City> rdfs:subClassOf ?class OPTION
(TRANSITIVE, t_distinct, t_step('step_no') as ?n, t_min (0)) .
?class a owl:Class .
}
task: retrieve all superclasses of a class
53. ● SEO (automatic schema.org)
● Improved annotator (DBpedia Spotlight)
● Richer content relationships (inference)
● Link to open data (e.g. DBPedia, dados.gov.br)
Next steps