Brainiak is a new semantic data management platform being developed by Globo to address problems with their legacy linked data architecture. It features a RESTful API to access and manage semantic data. This decouples applications from the triplestore and improves performance. Brainiak will enable Globo to enrich search, improve annotation and content relationships, and link data to external sources like DBPedia. It has the potential to enhance the user experience on Globo's websites.
SQL For Programmers -- Boston Big Data Techcon April 27thDave Stokes
SQL For Programmers is an introduction to SQL concepts, when SQL is a better choice, and a look at the future of databases. Presented April 27th, 2015 at Big Data Techcon Boston
HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint FederationMuhammad Saleem
Efficient federated query processing is of significant importance to tame the large amount of data available on the Web of Data. Previous works have focused on generating optimized query execution plans for fast result retrieval. However, devising source selection approaches beyond triple pattern-wise source selection has not received much attention. This work presents HiBISCuS, a novel hypergraph-based source selection approach to federated SPARQL querying. Our approach can be directly combined with existing SPARQL query federation engines to achieve the same recall while querying fewer data sources. We extend three well-known SPARQL query federation engines with HiBISCus and compare our extensions with the original approaches on FedBench. Our evaluation shows that HiBISCuS can efficiently reduce the total number of sources selected without losing recall. Moreover, our approach significantly reduces the execution time of the selected engines on most of the benchmark queries.
Relational databases have been the center of the world for many years although they suffer from a prefixed schema you have to adhere to. Now you have a choice: using a NoSQL database.
OrientDB is a NoSQL, multimodel and amazingly fast database since it can store 220,000 records per second on common hardware. This talk will show you some graph theory and the main advantages of using a graph database such as OrientDB.
Just a few years ago a knowledge graph was the domain of academic papers, today they underpin the natural language capabilities of Alexa, Siri, Cortana and Google Now. Graphs are a natural fit for this use case: treating every data item as equivalent and embracing rapid schema mutation. For the past few years, Thomson Reuters has been building a professional information knowledge graph to power our next generation of products. Our graph is RDF based, fast growing and supports a number of different products and user experiences. In this session, Dan will cover our experiences, architecture, tools and lessons learned from building, integrating and maintaining a 100bn triple graph.
Facebook made its Marketing and Ads API generally available in 2015, empowering developers to manage and potentially optimize marketing actions in a programmatic and automated way. CARD.com released an open-source R package to interact with this API and facilitate rapid development around data-driven ad management.
This talk will feature hands-on examples and a complete case study, including:
-Authenticating with the Facebook API.
-Creating and defining a new custom audience and target specs.
-Starting a new campaign, ad set, and ad group with multiple creatives for A/B testing.
-Analyzing the results of an ad.
Domino app ➟ https://app.dominodatalab.com/daroczi...
GH repo ➟ https://github.com/cardcorp/fbRads
SQL For Programmers -- Boston Big Data Techcon April 27thDave Stokes
SQL For Programmers is an introduction to SQL concepts, when SQL is a better choice, and a look at the future of databases. Presented April 27th, 2015 at Big Data Techcon Boston
HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint FederationMuhammad Saleem
Efficient federated query processing is of significant importance to tame the large amount of data available on the Web of Data. Previous works have focused on generating optimized query execution plans for fast result retrieval. However, devising source selection approaches beyond triple pattern-wise source selection has not received much attention. This work presents HiBISCuS, a novel hypergraph-based source selection approach to federated SPARQL querying. Our approach can be directly combined with existing SPARQL query federation engines to achieve the same recall while querying fewer data sources. We extend three well-known SPARQL query federation engines with HiBISCus and compare our extensions with the original approaches on FedBench. Our evaluation shows that HiBISCuS can efficiently reduce the total number of sources selected without losing recall. Moreover, our approach significantly reduces the execution time of the selected engines on most of the benchmark queries.
Relational databases have been the center of the world for many years although they suffer from a prefixed schema you have to adhere to. Now you have a choice: using a NoSQL database.
OrientDB is a NoSQL, multimodel and amazingly fast database since it can store 220,000 records per second on common hardware. This talk will show you some graph theory and the main advantages of using a graph database such as OrientDB.
Just a few years ago a knowledge graph was the domain of academic papers, today they underpin the natural language capabilities of Alexa, Siri, Cortana and Google Now. Graphs are a natural fit for this use case: treating every data item as equivalent and embracing rapid schema mutation. For the past few years, Thomson Reuters has been building a professional information knowledge graph to power our next generation of products. Our graph is RDF based, fast growing and supports a number of different products and user experiences. In this session, Dan will cover our experiences, architecture, tools and lessons learned from building, integrating and maintaining a 100bn triple graph.
Facebook made its Marketing and Ads API generally available in 2015, empowering developers to manage and potentially optimize marketing actions in a programmatic and automated way. CARD.com released an open-source R package to interact with this API and facilitate rapid development around data-driven ad management.
This talk will feature hands-on examples and a complete case study, including:
-Authenticating with the Facebook API.
-Creating and defining a new custom audience and target specs.
-Starting a new campaign, ad set, and ad group with multiple creatives for A/B testing.
-Analyzing the results of an ad.
Domino app ➟ https://app.dominodatalab.com/daroczi...
GH repo ➟ https://github.com/cardcorp/fbRads
Slides presenting some numbers of PythonBrasil[8] conference (PyCon Brasil), that happened in Rio de Janeiro, during November 2012. Authors: @tati_alchueyr and @turicas
Transifex: Ensinando o seu Software Público a falar novos idiomasTatiana Al-Chueyr
(Portuguese)
Presentation related to Transifex.net, Public Software Portal and InVesalius. It shows the improvements in the translation process of InVesalius after using Transifex.
Here is the 3-minute Lesson Plan I prepared... just practicing embedding a PP, using slideshare..... Wish me luck!
By the way, what excellent lesson plans! I got so many useful ideas, starting with aliens and ending with Deutsch rappers... and thanks, Alana, for saving my butt and helping me with the presentation!
M.
M.
Presentation about some common mistakes English learners make - and how it is possible to try to identify part of them automatically (spelling, capitalization and article). This presentation was made during PyCon SK on the 12th of March 2016. Many of the results are due to the partnership of the University of Cambridge and Education First.
Desarollando aplicaciones móviles con Python y AndroidTatiana Al-Chueyr
Charla presentada en PyConAr 2011 (Junín, Argentina), acerca como desarollar aplicaciones móviles con Python y Android.
El código de ejemplo puede ser bajado en:
http://github.com/tatiana/pyandroid
Open Data and News Analytics Demo from the 4th Sofia Open Data & Linked Data meetup
http://www.meetup.com/Sofia-Open-Data-Linked-Data-Meetup/events/228747999/
Mar'2016, Sofia | BG
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...Ontotext
This webinar continues series are demonstrating how linked open data and semantic tagging of news can be used for comprehensive media monitoring, market and business intelligence. The platform for the demonstrations is FactForge: a hub for news and data about people, organizations, and locations (POL). FactForge embodies a big knowledge graph (BKG) of more than 1 billion facts that allows various analytical queries, including tracing suspicious patterns of company control; media monitoring of people, including companies owned by them, their subsidiaries, etc.
Boost your data analytics with open data and public news contentOntotext
Get guidance through the gigantic sea of freely available Open Data and learn how it can empower you analysis of any kind of sources.
This webinar is a live demo of news and data analytics, based on rich links within big knowledge graphs. It will show you how to:
Build ranking reports (e.g for people and organisations)
View topics linked implicitly (e.g. daughter companies, key personnel, products …)
Draw trend lines
Extend your analytics with additional data sources
Simple fuzzy name matching in elasticsearch paris meetupBasis Technology
Those are the slides that were presented during the Elasticsearch meetup in Paris on July 29th.
Normalization is crucial to high quality search results -- who wants irrelevant variations between queries and documents leading to missed hits (e.g., “celebrity” v. “celebrities”)? Normalizing dictionary words works, but what if your application focuses on names? Whether you’re tackling log analysis, e-commerce, watch list screening or other applications, names are often the key. Can you find “Abdul Jabbar, Karim” if you search for “Kareem AbdalJabar” or “كريم عبد الجبار”?
Applications using Elasticsearch provide some fuzziness by mixing its built-in edit-distance matching and phonetic analysis with more generic analyzers and filters. We’ve tried to go beyond that to provide both better matching and a simpler integration. We use a custom Mapper and Score Function so that linguistic nuances can be handled behind-the-scenes. We’ll talk about how we built this sort of plug-in for Rosette, its customization, and its connection to broader trend of entity-centric search.
How to Reveal Hidden Relationships in Data and Risk AnalyticsOntotext
Imagine risk analysis manager or compliance officer who can discover easily relationships like this: Big Bucks Café out of Seattle controls My Local Café in NYC through an offshore company. Such discovery can be a game changer if My Local Café pretends to be an independent small enterprise, while recently Big Bucks experiences financial difficulties.
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingOntotext
A presentation of Ontotext’s CEO Atanas Kiryakov, given during Semantics 2018 - an annual conference that brings together researchers and professionals from all over the world to share knowledge and expertise on semantic computing.
Market analysis through Consumer Behavior Pattern InsightsCARTO
In this webinar in partnership with Safegraph, you learn how to use spatial analysis and leading POI data to drive superior market analysis workflows.
Watch the recorded webinar at: https://go.carto.com/webinars/safegraph-market-analysis-recorded
[Conference] Cognitive Graph Analytics on Company Data and NewsOntotext
Atanas Kiryakov, Ontotext's CEO, presented at the Data Day Texas 2018 conference, which took place in Austin, TX, USA, on January 27th.
Ontotext's talk was part of the Graph Day Sessions and its focus was 'Cognitive graph analytics on company data and news', aiming to demonstrate the power of Graph Analytics to create links between various datasets and lead to knowledge discovery.
Slides presenting some numbers of PythonBrasil[8] conference (PyCon Brasil), that happened in Rio de Janeiro, during November 2012. Authors: @tati_alchueyr and @turicas
Transifex: Ensinando o seu Software Público a falar novos idiomasTatiana Al-Chueyr
(Portuguese)
Presentation related to Transifex.net, Public Software Portal and InVesalius. It shows the improvements in the translation process of InVesalius after using Transifex.
Here is the 3-minute Lesson Plan I prepared... just practicing embedding a PP, using slideshare..... Wish me luck!
By the way, what excellent lesson plans! I got so many useful ideas, starting with aliens and ending with Deutsch rappers... and thanks, Alana, for saving my butt and helping me with the presentation!
M.
M.
Presentation about some common mistakes English learners make - and how it is possible to try to identify part of them automatically (spelling, capitalization and article). This presentation was made during PyCon SK on the 12th of March 2016. Many of the results are due to the partnership of the University of Cambridge and Education First.
Desarollando aplicaciones móviles con Python y AndroidTatiana Al-Chueyr
Charla presentada en PyConAr 2011 (Junín, Argentina), acerca como desarollar aplicaciones móviles con Python y Android.
El código de ejemplo puede ser bajado en:
http://github.com/tatiana/pyandroid
Open Data and News Analytics Demo from the 4th Sofia Open Data & Linked Data meetup
http://www.meetup.com/Sofia-Open-Data-Linked-Data-Meetup/events/228747999/
Mar'2016, Sofia | BG
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...Ontotext
This webinar continues series are demonstrating how linked open data and semantic tagging of news can be used for comprehensive media monitoring, market and business intelligence. The platform for the demonstrations is FactForge: a hub for news and data about people, organizations, and locations (POL). FactForge embodies a big knowledge graph (BKG) of more than 1 billion facts that allows various analytical queries, including tracing suspicious patterns of company control; media monitoring of people, including companies owned by them, their subsidiaries, etc.
Boost your data analytics with open data and public news contentOntotext
Get guidance through the gigantic sea of freely available Open Data and learn how it can empower you analysis of any kind of sources.
This webinar is a live demo of news and data analytics, based on rich links within big knowledge graphs. It will show you how to:
Build ranking reports (e.g for people and organisations)
View topics linked implicitly (e.g. daughter companies, key personnel, products …)
Draw trend lines
Extend your analytics with additional data sources
Simple fuzzy name matching in elasticsearch paris meetupBasis Technology
Those are the slides that were presented during the Elasticsearch meetup in Paris on July 29th.
Normalization is crucial to high quality search results -- who wants irrelevant variations between queries and documents leading to missed hits (e.g., “celebrity” v. “celebrities”)? Normalizing dictionary words works, but what if your application focuses on names? Whether you’re tackling log analysis, e-commerce, watch list screening or other applications, names are often the key. Can you find “Abdul Jabbar, Karim” if you search for “Kareem AbdalJabar” or “كريم عبد الجبار”?
Applications using Elasticsearch provide some fuzziness by mixing its built-in edit-distance matching and phonetic analysis with more generic analyzers and filters. We’ve tried to go beyond that to provide both better matching and a simpler integration. We use a custom Mapper and Score Function so that linguistic nuances can be handled behind-the-scenes. We’ll talk about how we built this sort of plug-in for Rosette, its customization, and its connection to broader trend of entity-centric search.
How to Reveal Hidden Relationships in Data and Risk AnalyticsOntotext
Imagine risk analysis manager or compliance officer who can discover easily relationships like this: Big Bucks Café out of Seattle controls My Local Café in NYC through an offshore company. Such discovery can be a game changer if My Local Café pretends to be an independent small enterprise, while recently Big Bucks experiences financial difficulties.
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingOntotext
A presentation of Ontotext’s CEO Atanas Kiryakov, given during Semantics 2018 - an annual conference that brings together researchers and professionals from all over the world to share knowledge and expertise on semantic computing.
Market analysis through Consumer Behavior Pattern InsightsCARTO
In this webinar in partnership with Safegraph, you learn how to use spatial analysis and leading POI data to drive superior market analysis workflows.
Watch the recorded webinar at: https://go.carto.com/webinars/safegraph-market-analysis-recorded
[Conference] Cognitive Graph Analytics on Company Data and NewsOntotext
Atanas Kiryakov, Ontotext's CEO, presented at the Data Day Texas 2018 conference, which took place in Austin, TX, USA, on January 27th.
Ontotext's talk was part of the Graph Day Sessions and its focus was 'Cognitive graph analytics on company data and news', aiming to demonstrate the power of Graph Analytics to create links between various datasets and lead to knowledge discovery.
10 best platforms to find free datasetsAparna Sharma
If “the data is the new oil” then there is a lot of free oil just waiting to be used. And you can do some pretty interesting things with that data, like finding the answer to the question: Is Buffalo, New York really that cold in the winter?
There is plenty of free data out there, ready to be used for school projects, market research, or just for fun. Before you go crazy, however, you should be aware of the quality of the data you find. Here are some great sources of free data and some ways to determine their quality.
All of these dataset sources have strengths, weaknesses, and specialties. All in all, these are great pieces of equipment and you can spend a lot of your time digging rabbit holes.
But if you want to stay focused and find what you need, it’s important to understand the nuances of each source and use their strengths to your advantage.
Semantic SEO in the post Hummingbird Era and WordLiftAndrea Volpini
This presentation is focused on Semantic SEO techniques, the importance of curating structured data and the new Google search algorithm called Hummingbird.
The hummingbird in English, is a very fast and accurate bird; in the world of search engines the changes introduced by this new algorithm are enormous. Google begins to understand, using natural language processing (NLP ), the search intent and provides answers instead of the traditional list of blue links.
In this new scenario, it becomes crucial to "curate" your blog contents and link them with publicly available datasets.
WordLift, a WordPress plugin (soon available in its version 3.0), allows us to publish content as Linked Open Data and connects these datasets with the Knowledge Graph of Google (the knowledge base Google uses to "respond" to users' queries).
These slides, made by Kim Renberg and Andrea Volpini, have presented (in Italian) to the WordPress Meetup in Rome (# wproma on Twitter).
A Day in the Life of a Functional Data ScientistC4Media
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1tnEpjY.
Richard Minerich explains how ideas and tools from functional programming can save time, prevent subtle mistakes in data science, and how he incorporates them into his everyday workflow. Filmed at qconnewyork.com.
Richard Minerich works tirelessly at Bayard Rock to apply cutting edge research to anti-money laundering and fraud while using typed functional programming whenever possible. As an F# MVP he's been running events, speaking, and writing for over five years.
Building search and discovery services for Schibsted (LSRS '17)Sandra Garcia
Presentation given at the Large Scale Recommender Systems workshop (LSRS) in Recsys 2017.
This presentation describes the search and discovery products we are working on in Schibsted for the domains of news and marketplaces as well as the challenges within each of these domains. It also covers how we bring these services into production including the system architecture and deployment process.
Similar to Rio info 2013 - Linked Data at Globo.com (20)
Talk given at the London AICamp meet up on the 13 July 2023. It's an introduction on building open-source ChatGPT-like chat bots and some of the considerations to have while training/tuning them using Airflow.
From an idea to production: building a recommender for BBC SoundsTatiana Al-Chueyr
This presentation was given on the 28th of September 2021 at the first MLOps London Meetup
Event website: https://www.meetup.com/mlopslondon/events/280295841/
Presentation given on the 21st of September 2021 at the London Beam Meet-up
Event website: https://www.meetup.com/London-Apache-Beam-Meetup/events/280442419/
Presentation given on the 15th July 2021 at the Airflow Summit 2021
Conference website: https://airflowsummit.org/sessions/2021/clearing-airflow-obstructions/
Recording: https://www.crowdcast.io/e/airflowsummit2021/40
Artificial intelligence breaks into our lives. In the future, everything will probably be clear, but so far, some questions have arisen, and increasingly these issues affect aspects of morality and ethics. Which principles do we need to keep in mind while surfacing machine learning algorithms? How the editorial team affects the day to day development of applications at BBC?
Place: Kharkiv National University of Radio Electronics, Ukraine
When: 17th November 2019.
Presented at PyCon UK 2018 (18 September 2018, Cardiff).
The slides are incomplete.
Recording available at:
https://www.youtube.com/watch?v=-weU0Zy4Yd8
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
16. Isabella Nardoni foi morta em 29 de março de 2008
na Zona Norte de São Paulo (Foto:Reprodução)
Isabella de Oliveira Nardoni, de 5
anos, foi morta na noite de 29 de
março de 2008. A perícia concluiu
que a menina foi atirada do sexto
andar do prédio onde moravam seu
pai, Alexandre Nardoni, sua
madrasta, Anna Carolina Jatobá, e
dois filhos pequenos do casal, na
Vila Isolina Mazzei, na zona norte de
São Paulo.
Túmulo de Isabella vira local de visitação em SP; casal Nardoni está preso.
Caso Isabella Nardoni
Juliana Cardilli G1 SP
RDF
FOAF
GEO
Dublin
Core
SKOS
Semantic markup in web pages
Motivation
24. Outcomes
● To replace words by entities improved:
○ Finding
○ Linking
○ Reconciling
○ Organizing
multiple layers of information
25. Outcomes
● Flexible ways to organize content
● Ease to find related issues
● Explicit relations derived from annotated content
● Up-to-date topic pages with little editorial effort
● Linking content across different web products
● Seamless navigation leading to flow state
26. Status Quo
Used by the main web products of Globo.com:
○ 18,485 organizations
○ 83,000 people
○ 9,129 places
○ 1,000,000+ annotated news
Which sum up 2,500,000+ entities!
from August 2010 to May 2013
30. Poor data management
○ direct access to triple store (unmanaged)
○ difficulty to share data (distributed DBs)
○ re-sync triple-store and search engine index
○ scalability of triple store
○ high entropy in distributed ontology engineering
Problems
34. Semantic as a library
○ many different versions in production
○ programming language dependent
○ steep learning curve for RDF/OWL/SPARQL
Problems
35. Create an open semantic data management platform
● Scalable
● Mobile and Web friendly
● Interconnect Globo's data with external data sources
● Automate content extraction (including NER)
Solution
39. Requirements
● Indirect usage of SPARQL
● Programming language independent
● Data management with quality
● Finer-grained authorization and authentication
● Isolate applications from triplestore
● Improve triplestore performance
40. SPARQL query
DEFINE input:inference <http://data.globo.com/ruleset>
SELECT ?uri ?label
FROM <http://data.globo.com/sports/>
WHERE
{
?uri a <http://data.globo.com/sports/Team>;
rdfs:label ?label .
}
LIMIT 10
OFFSET 0
task: list all sports teams
44. SPARQL query
SELECT DISTINCT ?class
WHERE {
<http://data.globo.com/place/City> rdfs:subClassOf ?class OPTION
(TRANSITIVE, t_distinct, t_step('step_no') as ?n, t_min (0)) .
?class a owl:Class .
}
task: retrieve all superclasses of a class
48. ● Enrich Globo.com search
● SEO (automatic schema.org)
● Improve annotator (DBpedia Spotlight)
● Richer content relationships (inference)
● Link to open data (e.g. DBPedia, dados.gov.br)
Next steps