The document describes the Open Land Use Map (OLU) project which aims to create harmonized land use maps by integrating heterogeneous land use and land cover data from different sources into a common INSPIRE compliant data model. It also describes the Smart Points of Interest (SPOI) project which creates a seamless open dataset of points of interest (POIs) by combining data from various sources and representing it in a linked data format with semantic descriptions and classifications. The SPOI data is accessible through OGC services and a SPARQL endpoint.
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of InterestWhereCampBerlin
The document summarizes the SPOI (SDI4Apps: Points of Interest) data set, which harmonizes data from various sources to provide open tourism and travel point of interest data. The SPOI data set follows Linked Data principles and is published as RDF. It contains over 4.7 million POIs from many countries categorized using a common vocabulary. The data set and its development aim to improve accessibility and usability of geospatial tourism data through open standards.
Data Integration & Disintegration: Managing SN SciGraph with SHACL and OWLTony Hammond
A presentation on 23 October 2017 by Tony Hammond, Michele Pasin and Evangelos Theodoridis to the International Semantic Web Conference (ISWC) 2017 Industry Track on managing Springer Nature (SN) SciGraph with SHACL and OWL. See http://scigraph.com/ for more information on the project.
This document discusses linked spatial data and spatial data infrastructures. It provides examples of using URIs to represent spatial things and linking spatial datasets. Key points discussed include:
1. Using URIs and HTTP to identify spatial things like locations and allowing information about those things to be retrieved in different formats like RDF and GML.
2. Examples of using linked spatial data for tasks like looking up information, identifying locations, linking datasets, and querying spatial relationships between objects.
3. Initiatives to link spatial metadata standards like ISO19115 to open data schemas like DCAT-AP to make spatial data more accessible on the web.
4. Revenue models for linked data providers including public funding, advertisements, and
This document discusses using the GT.M database to store and query geospatial data from OpenStreetMap. It describes how OpenStreetMap contains large amounts of map data that is currently stored in PostgreSQL but could benefit from GT.M's capabilities for querying data by tag or spatial area more efficiently. The document provides examples of how the OpenStreetMap data schema could be represented within GT.M using its key-value data storage model and indexing capabilities.
Presented in : JIST2015, Yichang, China
Prototype: http://rc.lodac.nii.ac.jp/rdf4u/
Video: https://www.youtube.com/watch?v=z3roA9-Cp8g
Abstract: It is known that Semantic Web and Linked Open Data (LOD) are powerful technologies for knowledge management, and explicit knowledge is expected to be presented by RDF format (Resource Description Framework), but normal users are far from RDF due to technical skills required. As we learn, a concept-map or a node-link diagram can enhance the learning ability of learners from beginner to advanced user level, so RDF graph visualization can be a suitable tool for making users be familiar with Semantic technology. However, an RDF graph generated from the whole query result is not suitable for reading, because it is highly connected like a hairball and less organized. To make a graph presenting knowledge be more proper to read, this research introduces an approach to sparsify a graph using the combination of three main functions: graph simplification, triple ranking, and property selection. These functions are mostly initiated based on the interpretation of RDF data as knowledge units together with statistical analysis in order to deliver an easily-readable graph to users. A prototype is implemented to demonstrate the suitability and feasibility of the approach. It shows that the simple and flexible graph visualization is easy to read, and it creates the impression of users. In addition, the attractive tool helps to inspire users to realize the advantageous role of linked data in knowledge management.
M/DB and M/DB:X are open source NoSQL databases based on GT.M. M/DB emulates the Amazon SimpleDB API and data model, allowing use of SimpleDB-compatible clients on premise. M/DB:X provides a native XML database with DOM and XPath APIs that can store and retrieve XML documents in JSON or XML format using the SimpleDB security model. Both leverage the high performance and scalability of the underlying GT.M database.
The document describes the Open Land Use Map (OLU) project which aims to create harmonized land use maps by integrating heterogeneous land use and land cover data from different sources into a common INSPIRE compliant data model. It also describes the Smart Points of Interest (SPOI) project which creates a seamless open dataset of points of interest (POIs) by combining data from various sources and representing it in a linked data format with semantic descriptions and classifications. The SPOI data is accessible through OGC services and a SPARQL endpoint.
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of InterestWhereCampBerlin
The document summarizes the SPOI (SDI4Apps: Points of Interest) data set, which harmonizes data from various sources to provide open tourism and travel point of interest data. The SPOI data set follows Linked Data principles and is published as RDF. It contains over 4.7 million POIs from many countries categorized using a common vocabulary. The data set and its development aim to improve accessibility and usability of geospatial tourism data through open standards.
Data Integration & Disintegration: Managing SN SciGraph with SHACL and OWLTony Hammond
A presentation on 23 October 2017 by Tony Hammond, Michele Pasin and Evangelos Theodoridis to the International Semantic Web Conference (ISWC) 2017 Industry Track on managing Springer Nature (SN) SciGraph with SHACL and OWL. See http://scigraph.com/ for more information on the project.
This document discusses linked spatial data and spatial data infrastructures. It provides examples of using URIs to represent spatial things and linking spatial datasets. Key points discussed include:
1. Using URIs and HTTP to identify spatial things like locations and allowing information about those things to be retrieved in different formats like RDF and GML.
2. Examples of using linked spatial data for tasks like looking up information, identifying locations, linking datasets, and querying spatial relationships between objects.
3. Initiatives to link spatial metadata standards like ISO19115 to open data schemas like DCAT-AP to make spatial data more accessible on the web.
4. Revenue models for linked data providers including public funding, advertisements, and
This document discusses using the GT.M database to store and query geospatial data from OpenStreetMap. It describes how OpenStreetMap contains large amounts of map data that is currently stored in PostgreSQL but could benefit from GT.M's capabilities for querying data by tag or spatial area more efficiently. The document provides examples of how the OpenStreetMap data schema could be represented within GT.M using its key-value data storage model and indexing capabilities.
Presented in : JIST2015, Yichang, China
Prototype: http://rc.lodac.nii.ac.jp/rdf4u/
Video: https://www.youtube.com/watch?v=z3roA9-Cp8g
Abstract: It is known that Semantic Web and Linked Open Data (LOD) are powerful technologies for knowledge management, and explicit knowledge is expected to be presented by RDF format (Resource Description Framework), but normal users are far from RDF due to technical skills required. As we learn, a concept-map or a node-link diagram can enhance the learning ability of learners from beginner to advanced user level, so RDF graph visualization can be a suitable tool for making users be familiar with Semantic technology. However, an RDF graph generated from the whole query result is not suitable for reading, because it is highly connected like a hairball and less organized. To make a graph presenting knowledge be more proper to read, this research introduces an approach to sparsify a graph using the combination of three main functions: graph simplification, triple ranking, and property selection. These functions are mostly initiated based on the interpretation of RDF data as knowledge units together with statistical analysis in order to deliver an easily-readable graph to users. A prototype is implemented to demonstrate the suitability and feasibility of the approach. It shows that the simple and flexible graph visualization is easy to read, and it creates the impression of users. In addition, the attractive tool helps to inspire users to realize the advantageous role of linked data in knowledge management.
M/DB and M/DB:X are open source NoSQL databases based on GT.M. M/DB emulates the Amazon SimpleDB API and data model, allowing use of SimpleDB-compatible clients on premise. M/DB:X provides a native XML database with DOM and XPath APIs that can store and retrieve XML documents in JSON or XML format using the SimpleDB security model. Both leverage the high performance and scalability of the underlying GT.M database.
Managing and querying large data sets using Data Factory, Cosmos DB and Azure...Marc Duiker
Slides of the Cosmos DB session for the Global Azure bootcamp held at Xebia Amsterdam on the 21st of April 2018.
Related GitHub repo: https://github.com/XpiritBV/GABC2018_HandsOnLabs/tree/master/Cosmos
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataPRELIDA Project
by Albert Meroño, presented at the 3rd PRELIDA Consolidation and Dissemination Workshop, Riva, Italy, October, 17, 2014. More information about the workshop at: prelida.eu
Improving D3 Performance with CANVAS and other HacksPhilip Tellis
This document discusses techniques for improving the performance of D3 visualizations. It begins with an overview of D3 and some basic tutorials. It then describes issues with performance for force-directed layouts and edge-bundled layouts as the number of nodes and links increases. Solutions proposed include using canvas instead of SVG for rendering, reducing unnecessary calculations, and caching repeated drawing states. The document concludes that the number of DOM nodes has major performance implications and techniques like canvas can help when exact mouse interactions are not required.
The document summarizes Yandex's academic initiatives including:
1) The Yandex School of Data Analysis, a two-year master's program in data analysis.
2) Monthly scientific seminars on data analysis and information retrieval organized by Microsoft Research and Yandex.
3) The Internet Mathematics Competition (IMAT) which involves machine learning tasks such as web search ranking and traffic prediction.
4) The Russian Initiative on Scholarly Search and Information Retrieval (RuSSIR) which is an annual IR conference.
RSP-QL*: Querying Data-Level Annotations in RDF Streamskeski
This document proposes an extension to RSP-QL called RSP-QL* that allows querying of statement-level annotations in RDF streams. RSP-QL* uses the RDF* model, which allows embedding RDF triples as the subject or object of other triples. This provides an efficient way to represent statement-level metadata in RDF. The semantics of RSP-QL are extended to support RSP-QL* patterns, which can include basic graph patterns, named graphs, windows and other operators. Future work includes adding more functionality to the RDF* model, prototyping an implementation, and evaluating performance.
Finding Insights In Connected Data: Using Graph Databases In JournalismWilliam Lyon
When dealing with datasets, journalists have many options to choose from when moving beyond Excel. Usually the first step is using a relational (or SQL) database. While a relational database can be a good choice for some datasets, data analysts today turn to new tools to gain deeper insight. This talk will show how we can use a graph database to analyze highly connected data using examples from U.S. Congressional data and political email archives. Using the U.S. Congress data, we’ll show you how to explore the dataset using Cypher, the Neo4j query language, to discover legislator activity including bill sponsorship and voting activity. Building up our knowledge of Cypher as we progress, we’ll show how you can use principles from social network analysis to find influential legislators and discover what topics legislators have influence over. Finally, we will examine how to draw insights from the Hillary Clinton email dataset, released as part of a FOIA request earlier this year. We will explore this dataset as a graph of interactions among users, answering questions like: Who is communicating with Hillary the most? What are the topics of these emails? You’ll learn how to visualize these using the Neo4j browser to quickly make sense of the data as we are exploring.
The goal of this talk is to provide a demonstration of database tools that any journalist can use to explore datasets and draw insights from connected datasets.
Adventures in Linked Data Land (presentation by Richard Light)jottevanger
"Adventures in Linked Data Land: bringing RDF to the Wordsworth Trust" is a paper given by RIchard Light (http://uk.linkedin.com/pub/richard-light/a/221/ba5) to a Linked Data meeting run by the Collections Trust in February 2010. He runs through the basics of LD, how it relates to cultural heritage, and some of his experiments with it, specifically with the data of the Wordsworth Trust, finally listing a series of challenges that face museums in trying to get on board the Linked Data bus.
1. The document discusses how to make transportation stop data from Switzerland's Federal Office of Topography linkable and accessible on the web.
2. It introduces the concept of linked data and the four rules for publishing linked data, including using URIs to identify things, providing HTTP URIs so that people can look up those names, and including links between data.
3. The document provides steps for publishing transportation stop data as linked data, such as mapping data fields to ontology terms, converting coordinates between reference systems, and linking the data to other datasets to connect it to the larger web of data.
1. The document outlines the evolution of graph schemas from early semantic web schemas like RDFS and OWL to simpler property graph schemas.
2. It discusses elements of graph schemas including entity types, relationship types, indexes, and schema imports.
3. Graph and schema management techniques are covered including schema validation, initialization, migration, and revision control.
4. Graph generation techniques are presented for capacity planning and benchmarking graphs of different sizes based on schema statistics.
At the Dublin Fashion Insights Centre, we are exploring methods of categorising the web into a set of known fashion related topics. This raises questions such as: How many fashion related topics are there? How closely are they related to each other, or to other non-fashion topics? Furthermore, what topic hierarchies exist in this landscape? Using Clojure and MLlib to harness the data available from crowd-sourced websites such as DMOZ (a categorisation of millions of websites) and Common Crawl (a monthly crawl of billions of websites), we are answering these questions to understand fashion in a quantitative manner.
The latest generation of big data tools such as Apache Spark routinely handle petabytes of data while also addressing real-world realities like node and network failures. Spark's transformations and operations on data sets are a natural fit with Clojure's everyday use of transformations and reductions. Spark MLlib's excellent implementations of distributed machine learning algorithms puts the power of large-scale analytics in the hands of Clojure developers. At Zalando's Dublin Fashion Insights Centre, we're using the Clojure bindings to Spark and MLlib to answer fashion-related questions that until recently have been nearly impossible to answer quantitatively.
Hunter Kelly @retnuh
tech.zalando.com
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...Thomas Gottron
The intensive growth of the Linked Open Data (LOD) Cloud has spawned a web of data where a multitude of data sources provides huge amounts of valuable information across different domains. Nowadays, when accessing and using Linked Data more and more often the challenging question is not so much whether there is relevant data available, but rather where it can be found, how it is structured and to make best use of it.
I this lecture I will start with giving a brief introduction to the concepts underlying LOD. Then I will focus on three aspects of current research:
(1) Managing Linked Data. Index structures play an important role for making use of the information in LOD cloud. I will give an overview of indexing approaches, present algorithms and discuss the ideas behind the index structures.
(2) Analysing Linked Data. I will present methods for analysing various aspects of LOD. From an information theoretic analysis for measuring structural redundancy, over formal concept analysis for identifying alternative declarative descriptions to a dynamics analysis for capturing the evolution of Linked Data sources.
(3) Making Use of Linked Data. Finally I will give a brief overview and outlook on where the presented techniques and approaches are of practical relevance in applications.
(Talk at the IRSS summerschool 2014 in Athens)
Slide show for the webinar on "Spatial Data Science with R" organized for the GeoDevelopers.org community. The video of the webinar and all the related materials including source code and sample data can be downloaded from this link: http://amsantac.co/blog/en/2016/08/07/spatial-data-science-r.html
In this webinar I talked about Data Science in the context of its application to spatial data and explained how we can use the R language for the analysis of geographic information within the different stages of a data science workflow, from the import and processing of spatial data to visualization and publication of results.
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorHenrik Ingo
This document provides an overview of analytics with MongoDB and Hadoop Connector. It discusses how to collect and explore data, use visualization and aggregation, and make predictions. It describes how MongoDB can be used for data collection, pre-aggregation, and real-time queries. The Aggregation Framework and MapReduce in MongoDB are explained. It also covers using the Hadoop Connector to process large amounts of MongoDB data in Hadoop and writing results back to MongoDB. Examples of analytics use cases like recommendations, A/B testing, and personalization are briefly outlined.
This document discusses demos and tools for linking knowledge discovery (KDD) and linked data. It summarizes several tools that integrate linked data and KDD processes like data preprocessing, mining, and postprocessing. OpenRefine, RapidMiner, R, Matlab, ProLOD++, DL-Learner, Spark, KNIME, and Gephi were highlighted as tools that support tasks like enriching data, running SPARQL queries, loading RDF data, and visualizing linked data. The document concludes by asking about gaps and how to increase adoption, noting linked data could benefit KDD with validation, enrichment, and reasoning over semantic web data.
Geohash encodes latitude and longitude coordinates into alphanumeric strings to simplify representation and allow for proximity searches. It subdivides geographic areas into nested grid "buckets" represented by strings, with longer strings indicating smaller areas. This hierarchical structure allows nearby locations to be identified by searching for records with similar geohash prefixes. While it approximates location rather than representing a single point, geohash enables easy grouping, zooming, and proximity searches of geographic data in databases.
This is a very ^2 basic introduction to R.
The purpose of this presentation is to prepare you with all that you have to know about fundamentals of using R to operate data frames, which you can easily get by importing data from relational database table or csv/text file.
This document contains a summary of John Mark M. Canonizado's qualifications and work experience. He has over 10 years of experience in safety officer roles in the Philippines, Qatar, Saudi Arabia, and United Arab Emirates. He has a bachelor's degree from KOLEHIYO NG SUBIC and certifications in first aid, safety training, and firefighting. His strengths include strong communication and leadership skills as well as the ability to work independently and under pressure.
The document announces the Swedish Kata Trophy karate tournament to be held on March 16th, 2013 in Stockholm, Sweden. Last year there were 650 individual competitors and 20 teams from 7 countries. The tournament aims to be international in scope by inviting judges and competitors from Europe. It will feature male and female categories for seniors, juniors, cadets, children and teams. The top sponsor is Budo Nord, a martial arts equipment supplier. Clubs are invited to participate and help make this an entirely international event.
Managing and querying large data sets using Data Factory, Cosmos DB and Azure...Marc Duiker
Slides of the Cosmos DB session for the Global Azure bootcamp held at Xebia Amsterdam on the 21st of April 2018.
Related GitHub repo: https://github.com/XpiritBV/GABC2018_HandsOnLabs/tree/master/Cosmos
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataPRELIDA Project
by Albert Meroño, presented at the 3rd PRELIDA Consolidation and Dissemination Workshop, Riva, Italy, October, 17, 2014. More information about the workshop at: prelida.eu
Improving D3 Performance with CANVAS and other HacksPhilip Tellis
This document discusses techniques for improving the performance of D3 visualizations. It begins with an overview of D3 and some basic tutorials. It then describes issues with performance for force-directed layouts and edge-bundled layouts as the number of nodes and links increases. Solutions proposed include using canvas instead of SVG for rendering, reducing unnecessary calculations, and caching repeated drawing states. The document concludes that the number of DOM nodes has major performance implications and techniques like canvas can help when exact mouse interactions are not required.
The document summarizes Yandex's academic initiatives including:
1) The Yandex School of Data Analysis, a two-year master's program in data analysis.
2) Monthly scientific seminars on data analysis and information retrieval organized by Microsoft Research and Yandex.
3) The Internet Mathematics Competition (IMAT) which involves machine learning tasks such as web search ranking and traffic prediction.
4) The Russian Initiative on Scholarly Search and Information Retrieval (RuSSIR) which is an annual IR conference.
RSP-QL*: Querying Data-Level Annotations in RDF Streamskeski
This document proposes an extension to RSP-QL called RSP-QL* that allows querying of statement-level annotations in RDF streams. RSP-QL* uses the RDF* model, which allows embedding RDF triples as the subject or object of other triples. This provides an efficient way to represent statement-level metadata in RDF. The semantics of RSP-QL are extended to support RSP-QL* patterns, which can include basic graph patterns, named graphs, windows and other operators. Future work includes adding more functionality to the RDF* model, prototyping an implementation, and evaluating performance.
Finding Insights In Connected Data: Using Graph Databases In JournalismWilliam Lyon
When dealing with datasets, journalists have many options to choose from when moving beyond Excel. Usually the first step is using a relational (or SQL) database. While a relational database can be a good choice for some datasets, data analysts today turn to new tools to gain deeper insight. This talk will show how we can use a graph database to analyze highly connected data using examples from U.S. Congressional data and political email archives. Using the U.S. Congress data, we’ll show you how to explore the dataset using Cypher, the Neo4j query language, to discover legislator activity including bill sponsorship and voting activity. Building up our knowledge of Cypher as we progress, we’ll show how you can use principles from social network analysis to find influential legislators and discover what topics legislators have influence over. Finally, we will examine how to draw insights from the Hillary Clinton email dataset, released as part of a FOIA request earlier this year. We will explore this dataset as a graph of interactions among users, answering questions like: Who is communicating with Hillary the most? What are the topics of these emails? You’ll learn how to visualize these using the Neo4j browser to quickly make sense of the data as we are exploring.
The goal of this talk is to provide a demonstration of database tools that any journalist can use to explore datasets and draw insights from connected datasets.
Adventures in Linked Data Land (presentation by Richard Light)jottevanger
"Adventures in Linked Data Land: bringing RDF to the Wordsworth Trust" is a paper given by RIchard Light (http://uk.linkedin.com/pub/richard-light/a/221/ba5) to a Linked Data meeting run by the Collections Trust in February 2010. He runs through the basics of LD, how it relates to cultural heritage, and some of his experiments with it, specifically with the data of the Wordsworth Trust, finally listing a series of challenges that face museums in trying to get on board the Linked Data bus.
1. The document discusses how to make transportation stop data from Switzerland's Federal Office of Topography linkable and accessible on the web.
2. It introduces the concept of linked data and the four rules for publishing linked data, including using URIs to identify things, providing HTTP URIs so that people can look up those names, and including links between data.
3. The document provides steps for publishing transportation stop data as linked data, such as mapping data fields to ontology terms, converting coordinates between reference systems, and linking the data to other datasets to connect it to the larger web of data.
1. The document outlines the evolution of graph schemas from early semantic web schemas like RDFS and OWL to simpler property graph schemas.
2. It discusses elements of graph schemas including entity types, relationship types, indexes, and schema imports.
3. Graph and schema management techniques are covered including schema validation, initialization, migration, and revision control.
4. Graph generation techniques are presented for capacity planning and benchmarking graphs of different sizes based on schema statistics.
At the Dublin Fashion Insights Centre, we are exploring methods of categorising the web into a set of known fashion related topics. This raises questions such as: How many fashion related topics are there? How closely are they related to each other, or to other non-fashion topics? Furthermore, what topic hierarchies exist in this landscape? Using Clojure and MLlib to harness the data available from crowd-sourced websites such as DMOZ (a categorisation of millions of websites) and Common Crawl (a monthly crawl of billions of websites), we are answering these questions to understand fashion in a quantitative manner.
The latest generation of big data tools such as Apache Spark routinely handle petabytes of data while also addressing real-world realities like node and network failures. Spark's transformations and operations on data sets are a natural fit with Clojure's everyday use of transformations and reductions. Spark MLlib's excellent implementations of distributed machine learning algorithms puts the power of large-scale analytics in the hands of Clojure developers. At Zalando's Dublin Fashion Insights Centre, we're using the Clojure bindings to Spark and MLlib to answer fashion-related questions that until recently have been nearly impossible to answer quantitatively.
Hunter Kelly @retnuh
tech.zalando.com
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...Thomas Gottron
The intensive growth of the Linked Open Data (LOD) Cloud has spawned a web of data where a multitude of data sources provides huge amounts of valuable information across different domains. Nowadays, when accessing and using Linked Data more and more often the challenging question is not so much whether there is relevant data available, but rather where it can be found, how it is structured and to make best use of it.
I this lecture I will start with giving a brief introduction to the concepts underlying LOD. Then I will focus on three aspects of current research:
(1) Managing Linked Data. Index structures play an important role for making use of the information in LOD cloud. I will give an overview of indexing approaches, present algorithms and discuss the ideas behind the index structures.
(2) Analysing Linked Data. I will present methods for analysing various aspects of LOD. From an information theoretic analysis for measuring structural redundancy, over formal concept analysis for identifying alternative declarative descriptions to a dynamics analysis for capturing the evolution of Linked Data sources.
(3) Making Use of Linked Data. Finally I will give a brief overview and outlook on where the presented techniques and approaches are of practical relevance in applications.
(Talk at the IRSS summerschool 2014 in Athens)
Slide show for the webinar on "Spatial Data Science with R" organized for the GeoDevelopers.org community. The video of the webinar and all the related materials including source code and sample data can be downloaded from this link: http://amsantac.co/blog/en/2016/08/07/spatial-data-science-r.html
In this webinar I talked about Data Science in the context of its application to spatial data and explained how we can use the R language for the analysis of geographic information within the different stages of a data science workflow, from the import and processing of spatial data to visualization and publication of results.
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorHenrik Ingo
This document provides an overview of analytics with MongoDB and Hadoop Connector. It discusses how to collect and explore data, use visualization and aggregation, and make predictions. It describes how MongoDB can be used for data collection, pre-aggregation, and real-time queries. The Aggregation Framework and MapReduce in MongoDB are explained. It also covers using the Hadoop Connector to process large amounts of MongoDB data in Hadoop and writing results back to MongoDB. Examples of analytics use cases like recommendations, A/B testing, and personalization are briefly outlined.
This document discusses demos and tools for linking knowledge discovery (KDD) and linked data. It summarizes several tools that integrate linked data and KDD processes like data preprocessing, mining, and postprocessing. OpenRefine, RapidMiner, R, Matlab, ProLOD++, DL-Learner, Spark, KNIME, and Gephi were highlighted as tools that support tasks like enriching data, running SPARQL queries, loading RDF data, and visualizing linked data. The document concludes by asking about gaps and how to increase adoption, noting linked data could benefit KDD with validation, enrichment, and reasoning over semantic web data.
Geohash encodes latitude and longitude coordinates into alphanumeric strings to simplify representation and allow for proximity searches. It subdivides geographic areas into nested grid "buckets" represented by strings, with longer strings indicating smaller areas. This hierarchical structure allows nearby locations to be identified by searching for records with similar geohash prefixes. While it approximates location rather than representing a single point, geohash enables easy grouping, zooming, and proximity searches of geographic data in databases.
This is a very ^2 basic introduction to R.
The purpose of this presentation is to prepare you with all that you have to know about fundamentals of using R to operate data frames, which you can easily get by importing data from relational database table or csv/text file.
This document contains a summary of John Mark M. Canonizado's qualifications and work experience. He has over 10 years of experience in safety officer roles in the Philippines, Qatar, Saudi Arabia, and United Arab Emirates. He has a bachelor's degree from KOLEHIYO NG SUBIC and certifications in first aid, safety training, and firefighting. His strengths include strong communication and leadership skills as well as the ability to work independently and under pressure.
The document announces the Swedish Kata Trophy karate tournament to be held on March 16th, 2013 in Stockholm, Sweden. Last year there were 650 individual competitors and 20 teams from 7 countries. The tournament aims to be international in scope by inviting judges and competitors from Europe. It will feature male and female categories for seniors, juniors, cadets, children and teams. The top sponsor is Budo Nord, a martial arts equipment supplier. Clubs are invited to participate and help make this an entirely international event.
Use of Linked Open Data for Educational Purposesplan4all
This document discusses using Linked Open Data for educational purposes such as creating quizzes about rivers in Europe. It provides examples of quiz questions about which countries certain rivers flow through and examples of SPARQL queries that can be used to generate questions and answers. The goal is to create an automated or GUI-based system for generating geography questions and a game interface for students to answer the questions using Linked Open Data as the knowledge base.
El documento presenta la liturgia para el quinto domingo de Pascua del ciclo A. Incluye lecturas de los Hechos de los Apóstoles, el Apocalipsis y el Evangelio de Juan sobre el amor mutuo entre los discípulos como señal de pertenencia a Cristo. La homilía exhorta a los fieles a construir un mundo nuevo siguiendo el Reino de Dios a través del amor y la entrega a los demás.
The video for The Weeknd's song "Often" features slow camera movements and close-ups that focus on his face and body as the central subject. Additional shots introduce a naked woman on a bed and female lingerie on the floor. Later, many women are suddenly shown throughout the room despite not being there before. Cinematography, editing, and mise-en-scene aim to present the women as sexual objects for the male gaze without identities of their own, while The Weeknd remains the main focus.
The document discusses font selection for an album cover design. It describes searching dafont.com for suitable fonts, settling on "Bebas Neue" for the track list due to its bold yet simple style. For the artist name, the designer chose "Colours of Autumn" for its elegance and sharpness that matches the artist's style. Typography inspiration came from Google images, experimenting with font sizing and positioning on the inside covers to make them modern and highlight impactful lyrics.
01 What is sustainable planning and developmentMark M. Miller
The document discusses concepts related to sustainability and sustainable development. It references definitions from the Brundtland Commission which defined sustainable development as "development that meets the needs of current generations without compromising the ability of future generations to meet their own needs." Jeffrey Sachs identifies three pillars of sustainable development: economic development, social inclusion, and environmental sustainability. The document also discusses concepts like carrying capacity, planning, and challenges around balancing economic growth, social welfare, and environmental protection.
The Bal Sudhar Griha juvenile correction home in Bhaktapur, Nepal is managed by UCEP Nepal in partnership with the Ministry of Women, Children and Social Welfare. It provides residential treatment, education, counseling, and health services to maladjusted children and juveniles referred by courts. A trainee at the organization learned about its operations, introduced personality development activities to students, and helped organize sports and holiday events. The trainee applied principles of acceptance, participation, and self-awareness during interactions and gained practical experience in communication, rapport building, and program leadership while overcoming challenges of being untrained and having difficulty adjusting to the new environment.
Mapbox, a Google map alternative
You can watch the presentation video on:
youtube:
https://www.youtube.com/playlist?list=PLT2xIm2X7W7gTTEy77_FZGvoqo3DQcVT-
aparat:
https://www.aparat.com/v/F5GAH
The document describes a Maps4Finland workshop on common geographic information programming examples. It provides an agenda that includes discussions of content from various Finnish organizations like the National Land Survey of Finland and Statistics Finland. It also gives examples of code samples for creating maps using APIs and standards like OpenLayers, WMS, and GeoJSON. Finally, it discusses handling and converting spatial and non-spatial content between different formats.
Geospatial applications created using java script(and nosql)Comsysto Reply GmbH
Ever wondered how geospatial data works? Why don’t you come along and learn it where you’ll be presented to a fully functioning geospatial application that uses metadata from images to pinpoint them to a map. You’ll be introduced to a NoSQL tool and you’ll learn the basics of NoSQL technologies in a fun and initiative way. Along the way you’ll experience geospatial data, full stack application development using JavaScript and a little bit on semantic data as well. You will experience how easy it is to manage hybrid data (JSON documents, JPEG images as well as RDF triples) in one database, how to query geospatial data and how to work with JavaScript across a three tiered application.
Validating and Describing Linked Data Portals using RDF Shape ExpressionsJose Emilio Labra Gayo
Presentation at 1st Linked Data Quality Workshop, Leipzig, 2nd Sept. 2014
Author: Jose Emilio Labra Gayo
Applies Shapes Expressions to validate the WebIndex linked data portal
Linked services: Connecting services to the Web of DataJohn Domingue
Keynote from the International Conference on e-Business Engineering, September 2013. The talk covers a short integration to Linked Data, our approach to building applications on top of the Web of Data (which we term Linked Services) and a number of applications in the areas of house hunting: crowdsourcing car parking, sharing human body processes. The talk also covers recent work on transforming SAP's Unified Service Description Language to a Linked Data format.
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...Micah Altman
The WorldMap platform http://worldmap.harvard.edu is the largest open source collaborative mapping system in the world, with over 13,000 map layers contributed by thousands of users from Harvard and around the world. Researchers may upload large spatial datasets to the system, create data-driven visualizations, edit data, and control access. Users may keep their data private, share it in groups, or publish to the world.
The user base is interdisciplinary, including scholars from the humanities, social sciences, sciences, public health, design, planning, etc. All are able to access, view, and use one another’s data, either online, via map services, or by downloading.
Current work is underway to create and maintain a global registry of map services and take us a step closer to one-stop-access for public geospatial data. Another project is working on tools to support the visualization of spatial datasets with over a billion features. Current collaborations are underway with groups inside Harvard, such as Dataverse, HarvardX, and various departments, and with groups outside Harvard, such as Cornell University and the University of Pennsylvania. Major additional contributors to the underlying source code include the WorldBank, the U.S. State Department, and the United Nations.
The source code for the WorldMap platform is available on GitHub https://github.com/cga-harvard/cga-worldmap.
Location: E25-202
Discussant: Ben Lewis is system architect and project manager for WorldMap, an open source infrastructure that supports collaborative research centered on geospatial information. Before joining Harvard, Ben was a project manager with Advanced Technology Solutions of Pennsylvania, where he led the company in adopting platform independent approaches to GIS system development. Ben studied Chinese at the University of Wisconsin and has a Masters in Planning from the University of Pennsylvania. After Penn, Ben helped start the GIS Lab at U.C. Berkeley, founded the GIS group for transportation engineering firm McCormick Taylor, and coordinated the Land Acquisition Mapping System for South Florida Water Management District. Ben is especially interested in technologies that lower the barrier to spatial technology access.
Information Science Brown Bag talks, hosted by the Program on Information Science, consists of regular discussions and brainstorming sessions on all aspects of information science and uses of information science and technology to assess and solve institutional, social and research problems. These are informal talks. Discussions are often inspired by real-world problems being faced by the lead discussant.
Knowledge graph construction with a façade - The SPARQL Anything ProjectEnrico Daga
The document discusses a project called "SPARQL Anything" which aims to simplify knowledge graph construction by using SPARQL as the single language for representing and transforming diverse data formats into RDF. It presents an approach called "Facade-X" which defines a common RDF structure that can be applied over different formats like CSV, JSON, HTML, etc. This facade focuses on the RDF meta-model and aims to apply minimal ontological commitments. The document outlines how Facade-X can be used to represent different formats and provides examples of using SPARQL to transform sample data into RDF without committing to a specific domain ontology.
GRASS and OSGeo: a framework for archeologyMarkus Neteler
Use of GIS and geospatial data in archeology. Contribution to:
Quarto Workshop Italiano "Open Source, Free Software e Open Format nei processi di ricerca archeologica", Roma, 27 e 28 aprile 2009. Sede centrale del Consiglio Nazionale delle Ricerche (CNR)
http://www.archeo-foss.org/
Abstract:
With the widespread availability of desktop GIS, archaeologists have gained the tools to comprehensively analyze the important spatial component of their data. Initial archaeological use of GIS was (and still is in many instances) for making maps of archaeological sites. Rather quickly GIS became used for predictive modeling of site locations. More recently, viewshed analysis has seen increasing use, in efforts to understand prehistoric perceptions of the landscape.
In the last years, Open Source GIS software evolved to a powerful set of software products which support both scientific as well as common GIS users. In particular, the integration of GIS with image processing capabilities, geospatial data analysis, database management system and Web mapping software enables archaeologists to perform their tasks in a completely free environment. Since 2006, the Open Source Geospatial Foundation (OSGeo) operates as umbrella foundation for Web Mapping, Desktop GIS Applications, Geospatial Libraries, Metadata Catalog as well as the Public Geospatial Data project and the Education and Curriculum project.
In our presentation, we focus on GRASS GIS (http://grass.osgeo.org/) for spatial data analysis and visualization. GRASS is the largest Open Source GIS program currently available. The new version GRASS 6.4.0 is interoperable as it supports all common vector and raster GIS formats. Its capabilities cover raster and volume spatial analysis and modeling, time-series and landscape analysis, image processing, and visualization of 2D and 3D (voxel) raster data. Vector data can be digitized, extracted, extruded to 3D, and vector networks analyzed. Vector data are handled topologically. Vector attributes are stored in internal or externally connected databases. All general GIS tasks like map reprojection, georeferencing, and transformations are available for raster and vector data. The data storage concept of GRASS permits for single as well as multi-user access set up via network file system.
GRASS 6.4.0, the new stable release after more than one year of development and testing, brings a number of exciting enhancements to the GIS. Besides the hundreds of new module features, supported data formats, and language translations. The 6.4.0 release also runs in MS-Windows, a new installer is provided. A new graphical user interface with integrated location wizard and new vector digitizer is also included.
The presentation concludes with a series of applications relevant to archaeology including image processing, Lidar data analysis, fast viewshed analysis and more.
New Discovery Tools for Digital Humanities and Spatial Data (Summary of the J...Micah Altman
This document discusses new tools for digital humanities and spatial data. It describes how physical discovery of manuscripts led to new methods of transmission and preservation of information over time. Modern libraries are indexing resources through internal catalogs and digital objects. The text advocates for moving resources on the semantic web using linked open data with RDF to better integrate geographic data and connect projects. The future of catalogs may involve direct access to digital resources through APIs, linked open data, and graph databases to allow deeper analysis of content and spatial indexing of metadata.
LarKC Tutorial at ISWC 2009 - Urban ComputingLarKC
The document discusses understanding and manipulating urban computing workflows. It describes three workflows used in the Alpha Urban LarKC system: 1) a monument selection workflow, 2) an event selection workflow, and 3) a path finding workflow. Each workflow utilizes various LarKC plugins to integrate, transform, and reason over distributed data sources for responding to user queries about points of interest in a city.
The Linked Map project is part of the FP7 PlanetData project (http://planet-data.eu/), whose aim is to help organisations to get their big amounts of data exposed online in a useful form with quality. Regarding to this goal and as demonstration of the LMS technology (a transparent semantic proxy for WMS 1.3.0), the project Linked Map has developed a Web platform (http://linkedmap.unizar.es/crowdsourcing-platform/). This platform enables users to assess the quality of an automatic integration of INSPIRE data and Volunteer Geographic Information (VGI). The platform uses a LMS instance. This demonstration involves an experiment that combines in a meaningful way a big INSPIRE dataset that contains data from Annex I and Annex III themes (BCN/BTN25) with VGI data (OpenStreetMap).
The Linked Map project was developed by IAAA Lab (Universidad Zaragoza) and GeoSpatiumLab. These slides were presented at JIIDE 2014 (Lisbon)
RDF and linked data standards allow for layering and linking of information on the web. There is a large and growing amount of RDF data available from sources like Wikipedia, Flickr, government data sets, and more. Standards like RDF, RDFS, OWL, SKOS, and SPARQL enable publishing, linking, querying and reusing this structured data on the web in a way that is machine-readable. Integrating RDF and linked data into systems like Drupal could provide benefits like improved searchability, cross-linking of content, and reuse of external taxonomies and metadata schemas.
04 Applications of Smart Points of Interestplan4all
1) Smart Points of Interest (SPOI) is a dataset of over 27 million geographic points of interest published as linked open data and accessible via a SPARQL endpoint and map client.
2) SPOI was developed to interconnect tourism data across borders in a standardized, linked data format. It populates points from 49 external sources and supports applications like tourism guides.
3) Future work will focus on optimizing data collection, improving links and metadata, eliminating errors, and developing new applications like augmented reality with SPOI data.
This document discusses a location-based application called DBpedia Mobile that uses linked open data. It retrieves data from multiple sources like GeoNames, Flickr, Factbook, Revyu, and Yago based on a user's current GPS location. A SPARQL query is run to gather relevant information from these sources. The data is displayed on a map indicating nearby locations along with background details. It also allows users to publish and interlink their own location data, photos, and reviews with DBpedia resources. This mobile application demonstrates the real-world use of linked data and semantic web technologies to provide useful information to tourists or those exploring new places.
Implementing a VO archive for datacubes of galaxiesJose Enrique Ruiz
The document describes implementing a VO archive for galaxy datacubes. It details collections of FITS files containing 2D spatial and spectral data on galaxies from two telescopes. A MySQL database stores metadata on the datasets extracted from FITS headers using IPython notebooks. The web interface allows discovering, viewing metadata, and accessing the data through use cases like moment maps and channel maps. The archive aims to provide characterization of emission lines and provenance to better understand the radio interferometric data.
Apache Drill (http://incubator.apache.org/drill/) is a distributed system for interactive analysis of large-scale datasets, inspired by Google’s Dremel technology. It is designed to scale to thousands of servers and able to process Petabytes of data in seconds. Since its inception in mid 2012, Apache Drill has gained widespread interest in the community, attracting hundreds of interested individuals and companies. In the talk we discuss how Apache Drill enables ad-hoc interactive query at scale, walking through typical use cases and delve into Drill's architecture, the data flow and query languages as well as data sources supported.
This document provides a history of Semantic MediaWiki (SMW) and Wikidata, including:
- SMW was created in 2006 as a MediaWiki extension to add structured data to wiki pages. It is now used on over 1500 sites.
- Wikidata launched in 2012 as a structured database to provide common facts across Wikimedia projects.
- SMW allows adding both unstructured text and structured data to wiki pages through features like online forms, queries, and semantic web standards.
- An example FINA wiki uses several SMW extensions to integrate structured data from Wikidata into person pages and visualizations.
- Opportunities are discussed to further link SMW and Wikidata data through mappings, reconciliation
Similar to Open Land Use Map and Smart Points of Interest (20)
Agrihub INSPIRE Hackathon 2021: Challenge #7: Analysis, processing and standa...plan4all
This is a presentation of results of Challenge #7: Analysis, processing and standardisation of data from agriculture machinery for easier utilization by farmers of the Agrihub INSPIRE Hackathon 2021.
Challenge #3 agro environmental services final presentationplan4all
This document discusses 3 use cases for agro-environmental services. Use case 1 aims to improve access and sharing of geo data by addressing challenges with data inventory, publishing, and communication. Use case 2 examines structural changes to water canals by mapping a canal in Slovakia and analyzing subsidized land nearby. Use case 3 identifies dynamic landscape changes over time through vulnerability analysis and analyzing landscape structure changes using data like Corine Land Cover and NDVI indexes. The team behind this work includes representatives from universities and government organizations in Slovakia.
The document discusses a pilot project that will implement a new soil methodology in 7 locations in the EU and 4 locations in China. It will create an integrated platform using advanced sensing tools, land monitoring, and data fusion to achieve sustainable land management. The platform aims to maximize land productivity while minimizing environmental impacts. The project partners will run pilots of this new methodology in Greece, Spain, Belgium, Czech Republic, and several locations in China.
Calculation of agro climatic factors from global climatic dataplan4all
Authors: Pavel Hájek,
Raitis Berzins , Jiří Valeš, Martin Pitoňák , Vincent
Onckelet , Tomáš Andrš, Veronika Osmiková , Ronald
Ssembajwe , Amit Kirschenbaum , Jörg Schliesser , Michal Kepka & Karel Jedlička
Digitalization of indigenous knowledge in African agriculture for fostering f...plan4all
Authors:
Antoine Kantiza, AKANTIZA CONSULT, Burundi
Didier Muyiramye, Swedish University of Agricultural Sciences, Rwanda
Elias Cherenet Weldemariam, HARAMAYA UNIVERSITY, Ethiopia
Petr Horak, WIRELESSINFO, Czech Republic
Robert Sabimana, Frutus Fresco Ltd, Uganda
Pavel Hajek, West Bohemia University, Czech Republic
Tuula Löytty, Smart & Lean Hub Oy, Finland
Demet Osmancelebioglu, Smart & Lean Hub Oy, Finland
This document summarizes social innovations in rural areas to address challenges from the Covid-19 pandemic. It describes projects in Finland and Spain that raised awareness of local food and businesses, supported community centers, repurposed rural accommodations, and improved rural digital infrastructure and connectivity. The atlas of best practices aims to provide rural stakeholders with ideas to support farmers and communities, integrate new residents, promote local tourism, and improve business practices as the collection expands.
The EUXDAT project, which received Horizon 2020 funding, has come to an end after three years of collaboration and development. The project developed an e-infrastructure platform to address UN Sustainable Development Goals and the European Green Deal. Key deliverables of the project included the final specification of the infrastructure platform (D4.5), definition of three pilots and scenarios (D5.6), description of the end users' platform (End Users’ Platform v3), and description of the infrastructure platform and services (D4.6, D5.7). Webinars were held in October 2020 to present results of the pilots and infrastructure.
Karel charvat map-compositions-format-intro-presentation-by-karel (1)plan4all
Karel Charvat on behalf of Plan4all, Lesprojekt, BOSC and Asplan Viak gave a presentation about the project to create a Google Docs-like map application and map composition format.
Karel charvat map-whiteboard-collaborative-map-making-breakout-sessionplan4all
Karel Charvat on behalf of Plan4all, Lesprojekt, BOSC and Asplan Viak gave a presentation about the project to create a Google Docs-like map application and map composition format.
This document discusses codes of conduct for farm data sharing. It notes the challenges in balancing farmers' rights to protect business data while also needing to share data for precision agriculture and services. Existing codes focus on consent, disclosure and transparency. Challenges include ensuring proper representation, independent oversight, and alignment with legal frameworks. Success requires adoption, credibility, clarity and balancing interests. Codes can help build trust but self-regulation has limitations; cooperation across stakeholders is important.
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Kaxil Naik
Navigating today's data landscape isn't just about managing workflows; it's about strategically propelling your business forward. Apache Airflow has stood out as the benchmark in this arena, driving data orchestration forward since its early days. As we dive into the complexities of our current data-rich environment, where the sheer volume of information and its timely, accurate processing are crucial for AI and ML applications, the role of Airflow has never been more critical.
In my journey as the Senior Engineering Director and a pivotal member of Apache Airflow's Project Management Committee (PMC), I've witnessed Airflow transform data handling, making agility and insight the norm in an ever-evolving digital space. At Astronomer, our collaboration with leading AI & ML teams worldwide has not only tested but also proven Airflow's mettle in delivering data reliably and efficiently—data that now powers not just insights but core business functions.
This session is a deep dive into the essence of Airflow's success. We'll trace its evolution from a budding project to the backbone of data orchestration it is today, constantly adapting to meet the next wave of data challenges, including those brought on by Generative AI. It's this forward-thinking adaptability that keeps Airflow at the forefront of innovation, ready for whatever comes next.
The ever-growing demands of AI and ML applications have ushered in an era where sophisticated data management isn't a luxury—it's a necessity. Airflow's innate flexibility and scalability are what makes it indispensable in managing the intricate workflows of today, especially those involving Large Language Models (LLMs).
This talk isn't just a rundown of Airflow's features; it's about harnessing these capabilities to turn your data workflows into a strategic asset. Together, we'll explore how Airflow remains at the cutting edge of data orchestration, ensuring your organization is not just keeping pace but setting the pace in a data-driven future.
Session in https://budapestdata.hu/2024/04/kaxil-naik-astronomer-io/ | https://dataml24.sessionize.com/session/667627
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Land Use Map and Smart Points of Interest
1. Open Land Use Map
and
Smart Points of Interest
Tomáš Mildorf, Otakar Čerba, Dmitrii Kožuch
University of West Bohemia, Help Service Remote Sensing
Czech Republic
INSPIRE Conference 2016
2. Open Land Use Map (OLU)
• Harmonisation and integration of
heterogeneous land use and land cover data
• Reusing the INSPIRE land use data
specifications → transformation into a common
INSPIRE compliant data model
• Mapping different classifications → HILUCS
• Uniform visualisation
• Using linked data
INSPIRE Conference 2016
3. • Corine Land Cover 2006
• Urban Atlas
• Czech cadastre
• Land Parcel Identification System – LPIS
• Spatial plans
• Other sources
Different
level of
detail
Different
geometry
Open Land Use Map
INSPIRE Conference 2016
10. Smart Points of Interest (SPOI)
●
POI: a specific point location that someone may find
useful or interesting
●
SPOI domain: tourism and related spheres
(transport, logistics, advertising...)
●
Smart: Links to other data and information
●
Open and seamless data set of POIs as a “data fuel”
for development tourism applications and services
11. Essential attributes of SPOI
●
Many heterogeneous input data
●
Complicated data harmonization process
●
Based on standards, semantic description and Linked
data
●
Seamless data (no borders)
●
Published on map portal and SPARQL endpoint
●
Open Database License (ODbL)
12. Data resources & harmonization
Belluno data (SHP)
Change character coding
Change coordinate system
QGIS
Filtering attributes
Text modification
LibreOffice Calc
Data (CSV)
Classification
Metadata
Links
Transformation to SPOI data model
Smart Points of Interest (RDF)
BASH script
GeoNames.org (ZIP)
Data downloadwget
Unpackingunzip
BASH script
Text modificationawk
Data (TXT)
Data (XML)
Classification
Metadata
Links
Transformation to SPOI data model
Saxon / Java
XSLT template
Natural Earth (KML)
Classification
Metadata
Links
Transformation to SPOI data model
Saxon / Java
2x XSLT templates
Antwerpen (XML)
Classification
Metadata
Links
Transformation to SPOI data model
Saxon / Java
XSLT template
Transformation to SPOI data model
Links
Metadata
Classification
OpenStreetMap (BZ2)
Data downloadwget
Unpackingbunzip2
Data (TAR)
Unpackingtar
Data (OSM binary)
Filtering (attributes)
Converting
osmconvert
Data (OSM XML)
Filtering (nodes)osmfilter
Saxon / Java
XSLT template
BASH script
Citadel on the Move (JSON)
BASH script
Data download wget
Text modification sed + BASH script
Data (XML)
Transformation to SPOI data model
Links
Metadata
Classification
Saxon / Java
XSLT template
BASH script
Issy (XML)
Transformation to SPOI data model
Links
Metadata
Classification
Saxon / Java
XSLT template
Filtering (attributes)
UWB experimental ontologies (OWL)
Transformation to SPOI data model
Links
Metadata
Classification
Saxon / Java
XSLT template
Filtering (attributes)
Travel agency information (text)
Transcription to table LibreOffice Calc
Data (CSV)
Classification
Metadata
Links
Transformation to SPOI data model
Sicily (text)
Transcription to table LibreOffice Calc
Data (CSV)
Classification
Metadata
Links
Transformation to SPOI data model
BASH script
Saxon / Java
XSLT template
Format conversion Web service
Data (XML)
Pošumaví (XLS)
Transformation to SPOI data model
Links
Metadata
Classification
Saxon / Java
XSLT template
Text modification LibreOffice Calc
Format conversion Web service
Data (XML)
Zemgale (XLS)
Transformation to SPOI data model
Links
Metadata
Classification
Saxon / Java
XSLT template
Format conversion Web service
Data (XML)
Filtering (attributes)
Prague Open data (GML)
Transformation to SPOI data model
Links
Metadata
Classification
Saxon / Java
XSLT template
Filtering (attributes)
Wikidata (JSON)
Data download wget
Text modification BASH script
Classification
Metadata
Links
Transformation to SPOI data model
BASH script
BASH script