PGDay Amsterdam 2018
Instead of using ETL Tools, which consume tons of memory on their own system, you will learn how to do ETL jobs directly in and with a database.
PostgreSQL Management of External Data (SQL/MED, http://www.iso.org/iso/catalogue_detail.htm?csnumber=38643) is also known as Foreign Data Wrapper (FDW). With FDW, there is nearly no limit of external data, that you could use directly inside a PostgreSQL database.
PGDay.Amsterdam 2018 - Stefanie Stoelting - PostgreSQL As Data Integration ToolPGDay.Amsterdam
Instead of using ETL Tools, which consume tons of memory on their own system, you will learn how to do ETL jobs directly in and with a database. With PostgreSQL Management of External Data, also known as Foreign Data Wrapper (FDW), there is nearly no limit of external data, that you could use directly inside a PostgreSQL database.
A talk given at the Ingenta Publisher Forum, in November 2008.
See: http://www.ldodds.com/blog/archives/000264.html
For a detailed description of the talk.
The document discusses the Digital Object Identifier (DOI) system. Some key points:
- DOI provides persistent identifiers for content on digital networks, governed by the International DOI Foundation. It is implemented through Registration Agencies that allocate prefixes and register DOI names.
- DOIs have a prefix-suffix syntax (e.g. doi:10.314/56789) and can resolve to metadata and services about the associated entity. This metadata can describe relationships and provide additional discovery capabilities.
- The system aims to enable persistent access and citation of digital resources through long-term identifier resolution and curation. It has growing coverage of academic publications, research datasets, and other content types.
RDAP 16 Poster: Hacking the figshare API to Create Enhanced Metadata RecordsASIS&T
Research Data Access and Preservation Summit, 2016
Atlanta, GA
May 4-7, 2016
Poster session (Wednesday, May 4)
Presenters:
Simon Porter, Digital Science
Dan Valen, figshare
Addressing the data challenge in IIoT system evolution with Protocol BuffersToby McClean
This document discusses how Protocol Buffers can help address data challenges as industrial IoT systems evolve. It notes that as these systems evolve, the volume, velocity, and variety of data will increase and the data structure will change. Protocol Buffers is presented as a solution because it allows for self-describing and evolvable data sharing across languages, platforms, and middleware in an open source and interoperable way. This reduces time to market and development/integration time for industrial IoT systems.
Data Integration Solutions Created By KoneksysKoneksys
This document summarizes data integration solutions created by Koneksys including OSLC adapters and clients, data management apps, specifications, and community efforts. It also describes other solutions such as model-based systems engineering, linked data research, blockchain, web applications, engineering and analysis, and network security and database work done by Koneksys. Open source projects for many of these solutions are listed.
The Web is until now mostly considered to be a Web of documents, more specifically a Web of HTML pages. However, the inventor of the Web Tim Berners Lee considers the Web not to have reached its fullest potential. The Data Web and Linked Data will enable more precise search services transforming the Web into a smarter and richer Web. Google for example uses Linked Data concepts to realize its own knowledge graph to process voice commands and voice queries for users. Linked Data concepts are not limited to the public Web. They can also be used to capture private knowledge in private company Webs making them potentially applicable as the backbone for future PLM solutions.
PGDay Amsterdam 2018
Instead of using ETL Tools, which consume tons of memory on their own system, you will learn how to do ETL jobs directly in and with a database.
PostgreSQL Management of External Data (SQL/MED, http://www.iso.org/iso/catalogue_detail.htm?csnumber=38643) is also known as Foreign Data Wrapper (FDW). With FDW, there is nearly no limit of external data, that you could use directly inside a PostgreSQL database.
PGDay.Amsterdam 2018 - Stefanie Stoelting - PostgreSQL As Data Integration ToolPGDay.Amsterdam
Instead of using ETL Tools, which consume tons of memory on their own system, you will learn how to do ETL jobs directly in and with a database. With PostgreSQL Management of External Data, also known as Foreign Data Wrapper (FDW), there is nearly no limit of external data, that you could use directly inside a PostgreSQL database.
A talk given at the Ingenta Publisher Forum, in November 2008.
See: http://www.ldodds.com/blog/archives/000264.html
For a detailed description of the talk.
The document discusses the Digital Object Identifier (DOI) system. Some key points:
- DOI provides persistent identifiers for content on digital networks, governed by the International DOI Foundation. It is implemented through Registration Agencies that allocate prefixes and register DOI names.
- DOIs have a prefix-suffix syntax (e.g. doi:10.314/56789) and can resolve to metadata and services about the associated entity. This metadata can describe relationships and provide additional discovery capabilities.
- The system aims to enable persistent access and citation of digital resources through long-term identifier resolution and curation. It has growing coverage of academic publications, research datasets, and other content types.
RDAP 16 Poster: Hacking the figshare API to Create Enhanced Metadata RecordsASIS&T
Research Data Access and Preservation Summit, 2016
Atlanta, GA
May 4-7, 2016
Poster session (Wednesday, May 4)
Presenters:
Simon Porter, Digital Science
Dan Valen, figshare
Addressing the data challenge in IIoT system evolution with Protocol BuffersToby McClean
This document discusses how Protocol Buffers can help address data challenges as industrial IoT systems evolve. It notes that as these systems evolve, the volume, velocity, and variety of data will increase and the data structure will change. Protocol Buffers is presented as a solution because it allows for self-describing and evolvable data sharing across languages, platforms, and middleware in an open source and interoperable way. This reduces time to market and development/integration time for industrial IoT systems.
Data Integration Solutions Created By KoneksysKoneksys
This document summarizes data integration solutions created by Koneksys including OSLC adapters and clients, data management apps, specifications, and community efforts. It also describes other solutions such as model-based systems engineering, linked data research, blockchain, web applications, engineering and analysis, and network security and database work done by Koneksys. Open source projects for many of these solutions are listed.
The Web is until now mostly considered to be a Web of documents, more specifically a Web of HTML pages. However, the inventor of the Web Tim Berners Lee considers the Web not to have reached its fullest potential. The Data Web and Linked Data will enable more precise search services transforming the Web into a smarter and richer Web. Google for example uses Linked Data concepts to realize its own knowledge graph to process voice commands and voice queries for users. Linked Data concepts are not limited to the public Web. They can also be used to capture private knowledge in private company Webs making them potentially applicable as the backbone for future PLM solutions.
Building Windows Phone Database App Using MVVM PatternFiyaz Hasan
This document discusses building a Windows Phone app using the MVVM pattern with a local database. It introduces MVVM and LINQ to SQL for accessing the database in an object-oriented way. A data context represents the database with table objects mapping to database tables containing entity objects. Links are provided for implementing ICommands and building local database apps for Windows Phone 8 using MVVM.
A DOI (Digital Object Identifier) is a standardized system for identifying electronic or digital objects like online journal articles. A DOI is a unique alphanumeric string that provides a persistent link to the article. It is often included in citations and can be found in the article itself or through a service like Crossref.org. Including the DOI in references, as required by citation styles like APA, helps provide a stable link to the article. While not all journals have DOIs assigned, they are useful for linking to online content over the long term.
This document discusses using local databases in Windows Phone 8.1 apps. It explains that SQL Server Compact Edition (SQLCE) can be used for Silverlight apps to store relational data in the app's local folder using LINQ to SQL. For WinRT apps, SQLite is recommended as a third party database that supports multiple tables, triggers, views and indices and follows ACID rules. The document outlines how to define entities and mappings using attributes in a DataContext object to represent database tables and perform CRUD operations with LINQ to SQL queries that get translated to SQL and back.
Koneksys - Offering Services to Connect Data using the Data WebKoneksys
Koneksys provides consulting and software services to connect data silos using the Data Web (Linked Data on the World Wide Web). They create open-source software, promote data integration standards like OSLC, and help clients integrate their data from different sources and systems for improved traceability, transparency, collaboration and analytics. Connecting data using web technologies avoids vendor lock-in and proprietary solutions, allowing organizations to establish relationships between related data to facilitate sharing and decision making across silos.
LDCache - a cache for linked data-driven web applicationsMetaSolutions AB
This document describes LDCache, a cache for Linked Data-driven web applications. LDCache solves problems with relying on third party data sources like availability and alignment issues. It pre-fetches and caches RDF data based on a configuration to follow certain predicates and resources to a specified depth. Developers can then query the cache to retrieve consistent RDF data in various formats.
The document discusses using linked data to solve problems of identity and data access/integration. It describes linked data as data that is accessible over HTTP and implicitly associated with metadata. It then outlines problems around identity, such as repeating credentials across different apps/enterprises. The solution proposed is assigning individuals HTTP-based IDs and binding IDs to certificates and profiles. Problems of data silos across different databases and apps are also described, with the solution being to generate conceptual views over heterogeneous sources using middleware and RDF.
SQLite is a widely popular database format that is used extensively pretty much everywhere. Both iOS and Android employ SQLite as a storage format of choice, with built-in and third-party applications relying on SQLite to keep their data. A wide range of desktop and mobile Web browsers (Chrome, Firefox) and instant messaging applications use SQLite, which includes newer versions of Skype (the older versions don’t work anyway without a forced upgrade), WhatsApp, iMessages, and many other messengers.
Forensic analysis of SQLite databases is often concluded by simply opening a database file in one or another database viewer. One common drawback of using a free or commercially available database viewer for examining SQLite databases is the inherent inability of such viewers to access and display recently deleted (erased) as well as recently added (but not yet committed) records. Here we examine the forensic implications of three features of the SQLite database engine: Free Lists, Write Ahead Log and Unallocated Space.
More information: http://belkasoft.com/sqlite-analysis
Building nTier Applications with Entity Framework ServicesDavid McCarter
Learn how to build real world nTier applications with the Entity Framework and related services. With this new technology built into .NET, you can easily wrap an object model around your database and have all the data access automatically generated or use your own stored procedures and views. Then learn how to easily and securely expose your object model using WCF with just a few line of code using ADO.NET Data Services. The session will demonstrate how to create and consume these new technologies from the ground up.
Linked Data Driven Data Virtualization for Web-scale Integrationrumito
- Linked data and data virtualization can help address challenges of growing data heterogeneity, complexity, and need for agility by providing a common data model and identifiers.
- Linked data uses RDF to represent information as graphs of triples connected by URIs, allowing different data sources to be integrated and queried together.
- As more data is published using common vocabularies and linking to existing URIs, it increases opportunities for discovery, integration and novel ways to extract value from diverse data sources.
What's with the 1s and 0s? Making sense of binary data at scale with Tika and...gagravarr
If you have one or two files, you can take the time to manually work out what they are, what they contain, and how to get the useful bits out (probably....). However, this approach really doesn't scale, mechanical turks or no! Luckily, there are Apache projects out there which can help!
In this talk, we'll first look at how we can work out what a given blob of 1s and 0s actually is, be it textual or binary. We'll then see how to extract common metadata from it, along with text, embedded resources, images, and maybe even the kitchen sink! We'll see how to do all of this with Apache Tika, and how to dive down to the underlying libraries (including its Apache friends like POI and PDFBox) for specialist cases. Finally, we'll look a little bit about how to roll this all out on a Big Data or Large-Search case.
Building nTier Applications with Entity Framework Services (Part 1)David McCarter
Learn how to build real world nTier applications with the new Entity Framework and related services. With this new technology built into .NET, you can easily wrap an object model around your database and have all the data access automatically generated or use your own stored procedures and views. The session will demonstrate how to create and consume these new technologies from the ground up and focus on database modeling including views and stored procedures along with coding against the model via LINQ. Dynamic data website will also be demonstrated.
This document provides an overview of data binding in .NET applications. It discusses data binding goals, the BindingSource class, interfaces for custom data sources, and new features in the Binding class for formatting and parsing data. The key points covered are:
- The BindingSource class acts as the binding context and supports operations like add/remove on supported data sources.
- Interfaces like IBindingList, INotifyPropertyChanged help create custom data sources that support binding.
- The Binding class supports custom formatting/parsing of data through events and type converters.
- Data binding aims to synchronize data between controls and data sources at runtime.
The British Library has created a two-year Digital Scholarship Training Programme to teach their staff about digital tools and skills. They have developed fifteen one-day courses on topics like digitization, metadata, data visualization, and digital curation. Over 200 staff members have attended the courses so far. Feedback shows the courses are helping staff apply digital skills to their work and collaborations. The program aims to empower staff and facilitate new discoveries through digital scholarship.
DataCite and its DOI infrastructure - IASSIST 2013Frauke Ziedorn
- DataCite is an international consortium that aims to make research data citable and accessible by establishing a system for minting DOIs (Digital Object Identifiers) for research data.
- DataCite has grown to include 17 member organizations from 12 countries that work with the Technical Information Library (TIB) to register over 1.5 million DOIs for research data.
- The DataCite metadata schema, based on Dublin Core, requires core metadata for DOI registration and encourages linking related publications, data, and other research objects to facilitate discovery and access.
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella WisdomStella Wisdom
"Corpus Protocols: digital transformations of commercial newspaper collections for text and data mining to support academic research"
Presented by Neil Smyth and Stella Wisdom, at the IFLA 2014 Pre-Conference; "Digital Transformation and the Changing Role of News Media in the 21st Century" held at ITU, Geneva, August 13-14, 2014
The Digital Research & Curator Team at the British Library was formed in 2010 to support digital scholarship. Their mission is to develop innovative models for digital scholarship using digital content and technologies. Some of their main activities include staff training, promoting digital scholarship at the library, curating digital research data, and engaging with users. They offer various training courses, organize discussions on digital topics, and support digital collections and services at the library.
This document summarizes drivers for research data management in UK higher education, including policies from research funders like RCUK and AHRC. It also describes resources for supporting research data management, such as the Jisc Managing Research Data programme, the Digital Curation Centre (DCC), and projects funded through the Jisc programme like CAiRO and KAPTUR. The DCC provides guidance on data management planning, training, and curation best practices. Research data is broadly defined as any digital evidence used or created during the research process to generate new knowledge.
TIB's action for research data managament as a national library's strategy in...Peter Löwe
The document discusses the TIB's strategy for research data management as a national library in the era of big data. It provides background on the TIB, including its size, budget, collections and networks. It then discusses key initiatives and projects related to research data management, including DataCite for assigning DOIs to datasets, the GOPORTIS library network, and the RADAR project which aims to create a research data repository. The goal is to improve access, discovery and preservation of research data by integrating datasets into the scholarly record through persistent identifiers and linking from publications.
Myth Busters II: BI Tools and Data Virtualization are InterchangeableDenodo
Watch Here: https://bit.ly/2NcqU6F
We take on the 2nd myth about data virtualization and it’s one that suggests a BI tool can substitute a data virtualization software.
You might be thinking: If I can have multi-source queries and define a logical model in my reporting tool, why would I need a data virtualization software?
Reporting tools, no doubt important and necessary, focus on the visualization of data and it’s presentation to the business user. Data virtualization is a governed data access layer designed to connect to and provide transparency of all enterprise data.
Yet the myth suggests that these technologies are interchangeable. So we’re going to take it on!
Watch this webinar as we compare and contrast BI tools and data virtualization to draw a final conclusion.
This document provides an overview of database concepts and the history of data access APIs in Microsoft technologies. It defines what a database and DBMS are, lists some common DBMSs, and explains what data access is and why universal data access is important. It then summarizes the evolution of Microsoft's data access APIs from ODBC and DAO, which had limitations, to RDO, OLE DB, and ADO, which improved performance and universality.
Building Windows Phone Database App Using MVVM PatternFiyaz Hasan
This document discusses building a Windows Phone app using the MVVM pattern with a local database. It introduces MVVM and LINQ to SQL for accessing the database in an object-oriented way. A data context represents the database with table objects mapping to database tables containing entity objects. Links are provided for implementing ICommands and building local database apps for Windows Phone 8 using MVVM.
A DOI (Digital Object Identifier) is a standardized system for identifying electronic or digital objects like online journal articles. A DOI is a unique alphanumeric string that provides a persistent link to the article. It is often included in citations and can be found in the article itself or through a service like Crossref.org. Including the DOI in references, as required by citation styles like APA, helps provide a stable link to the article. While not all journals have DOIs assigned, they are useful for linking to online content over the long term.
This document discusses using local databases in Windows Phone 8.1 apps. It explains that SQL Server Compact Edition (SQLCE) can be used for Silverlight apps to store relational data in the app's local folder using LINQ to SQL. For WinRT apps, SQLite is recommended as a third party database that supports multiple tables, triggers, views and indices and follows ACID rules. The document outlines how to define entities and mappings using attributes in a DataContext object to represent database tables and perform CRUD operations with LINQ to SQL queries that get translated to SQL and back.
Koneksys - Offering Services to Connect Data using the Data WebKoneksys
Koneksys provides consulting and software services to connect data silos using the Data Web (Linked Data on the World Wide Web). They create open-source software, promote data integration standards like OSLC, and help clients integrate their data from different sources and systems for improved traceability, transparency, collaboration and analytics. Connecting data using web technologies avoids vendor lock-in and proprietary solutions, allowing organizations to establish relationships between related data to facilitate sharing and decision making across silos.
LDCache - a cache for linked data-driven web applicationsMetaSolutions AB
This document describes LDCache, a cache for Linked Data-driven web applications. LDCache solves problems with relying on third party data sources like availability and alignment issues. It pre-fetches and caches RDF data based on a configuration to follow certain predicates and resources to a specified depth. Developers can then query the cache to retrieve consistent RDF data in various formats.
The document discusses using linked data to solve problems of identity and data access/integration. It describes linked data as data that is accessible over HTTP and implicitly associated with metadata. It then outlines problems around identity, such as repeating credentials across different apps/enterprises. The solution proposed is assigning individuals HTTP-based IDs and binding IDs to certificates and profiles. Problems of data silos across different databases and apps are also described, with the solution being to generate conceptual views over heterogeneous sources using middleware and RDF.
SQLite is a widely popular database format that is used extensively pretty much everywhere. Both iOS and Android employ SQLite as a storage format of choice, with built-in and third-party applications relying on SQLite to keep their data. A wide range of desktop and mobile Web browsers (Chrome, Firefox) and instant messaging applications use SQLite, which includes newer versions of Skype (the older versions don’t work anyway without a forced upgrade), WhatsApp, iMessages, and many other messengers.
Forensic analysis of SQLite databases is often concluded by simply opening a database file in one or another database viewer. One common drawback of using a free or commercially available database viewer for examining SQLite databases is the inherent inability of such viewers to access and display recently deleted (erased) as well as recently added (but not yet committed) records. Here we examine the forensic implications of three features of the SQLite database engine: Free Lists, Write Ahead Log and Unallocated Space.
More information: http://belkasoft.com/sqlite-analysis
Building nTier Applications with Entity Framework ServicesDavid McCarter
Learn how to build real world nTier applications with the Entity Framework and related services. With this new technology built into .NET, you can easily wrap an object model around your database and have all the data access automatically generated or use your own stored procedures and views. Then learn how to easily and securely expose your object model using WCF with just a few line of code using ADO.NET Data Services. The session will demonstrate how to create and consume these new technologies from the ground up.
Linked Data Driven Data Virtualization for Web-scale Integrationrumito
- Linked data and data virtualization can help address challenges of growing data heterogeneity, complexity, and need for agility by providing a common data model and identifiers.
- Linked data uses RDF to represent information as graphs of triples connected by URIs, allowing different data sources to be integrated and queried together.
- As more data is published using common vocabularies and linking to existing URIs, it increases opportunities for discovery, integration and novel ways to extract value from diverse data sources.
What's with the 1s and 0s? Making sense of binary data at scale with Tika and...gagravarr
If you have one or two files, you can take the time to manually work out what they are, what they contain, and how to get the useful bits out (probably....). However, this approach really doesn't scale, mechanical turks or no! Luckily, there are Apache projects out there which can help!
In this talk, we'll first look at how we can work out what a given blob of 1s and 0s actually is, be it textual or binary. We'll then see how to extract common metadata from it, along with text, embedded resources, images, and maybe even the kitchen sink! We'll see how to do all of this with Apache Tika, and how to dive down to the underlying libraries (including its Apache friends like POI and PDFBox) for specialist cases. Finally, we'll look a little bit about how to roll this all out on a Big Data or Large-Search case.
Building nTier Applications with Entity Framework Services (Part 1)David McCarter
Learn how to build real world nTier applications with the new Entity Framework and related services. With this new technology built into .NET, you can easily wrap an object model around your database and have all the data access automatically generated or use your own stored procedures and views. The session will demonstrate how to create and consume these new technologies from the ground up and focus on database modeling including views and stored procedures along with coding against the model via LINQ. Dynamic data website will also be demonstrated.
This document provides an overview of data binding in .NET applications. It discusses data binding goals, the BindingSource class, interfaces for custom data sources, and new features in the Binding class for formatting and parsing data. The key points covered are:
- The BindingSource class acts as the binding context and supports operations like add/remove on supported data sources.
- Interfaces like IBindingList, INotifyPropertyChanged help create custom data sources that support binding.
- The Binding class supports custom formatting/parsing of data through events and type converters.
- Data binding aims to synchronize data between controls and data sources at runtime.
The British Library has created a two-year Digital Scholarship Training Programme to teach their staff about digital tools and skills. They have developed fifteen one-day courses on topics like digitization, metadata, data visualization, and digital curation. Over 200 staff members have attended the courses so far. Feedback shows the courses are helping staff apply digital skills to their work and collaborations. The program aims to empower staff and facilitate new discoveries through digital scholarship.
DataCite and its DOI infrastructure - IASSIST 2013Frauke Ziedorn
- DataCite is an international consortium that aims to make research data citable and accessible by establishing a system for minting DOIs (Digital Object Identifiers) for research data.
- DataCite has grown to include 17 member organizations from 12 countries that work with the Technical Information Library (TIB) to register over 1.5 million DOIs for research data.
- The DataCite metadata schema, based on Dublin Core, requires core metadata for DOI registration and encourages linking related publications, data, and other research objects to facilitate discovery and access.
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella WisdomStella Wisdom
"Corpus Protocols: digital transformations of commercial newspaper collections for text and data mining to support academic research"
Presented by Neil Smyth and Stella Wisdom, at the IFLA 2014 Pre-Conference; "Digital Transformation and the Changing Role of News Media in the 21st Century" held at ITU, Geneva, August 13-14, 2014
The Digital Research & Curator Team at the British Library was formed in 2010 to support digital scholarship. Their mission is to develop innovative models for digital scholarship using digital content and technologies. Some of their main activities include staff training, promoting digital scholarship at the library, curating digital research data, and engaging with users. They offer various training courses, organize discussions on digital topics, and support digital collections and services at the library.
This document summarizes drivers for research data management in UK higher education, including policies from research funders like RCUK and AHRC. It also describes resources for supporting research data management, such as the Jisc Managing Research Data programme, the Digital Curation Centre (DCC), and projects funded through the Jisc programme like CAiRO and KAPTUR. The DCC provides guidance on data management planning, training, and curation best practices. Research data is broadly defined as any digital evidence used or created during the research process to generate new knowledge.
TIB's action for research data managament as a national library's strategy in...Peter Löwe
The document discusses the TIB's strategy for research data management as a national library in the era of big data. It provides background on the TIB, including its size, budget, collections and networks. It then discusses key initiatives and projects related to research data management, including DataCite for assigning DOIs to datasets, the GOPORTIS library network, and the RADAR project which aims to create a research data repository. The goal is to improve access, discovery and preservation of research data by integrating datasets into the scholarly record through persistent identifiers and linking from publications.
Myth Busters II: BI Tools and Data Virtualization are InterchangeableDenodo
Watch Here: https://bit.ly/2NcqU6F
We take on the 2nd myth about data virtualization and it’s one that suggests a BI tool can substitute a data virtualization software.
You might be thinking: If I can have multi-source queries and define a logical model in my reporting tool, why would I need a data virtualization software?
Reporting tools, no doubt important and necessary, focus on the visualization of data and it’s presentation to the business user. Data virtualization is a governed data access layer designed to connect to and provide transparency of all enterprise data.
Yet the myth suggests that these technologies are interchangeable. So we’re going to take it on!
Watch this webinar as we compare and contrast BI tools and data virtualization to draw a final conclusion.
This document provides an overview of database concepts and the history of data access APIs in Microsoft technologies. It defines what a database and DBMS are, lists some common DBMSs, and explains what data access is and why universal data access is important. It then summarizes the evolution of Microsoft's data access APIs from ODBC and DAO, which had limitations, to RDO, OLE DB, and ADO, which improved performance and universality.
This document discusses building a single database containing all web data by creating a scalable web crawler, data store, and data retrieval system. It describes the challenges of collecting and structuring data from millions of websites, building a NoSQL data store using Cassandra to handle terabytes of data, and providing an intuitive RESTful API for querying the unified database. The project aims to make web data easily accessible through a single source as if querying a database.
This is Part 4 of the GoldenGate series on Data Mesh - a series of webinars helping customers understand how to move off of old-fashioned monolithic data integration architecture and get ready for more agile, cost-effective, event-driven solutions. The Data Mesh is a kind of Data Fabric that emphasizes business-led data products running on event-driven streaming architectures, serverless, and microservices based platforms. These emerging solutions are essential for enterprises that run data-driven services on multi-cloud, multi-vendor ecosystems.
Join this session to get a fresh look at Data Mesh; we'll start with core architecture principles (vendor agnostic) and transition into detailed examples of how Oracle's GoldenGate platform is providing capabilities today. We will discuss essential technical characteristics of a Data Mesh solution, and the benefits that business owners can expect by moving IT in this direction. For more background on Data Mesh, Part 1, 2, and 3 are on the GoldenGate YouTube channel: https://www.youtube.com/playlist?list=PLbqmhpwYrlZJ-583p3KQGDAd6038i1ywe
Webinar Speaker: Jeff Pollock, VP Product (https://www.linkedin.com/in/jtpollock/)
Mr. Pollock is an expert technology leader for data platforms, big data, data integration and governance. Jeff has been CTO at California startups and a senior exec at Fortune 100 tech vendors. He is currently Oracle VP of Products and Cloud Services for Data Replication, Streaming Data and Database Migrations. While at IBM, he was head of all Information Integration, Replication and Governance products, and previously Jeff was an independent architect for US Defense Department, VP of Technology at Cerebra and CTO of Modulant – he has been engineering artificial intelligence based data platforms since 2001. As a business consultant, Mr. Pollock was a Head Architect at Ernst & Young’s Center for Technology Enablement. Jeff is also the author of “Semantic Web for Dummies” and "Adaptive Information,” a frequent keynote at industry conferences, author for books and industry journals, formerly a contributing member of W3C and OASIS, and an engineering instructor with UC Berkeley’s Extension for object-oriented systems, software development process and enterprise architecture.
Build Business Web Applications with PHPOpenbiz Framework and Cubi PlatformAgus Suhartono
Openbiz is php application framework that provides an object-oriented metadata-driven platform for application developers to build web application with least possible programming code (80% metadata, 20% programming code).
Building IoT and Big Data Solutions on AzureIdo Flatow
This document discusses building IoT and big data solutions on Microsoft Azure. It provides an overview of common data types and challenges in integrating diverse data sources. It then describes several Azure services that can be used to ingest, process, analyze and visualize IoT and other large, diverse datasets. These services include IoT Hub, Event Hubs, Stream Analytics, HDInsight, Data Factory, DocumentDB and others. Examples and demos are provided for how to use these services to build end-to-end IoT and big data solutions on Azure.
The document summarizes discussions from a Clearinghouse WG Telecon meeting. It provides updates on deployed software, including upcoming releases of Isite, SMMS/GeoConnect Server, Blue Angel Gateway, and the ICN Web Search Module. It also discusses the NSDI FAQ resource, CAP grants, standards activities regarding ISO metadata and OGC catalog services, and plans for an NSDI workshop in Fall 2001.
The document summarizes discussions from a Clearinghouse WG Telecon meeting. It provides updates on deployed software, including upcoming releases of Isite, SMMS/GeoConnect Server, Blue Angel Gateway, and the ICN Web Search Module. It also discusses the NSDI FAQ resource, CAP grants, standards activities regarding ISO metadata and OGC catalog services, and plans for an NSDI workshop in Fall 2001.
The document discusses Microsoft's ALM Search service architecture and design. It describes plans for the search indexing and query pipelines, including using Elastic Search for indexing and querying across artifacts. It addresses security, performance, deployment topology, and futures like semantic search and integration with on-premise systems. Key points include indexing millions of files in hours, scaling out the indexing pipeline, and supporting cross-account and public repository search.
Data-Blitz is a processing platform that provides high throughput and availability for organizations lacking resources. It uses modern techniques like those used by LinkedIn, Twitter, and others. Data-Blitz allows building, testing, deploying, and managing big data applications at scale across various infrastructure with built-in security, monitoring, and DevOps tools.
DataFinder is software developed by the German Aerospace Center (DLR) to help scientists and engineers efficiently manage and organize their large and growing scientific data sets. It provides a structured way to organize data through customizable data models and metadata, and can integrate various storage resources. DataFinder was created in Python due to its ease of use and maintainability. It uses a client-server model with a WebDAV server to manage metadata and data structures, and can access different storage backends. Customizations through Python scripts allow users to automate tasks and integrate it into their workflows.
Big Data Analytics from Azure Cloud to Power BI MobileRoy Kim
This document discusses using Azure services for big data analytics and data insights. It provides an overview of Azure services like Azure Batch, Azure Data Lake, Azure HDInsight and Power BI. It then describes a demo solution that uses these Azure services to analyze job posting data, including collecting data using a .NET application, storing in Azure Data Lake Store, processing with Azure Data Lake Analytics and Azure HDInsight, and visualizing results in Power BI. The presentation includes architecture diagrams and discusses implementation details.
Next Gen Data Modeling in the Open Data Platform With Doron Porat and Liran Y...HostedbyConfluent
Next Gen Data Modeling in the Open Data Platform With Doron Porat and Liran Yogev | Current 2022
At Yotpo, we have a rich and busy data lake consisting of thousands of data sets ingested and digested by different engines, the main one being Spark.
We built our data infrastructure to enable our users to produce and consume data via self-service tooling, giving them the utmost freedom.
This freedom came with a cost.
We had trouble with bad standardization, little data reusability, lack of data lineage, and flaky data sets.
We also witnessed the landscape under which we built our platform change dramatically and so have our analytics needs and expectations.
We came to an understanding that the modeling layer should be decoupled from the execution layer in order to get rid of the limitations we were bounded by -
Batch and stream should be no more than attributes as part of a wider abstraction
A Kafka topic and a data lake table are no different and should be treated the same way
Observability of our data pipelines should have the same quality and depth across all execution engines, storage methods, and formats
Governance should be an implicit part of our ecosystem to serve as a basis for both exploration and automation/anomaly detection
That's when we started building YODA (soon to be open sourced) that gives us killer dev experience with the level of abstraction we always dreamed of.
Combining DBT, Databricks, lakeFS, and a multitude of streaming engines - we started seeing our vision come to life.
In this talk, we'll share from our journey redesigning the data lake, and how to best address organizational needs, without having to give up on high-end tooling and technology. We are taking this to the next level.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
AppSec PNW: Android and iOS Application Security with MobSFAjin Abraham
Mobile Security Framework - MobSF is a free and open source automated mobile application security testing environment designed to help security engineers, researchers, developers, and penetration testers to identify security vulnerabilities, malicious behaviours and privacy concerns in mobile applications using static and dynamic analysis. It supports all the popular mobile application binaries and source code formats built for Android and iOS devices. In addition to automated security assessment, it also offers an interactive testing environment to build and execute scenario based test/fuzz cases against the application.
This talk covers:
Using MobSF for static analysis of mobile applications.
Interactive dynamic security assessment of Android and iOS applications.
Solving Mobile app CTF challenges.
Reverse engineering and runtime analysis of Mobile malware.
How to shift left and integrate MobSF/mobsfscan SAST and DAST in your build pipeline.
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/how-axelera-ai-uses-digital-compute-in-memory-to-deliver-fast-and-energy-efficient-computer-vision-a-presentation-from-axelera-ai/
Bram Verhoef, Head of Machine Learning at Axelera AI, presents the “How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-efficient Computer Vision” tutorial at the May 2024 Embedded Vision Summit.
As artificial intelligence inference transitions from cloud environments to edge locations, computer vision applications achieve heightened responsiveness, reliability and privacy. This migration, however, introduces the challenge of operating within the stringent confines of resource constraints typical at the edge, including small form factors, low energy budgets and diminished memory and computational capacities. Axelera AI addresses these challenges through an innovative approach of performing digital computations within memory itself. This technique facilitates the realization of high-performance, energy-efficient and cost-effective computer vision capabilities at the thin and thick edge, extending the frontier of what is achievable with current technologies.
In this presentation, Verhoef unveils his company’s pioneering chip technology and demonstrates its capacity to deliver exceptional frames-per-second performance across a range of standard computer vision networks typical of applications in security, surveillance and the industrial sector. This shows that advanced computer vision can be accessible and efficient, even at the very edge of our technological ecosystem.
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.