WMS Benchmarking presentation and results, from the FOSS4G 2010 event in Barcelona. 8 different development teams participated in this exercise, to display common data through the WMS standard the fastest. http://2010.foss4g.org/wms_benchmarking.php
QGIS server: the good, the not-so-good and the uglyRoss McDonald
Fiona Hemsley-Flint's presentation on QGIS Server given at the 6th Scottish QGIS UK user group meeting. Compares QGIS server with Mapserver and Geognosis.
Apache Spark presentation at HasGeek FifthElelephant
https://fifthelephant.talkfunnel.com/2015/15-processing-large-data-with-apache-spark
Covering Big Data Overview, Spark Overview, Spark Internals and its supported libraries
Spatio-temporal Data Handling With GeoServer for MetOc And Remote SensingGeoSolutions
This presentation will provide detailed information on how to handle SpatioTemporal metadata in GeoServer for serving with OGC Services, with a particular focus on WMS and WCS.
SRE Demystified - 16 - NALSD - Non-Abstract Large System DesignDr Ganesh Iyer
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain what is Non-Abstract Large System Design (NALSD). Video recording can be found at https://youtu.be/barW9f1aNig
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
In this talk, I will dive into the stage level scheduling feature added to Apache Spark 3.1. Stage level scheduling extends upon Project Hydrogen by improving big data ETL and AI integration and also enables multiple other use cases. It is beneficial any time the user wants to change container resources between stages in a single Apache Spark application, whether those resources are CPU, Memory or GPUs. One of the most popular use cases is enabling end-to-end scalable Deep Learning and AI to efficiently use GPU resources. In this type of use case, users read from a distributed file system, do data manipulation and filtering to get the data into a format that the Deep Learning algorithm needs for training or inference and then sends the data into a Deep Learning algorithm. Using stage level scheduling combined with accelerator aware scheduling enables users to seamlessly go from ETL to Deep Learning running on the GPU by adjusting the container requirements for different stages in Spark within the same application. This makes writing these applications easier and can help with hardware utilization and costs.
There are other ETL use cases where users want to change CPU and memory resources between stages, for instance there is data skew or perhaps the data size is much larger in certain stages of the application. In this talk, I will go over the feature details, cluster requirements, the API and use cases. I will demo how the stage level scheduling API can be used by Horovod to seamlessly go from data preparation to training using the Tensorflow Keras API using GPUs.
The talk will also touch on other new Apache Spark 3.1 functionality, such as pluggable caching, which can be used to enable faster dataframe access when operating from GPUs.
QGIS server: the good, the not-so-good and the uglyRoss McDonald
Fiona Hemsley-Flint's presentation on QGIS Server given at the 6th Scottish QGIS UK user group meeting. Compares QGIS server with Mapserver and Geognosis.
Apache Spark presentation at HasGeek FifthElelephant
https://fifthelephant.talkfunnel.com/2015/15-processing-large-data-with-apache-spark
Covering Big Data Overview, Spark Overview, Spark Internals and its supported libraries
Spatio-temporal Data Handling With GeoServer for MetOc And Remote SensingGeoSolutions
This presentation will provide detailed information on how to handle SpatioTemporal metadata in GeoServer for serving with OGC Services, with a particular focus on WMS and WCS.
SRE Demystified - 16 - NALSD - Non-Abstract Large System DesignDr Ganesh Iyer
According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain what is Non-Abstract Large System Design (NALSD). Video recording can be found at https://youtu.be/barW9f1aNig
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
In this talk, I will dive into the stage level scheduling feature added to Apache Spark 3.1. Stage level scheduling extends upon Project Hydrogen by improving big data ETL and AI integration and also enables multiple other use cases. It is beneficial any time the user wants to change container resources between stages in a single Apache Spark application, whether those resources are CPU, Memory or GPUs. One of the most popular use cases is enabling end-to-end scalable Deep Learning and AI to efficiently use GPU resources. In this type of use case, users read from a distributed file system, do data manipulation and filtering to get the data into a format that the Deep Learning algorithm needs for training or inference and then sends the data into a Deep Learning algorithm. Using stage level scheduling combined with accelerator aware scheduling enables users to seamlessly go from ETL to Deep Learning running on the GPU by adjusting the container requirements for different stages in Spark within the same application. This makes writing these applications easier and can help with hardware utilization and costs.
There are other ETL use cases where users want to change CPU and memory resources between stages, for instance there is data skew or perhaps the data size is much larger in certain stages of the application. In this talk, I will go over the feature details, cluster requirements, the API and use cases. I will demo how the stage level scheduling API can be used by Horovod to seamlessly go from data preparation to training using the Tensorflow Keras API using GPUs.
The talk will also touch on other new Apache Spark 3.1 functionality, such as pluggable caching, which can be used to enable faster dataframe access when operating from GPUs.
Hive Bucketing in Apache Spark with Tejas PatilDatabricks
Bucketing is a partitioning technique that can improve performance in certain data transformations by avoiding data shuffling and sorting. The general idea of bucketing is to partition, and optionally sort, the data based on a subset of columns while it is written out (a one-time cost), while making successive reads of the data more performant for downstream jobs if the SQL operators can make use of this property. Bucketing can enable faster joins (i.e. single stage sort merge join), the ability to short circuit in FILTER operation if the file is pre-sorted over the column in a filter predicate, and it supports quick data sampling.
In this session, you’ll learn how bucketing is implemented in both Hive and Spark. In particular, Patil will describe the changes in the Catalyst optimizer that enable these optimizations in Spark for various bucketing scenarios. Facebook’s performance tests have shown bucketing to improve Spark performance from 3-5x faster when the optimization is enabled. Many tables at Facebook are sorted and bucketed, and migrating these workloads to Spark have resulted in a 2-3x savings when compared to Hive. You’ll also hear about real-world applications of bucketing, like loading of cumulative tables with daily delta, and the characteristics that can help identify suitable candidate jobs that can benefit from bucketing.
OSA Con 2022 - Apache Iceberg_ An Architectural Look Under the Covers - Alex ...Altinity Ltd
OSA Con 2022: Apache Iceberg: An Architectural Look Under the Covers
Alex Merced - Dremio
The data lakehouse is one of the most exciting trends in the data space promising to merge the best aspects of data lakes and data warehouses without either of their problems. Open source tech is making this promise a reality and in this talk Dremio Developer Advocate, Alex Merced, explores these technologies.
In this talk Alex Merced will cover:
- What is a Data Lakehouse?
- Why open matters in preserving the promise of lakehouses (better costs, vendor freedom, data freedom)
- What are technologies that enable lakehouses like Apache Iceberg, Apache Parquet, Apache Arrow and Project Nessie
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...Dremio Corporation
Essentially every successful analytical DBMS in the market today makes use of column-oriented data structures. In the Hadoop ecosystem, Apache Parquet (and Apache ORC) provide similar advantages in terms of processing and storage efficiency. Apache Arrow is the in-memory counterpart to these formats and has been been embraced by over a dozen open source projects as the de facto standard for in-memory processing. In this session the PMC Chair for Apache Arrow and the PMC Chair for Apache Parquet discuss the future of column-oriented processing.
Apache phoenix: Past, Present and Future of SQL over HBAseenissoz
HBase as the NoSQL database of choice in the Hadoop ecosystem has already been proven itself in scale and in many mission critical workloads in hundreds of companies. Phoenix as the SQL layer on top of HBase, has been increasingly becoming the tool of choice as the perfect complementary for HBase. Phoenix is now being used more and more for super low latency querying and fast analytics across a large number of users in production deployments. In this talk, we will cover what makes Phoenix attractive among current and prospective HBase users, like SQL support, JDBC, data modeling, secondary indexing, UDFs, and also go over recent improvements like Query Server, ODBC drivers, ACID transactions, Spark integration, etc. We will conclude by looking into items in the pipeline and how Phoenix and HBase interacts with other engines like Hive and Spark.
DAOS is a open-source storage-as-a-service stack designed from the ground up to address many of the problems that arise when scaling out storage. DAOS takes advantage of next generation non-volatile memory technologies while presenting a rich and scalable storage interface providing features such as transactional non-blocking list I/O, data resiliency on top of commodity hardware, fine grained data control, and storage tiering to optimize performance and cost.
GeoServer in Production: we do it, here is how!GeoSolutions
The presentation will describe how to setup a production system based on GeoServer from the points of view of performance, availability and security. The suggestions will start covering how a single node GeoServer should be prepared for internet usage, tuning logging, connection pools, security, data and JVM preparation, keeping disk, memory and CPU usage in check within the limits of the available resources. We’ll then move to tools used to monitor the production instances, ranging from probes to request auditing and watch-dogs. Finally the presentation will cover setting up a cluster of server and the strategies for keeping them in synch, from the traditional multi-tier setup (testing vs production) to the systems that need to keep an ever evolving catalog of layers constantly on-line and in synch.
ORC files were originally introduced in Hive, but have now migrated to an independent Apache project. This has sped up the development of ORC and simplified integrating ORC into other projects, such as Hadoop, Spark, Presto, and Nifi. There are also many new tools that are built on top of ORC, such as Hive’s ACID transactions and LLAP, which provides incredibly fast reads for your hot data. LLAP also provides strong security guarantees that allow each user to only see the rows and columns that they have permission for.
This talk will discuss the details of the ORC and Parquet formats and what the relevant tradeoffs are. In particular, it will discuss how to format your data and the options to use to maximize your read performance. In particular, we’ll discuss when and how to use ORC’s schema evolution, bloom filters, and predicate push down. It will also show you how to use the tools to translate ORC files into human-readable formats, such as JSON, and display the rich metadata from the file including the type in the file and min, max, and count for each column.
Apache Iceberg Presentation for the St. Louis Big Data IDEAAdam Doyle
Presentation on Apache Iceberg for the February 2021 St. Louis Big Data IDEA. Apache Iceberg is an alternative database platform that works with Hive and Spark.
Hive Bucketing in Apache Spark with Tejas PatilDatabricks
Bucketing is a partitioning technique that can improve performance in certain data transformations by avoiding data shuffling and sorting. The general idea of bucketing is to partition, and optionally sort, the data based on a subset of columns while it is written out (a one-time cost), while making successive reads of the data more performant for downstream jobs if the SQL operators can make use of this property. Bucketing can enable faster joins (i.e. single stage sort merge join), the ability to short circuit in FILTER operation if the file is pre-sorted over the column in a filter predicate, and it supports quick data sampling.
In this session, you’ll learn how bucketing is implemented in both Hive and Spark. In particular, Patil will describe the changes in the Catalyst optimizer that enable these optimizations in Spark for various bucketing scenarios. Facebook’s performance tests have shown bucketing to improve Spark performance from 3-5x faster when the optimization is enabled. Many tables at Facebook are sorted and bucketed, and migrating these workloads to Spark have resulted in a 2-3x savings when compared to Hive. You’ll also hear about real-world applications of bucketing, like loading of cumulative tables with daily delta, and the characteristics that can help identify suitable candidate jobs that can benefit from bucketing.
OSA Con 2022 - Apache Iceberg_ An Architectural Look Under the Covers - Alex ...Altinity Ltd
OSA Con 2022: Apache Iceberg: An Architectural Look Under the Covers
Alex Merced - Dremio
The data lakehouse is one of the most exciting trends in the data space promising to merge the best aspects of data lakes and data warehouses without either of their problems. Open source tech is making this promise a reality and in this talk Dremio Developer Advocate, Alex Merced, explores these technologies.
In this talk Alex Merced will cover:
- What is a Data Lakehouse?
- Why open matters in preserving the promise of lakehouses (better costs, vendor freedom, data freedom)
- What are technologies that enable lakehouses like Apache Iceberg, Apache Parquet, Apache Arrow and Project Nessie
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...Dremio Corporation
Essentially every successful analytical DBMS in the market today makes use of column-oriented data structures. In the Hadoop ecosystem, Apache Parquet (and Apache ORC) provide similar advantages in terms of processing and storage efficiency. Apache Arrow is the in-memory counterpart to these formats and has been been embraced by over a dozen open source projects as the de facto standard for in-memory processing. In this session the PMC Chair for Apache Arrow and the PMC Chair for Apache Parquet discuss the future of column-oriented processing.
Apache phoenix: Past, Present and Future of SQL over HBAseenissoz
HBase as the NoSQL database of choice in the Hadoop ecosystem has already been proven itself in scale and in many mission critical workloads in hundreds of companies. Phoenix as the SQL layer on top of HBase, has been increasingly becoming the tool of choice as the perfect complementary for HBase. Phoenix is now being used more and more for super low latency querying and fast analytics across a large number of users in production deployments. In this talk, we will cover what makes Phoenix attractive among current and prospective HBase users, like SQL support, JDBC, data modeling, secondary indexing, UDFs, and also go over recent improvements like Query Server, ODBC drivers, ACID transactions, Spark integration, etc. We will conclude by looking into items in the pipeline and how Phoenix and HBase interacts with other engines like Hive and Spark.
DAOS is a open-source storage-as-a-service stack designed from the ground up to address many of the problems that arise when scaling out storage. DAOS takes advantage of next generation non-volatile memory technologies while presenting a rich and scalable storage interface providing features such as transactional non-blocking list I/O, data resiliency on top of commodity hardware, fine grained data control, and storage tiering to optimize performance and cost.
GeoServer in Production: we do it, here is how!GeoSolutions
The presentation will describe how to setup a production system based on GeoServer from the points of view of performance, availability and security. The suggestions will start covering how a single node GeoServer should be prepared for internet usage, tuning logging, connection pools, security, data and JVM preparation, keeping disk, memory and CPU usage in check within the limits of the available resources. We’ll then move to tools used to monitor the production instances, ranging from probes to request auditing and watch-dogs. Finally the presentation will cover setting up a cluster of server and the strategies for keeping them in synch, from the traditional multi-tier setup (testing vs production) to the systems that need to keep an ever evolving catalog of layers constantly on-line and in synch.
ORC files were originally introduced in Hive, but have now migrated to an independent Apache project. This has sped up the development of ORC and simplified integrating ORC into other projects, such as Hadoop, Spark, Presto, and Nifi. There are also many new tools that are built on top of ORC, such as Hive’s ACID transactions and LLAP, which provides incredibly fast reads for your hot data. LLAP also provides strong security guarantees that allow each user to only see the rows and columns that they have permission for.
This talk will discuss the details of the ORC and Parquet formats and what the relevant tradeoffs are. In particular, it will discuss how to format your data and the options to use to maximize your read performance. In particular, we’ll discuss when and how to use ORC’s schema evolution, bloom filters, and predicate push down. It will also show you how to use the tools to translate ORC files into human-readable formats, such as JSON, and display the rich metadata from the file including the type in the file and min, max, and count for each column.
Apache Iceberg Presentation for the St. Louis Big Data IDEAAdam Doyle
Presentation on Apache Iceberg for the February 2021 St. Louis Big Data IDEA. Apache Iceberg is an alternative database platform that works with Hive and Spark.
Fieldtrip GB is an app for Android and iOS which enables users to collect data against quality cartographic background maps. It's custom data capture forms allow users to collect exactly the information they need to support their research and export the data in a useable format.
mod-geocache / mapcache - A fast tiling solution for the apache web servertbonfort
MapServer MapCache (formerly known as mod-geocache) is a new member in the family of tile caching servers. It aims to be simple to install and configure (no need for the intermediate glue such as mod-python, mod-wsgi or fastcgi), to be (very) fast (written in c and running as a native module under apache), and to be capable (services WMTS, googlemaps, virtualearth, KML, TMS, WMS). When acting as a WMS server, it will also respond to untiled requests, by merging its cached tiles vertically (multiple layers) and/or horizontally.
WMS Benchmarking presentation and results, from the FOSS4G 2011 event in Denver. 6 different development teams participated in this exercise, to display common data through the WMS standard the fastest. http://2011.foss4g.org/sessions/web-mapping-performance-shootout
A brief overview on Oracle R12 Warehouse Management System (WMS). The available functionality and various processes. Also go through the benefits of WMS.
Agenda:
- WMS overview
- Processess of warehouse
- Inbound logistics
- Outbound logistics
- Material Control
- Q & A session
Time, Change and Habits in Geospatial-Temporal Information StandardsGeorge Percivall
Keynote for HIC 2014 – 11th International Conference on Hydroinformatics, New York, USA August 17 – 21, 2014
Time, Change and Habits in Geospatial-Temporal Information Standards
Time and change are fundamental to our scientific understanding of the world. Standards for geospatial-temporal information exist but new needs outstrip current standards. Geospatial-temporal information includes capturing change in features and coverages and modeling the processes that inform change. Key standards for time, calendars, and temporal reference systems are in place. Time series modeling from the WaterML standard is a recent advance of high value to hydrology. The OGC Moving Features standard will establish an encoding format for changes in “rigid” features. Interoperability standards are needed for Coverages with values that change based on observations, analytical expressions, or simulations. Applying a coverage model to time-varying, fluid Earth systems was the topic of the ground breaking GALEON Interoperability Experiment. Standards developments for spatial-temporal process models is progressing with WPS, OpenMI and ESMF - supporting a Model Web concept. A robust framework for sharing geospatial-temporal information is now coming into place based on developments captured in standards by ISO, WMO, ITU, ICSU and OGC - including the newly established OGC Temporal domain working group. The new framework will enable capabilities in expressing and sharing scientific investigations including research on the emergence of forms over time. With these new capabilities we may come to understand Peirce’s observation that over time “all things have a tendency to take habits.”
UAVs are a disruptive technology bringing new geographic data and information to many application domains. UASs are similar to other geographic imagery systems so existing frameworks are applicable. But the diversity of UAVs as platforms along with the diversity of available sensors are presenting challenges in the processing and creation of geospatial products. Efficient processing and dissemination of the data is achieved using software and systems that implement open standards. The challenges identified point to the need for use of existing standards and extending standards. Results from the use of the OGC Sensor Web Enablement set of standards are presented. Next steps in the progress of UAVs and UASs may follow the path of open data, open source and open standards.
Open source is gleefully rewriting the rules of IT development at all levels of industry and government. Adoption of open source in government is well underway, with success stories illustrating the benefits.
This decade we are going further - fostering a healthy, sustainable, working relationship between government and open source:
* This presentation digs into the flexibility of open source licensing and how government organizations can meet the challenges of developing with open source.
* We will look at the advantages of government participation in open source at the project, institutional, and foundation level.
Attend this talk to understand how your organization cannot only benefit from open source, but be open source.
Using GeoServer for spatio-temporal data management with examples for MetOc a...GeoSolutions
This presentation will provide detailed information on how to ingest and configure spatio-temporal data in GeoServer, to be served using OGC services, with examples from WMS and WCS services.
Accelerating hbase with nvme and bucket cacheDavid Grier
This set of slides describes some initial experiments which we have designed for discovering improvements for performance in Hadoop technologies using NVMe technology
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
on-Volatile-Memory express (NVMe) standard promises and order of magnitude faster storage than regular SSDs, while at the same time being more economical than regular RAM on TB/$. This talk evaluates the use cases and benefits of NVMe drives for its use in Big Data clusters with HBase and Hadoop HDFS.
First, we benchmark the different drives using system level tools (FIO) to get maximum expected values for each different device type and set expectations. Second, we explore the different options and use cases of HBase storage and benchmark the different setups. And finally, we evaluate the speedups obtained by the NVMe technology for the different Big Data use cases from the YCSB benchmark.
In summary, while the NVMe drives show up to 8x speedup in best case scenarios, testing the cost-efficiency of new device technologies is not straightforward in Big Data, where we need to overcome system level caching to measure the maximum benefits.
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
Jan 22nd, 2010 Hadoop meetup presentation on project voldemort and how it plays well with Hadoop at linkedin. The talk focus on Linkedin Hadoop ecosystem. How linkedin manage complex workflows, data ETL , data storage and online serving of 100GB to TB of data.
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...javier ramirez
How would you build a database to support sustained ingestion of several hundreds of thousands rows per second while running near real-time queries on top?
In this session I will go over some of the technical decisions and trade-offs we applied when building QuestDB, an open source time-series database developed mainly in JAVA, and how we can achieve over four million row writes per second on a single instance without blocking or slowing down the reads. There will be code and demos, of course.
We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
I'm currently experiencing a strong cognitive dissonance, and it won't let me go. You see, I visit various programmers' forums and see topics where people discuss noble ideas about how to write super-reliable classes; somebody tells he has his project built with the switches -Wall -Wextra -pedantic -Weffc++, and so on. But, God, where are all these scientific and technological achievements? Why do I come across most silly mistakes again and again? Perhaps something is wrong with me?
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
Après la petite intro sur le stockage distribué et la description de Ceph, Jian Zhang réalise dans cette présentation quelques benchmarks intéressants : tests séquentiels, tests random et surtout comparaison des résultats avant et après optimisations. Les paramètres de configuration touchés et optimisations (Large page numbers, Omap data sur un disque séparé, ...) apportent au minimum 2x de perf en plus.
The state of SQL-on-Hadoop in the CloudNicolas Poggi
With the increase of Hadoop offerings in the Cloud, users are faced with many decisions to make: which Cloud provider, VMs to choose, cluster sizing, storage type, or even if to go to fully managed Platform-as-a-Service (PaaS) Hadoop? As the answer is always "depends on your data and usage", this talk will guide participants over an overview of the different PaaS solutions for the leading Cloud providers. By highlighting the main results benchmarking their SQL-on-Hadoop (i.e., Hive) services using the ALOJA benchmarking project. To compare their current offerings in terms of readiness, architectural differences, and cost-effectiveness (performance-to-price), to entry-level Hadoop based deployments. As well as briefly presenting how to replicate results and create custom benchmarks from internal apps. So that users can make their own decisions about choosing the right provider to their particular data needs.
sudoers: Benchmarking Hadoop with ALOJANicolas Poggi
Presentation for the sudoers Barcelona group 0ct 06 2015, on benchmarking Hadoop with ALOJA open source benchmarking platform. The presentation was mostly a live DEMO, posting some slides for the people who could not attend.
http://lanyrd.com/2015/sudoers-barcelona-october/
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...Nagios
Dan Wittenberg's presentation on using Nagios at a Fortune 50 Company
The presentation was given during the Nagios World Conference North America held Sept 25-28th, 2012 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
Presentation on Large Scale Data ManagementChris Bunch
These are the slides for a presentation I recently gave at a seminar on Large Scale Data Management. The first half talks about the current state of affairs in the debate between MapReduce and parallel databases, while the second half focuses on two recent papers on virtual machine migration.
A look at the journey of my career, and a glimpse of the spatial industry through my eyes today. Presented to the students of COGS (Centre of Geographic Sciences), in Nova Scotia, Canada, on 2016-03-31.
History of the GRASS GIS Video from 1987 (with William Shatner)Jeff McKenna
Peter Lowe's presentation from FOSS4G 2014 in Portland, explaining the history of the famous GRASS video, shot in 1987, starring 'Captain James T Kirk'.
Keynote presentation at the Geo-Information Society of South Africa (GISSA) conference, in Johannesburg, on 2 October 2012. Over 600 attendees from all over Africa.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
6. FOSS4G 2008: OpenGeo run and published the second comparison with some review from the MapServer developers. Focus on simple thematic mapping, raster data access, WFS and tile caching
48. Common bottlenecks are the CPU, the disk access, the network access, the remote database access
49. Looking at system monitoring on the WMS machines during the benchmark runs, the disk was playing the role of the bottleneck in most tests. For some servers, for some runs, the data became fully cached and performance increased dramatically
50.
51. Some teams managed to turn this disk-bound scenario into a disk-unbound scenario, allowing servers to fully run in memory, with no disc access and putting all load on the CPU
52. Results observed in the disk-unbound scenario were 5 to 10 times better than in the disk-bound scenario
53.
54. the 2000 WMS requests are the exact same between all runs
57. The Windows server was down in the last benchmark days, preventing some teams to try to reproduce the disk unbound scenario
58. As apples cannot be compared to oranges, all participants did agree that teams were not mandatory to publish their benchmark results in the final presentation
95. Suggested setup: when you have more than 4 CPU, and if you are normally CPU bound, setup multiple processes and use a load balancer to distribute the load among them GeoServer 1 GeoServer 2 Load balancer
177. => Tried to harmonize styles, forgot to check image sizes and compression levels, avoided discussions of quality although it is key to some participants
224. Huge disk read bottleneck reducing rendering optimizations effects on the overall performances
225. Tests started very late because of data availability. As a result, issues were discovered late in the test design leaving no time to fix.
226.
227.
228. As apples cannot be compared to oranges, disk bound scenario cannot be compared to CPU bound scenario.
229.
230. Given the inconsistencies in the tests conception that were discussed, ERDAS is concerned that the different throughput results between server applications might confuse the community and mislead the community.
231. ERDAS plans to conduct a webinar in the future to review the methodology and results of the FOSS4G benchmark. ERDAS will also provide analysis and ideas for future improvements.