This is from a 2 hour talk introducing in-memory databases. First a look at traditional RDBMS architecture and some of it's limitations, then a look at some in-memory products and finally a closer look at OrigoDB, the open source in-memory database toolkit for NET/Mono.
Apache HBase™ is the Hadoop database, a distributed, salable, big data store.Its a column-oriented database management system that runs on top of HDFS.
Apache HBase is an open source NoSQL database that provides real-time read/write access to those large data sets. ... HBase is natively integrated with Hadoop and works seamlessly alongside other data access engines through YARN.
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)Ontico
Database sharding involves spreading database contents across multiple servers, with each server holding only part of the database. While it is possible to vertically scale Postgres, and to scale read-only workloads across multiple servers, only sharding allows multi-server read-write scaling. This presentation will cover the advantages of sharding and future Postgres sharding implementation requirements, including foreign data wrapper enhancements, parallelism, and global snapshot and transaction control. This is a followup to my Postgres Scaling Opportunities presentation.
This is from a 2 hour talk introducing in-memory databases. First a look at traditional RDBMS architecture and some of it's limitations, then a look at some in-memory products and finally a closer look at OrigoDB, the open source in-memory database toolkit for NET/Mono.
Apache HBase™ is the Hadoop database, a distributed, salable, big data store.Its a column-oriented database management system that runs on top of HDFS.
Apache HBase is an open source NoSQL database that provides real-time read/write access to those large data sets. ... HBase is natively integrated with Hadoop and works seamlessly alongside other data access engines through YARN.
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)Ontico
Database sharding involves spreading database contents across multiple servers, with each server holding only part of the database. While it is possible to vertically scale Postgres, and to scale read-only workloads across multiple servers, only sharding allows multi-server read-write scaling. This presentation will cover the advantages of sharding and future Postgres sharding implementation requirements, including foreign data wrapper enhancements, parallelism, and global snapshot and transaction control. This is a followup to my Postgres Scaling Opportunities presentation.
HBase hast established itself as the backend for many operational and interactive use-cases, powering well-known services that support millions of users and thousands of concurrent requests. In terms of features HBase has come a long way, overing advanced options such as multi-level caching on- and off-heap, pluggable request handling, fast recovery options such as region replicas, table snapshots for data governance, tuneable write-ahead logging and so on. This talk is based on the research for the an upcoming second release of the speakers HBase book, correlated with the practical experience in medium to large HBase projects around the world. You will learn how to plan for HBase, starting with the selection of the matching use-cases, to determining the number of servers needed, leading into performance tuning options. There is no reason to be afraid of using HBase, but knowing its basic premises and technical choices will make using it much more successful. You will also learn about many of the new features of HBase up to version 1.3, and where they are applicable.
Building robust CDC pipeline with Apache Hudi and DebeziumTathastu.ai
We have covered the need for CDC and the benefits of building a CDC pipeline. We will compare various CDC streaming and reconciliation frameworks. We will also cover the architecture and the challenges we faced while running this system in the production. Finally, we will conclude the talk by covering Apache Hudi, Schema Registry and Debezium in detail and our contributions to the open-source community.
Making MySQL Great For Business IntelligenceCalpont
This presentation describes how to make MySQL a great database for business intelligence, and presents a special focus on column databases and InfiniDB from Calpont
Building tiered data stores using aesop to bridge sql and no sql systemsRegunath B
Slides from my talk on building tiered data stores using Aesop to bridge SQL and NoSQL data stores. Aesop is a pub-sub like change data capture and propagation system.
Take an in-depth look at data warehousing with Amazon Redshift and get answers to your technical questions. We will cover performance tuning techniques that take advantage of Amazon Redshift's columnar technology and massively parallel processing architecture. We will also discuss best practices for migrating from existing data warehouses, optimizing your schema, loading data efficiently, and using work load management and interleaved sorting.
Moderated by Lars Hofhansl (Salesforce), with Matteo Bertozzi (Cloudera), John Leach (Splice Machine), Maxim Lukiyanov (Microsoft), Matt Mullins (Facebook), and Carter Page (Google)
The future of HBase, via a variety of viewpoints.
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Amazon Web Services
Get a look under the hood: Understand how to take advantage of Amazon Redshift's columnar technology and parallel processing capabilities to improve your delivery of queries and improve overall database performance. You’ll also hear about how the University of Technology Sydney (UTS) are using Redshift. The University of Technology Sydney will describe how utilizing Amazon Redshift enabled agility in dealing with Data Quality, a capacity to scale when required, and optimizing development processes through rapid provisioning of Data Warehouse environments.
Speaker: Ganesh Raja, Solutions Architect, Amazon Web Services with Susan Gibson, Manager, Data and Business Intelligence, UTS
Level: 300
This presentation gives you an overview about SAP HANA, explains how SAP HANA is working, addresses the comprehensive SAP big data solution, and at last, illustrates how to create a SAP HANA One instance in AWS to tame your big data challenges.
Presentation of SAP's latest in-memory technology Hana, presentation to School of Information and Service Economy of Aalto University Helsinki, Prof. Matti Rossi, presentation includes links to demo systems and explains how to apply for access to a real SAP Hana system.
HBase hast established itself as the backend for many operational and interactive use-cases, powering well-known services that support millions of users and thousands of concurrent requests. In terms of features HBase has come a long way, overing advanced options such as multi-level caching on- and off-heap, pluggable request handling, fast recovery options such as region replicas, table snapshots for data governance, tuneable write-ahead logging and so on. This talk is based on the research for the an upcoming second release of the speakers HBase book, correlated with the practical experience in medium to large HBase projects around the world. You will learn how to plan for HBase, starting with the selection of the matching use-cases, to determining the number of servers needed, leading into performance tuning options. There is no reason to be afraid of using HBase, but knowing its basic premises and technical choices will make using it much more successful. You will also learn about many of the new features of HBase up to version 1.3, and where they are applicable.
Building robust CDC pipeline with Apache Hudi and DebeziumTathastu.ai
We have covered the need for CDC and the benefits of building a CDC pipeline. We will compare various CDC streaming and reconciliation frameworks. We will also cover the architecture and the challenges we faced while running this system in the production. Finally, we will conclude the talk by covering Apache Hudi, Schema Registry and Debezium in detail and our contributions to the open-source community.
Making MySQL Great For Business IntelligenceCalpont
This presentation describes how to make MySQL a great database for business intelligence, and presents a special focus on column databases and InfiniDB from Calpont
Building tiered data stores using aesop to bridge sql and no sql systemsRegunath B
Slides from my talk on building tiered data stores using Aesop to bridge SQL and NoSQL data stores. Aesop is a pub-sub like change data capture and propagation system.
Take an in-depth look at data warehousing with Amazon Redshift and get answers to your technical questions. We will cover performance tuning techniques that take advantage of Amazon Redshift's columnar technology and massively parallel processing architecture. We will also discuss best practices for migrating from existing data warehouses, optimizing your schema, loading data efficiently, and using work load management and interleaved sorting.
Moderated by Lars Hofhansl (Salesforce), with Matteo Bertozzi (Cloudera), John Leach (Splice Machine), Maxim Lukiyanov (Microsoft), Matt Mullins (Facebook), and Carter Page (Google)
The future of HBase, via a variety of viewpoints.
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Amazon Web Services
Get a look under the hood: Understand how to take advantage of Amazon Redshift's columnar technology and parallel processing capabilities to improve your delivery of queries and improve overall database performance. You’ll also hear about how the University of Technology Sydney (UTS) are using Redshift. The University of Technology Sydney will describe how utilizing Amazon Redshift enabled agility in dealing with Data Quality, a capacity to scale when required, and optimizing development processes through rapid provisioning of Data Warehouse environments.
Speaker: Ganesh Raja, Solutions Architect, Amazon Web Services with Susan Gibson, Manager, Data and Business Intelligence, UTS
Level: 300
This presentation gives you an overview about SAP HANA, explains how SAP HANA is working, addresses the comprehensive SAP big data solution, and at last, illustrates how to create a SAP HANA One instance in AWS to tame your big data challenges.
Presentation of SAP's latest in-memory technology Hana, presentation to School of Information and Service Economy of Aalto University Helsinki, Prof. Matti Rossi, presentation includes links to demo systems and explains how to apply for access to a real SAP Hana system.
In-Memory Computing: How, Why? and common PatternsSrinath Perera
Traditionally, big data is mostly read from disks and processed. However, most big data systems are latency bound, which means often the CPU sits idle waiting for data to arrive. This problem is more prevalent with use cases like graph searches that need to randomly access different parts of datasets. In-memory computing proposes an alternative model where data is loaded or stored in-memory and processed instead of processing them from the disk. Although such designs cost more in terms of memory, sometimes resulting systems can have faster order of magnitudes (e.g. 1000X), which could lead to savings in the long run. With rapidly falling memory prices, this difference is reducing by the day. Furthermore, in-memory computing can enable use cases like ad hoc analysis over a large set of data that was not possible earlier. This talk will provide an overview of in-memory technology and discuss how WSO2 technologies like complex event processing that can be used to build in-memory solutions. It will also provide an overview of upcoming improvements in the WSO2 platform.
CTO View: Driving the On-Demand Economy with Predictive AnalyticsSingleStore
In the on-demand economy real-time analytics is both a necessity and a competitive advantage. The next evolution in the on-demand economy is in predictive analytics fueled by live streams of data—in effect knowing what customers want before they do. This session will feature technical examples of real-time pipelines, machine learning, and custom dashboards as well as off-the-shelf dashboards with Tableau.
Why Companies Need New Approaches for Faster Time-to-Insight SAP Asia Pacific
An IDC infographic, sponsored by SAP. Volumes and variety of data are continuing to grow. The speed of data usage is also increasing, leading to new user expectations. Given these realities, organizations need to: Sift through data to find meaning, identify risks and opportunities, and identify factors that impact future performance.
Presentation at Open Day on Enterprise-Architecture and Systems-Thinking, London, 21 October 2104, for SCiO (Systems and Cybernetics in Organisations) http://scio.org.uk/
This used my development-work on the Enterprise Canvas framework as a worked-example of how we might create tools to bridge the gaps between enterprise-architecture and systems-thinking, in support of organisations' needs.
(This slidedeck also provides a useful overview and primer for Enterprise Canvas itself.)
Unify Line of Business Data with SAP Digital BoardroomSAP Analytics
In this sample use case, you can see the power and potential of unifying your business data using SAP Digital Boardroom to gain meaningful insight. Learn more at http://www.sap.com/digital-boardroom
With SAP Digital Boardroom, you’re able to:
• Connect to Cloud data
• Leverage SAP business networks such as Hybris, Fieldglass, Ariba, and SuccessFactors
• Make faster decisions on live data
• Gain actionable insights
• Collaborate seamlessly in real-time
How do we explore the context for a business-architecture? Short-answer: raid the kids' toy-box!
This slidedeck provides a practical overview of how to explore and identify service-context or business-context, whilst developing a business-architecture. The key theme here is that it's easier to engage people in architecture-development if we make it both fun and thought-provoking, in an immediate, tangible way. As shown in the slidedeck, tools to do this include a wooden train-set and a Victorian toy-theatre - cheap, easily-obtainable and directly practical. Share And Enjoy!
Slidedeck for presentation at IASA-ITARC conference, London, 25 November 2016 - http://iasaglobal.org/itarc-london/
(Note: This is a big slidedeck - almost 75Mb. It'll take some time to download. But worth it, I trust!)
Leading Business Disruption Strategy with EA - Hugh EvansCraig Martin
A Digital disruption presentation delivered as a webinar to the Open Group by Hugh Evans - CEO of Enterprise Architects.
The world is undergoing unprecedented change, driven largely by developments in digital technologies.
Organizations must now consider how to invent new business models as well as new products and services, and they must hone their transformational capabilities to rapidly execute on these plans.
In the recently published Hype Cycle for Enterprise Architecture 2013 Gartner places disruptive forces at the center of the emerging EA mandate:
"Enterprise Architecture (EA) is a discipline for proactively and holistically leading enterprise responses to disruptive forces by identifying and analyzing the execution of change toward desired business vision and outcomes."
"EA practitioners have the opportunity to take a quantum leap toward not only becoming integral to the business, but also leading business change."
[Source: Hype Cycle for Enterprise Architecture 2013, Gartner 2013]
Today, businesses are being forced to come to terms with their vulnerability and opportunities when it comes to disruptive innovation. Enterprise Architecture, by leveraging its emergent business architecture capabilities and its traditional technology and innovation focus, has the opportunity to fill a key void, aiding businesses to win in this new world.
This webinar will explore how EA can drive an organization’s disruptive agenda.
This presentation provides a clear overview of how Oracle Database In-Memory optimizes both analytics and mixed workloads, delivering outstanding performance while supporting real-time analytics, business intelligence, and reporting. It provides details on what you can expect from Database In-Memory in both Oracle Database 12.1.0.2 and 12.2.
The material was created around 2010-end, but published February'2011.
The main purposes of creating this document were as follows:
- very very few people working on SAP HANA at that time
- information regarding SAP HANA not really available and if available, it was scattered !!
- always pulled for presentations and technical demos which generally hampered my own hands-on work :). And I was alone at San Jose whereas Ulrich used to busy in Walldorf coordinating with SAP.
BTW, it was my full-fledged first SAP HANA presentation in end of 2010, although published in 2011. The document is quite old now but most of the part still holds good as of today.
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
SAP HANA is an increasingly popular platform for various analytical and transactional use cases with its in-memory architecture. If you’re an SAP customer you’ve experienced the benefits.
However, the underlying storage for SAP HANA is painfully expensive. This slows down your ability to grow your SAP HANA footprint and serve up more applications.
Esta oferta podría ser la solución para el punto de partida de Flash que junto con Spectrum Scale te da una solución de Software Defined Storage escalable, para cumplir con los requisitos del almacenamiento no estructurado y big data.
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
Teams experiencing subpar latency commonly turn to an external cache to meet the required SLAs. Placing a cache in front of your database might seem like a fast and easy fix, but it often ends up introducing unanticipated complexity, costs, and risks. Caches can be one of the more problematic components of distributed application architecture.
Join this webinar for a technical discussion of the risks associated with using an external cache and a look at an alternative strategy that simplifies your architecture without compromising latency. We’ll cover:
- Different approaches to caching (pre-caching vs. caching, side cache vs. transparent cache)
- 7 specific reasons why external caching can be a bad choice
- Why Linux’s default caching doesn’t work well for databases
- The advantages & architecture of specialized row-based caches
- Real-world examples of why and how teams eliminated their external cache
Getting Started with Managed Database Services on AWS - September 2016 Webina...Amazon Web Services
On AWS you can choose from a variety of managed database services that save effort, save time, and unlock new capabilities and economies. In this session, we make it easy to understand how they differ, what they have in common, and how to choose one or more. We'll explain the fundamentals of Amazon RDS, a managed relational database service in the cloud; Amazon DynamoDB, a fully managed NoSQL database service; Amazon ElastiCache, a fast, in-memory caching service in the cloud; and Amazon Redshift, a fully managed, petabyte-scale data-warehouse solution that can be surprisingly economical. We will cover how each service might help support your application, how much each service costs, and how to get started.
Learning Objectives:
• Overview of managed database services available on AWS
• How to combine them for high-performance cost effective architectures
• Learn how to choose between the AWS database services based on the use case
Who Should Attend:
• IT Managers, DBAs, Enterprise and Solution Architects, IT Managers, DBAs, Enterprise and Solution Architects, Devops Engineers and Developers
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
3. What is a database ?
.ball
•An organised collection of information
•Allows reading and writing .
•Provides authorisation and authentication.
•Provides some level of data safety.
4. Traditional RDBMS
.ball
•Developed by E F Codd in early 1970s
•This model is based on tables rows and columns
and the manipulation of data stored within.
•A Relational DB is the collection of all these table
•Example: Oracle, mysql & microsoft access
5. What is a database ?
.ball
•An organised collection of information
•Allows reading and writing .
•Provides authorisation and authentication.
•Provides some level of data safety.
6. Data store for typical RDBMS
.ball
•Data resides on disk.
•Data maybe cached into memory for access.
7. PROBLEM
.ball
• Existing disk-based systems can no longer offer
timely response due to the high access latency to
hard disks
•The unacceptable performance an obstacle for a
meaningful real-time service.
•Eg :Real-time bidding, advertising, social gaming,
Stock market .
8. “Memory is the new disk, disk is the new tape”
Jim Gray
Data scientist
Creator IBM system R
.ball
11. IN-MEMORY DATABASE SYSTEMS
.ball
•For in-memory DB ,Data resides permanently on main memory.
•Source data is loaded into system memory in a compressed,
non-relational format
•Only backup copy on disk.
•Memory optimised data structures are used
12. Disk VS Memory
.ball
•Order of magnitude of access time is less for main memory.
•Main memory is normally volatile while disk storage is not.
•The layout of disk is much more critical than layout of main
memory
15. .ball
•SAP HANA is the market leader in IMDB systems. It is also a platform
for big data processing analysis and prediction.
•SAP HANA can help business for building real-time applications and
analytics for accelerating the process
22. .ball
•In-Memory Big Data Management and Processing:
By Hao Zhang, Gang Chen, Member, IEEE, Beng Chin Ooi, Fellow, IEEE,
Kian-Lee Tan, Member, IEEE, and Meihui Zhang, Member, IEEE
•SAP HANA Distributed In-Memory Database
System: Transaction, Session, and Metadata Management
Juchang Lee#1, Yong Sik Kwon#2, Franz Färber*3, Michael Muehle*4, Chulwon
SAP Labs, Korea
•In-memory database
www.wikipedia.org
REFERENCES
From one core to multi-core, to multiple processors per servers, to multi-threaded cores, where we now have servers with up to 8 CPUs (with 24Mb caches each) and 160 threads!
Relentless technology progress by Intel, AMD, ARM and others, will lead to even bigger caches and cores. The name of the game is data-locality and parallelization. Just released “Sandy Bridge” generation for servers.
By accessing data in column-store order, you benefit immensely from simplified table-scan and data pre-caching. This can make all the difference in performance.
Big building 1910
Basketball hoop – 10 feet
Ratio of 106M to 4.9k
Memory access is 1M – 10M times faster than disk. In the past memory was so expensive that database vendors optimized for disk. However, with memory costs dropping so dramatically over last 20 years, it’s not possible to harness the power of in-memory computing.