In-memory databases (IMDBs) store data primarily in RAM for faster access than disk-based databases. While an older concept, IMDBs have become more practical due to lower RAM costs, multi-core CPUs, and 64-bit systems allowing more memory. IMDBs have different architectures, data representations, indexing, and query processing optimized for memory versus disk. They also face challenges in providing durability without disk and scaling to very large data sizes.
This is from a 2 hour talk introducing in-memory databases. First a look at traditional RDBMS architecture and some of it's limitations, then a look at some in-memory products and finally a closer look at OrigoDB, the open source in-memory database toolkit for NET/Mono.
ELT vs. ETL - How they’re different and why it mattersMatillion
ELT is a fundamentally better way to load and transform your data. It’s faster. It’s more efficient. And Matillion’s browser-based interface makes it easier than ever to work with your data. You’re using data to improve your world: shouldn’t the tools you use return the favor?
In this webinar:
- Explore the differences between ELT and ETL
- Learn why ELT is a better, more modern process
- Discover the latest trends in ELT and how they apply to your business
- Find out how Matillion ETL makes loading large amounts of data easier
Data Vault Modeling and Methodology introduction that I provided to a Montreal event in September 2011. It covers an introduction and overview of the Data Vault components for Business Intelligence and Data Warehousing. I am Dan Linstedt, the author and inventor of Data Vault Modeling and methodology.
If you use the images anywhere in your presentations, please credit http://LearnDataVault.com as the source (me).
Thank-you kindly,
Daniel Linstedt
Introduction
Big Data may well be the Next Big Thing in the IT world.
Big data burst upon the scene in the first decade of the 21st century.
The first organizations to embrace it were online and startup firms. Firms like Google, eBay, LinkedIn, and Face book were built around big data from the beginning.
Like many new information technologies, big data can bring about dramatic cost reductions, substantial improvements in the time required to perform a computing task, or new product and service offerings.
Cloud-native Semantic Layer on Data LakeDatabricks
With larger volume and more real-time data stored in data lake, it becomes more complex to manage these data and serve analytics and applications. With different service interfaces, data caliber, performance bias on different scenarios, the business users begin to suffer low confidence on quality and efficiency to get insight from data.
Data Warehouse Design and Best PracticesIvo Andreev
A data warehouse is a database designed for query and analysis rather than for transaction processing. An appropriate design leads to scalable, balanced and flexible architecture that is capable to meet both present and long-term future needs. This session covers a comparison of the main data warehouse architectures together with best practices for the logical and physical design that support staging, load and querying.
This is from a 2 hour talk introducing in-memory databases. First a look at traditional RDBMS architecture and some of it's limitations, then a look at some in-memory products and finally a closer look at OrigoDB, the open source in-memory database toolkit for NET/Mono.
ELT vs. ETL - How they’re different and why it mattersMatillion
ELT is a fundamentally better way to load and transform your data. It’s faster. It’s more efficient. And Matillion’s browser-based interface makes it easier than ever to work with your data. You’re using data to improve your world: shouldn’t the tools you use return the favor?
In this webinar:
- Explore the differences between ELT and ETL
- Learn why ELT is a better, more modern process
- Discover the latest trends in ELT and how they apply to your business
- Find out how Matillion ETL makes loading large amounts of data easier
Data Vault Modeling and Methodology introduction that I provided to a Montreal event in September 2011. It covers an introduction and overview of the Data Vault components for Business Intelligence and Data Warehousing. I am Dan Linstedt, the author and inventor of Data Vault Modeling and methodology.
If you use the images anywhere in your presentations, please credit http://LearnDataVault.com as the source (me).
Thank-you kindly,
Daniel Linstedt
Introduction
Big Data may well be the Next Big Thing in the IT world.
Big data burst upon the scene in the first decade of the 21st century.
The first organizations to embrace it were online and startup firms. Firms like Google, eBay, LinkedIn, and Face book were built around big data from the beginning.
Like many new information technologies, big data can bring about dramatic cost reductions, substantial improvements in the time required to perform a computing task, or new product and service offerings.
Cloud-native Semantic Layer on Data LakeDatabricks
With larger volume and more real-time data stored in data lake, it becomes more complex to manage these data and serve analytics and applications. With different service interfaces, data caliber, performance bias on different scenarios, the business users begin to suffer low confidence on quality and efficiency to get insight from data.
Data Warehouse Design and Best PracticesIvo Andreev
A data warehouse is a database designed for query and analysis rather than for transaction processing. An appropriate design leads to scalable, balanced and flexible architecture that is capable to meet both present and long-term future needs. This session covers a comparison of the main data warehouse architectures together with best practices for the logical and physical design that support staging, load and querying.
HDFS is a Java-based file system that provides scalable and reliable data storage, and it was designed to span large clusters of commodity servers. HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks.
Learn to Use Databricks for the Full ML LifecycleDatabricks
Machine learning development brings many new complexities beyond the traditional software development lifecycle. Unlike traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. In this talk, learn how to operationalize ML across the full lifecycle with Databricks Machine Learning.
NoSQL databases get a lot of press coverage, but there seems to be a lot of confusion surrounding them, as in which situations they work better than a Relational Database, and how to choose one over another. This talk will give an overview of the NoSQL landscape and a classification for the different architectural categories, clarifying the base concepts and the terminology, and will provide a comparison of the features, the strengths and the drawbacks of the most popular projects (CouchDB, MongoDB, Riak, Redis, Membase, Neo4j, Cassandra, HBase, Hypertable).
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a modern data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. They all may sound great in theory, but I'll dig into the concerns you need to be aware of before taking the plunge. I’ll also include use cases so you can see what approach will work best for your big data needs. And I'll discuss Microsoft version of the data mesh.
Big data architectures and the data lakeJames Serra
With so many new technologies it can get confusing on the best approach to building a big data architecture. The data lake is a great new concept, usually built in Hadoop, but what exactly is it and how does it fit in? In this presentation I'll discuss the four most common patterns in big data production implementations, the top-down vs bottoms-up approach to analytics, and how you can use a data lake and a RDBMS data warehouse together. We will go into detail on the characteristics of a data lake and its benefits, and how you still need to perform the same data governance tasks in a data lake as you do in a data warehouse. Come to this presentation to make sure your data lake does not turn into a data swamp!
I gave this presentation at the Advanced Architecture Conference, Bill Inmon, 2011 in Evergreen, Colorado. This presentation covers a new breed of data warehousing called Operational Data Warehousing. These are the next steps in business intelligence towards self-service BI and enabling users to do more with their enterprise data warehouse solution. Specifically, it talks about how the Data Vault model fits in to this picture.
If you would like to use the slides, please e-mail me first, I'd be happy to discuss it with you.
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...Patrick Van Renterghem
Presentation on "Cloud Data Warehousing: What, Why and How?" by Rogier Werschkull (RogerData), at the BI & Data Analytics Summit on June 13th, 2019 in Diegem (Belgium)
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
Over the last decade, the 3Vs of data - Volume, Velocity & Variety has grown massively. The Big Data revolution has completely changed the way companies collect, analyze & store data. Advancements in cloud-based data warehousing technologies have empowered companies to fully leverage big data without heavy investments both in terms of time and resources. But, that doesn’t mean building and managing a cloud data warehouse isn’t accompanied by any challenges. From deciding on a service provider to the design architecture, deploying a data warehouse tailored to your business needs is a strenuous undertaking. Looking to deploy a data warehouse to scale your company’s data infrastructure or still on the fence? In this presentation you will gain insights into the current Data Warehousing trends, best practices, and future outlook. Learn how to build your data warehouse with the help of real-life use-cases and discussion on commonly faced challenges. In this session you will learn:
- Choosing the best solution - Data Lake vs. Data Warehouse vs. Data Mart
- Choosing the best Data Warehouse design methodologies: Data Vault vs. Kimball vs. Inmon
- Step by step approach to building an effective data warehouse architecture
- Common reasons for the failure of data warehouse implementations and how to avoid them
This Presentation is about NoSQL which means Not Only SQL. This presentation covers the aspects of using NoSQL for Big Data and the differences from RDBMS.
A strong relationship with the founder
of Data Vault for over 3 years now.
Supporting your business with 40+
certified consultants.
Incorporated as the preferred
Enterprise Data Warehouse modelling
paradigm in the Logica BI Framework.
Satisfied customers in many countries
and industry sectors
This presentation gives you an overview about SAP HANA, explains how SAP HANA is working, addresses the comprehensive SAP big data solution, and at last, illustrates how to create a SAP HANA One instance in AWS to tame your big data challenges.
HDFS is a Java-based file system that provides scalable and reliable data storage, and it was designed to span large clusters of commodity servers. HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks.
Learn to Use Databricks for the Full ML LifecycleDatabricks
Machine learning development brings many new complexities beyond the traditional software development lifecycle. Unlike traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. In this talk, learn how to operationalize ML across the full lifecycle with Databricks Machine Learning.
NoSQL databases get a lot of press coverage, but there seems to be a lot of confusion surrounding them, as in which situations they work better than a Relational Database, and how to choose one over another. This talk will give an overview of the NoSQL landscape and a classification for the different architectural categories, clarifying the base concepts and the terminology, and will provide a comparison of the features, the strengths and the drawbacks of the most popular projects (CouchDB, MongoDB, Riak, Redis, Membase, Neo4j, Cassandra, HBase, Hypertable).
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a modern data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. They all may sound great in theory, but I'll dig into the concerns you need to be aware of before taking the plunge. I’ll also include use cases so you can see what approach will work best for your big data needs. And I'll discuss Microsoft version of the data mesh.
Big data architectures and the data lakeJames Serra
With so many new technologies it can get confusing on the best approach to building a big data architecture. The data lake is a great new concept, usually built in Hadoop, but what exactly is it and how does it fit in? In this presentation I'll discuss the four most common patterns in big data production implementations, the top-down vs bottoms-up approach to analytics, and how you can use a data lake and a RDBMS data warehouse together. We will go into detail on the characteristics of a data lake and its benefits, and how you still need to perform the same data governance tasks in a data lake as you do in a data warehouse. Come to this presentation to make sure your data lake does not turn into a data swamp!
I gave this presentation at the Advanced Architecture Conference, Bill Inmon, 2011 in Evergreen, Colorado. This presentation covers a new breed of data warehousing called Operational Data Warehousing. These are the next steps in business intelligence towards self-service BI and enabling users to do more with their enterprise data warehouse solution. Specifically, it talks about how the Data Vault model fits in to this picture.
If you would like to use the slides, please e-mail me first, I'd be happy to discuss it with you.
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...Patrick Van Renterghem
Presentation on "Cloud Data Warehousing: What, Why and How?" by Rogier Werschkull (RogerData), at the BI & Data Analytics Summit on June 13th, 2019 in Diegem (Belgium)
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
Over the last decade, the 3Vs of data - Volume, Velocity & Variety has grown massively. The Big Data revolution has completely changed the way companies collect, analyze & store data. Advancements in cloud-based data warehousing technologies have empowered companies to fully leverage big data without heavy investments both in terms of time and resources. But, that doesn’t mean building and managing a cloud data warehouse isn’t accompanied by any challenges. From deciding on a service provider to the design architecture, deploying a data warehouse tailored to your business needs is a strenuous undertaking. Looking to deploy a data warehouse to scale your company’s data infrastructure or still on the fence? In this presentation you will gain insights into the current Data Warehousing trends, best practices, and future outlook. Learn how to build your data warehouse with the help of real-life use-cases and discussion on commonly faced challenges. In this session you will learn:
- Choosing the best solution - Data Lake vs. Data Warehouse vs. Data Mart
- Choosing the best Data Warehouse design methodologies: Data Vault vs. Kimball vs. Inmon
- Step by step approach to building an effective data warehouse architecture
- Common reasons for the failure of data warehouse implementations and how to avoid them
This Presentation is about NoSQL which means Not Only SQL. This presentation covers the aspects of using NoSQL for Big Data and the differences from RDBMS.
A strong relationship with the founder
of Data Vault for over 3 years now.
Supporting your business with 40+
certified consultants.
Incorporated as the preferred
Enterprise Data Warehouse modelling
paradigm in the Logica BI Framework.
Satisfied customers in many countries
and industry sectors
This presentation gives you an overview about SAP HANA, explains how SAP HANA is working, addresses the comprehensive SAP big data solution, and at last, illustrates how to create a SAP HANA One instance in AWS to tame your big data challenges.
In-Memory Computing: How, Why? and common PatternsSrinath Perera
Traditionally, big data is mostly read from disks and processed. However, most big data systems are latency bound, which means often the CPU sits idle waiting for data to arrive. This problem is more prevalent with use cases like graph searches that need to randomly access different parts of datasets. In-memory computing proposes an alternative model where data is loaded or stored in-memory and processed instead of processing them from the disk. Although such designs cost more in terms of memory, sometimes resulting systems can have faster order of magnitudes (e.g. 1000X), which could lead to savings in the long run. With rapidly falling memory prices, this difference is reducing by the day. Furthermore, in-memory computing can enable use cases like ad hoc analysis over a large set of data that was not possible earlier. This talk will provide an overview of in-memory technology and discuss how WSO2 technologies like complex event processing that can be used to build in-memory solutions. It will also provide an overview of upcoming improvements in the WSO2 platform.
Using In-Memory Encrypted Databases on the CloudFrancesco Pagano
Storing data in the cloud poses a number of privacy issues. A way to handle them is supporting data replication and distribution on the cloud via a local, centrally synchronized storage. In this paper we propose to use an in-memory RDBMS with row-level data encryption for granting and revoking access rights to distributed data. This type of solution is rarely adopted in conventional RDBMSs because it requires several complex steps. In this paper we focus on implementation and benchmarking of a test system, which shows that our simple yet effective solution overcomes most of the problems.
Real World Use Cases and Success Stories for In-Memory Data Grids (TIBCO Acti...Kai Wähner
A lot of data grid products are available. TIBCO ActiveSpaces, Oracle Coherence, Infinispan, IBM WebSphere eXtreme Scale, Hazelcast, Gigaspaces, GridGain, Pivotal Gemfire to name most of the important ones. Not SAP HANA!
The goal of my talk was not very technical. Instead, I discussed several different real world use cases and success stories for using in-memory data grids. Here is the abstract for my talk:
NoSQL is not just about different storage alternatives such as document store, key value store, graphs or column-based databases. The hardware is also getting much more important. Besides common disks and SSDs, enterprises begin to use in-memory storages more and more because a distributed in-memory data grid provides very fast data access and update. While its performance will vary depending on multiple factors, it is not uncommon to be 100 times faster than corresponding database implementations. For this reason and others described in this session, in-memory computing is a great solution for lifting the burden of big data, reducing reliance on costly transactional systems, and building highly scalable, fault-tolerant applications.The session begins with a short introduction to in-memory computing. Afterwards, different frameworks and product alternatives are discussed for implementing in-memory solutions. Finally, the main part of this session shows several different real world uses cases where in-memory computing delivers business value by supercharging the infrastructure.
Open API and API Management - Introduction and Comparison of Products: TIBCO ...Kai Wähner
In October 2014, I had a talk at Jazoon in Zurich, Switzerland: "A New Front for SOA: Open API and API Management as Game Changer"
Open API represent the leading edge of a new business model, providing innovative ways for companies to expand brand value and routes to market, and create new value chains for intellectual property. In the past, SOA strategies mostly targeted internal users. Open APIs target mostly external partners.
This session introduces the concepts of Open API, its challenges and opportunities. API Management will become important in many areas, no matter if business-to-business (B2B) or business-to-customer (B2C) communication. Several real world use cases will discuss how to gain leverage due to API Management. The end of the session shows and compares API management products from different vendors such as TIBCO API Exchange, IBM, Apigee, 3scale, WSO2, MuleSoft, Mashery, Layer 7, Vordel
Geek Sync I Need for Speed: In-Memory Databases in Oracle and SQL ServerIDERA Software
You can watch the replay for this Geek Sync webcast in the IDERA Resource Center: http://ow.ly/S6MG50A5ok5
Microsoft introduced IN-MEMORY OLTP, widely referred to as “Hekaton” in SQL Server 2014. Hekaton allows for the creation of fully transactionally consistent memory-resident tables designed for high concurrency and no blocking. With SQL 2016, many of the original restrictions and limitations of this feature have been reduced. IDERA’s Vicky Harp will give an overview of this feature, including how to compile T-SQL code into machine code for an even greater performance boost.
There’s also been a lot of buzz about Oracle 12c’s new IN-MEMORY COLUMN STORE. Oracle ACE Bert Scalzo will cover this new feature, how it works, it’s benefits, scripts to measure/monitor it and more. He will also touch on performance observations from benchmarking this new feature against more traditional SGA memory allocations plus Oracle 11g R2’s Database Smart Flash Cache. All findings, scripts and conclusions from this exercise will be shared. In addition, two very popular database benchmarking tools will be highlighted.
In many modern applications the database side is realized using polyglot persistence – store each data format (graphs, documents, etc.) in an appropriate separate database. This approach yields several benefits, databases are optimized for their specific duty, however there are also drawbacks: * keep all databases in sync * queries might require data from several databases * experts needed for all used systems A multi-model database is not restricted to one data format, but can cope with several of them. In this talk i will present how a multi-model database can be used in a polyglot persistence setup and how it will reduce the effort drastically.
The goal of the MonetDB/DataCell project is to exploit the power of Relational DBMS (RDBMS) for efficient processing of continues queries over streaming data. This presentation first identifies the essential differences between processing one-time queries and continues queries. It then presents the current archtecture of MonetDB/DataCell and some ideas of how to extend an existing RDBMS with just a handful of new components to handle continues queries.
The presentation was presented by Ying Zhang (Centrum Wiskunde & Informatica) at the PlanetData project Meeting on February 28 - March 4, 2011 in Innsbruck, Austria.
In this talk from the Dublin Websummit 2014 AWS Technical Evangelist Danilo Poccia discusses NoSQL technology.
Includes an introduction to NoSQL DB and a discussion of when it is time to consider NoSQL.
Danilo also introduces Amazon DynamoDB as a NoSQL solution and talks through several case studies of customers that are using Amazon DynamoDB today.
Accelerating Data Science with Better Data Engineering on DatabricksDatabricks
Whether you’re processing IoT data from millions of sensors or building a recommendation engine to provide a more engaging customer experience, the ability to derive actionable insights from massive volumes of diverse data is critical to success. MediaMath, a leading adtech company, relies on Apache Spark to process billions of data points ranging from ads, user cookies, impressions, clicks, and more — translating to several terabytes of data per day. To support the needs of the data science teams, data engineering must build data pipelines for both ETL and feature engineering that are scalable, performant, and reliable.
Join this webinar to learn how MediaMath leverages Databricks to simplify mission-critical data engineering tasks that surface data directly to clients and drive actionable business outcomes. This webinar will cover:
- Transforming TBs of data with RDDs and PySpark responsibly
- Using the JDBC connector to write results to production databases seamlessly
- Comparisons with a similar approach using Hive
Selecting the Right AWS Database Solution - AWS 2017 Online Tech TalksAmazon Web Services
• Get an overview of managed database services available on AWS
• Learn how to combine them for high-performance cost effective architectures
• Learn how to choose between the AWS database services based on your use case
On AWS you can choose from a variety of managed database services that save effort, save time, and unlock new capabilities and economies. In this session, we make it easy to understand how they differ, what they have in common, and how to choose one or more. We'll explain the fundamentals of Amazon RDS, a managed relational database service in the cloud; Amazon DynamoDB, a fully managed NoSQL database service; Amazon ElastiCache, a fast, in-memory caching service in the cloud; and Amazon Redshift, a fully managed, petabyte-scale data-warehouse solution that can be economical. We will cover how each service might help support your application and how to get started.
Find out which is faster, SQL or NoSQL, for traditional reporting tasks. Discover how you can optimise MongoDB aggregation pipelines and how to push complex computation down to the database.
Getting Started with Managed Database Services on AWS - September 2016 Webina...Amazon Web Services
On AWS you can choose from a variety of managed database services that save effort, save time, and unlock new capabilities and economies. In this session, we make it easy to understand how they differ, what they have in common, and how to choose one or more. We'll explain the fundamentals of Amazon RDS, a managed relational database service in the cloud; Amazon DynamoDB, a fully managed NoSQL database service; Amazon ElastiCache, a fast, in-memory caching service in the cloud; and Amazon Redshift, a fully managed, petabyte-scale data-warehouse solution that can be surprisingly economical. We will cover how each service might help support your application, how much each service costs, and how to get started.
Learning Objectives:
• Overview of managed database services available on AWS
• How to combine them for high-performance cost effective architectures
• Learn how to choose between the AWS database services based on the use case
Who Should Attend:
• IT Managers, DBAs, Enterprise and Solution Architects, IT Managers, DBAs, Enterprise and Solution Architects, Devops Engineers and Developers
Some of the most common questions we hear from users relate to capacity planning and hardware choices. How many replicas do I need? Should I consider sharding right away? How much RAM will I need for my working set? SSD or HDD? No one likes spending a lot of cash on hardware and cloud bills can just be as painful. MongoDB is different from traditional RDBMSs in its resource management, so you need to be mindful when deciding on the cluster layout and hardware. In this talk we will review the factors that drive the capacity requirements: volume of queries, access patterns, indexing, working set size, among others. Attendees will gain additional insight as we go through a few real-world scenarios, as experienced with MongoDB Inc customers, and come up with their ideal cluster layout and hardware.
Presentation which accompanies the article at http://www.sharepointproconnections.com/article/microsoft-products/Database-Maintenance-for-SharePoint-.aspx
In this talk we will review the factors that drive the capacity requirements: volume of queries, access patterns, indexing, working set size, among others. View the slides with video recording: www.mongodb.com/presentations/hardware-provisioning-mongodb
It is a attempt to provide unified view of open data, In this system, data is collected from different sources in different formats. Data producer will define semantic relationship among datasets which is to input to our DC system.
Data consumer can pick a set of dataset randomly(or depends on his/her interest) and ask system to get HTTP API for it. System will identify which datasets are linked with each other (connected components) and generate HTTP API for each component which will produce unified output in JSON/XML format
It helps in maintaining loose coupling between underline storage structure and consumer client which is built on open data.
Presentation related to some of the water crisis we are facing with some statistics on Bangalore water usage with all the references mentioned that are used for making the ppt.
Reactive Data System is a framework which is highly responsive to change in state of data. This can be used in developing
data-intensive enterprise application. Consider an example of e-retail whenever a consumer place an order it need to be notified to multiple department namely shipping, warehouse, finance, etc. Normally this is achieved with the help of
call stack mechanism, developer has to call each action or a function explicitly evertime order is placed or a payment is done, as this is executed synchronously one after another leading to high coupling and directly effect the response time leading to bad user experience.
Using RDS, developer can separate such action which can be carried out independently making loose coupling among components
of user app. It works on principle of Event driven architecture. One may think why to worry about RDS when same thing can
be achieved with the help of triggers. But triggers in sql has many limitation such as max number of triggers on a table, no user defined function can be called when though triggers, no multiple database support. All this limitations are overcomes through our RDS.
RDS provide GUI and API for Developer to register events and actions. There can be N action on 1 events. Along with data driven events we are supporting time driven events.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
1. In Memory Database (IMDB)
(old is new again)
Kodamasimham Pridhvi
MT2012066
2. Agenda
› Introduction
› Architecture of IMDB
› Practical Application
› Myths about IMDB’s
› IMDB vs. DRDB
› Impact of IMDB
› Challenges in IMDB
› IMDB Open Sources
3. Introduction
› What is In-Memory Data Base(IMDB)?
An IMDB also called Main memory
Database(MMDB) is a database whose
primary data store is main memory.
4. History
› Is this a new idea? NO!!!
› Why now so important?
Due to 4 factors
5. Factors:
› LOWER ING COSTS & GROWING SIZE
(RAM):
– In early 2000, the cost of 64 MB RAM @ $71
– But now , 8GB DDR3 @ $69.99
› MULTICORE PROCESSORS
– parallel and faster computation
› 64 bit Computing
– multiple GB of main memory
› Faster responses to queries
7. Practical Application
› Applications that demand very fast data
access, storage and manipulation
› In real-time embedded systems
› Music databases in MP3 players
› Programming data in set-top boxes
› e-commerce and social networking sites
› financial services and many more…
8. IMDB vs. DRDB(Disk Resident
DB)
Disk Resident Data Base
In-Memory Data Base
Carries File I/O burden
No file I/O burden
Extra memory For Cache
No extra memory
Algorithm optimized for disk
Algorithms optimized for memory
More CPU cycles
Less CPU cycles
Assumes Memory is abundant
Uses memory more efficiently
9. Myths about IMDB’s
› Given the same amount of RAM, disk DBs
can perform at the same speed as IMDBs
(by using caching technology).
› If a RAM disk is created and a traditional
disk DB is deployed on it, it delivers the
same performance as an in-memory
database.
10. Myths about IMDB’s
› In-memory database the same as an
embedded database.
› Since RAM size is limited, sizes of IMDBs
are also limited.
11. Impact of IMDB:
› Data Representation,
› Concurrency control,
› Data Access Methods,
› Query Processing,
› ACID Properties,
› Recovery.
12. Data Representation
› In Disk Resident DB, we use flat files and
sequential access.
› In IMDB, Relational tuples with direct
pointers.
– Space efficient.
– Shared between columns and relations
13. Data Representation Diagram
Relation Tuple
Domain Table
Marks
Value 1
Age
Value 2
Pointers
To memory
address
Value 3
RollNo
Age
RollNo
Age
Age
Relational Tuple
14. Concurrency Control(lock based)
› In DRDB , locking granules are low level.
– To reduce contention
› In IMDB, due to fast processing it create
coarser locks.
– locking granules like a relation or entire database.
– No need of hash table look up.
15. Data Access Methods
› In DRDB, B-tree index structure is used.
– ranges , exact match queries
– lies in hard disk
› In IMDB, T-tree index structure is explicitly
designed
– reduces the CPU processing
– Eliminates index value compression and
expansion
16. T-tree
› T-tree node consists
– ordered elements in the range min and max
values
– two pointers to the left and right nodes
Pointer to parent
Parent
Key 1
Minimum Element
Left ptr
……………..
Control
Key..
Key n
Maximum Element
Right ptr
17. Query Processing
› In DRDB, main focus is on processing costs,
and attempt to minimize disk access.
› In IMDB, main factors are
–
–
–
–
Cardinality of table
Presence of index
Any ORDER BY clause
Predicate evaluation
› Ex : TimesTen provides range, hash and
bitmap indexes and support two types of join
methods nested-loop and merge-join.
18. ACID Properties
› IMDBs can be said to lack support for the
durability portion of the ACID
› Many MMDBs have added durability via the
following mechanisms:
– Checkpoints
– Transaction logging
– NVRAM(Non-Volatile RAM)
19. Recovery
› Mechanisms for recovery are :
– Logging
– Checkpoints
– Reloading
› transactional durability is kept, by keeping
two separate but synchronized copies of
the database at all times as well as storing
log files on-disk.
Today’s agenda for my presentation will be as follows:1.Introduction – we will see what is IMDB, some brief history about it and why is it becoming popular now..2.Architectural details of IMDB – some technical details of the IMDB3.Applications – where this IMDB can be used.4. Some Myths about IMDB .5. Difference between IMDB and DRDB – point where IMDB is superior to DRDB6.Impact of IMDB7.Some of the challenges in IMDB8.IMDB open Sources9.Thank you
Lets start with the basic question: What is In-Memory Data Base?From the name itself you can guess what does it means.. An IMDB also called Main memory Database(MMDB) is a database whose primary data store is main memory.That means in IMDB the primary copy lives permanently in memory..
1.The idea of using In Memory Database (IMDB) as physical memory is not new but is in existence quite since a decade. IMDB have evolved from a period when they were only used for caching or in high-speed data systems to a time now in twenty first century when they form a established part of the mainstream IT.2.Early in this century, although larger main memories were affordable but processors were not fast enough for main memory databases to be admired. However, today’s processors are faster, available in multicore and multiprocessor configurations having 64-bit memory addressability stocked with multiple gigabytes of main memory.
Three developments in recent years have made in-memory analytics increasingly feasible:64bit computing, multi-core servers and lower RAM prices and growing RAM sizes. One of the key reasons for the interest in in-memory database is to get faster responses to queries which otherwise would be limited by the speed of the disk storage systems.
Architecture:1.IMDB eliminates disk access by storing and manipulating entire database in main memory.2. The access time for main memory is orders of magnitude less than for disk storage3. Disks have a high, fixed cost per access that does not depend on the amount of data that is retrieved during the access. For this reason, disks are block-oriented storage devices. Main memory is not block oriented.4. The layout of data on a disk is much more critical than the layout of data in main memory, since sequential access to a disk is faster than random access. Sequential access is not as important in main memories.5. Buffer pool management totally disappears, number of machine instructions are reduced the structure and size of index pages is simplified, consequently the design becomes simple and more compact and most importantly requests are executed faster.6.You can see there is a secondary storage used for writing logs , checkpoints etc.
Most real-time applications need very short and anticipated response time and Main Memory as we know has short response time.IMDSs running on real-time operating systems (RTOSs) provide the responsiveness needed in applications including IP network routing, telecom switching, and industrial control.Open Music Daemon music player uses IMDB In-memory databases’ typically small memory and CPU footprint make them ideal because most embedded systems are highly resource-constrained. E-commerce and social networking sites use in-memory databases to cache portions of their back-end on-disk database systems
1.If the cache of a DRDB is large enough,copies of the data will reside in memory at all times. Although such a system will perform well, it is not taking full advantage of the memory. For example, the index structures will be designed for disk access (e.g., B-trees), even though the data are in memory. Also, applications may have to access data through a buffer manager, as if the data were on disk.2. data in an on-disk database system must be transferred to numerous locations as it is used. the handoffs required for an application to read a piece of data from an on-disk database, modify it and write that record back to the database. These steps require time and CPU cycles, and cannot be avoided in a traditional database, even when it runs on a RAM disk. Still more copies and transfers are required if transaction logging is active.In contrast, an in-memory database system entails a single data transfer. Elimination of multiple data transfers streamlines processing. Removing multiple copies of data reduces memory consumption, and the simplified processing makes for greater reliability and minimizes CPU demands.
1.EDB - database system that is built into the software program by the application developer in-memory database systems employ the client/server model. Eg. Timesten of oracle , polyhedra2. The database size is limited by the amount of physical RAM in the server. On 32-bit platforms, it’s constrained by 32-bit address space, so the database size is under 2GB in size or smaller depending on specific platform. For 64-bit platforms, there is no limit in size other than the amount of physical memory you have in the machine. We have customers that deploy with database size ranges from 1GB (gigabyte) to over 2 TBs (terabytes).
Main memory databases can also take advantage of efficient pointer following for data representation. Relational tuples can be represented as a set of pointers to data values. The use of pointers is space efficient when large values appear multiple times in the database, since the actual value needs to only be stored once. Pointers also simplify the handling of variable length fields since variable length data can be represented using pointers into a heap.Relational data are usually represented as flat files.Tuples are stored sequentially. Enumerated types larger than the pointer size are stored in the tuple as pointers to the domain table values, domain tables can be shared among different columns and even among different relations.
In DRDB, Systems choose small locking granules (fields or records) so to reduce contention. In IMDB, due to fast processing it create coarser locks, as contention is already low because data are memory resident, the principal advantageof small lock granulesis effectively removed.So we suggest large lock granules like a relation or an entire database.Implementation:In a conventional system, locks are implemented via a hash table that contains entries for the objects currently locked. The objects themselves (on disk) contain no lock information. If the objects are in memory, we may be able to afford a small number of bits in them to represent their lock status.
MMDB uses T-tree index structure unlike B-tree index structure used by DRDB. Since the ultimate aim of MMDB is to condense computation time while exploiting little memory. T-tree index structure is explicitly designed for MMDB.A T-tree node consists of ordered elements in the range min and max values,and two pointers to the left and right nodesIndex structures can store pointers to the indexed data, rather than the data itself. This eliminates the problem of storing variable length fields in an index and saves space as long as the pointers are smaller than the data they point to.Indexes are very space efficient and are reasonably fast for range and exact-match queries, although updates are slow.Use of T-trees dramatically reduces the CPU processing required to access data and completely eliminates the index value compression and expansionfound in B-trees.
T-trees uphold the fact that the actual data is always in main memory collectively with the index, hence it do not keep copies of actual attribute values within the index tree nodes. Instead it just contains pointers to the actual data fields . It is an ordered structure like an AVL tree having multiple keys per node. It is an Ideal index structure for ordered search over data. Other index structure supported by MMDB is heap file for handling a large number of fixed-length data items. Hash file supports unordered scan of data items as well as locking of data item that are obtained transparently when items are inserted, deleted, updated or scanned. The Oracle TimesTen, uses T-tree and hash indexing algorithms to speed access to indexed data, while also reducing CPU consumption. Use of T-trees dramatically reduces the CPU processing required to access data and completely eliminates the index value compression and expansionfound in B-trees.
Query processors for memory resident data must focus on processing costs, whereas most conventional systems attempt to minimize disk access.TimesTen and IMDB Cache provide range, hash and bitmap indexes and support two types of join methods nested-loop and merge-join. The optimizer can create temporary indexes as needed. The optimizer also accepts hints that give applications the flexibility to make tradeoffs between such factors as temporary space usage and performance
IMDB’s are logically more exposed to failure than DRDB’s due to high performance requirement as it is directly accessed by processor.